eBPF Memory Leak Detection — Potential Q&A

Based on slides 36–45 of the presentation. Organized by topic.

1. eBPF vs Traditional Tools (Slide 37)

Q1: What’s the overhead of eBPF memleak in production?

Q1: eBPF memleak 在生产环境中的开销是多少？

A: Typically less than 5% CPU overhead. The uprobe/kprobe mechanism adds a few microseconds per hit. For functions called less than 10K times per second, the impact is negligible. However, for extremely hot paths (>100K calls/sec), the overhead can become noticeable.

中文: 通常 CPU 开销低于 5%。uprobe/kprobe 机制每次命中增加几微秒。对于每秒调用少于 1 万次的函数，影响可忽略不计。但对于极高频热路径（>10 万次/秒），开销可能变得明显。

Q2: Why not just use ASan in production? It’s only 2-3x overhead.

Q2: 为什么不直接在生产环境用 ASan？它只有 2-3 倍开销。

A: 2-3x is still too much for most production services — it means you need 2-3x the hardware to maintain the same throughput. Also, ASan requires recompilation with -fsanitize=address, which changes the binary. You can’t attach it to an already-running process. eBPF can attach and detach at any time without restarting.

中文: 2-3 倍对大多数生产服务来说仍然太多——意味着需要 2-3 倍的硬件来维持同样的吞吐量。而且 ASan 需要用 -fsanitize=address 重新编译，改变了二进制文件。你无法把它挂到一个已经在运行的进程上。eBPF 可以随时挂载和卸载，无需重启。

Q3: Can eBPF detect use-after-free at all?

Q3: eBPF 能检测 use-after-free 吗？

A: Not directly. eBPF memleak tracks alloc/free pairing — it knows if memory was freed but not if it was accessed after being freed. For use-after-free detection, you still need Valgrind (Memcheck) or ASan. They maintain shadow memory that tracks the state of every byte.

中文: 不能直接检测。eBPF memleak 追踪 alloc/free 配对——它知道内存是否被释放，但不知道释放后是否被访问。要检测 use-after-free，仍需要 Valgrind（Memcheck）或 ASan。它们维护影子内存，追踪每个字节的状态。

Q4: What about memory leak detection in Go or Java programs?

Q4: Go 或 Java 程序的内存泄漏检测呢？

A: Go and Java use garbage collectors, so traditional malloc/free leaks don’t apply. However, they can still have logical leaks (growing maps, unclosed channels, accumulating references). eBPF memleak is primarily for C/C++/Rust programs that use manual memory management or system allocators. For Go, use pprof; for Java, use JVM heap dumps.

中文: Go 和 Java 使用垃圾回收器，所以传统的 malloc/free 泄漏不适用。但它们仍可能有逻辑泄漏（不断增长的 map、未关闭的 channel、累积的引用）。eBPF memleak 主要针对使用手动内存管理或系统分配器的 C/C++/Rust 程序。Go 用 pprof；Java 用 JVM 堆转储。

Q5: Does it work on containers/Kubernetes?

Q5: 在容器/Kubernetes 上能用吗？

A: Yes. eBPF runs at the kernel level, so it can trace any process on the host regardless of whether it’s in a container. You just need to specify the target PID or binary path. In Kubernetes, you’d run the memleak tool on the node where the pod is scheduled, targeting the specific container process.

中文: 可以。eBPF 在内核层面运行，所以可以追踪主机上的任何进程，无论是否在容器中。只需指定目标 PID 或二进制路径。在 Kubernetes 中，在 pod 所在节点运行 memleak 工具，指向特定容器进程即可。

Q6: What Linux kernel version do we actually need?

Q6: 实际需要什么 Linux 内核版本？

A: The minimum is 4.9 for basic BPF functionality. But for a stable memleak experience with stack traces, we recommend 4.14+. CAP_BPF (non-root usage) requires 5.8+. Our product targets recent kernels so this isn’t a practical limitation.

中文: 基本 BPF 功能最低需要 4.9。但要获得稳定的 memleak 体验（含调用栈），建议 4.14+。CAP_BPF（非 root 使用）需要 5.8+。我们的产品面向较新内核，所以这不是实际限制。

2. What We Capture (Slide 38)

Q7: Does memleak slow down malloc/free calls?

Q7: memleak 会减慢 malloc/free 调用吗？

A: Each uprobe hit adds approximately 2-4 microseconds. For a typical application that calls malloc a few thousand times per second, this is imperceptible. For applications with millions of malloc calls per second (e.g., high-frequency trading), you’d want to sample rather than trace every call.

中文: 每次 uprobe 命中增加约 2-4 微秒。对于每秒调用 malloc 几千次的典型应用，这不可感知。对于每秒百万次 malloc 调用的应用（如高频交易），你需要采样而不是追踪每次调用。

Q8: How do you get the stack trace? Does it require debug symbols?

Q8: 如何获取调用栈？需要调试符号吗？

A: We use the kernel’s BPF stack trace helper. For user-space, it walks the frame pointers. This means:
- You need the binary compiled with -fno-omit-frame-pointer for reliable user-space stacks.
- Without frame pointers, you’ll get partial or missing stacks.
- Kernel-space stacks always work because the kernel preserves frame pointers.
- Debug symbols (DWARF) are not required at collection time but are needed to translate addresses to function names/line numbers afterward.

中文: 我们使用内核的 BPF 调用栈辅助函数。对于用户态，它通过帧指针回溯。这意味着：
- 需要用 -fno-omit-frame-pointer 编译二进制文件才能获得可靠的用户态栈。
- 没有帧指针会得到不完整或缺失的调用栈。
- 内核态调用栈始终有效，因为内核保留帧指针。
- 收集时不需要调试符号（DWARF），但之后将地址转换为函数名/行号时需要。

Q9: What happens if the allocation is in a shared library (libc, etc.)?

Q9: 如果分配发生在共享库（libc 等）中会怎样？

A: memleak traces through shared libraries transparently. Since we hook the actual malloc/free symbols (which live in libc.so), we capture all allocations regardless of which library or module called them. The stack trace shows the full call chain from the application through any libraries.

中文: memleak 透明地追踪共享库中的分配。由于我们 hook 的是实际的 malloc/free 符号（在 libc.so 中），我们捕获所有分配，无论是哪个库或模块调用的。调用栈显示从应用程序到任何库的完整调用链。

Q10: Can you filter by specific process or thread?

Q10: 能按特定进程或线程过滤吗？

A: Yes. You can target a specific PID to only trace that process. The tool also records TID (thread ID) per allocation, so you can filter results by thread in post-processing.

中文: 可以。可以指定特定 PID 只追踪该进程。工具还记录每次分配的 TID（线程 ID），所以可以在后处理中按线程过滤结果。

Q11: How much memory does the eBPF memleak tool itself consume?

Q11: eBPF memleak 工具本身消耗多少内存？

A: The BPF maps (hash maps storing allocation records) grow proportionally to the number of outstanding (unfreed) allocations being tracked. For a typical application with thousands of active allocations, this is a few megabytes. The stack trace storage is the largest consumer — each unique stack is stored once and referenced by ID.

中文: BPF maps（存储分配记录的哈希表）与被追踪的未释放分配数量成正比增长。对于有数千个活跃分配的典型应用，这是几 MB。调用栈存储是最大消耗者——每个唯一调用栈只存一次，通过 ID 引用。

3. User-Space Tracing (Slides 39-40)

Q12: What if the program uses a custom allocator instead of malloc?

Q12: 如果程序使用自定义分配器而不是 malloc 呢？

A: If the custom allocator ultimately calls malloc (which most do — jemalloc, tcmalloc, mimalloc all wrap system malloc), memleak will still catch it. If it uses mmap directly or a fully custom pool allocator that never calls malloc, you’d need to add uprobes to that allocator’s specific allocation/deallocation functions.

中文: 如果自定义分配器最终调用 malloc（大多数都是——jemalloc、tcmalloc、mimalloc 都封装了系统 malloc），memleak 仍然能捕获。如果它直接使用 mmap 或完全自定义的池分配器从不调用 malloc，你需要对该分配器的特定分配/释放函数添加 uprobe。

Q13: Does it work with C++ new/delete?

Q13: 对 C++ 的 new/delete 有效吗？

A: Yes. C++ new calls operator new which calls malloc. C++ delete calls operator delete which calls free. So hooking malloc/free covers C++ allocations as well.

中文: 可以。C++ 的 new 调用 operator new，后者调用 malloc。C++ 的 delete 调用 operator delete，后者调用 free。所以 hook malloc/free 也覆盖了 C++ 的分配。

Q14: Can it detect leaks in short-lived programs?

Q14: 能检测短生命周期程序的泄漏吗？

A: Yes, but you need to start memleak before or at the same time as the target program. For short-lived programs, any allocation that wasn’t freed by program exit is reported as a leak. Note: some one-time allocations that are intentionally freed by OS on exit may show as false positives.

中文: 可以，但需要在目标程序启动之前或同时启动 memleak。对于短生命周期程序，程序退出时任何未释放的分配都被报告为泄漏。注意：一些故意在退出时由 OS 释放的一次性分配可能显示为误报。

Q15: How do you distinguish between intentional long-lived allocations and actual leaks?

Q15: 如何区分故意的长生命周期分配和真正的泄漏？

A: This is where human judgment comes in. memleak reports all unfreed memory. The key indicators of a real leak are:
1. Continuously growing allocations over time (not just one-time initialization)
2. Multiple allocations from the same call stack accumulating
3. Memory that grows proportionally to load/time

Our HTML report’s “continuous leak detection” feature specifically flags pattern #1.

中文: 这需要人工判断。memleak 报告所有未释放的内存。真正泄漏的关键指标是：
1. 随时间持续增长的分配（不只是一次性初始化）
2. 来自同一调用栈的多次分配不断累积
3. 内存与负载/时间成正比增长

我们 HTML 报告的”持续泄漏检测”功能专门标记模式 #1。

4. Kernel-Space Tracing (Slide 41)

Q16: Is it safe to trace kernel allocations in production?

Q16: 在生产环境中追踪内核分配安全吗？

A: Yes. kprobes are a well-established kernel tracing mechanism used in production by many organizations. The BPF verifier ensures the probe program cannot crash the kernel or access invalid memory. The overhead is similar to user-space — a few microseconds per probe hit.

中文: 安全的。kprobe 是成熟的内核追踪机制，被许多组织在生产环境中使用。BPF 验证器确保探针程序不会崩溃内核或访问无效内存。开销与用户态类似——每次探针命中几微秒。

Q17: Can you trace specific kernel modules?

Q17: 能追踪特定内核模块吗？

A: Yes. You can filter by kernel function name patterns. For example, to trace only allocations in a network driver, you’d look for stacks containing that driver’s functions. The tool captures all kernel allocations, but you filter in post-processing.

中文: 可以。可以按内核函数名模式过滤。例如，要只追踪网络驱动中的分配，你可以查找包含该驱动函数的调用栈。工具捕获所有内核分配，但在后处理中过滤。

Q18: What about SLUB/SLAB allocator tracking?

Q18: SLUB/SLAB 分配器追踪呢？

A: kmem_cache_alloc is the SLAB/SLUB interface, and we hook it directly. This covers most kernel object allocations (inodes, dentries, sk_buffs, etc.). For kmalloc, it’s also backed by SLAB/SLUB internally, so we capture both paths.

中文: kmem_cache_alloc 是 SLAB/SLUB 接口，我们直接 hook 它。这覆盖了大多数内核对象分配（inodes、dentries、sk_buffs 等）。kmalloc 内部也基于 SLAB/SLUB，所以两条路径都能捕获。

5. Rust Projects (Slide 42)

Q19: How exactly can Rust leak memory if it has ownership?

Q19: Rust 有所有权机制，怎么还会泄漏内存？

A: Several ways:
1. Rc/Arc reference cycles — two objects pointing to each other; reference count never reaches zero
2. mem::forget() — explicitly tells the compiler to skip the destructor
3. Box::leak() — intentionally leaks to get a 'static reference
4. FFI boundaries — memory allocated in C code called from Rust
5. Infinite-growing collections — Vec/HashMap that keeps pushing without bound

Rust’s safety guarantee is about memory safety (no UB), not memory efficiency. Leaking is safe because it doesn’t cause undefined behavior.

中文: 几种方式：
1. Rc/Arc 引用循环 — 两个对象互相指向；引用计数永远不归零
2. mem::forget() — 显式告诉编译器跳过析构函数
3. Box::leak() — 故意泄漏以获得 'static 引用
4. FFI 边界 — 从 Rust 调用的 C 代码中分配的内存
5. 无限增长的集合 — Vec/HashMap 不断 push 没有上限

Rust 的安全保证是关于内存安全性（无 UB），而不是内存效率。泄漏是安全的，因为它不会导致未定义行为。

Q20: Does Rust always use system malloc? What about custom allocators?

Q20: Rust 总是使用系统 malloc 吗？自定义分配器呢？

A: By default, Rust uses the system allocator (malloc/free on Linux). You can set a custom global allocator via #[global_allocator]. If the custom allocator (e.g., jemalloc, mimalloc) still calls system malloc underneath, our tool works. If it uses mmap directly, you’d need to adjust the probe targets.

中文: 默认情况下，Rust 使用系统分配器（Linux 上的 malloc/free）。可以通过 #[global_allocator] 设置自定义全局分配器。如果自定义分配器（如 jemalloc、mimalloc）底层仍调用系统 malloc，我们的工具有效。如果直接使用 mmap，需要调整探针目标。

Q21: Can you detect Rc/Arc cycle leaks specifically?

Q21: 能专门检测 Rc/Arc 循环泄漏吗？

A: memleak will show you that memory allocated by Rc/Arc is never freed — it will appear in the “outstanding allocations” report. However, it won’t tell you why it wasn’t freed (i.e., that it’s a cycle). You’d need to analyze the code to determine the cause. Tools like cargo-fuzz or runtime cycle detection crates can help identify cycles specifically.

中文: memleak 会显示 Rc/Arc 分配的内存从未被释放——它会出现在”未释放分配”报告中。但它不会告诉你为什么没被释放（即存在循环）。你需要分析代码来确定原因。cargo-fuzz 或运行时循环检测 crate 可以帮助专门识别循环。

6. HTML Visual Report (Slides 43-45)

Q22: Can the report handle very large logs (hours of tracing)?

Q22: 报告能处理非常大的日志（数小时追踪）吗？

A: Yes. The report aggregates data by snapshot intervals. Even hours of tracing produces a manageable number of data points. The main constraint is browser rendering performance for the charts — we’ve tested with reports covering thousands of snapshots without issues.

中文: 可以。报告按快照间隔聚合数据。即使数小时的追踪也产生可管理数量的数据点。主要限制是浏览器渲染图表的性能——我们测试过包含数千个快照的报告，没有问题。

Q23: How does the “continuous leak detection” algorithm work?

Q23: “持续泄漏检测”算法是如何工作的？

A: It compares each allocation source across all snapshots. If a source’s total bytes increase in every consecutive snapshot (monotonically increasing), it’s flagged as a suspected continuous leak. This eliminates one-time allocations and focuses on genuinely growing memory.

中文: 它比较每个分配源在所有快照中的数据。如果某个源的总字节数在每个连续快照中都增加（单调递增），则被标记为疑似持续泄漏。这排除了一次性分配，聚焦于真正增长的内存。

Q24: Can we integrate this into CI/CD?

Q24: 能集成到 CI/CD 中吗？

A: Yes. The memleak tool can run headlessly and output structured logs. You can add a pipeline step that:
1. Runs your service under memleak for N seconds
2. Parses the output
3. Fails the build if outstanding allocations exceed a threshold or if continuous growth is detected

The HTML report generation is a separate post-processing step that can also be automated.

中文: 可以。memleak 工具可以无头运行并输出结构化日志。你可以添加流水线步骤：
1. 在 memleak 下运行你的服务 N 秒
2. 解析输出
3. 如果未释放分配超过阈值或检测到持续增长则构建失败

HTML 报告生成是单独的后处理步骤，也可以自动化。

Q25: What’s do_anonymous_page? Why does it dominate the kernel allocations?

Q25: do_anonymous_page 是什么？为什么它在内核分配中占主导？

A: do_anonymous_page is the kernel function that handles page faults for anonymous memory (heap, stack, mmap without a file backing). It’s normal for it to dominate because every user-space malloc that triggers a new page allocation goes through this path. It typically represents your program’s working set, not a kernel bug.

中文: do_anonymous_page 是处理匿名内存缺页中断的内核函数（堆、栈、没有文件后端的 mmap）。它占主导是正常的，因为每个触发新页分配的用户态 malloc 都经过这条路径。它通常代表你程序的工作集，而不是内核 bug。

Q26: Is kernfs_fop_open a real kernel leak?

Q26: kernfs_fop_open 是真正的内核泄漏吗？

A: It depends on context. kernfs_fop_open is called when sysfs/procfs files are opened. If your monitoring tools are repeatedly reading /proc or /sys files and the allocations keep growing, it could indicate a kernel reference counting bug, or it could be normal caching behavior. The continuous growth pattern flagged in our report suggests it’s worth investigating.

中文: 取决于上下文。kernfs_fop_open 在打开 sysfs/procfs 文件时被调用。如果你的监控工具反复读取 /proc 或 /sys 文件且分配持续增长，可能表示内核引用计数 bug，也可能是正常的缓存行为。我们报告中标记的持续增长模式表明值得调查。

7. General / Architecture Questions

Q27: What’s the difference between BCC memleak and bpftrace for leak detection?

Q27: BCC memleak 和 bpftrace 做泄漏检测有什么区别？

A: BCC memleak is a purpose-built tool with allocation tracking logic built in — it handles the alloc/free pairing, stack collection, and reporting. bpftrace is a general-purpose tracing language — you could write a leak detector in it, but you’d reinvent what memleak already provides. Use memleak for the specific task; use bpftrace for ad-hoc custom tracing.

中文: BCC memleak 是专门构建的工具，内置了分配追踪逻辑——它处理 alloc/free 配对、调用栈收集和报告。bpftrace 是通用追踪语言——你可以用它写泄漏检测器，但会重新发明 memleak 已经提供的功能。用 memleak 做具体任务；用 bpftrace 做临时自定义追踪。

Q28: Can this detect memory fragmentation?

Q28: 能检测内存碎片化吗？

A: Not directly. memleak tracks allocation/free events, not the physical memory layout. Fragmentation is about the gaps between allocations in the virtual address space. For fragmentation analysis, you’d need tools like malloc_info(), /proc/buddyinfo, or allocator-specific introspection (jemalloc’s malloc_stats_print).

中文: 不能直接检测。memleak 追踪分配/释放事件，而不是物理内存布局。碎片化是关于虚拟地址空间中分配之间的空隙。要分析碎片化，需要 malloc_info()、/proc/buddyinfo 或分配器特定的内省工具（jemalloc 的 malloc_stats_print）。

Q29: What’s the maximum duration you can run memleak?

Q29: memleak 最长能运行多久？

A: There’s no hard limit. However, the BPF hash map that stores active allocations has a configurable max size. If your program has millions of concurrent unfreed allocations, you might hit the map size limit. In practice, we’ve run memleak for 24+ hours on production services without issues. The map size can be tuned at startup.

中文: 没有硬性限制。但存储活跃分配的 BPF 哈希表有可配置的最大大小。如果你的程序有数百万并发未释放分配，可能会达到 map 大小限制。实际中，我们在生产服务上运行 memleak 24 小时以上没有问题。map 大小可在启动时调整。

Q30: Does this work on ARM/aarch64?

Q30: 在 ARM/aarch64 上能用吗？

A: Yes. eBPF is architecture-independent — it runs in the kernel’s BPF virtual machine. Both x86_64 and aarch64 are fully supported. The only architecture-specific concern is stack unwinding, which may require different compilation flags on ARM to preserve frame pointers.

中文: 可以。eBPF 是架构无关的——它在内核的 BPF 虚拟机中运行。x86_64 和 aarch64 都完全支持。唯一与架构相关的问题是栈回溯，ARM 上可能需要不同的编译标志来保留帧指针。

Q31: How does this compare to Brendan Gregg’s original memleak tool?

Q31: 和 Brendan Gregg 的原始 memleak 工具相比如何？

A: Our tool is based on the BCC memleak tool from the BCC (BPF Compiler Collection) project, which Brendan Gregg contributed to. We’ve extended it with:
- The HTML visual reporting layer
- Continuous leak detection algorithm
- Rust project support validation
- Integration with our product’s workflow

The core BPF tracing logic is the same proven approach.

中文: 我们的工具基于 BCC（BPF Compiler Collection）项目中的 memleak 工具，Brendan Gregg 为该项目做过贡献。我们做了扩展：
- HTML 可视化报告层
- 持续泄漏检测算法
- Rust 项目支持验证
- 与我们产品工作流的集成

核心 BPF 追踪逻辑使用的是同样经过验证的方法。

8. Performance & Production Concerns

Q32: What if memleak itself crashes? Does it affect the traced process?

Q32: 如果 memleak 本身崩溃了怎么办？会影响被追踪的进程吗？

A: No. eBPF programs are verified by the kernel before loading — they cannot crash the kernel or the traced process. If the memleak user-space component crashes, the BPF probes are automatically cleaned up. The traced process continues running unaffected.

中文: 不会。eBPF 程序在加载前经过内核验证——它们不能崩溃内核或被追踪的进程。如果 memleak 的用户态组件崩溃，BPF 探针会自动清理。被追踪的进程继续不受影响地运行。

Q33: Can I run memleak on multiple processes simultaneously?

Q33: 能同时对多个进程运行 memleak 吗？

A: Yes. You can either run separate memleak instances targeting different PIDs, or run it in system-wide mode to trace all processes. System-wide mode captures every malloc/free on the system, which is useful when you don’t know which process is leaking.

中文: 可以。可以运行多个 memleak 实例分别指向不同 PID，或以系统级模式运行来追踪所有进程。系统级模式捕获系统上每个 malloc/free，当你不知道哪个进程在泄漏时很有用。

Q34: How do you handle multi-threaded programs? Is the tracing thread-safe?

Q34: 如何处理多线程程序？追踪是线程安全的吗？

A: Yes. BPF maps use per-CPU hash maps or have built-in locking. Each allocation event is atomically recorded with its TID. There’s no risk of race conditions corrupting the tracking data, even with highly concurrent programs.

中文: 是的。BPF maps 使用 per-CPU 哈希表或内置锁。每个分配事件都原子地记录其 TID。即使在高并发程序中，也不存在竞争条件损坏追踪数据的风险。

Q35: What’s the memory map size limit? What happens when it’s full?

Q35: 内存 map 大小限制是多少？满了会怎样？

A: The default BPF hash map size is typically 10240 entries. When full, new allocations are silently dropped from tracking (the allocation itself still succeeds — we just lose visibility). You can increase the map size with a command-line flag. For programs with massive concurrent allocations, we recommend increasing to 100K+ entries.

中文: 默认 BPF 哈希表大小通常为 10240 条目。满时，新分配会被静默丢弃追踪（分配本身仍然成功——只是失去可见性）。可以用命令行参数增加 map 大小。对于有大量并发分配的程序，建议增加到 10 万+条目。

Q36: Does memleak affect the timing/scheduling of the traced program?

Q36: memleak 会影响被追踪程序的时序/调度吗？

A: Minimally. Each uprobe adds 2-4µs of latency to the malloc/free call. This happens in the context of the traced thread, so it can theoretically affect timing-sensitive code. For real-time systems or latency-critical paths, consider sampling mode or limiting the tracing duration.

中文: 影响很小。每个 uprobe 给 malloc/free 调用增加 2-4µs 延迟。这发生在被追踪线程的上下文中，理论上可能影响时间敏感的代码。对于实时系统或延迟敏感路径，考虑采样模式或限制追踪时长。

Q37: Can we use this in a 24/7 always-on monitoring setup?

Q37: 能在 7x24 持续监控中使用吗？

A: It’s possible but not recommended for production hot paths. The per-hit overhead accumulates over time. A better approach is periodic sampling: run memleak for 60 seconds every hour, generate a report, compare trends. This gives you leak detection without continuous overhead.

中文: 可以但不推荐用于生产热路径。每次命中的开销会随时间累积。更好的方法是周期性采样：每小时运行 memleak 60 秒，生成报告，比较趋势。这在不持续增加开销的情况下提供泄漏检测。

9. Stack Traces & Symbols

Q38: Why do I sometimes see incomplete stack traces?

Q38: 为什么有时看到不完整的调用栈？

A: Common causes:
1. Missing frame pointers — binary compiled with -fomit-frame-pointer (gcc default with optimization)
2. Inlined functions — compiler inlines small functions, they disappear from the stack
3. Stack depth limit — BPF has a max stack depth (usually 127 frames)
4. JIT/interpreted code — dynamically generated code doesn’t have frame pointer metadata

Fix: recompile with -fno-omit-frame-pointer or use DWARF-based unwinding (requires newer kernels with ORC unwinder support).

中文: 常见原因：
1. 缺少帧指针 — 二进制用 -fomit-frame-pointer 编译（gcc 优化时的默认行为）
2. 内联函数 — 编译器内联小函数，它们从调用栈中消失
3. 栈深度限制 — BPF 有最大栈深度（通常 127 帧）
4. JIT/解释型代码 — 动态生成的代码没有帧指针元数据

修复：用 -fno-omit-frame-pointer 重新编译，或使用基于 DWARF 的栈回溯（需要较新内核支持 ORC unwinder）。

Q39: How do I resolve addresses to function names and line numbers?

Q39: 如何将地址解析为函数名和行号？

A: Two approaches:
1. At analysis time — use addr2line or llvm-symbolizer with the debug symbols file to convert addresses to source locations
2. In our HTML report — we automatically resolve symbols if the binary with debug info is available at report generation time

You don’t need debug symbols on the production machine — just keep them available offline for analysis.

中文: 两种方法：
1. 分析时 — 使用 addr2line 或 llvm-symbolizer 配合调试符号文件将地址转换为源码位置
2. 在我们的 HTML 报告中 — 如果生成报告时有带调试信息的二进制文件，我们会自动解析符号

不需要在生产机器上保留调试符号——只需在离线分析时可用即可。

Q40: Does it work with stripped binaries?

Q40: 对剥离符号的二进制文件有效吗？

A: Partially. You’ll still get allocation tracking (address, size, timing) and kernel-space stacks. But user-space stack traces will show raw addresses instead of function names. You can still resolve them later using a separate debug symbols file (.debug file or unstripped binary copy).

中文: 部分有效。你仍然能获得分配追踪（地址、大小、时间）和内核态调用栈。但用户态调用栈只显示原始地址而不是函数名。你仍可以之后使用单独的调试符号文件（.debug 文件或未剥离的二进制副本）来解析。

Q41: What about Position Independent Executables (PIE)? Do addresses change?

Q41: 位置无关可执行文件（PIE）呢？地址会变吗？

A: Yes, PIE causes ASLR to randomize the load address. But memleak captures the absolute virtual address at runtime, and the offset from the binary base is constant. You need to know the binary’s load base address (from /proc/PID/maps) to resolve symbols. Our tool handles this automatically.

中文: 是的，PIE 导致 ASLR 随机化加载地址。但 memleak 捕获的是运行时绝对虚拟地址，与二进制基址的偏移是固定的。你需要知道二进制的加载基址（来自 /proc/PID/maps）来解析符号。我们的工具自动处理这个。

10. Comparison Deep Dives

Q42: Valgrind Memcheck vs eBPF memleak — when to use which?

Q42: Valgrind Memcheck vs eBPF memleak——什么时候用哪个？

A: It really comes down to two questions: where are you running, and what are you looking for?

If you’re in a dev or test environment and you want to catch everything — use-after-free, buffer overflows, double-free, and leaks — Valgrind is your best friend. It gives you byte-level precision on all memory errors. The downside? It slows your program by ten to twenty times. That’s fine on your laptop, but unacceptable in production.

If you’re in production and you suspect a leak, eBPF memleak is the answer. It attaches to a running process with near-zero overhead — no restart, no recompile, no downtime. It tracks leaks in both user-space and kernel-space. But it only detects leaks; it won’t tell you about use-after-free or overflows.

There’s also a key operational difference: Valgrind requires you to start your program under it from the beginning. You can’t attach it to something already running. eBPF, on the other hand, can attach and detach at any time — like a stethoscope you press against a running system.

So the rule of thumb: Valgrind during development for comprehensive checking, eBPF in production for leak hunting. They complement each other perfectly.

中文: 归根结底取决于两个问题：你在哪里运行，你在找什么？

如果你在开发或测试环境中，想捕获一切——use-after-free、缓冲区溢出、double-free 和泄漏——Valgrind 是你最好的朋友。它对所有内存错误提供字节级精度。缺点？它让程序慢 10 到 20 倍。在你的笔记本电脑上没问题，但在生产环境中不可接受。

如果你在生产环境中怀疑有泄漏，eBPF memleak 就是答案。它以近零开销挂载到运行中的进程——无需重启、无需重编译、无需停机。它追踪用户态和内核态的泄漏。但它只检测泄漏；不会告诉你 use-after-free 或溢出。

还有一个关键的操作差异：Valgrind 要求你从一开始就在它下面启动程序。你不能把它挂到已经在运行的东西上。而 eBPF 可以随时挂载和卸载——就像你贴在运行中系统上的听诊器。

所以经验法则是：开发阶段用 Valgrind 做全面检查，生产环境用 eBPF 猎取泄漏。它们完美互补。

Q43: How does ASan’s leak detector (LeakSanitizer) compare?

Q43: ASan 的泄漏检测器（LeakSanitizer）相比如何？

A: LeakSanitizer (LSan, part of ASan) does a reachability analysis at program exit — it scans all memory for pointers to determine what’s truly unreachable. This is more precise than eBPF (which just reports unfreed memory). But LSan requires recompilation, has 2-3x overhead, only runs at exit (not continuous), and doesn’t work on kernel memory.

中文: LeakSanitizer（LSan，ASan 的一部分）在程序退出时做可达性分析——它扫描所有内存中的指针来确定什么是真正不可达的。这比 eBPF（只报告未释放内存）更精确。但 LSan 需要重新编译、有 2-3 倍开销、只在退出时运行（不是持续的），且不能用于内核内存。

Q44: What about Electric Fence or DUMA?

Q44: Electric Fence 或 DUMA 呢？

A: These tools use guard pages around each allocation to detect overflows immediately via segfault. They are extremely memory-hungry (each allocation gets its own page) and slow. They detect buffer overflows instantly but don’t specifically detect leaks. They’re incompatible with production use.

中文: 这些工具在每次分配周围使用保护页，通过段错误立即检测溢出。它们极度消耗内存（每次分配获得自己的页面）且慢。它们能立即检测缓冲区溢出，但不专门检测泄漏。它们不适合生产使用。

Q45: What about using /proc/PID/smaps for leak detection?

Q45: 用 /proc/PID/smaps 做泄漏检测呢？

A: /proc/PID/smaps shows memory regions and RSS but not individual allocations. It can tell you “heap is growing” but not “which malloc call is responsible.” It’s a coarse-grained tool for identifying the symptom, while memleak identifies the root cause with exact stack traces.

中文: /proc/PID/smaps 显示内存区域和 RSS，但不显示单个分配。它能告诉你”堆在增长”但不能告诉你”哪个 malloc 调用负责”。它是识别症状的粗粒度工具，而 memleak 通过精确调用栈识别根因。

11. Edge Cases & Troubleshooting

Q46: What about memory pools? (pre-allocate a big chunk, sub-allocate from it)

Q46: 内存池怎么办？（预分配大块，从中子分配）

A: If a program uses its own memory pool (allocates a large block once via malloc, then manages sub-allocations internally), memleak will only see the initial large malloc — not the internal sub-allocations. The initial block appears as “allocated, never freed” which looks like a leak but isn’t. To trace pool-internal allocations, you’d need to uprobe the pool’s custom allocate/free functions.

中文: 如果程序使用自己的内存池（通过 malloc 一次性分配大块，然后内部管理子分配），memleak 只会看到初始的大 malloc——而不是内部子分配。初始块显示为”已分配，未释放”，看起来像泄漏但实际不是。要追踪池内部分配，需要 uprobe 池的自定义 allocate/free 函数。

Q47: How do you handle realloc? Does it look like a leak?

Q47: 如何处理 realloc？它看起来像泄漏吗？

A: memleak properly handles realloc by tracking it as a free of the old address + allocation of the new address. If realloc returns a different pointer (because it moved the block), the old address is marked as freed. This avoids false positives from realloc patterns.

中文: memleak 正确处理 realloc，将其追踪为释放旧地址 + 分配新地址。如果 realloc 返回不同指针（因为移动了块），旧地址被标记为已释放。这避免了 realloc 模式的误报。

Q48: What about programs that intentionally never free (cleanup-at-exit pattern)?

Q48: 故意不释放的程序（退出时清理模式）呢？

A: Many programs allocate at startup and rely on process exit for cleanup. These are technically leaks but not problematic. Our “continuous leak detection” feature handles this well — one-time allocations that plateau don’t trigger alerts. Only allocations that keep growing across multiple snapshots get flagged.

中文: 许多程序在启动时分配，依赖进程退出进行清理。这些技术上是泄漏但不是问题。我们的”持续泄漏检测”功能很好地处理了这个——趋于平稳的一次性分配不会触发告警。只有在多个快照中持续增长的分配才被标记。

Q49: Can memleak detect leaks in signal handlers?

Q49: memleak 能检测信号处理函数中的泄漏吗？

A: Yes. Signal handlers that call malloc (though this is technically unsafe/non-portable) are traced just like any other malloc call. The uprobe fires regardless of the execution context. However, note that calling malloc in a signal handler is itself a bug (async-signal-unsafe function).

中文: 可以。调用 malloc 的信号处理函数（虽然技术上是不安全/不可移植的）和其他 malloc 调用一样被追踪。uprobe 无论执行上下文都会触发。但注意，在信号处理函数中调用 malloc 本身就是一个 bug（异步信号不安全函数）。

Q50: What if dlopen/dlclose loads and unloads shared libraries?

Q50: dlopen/dlclose 加载和卸载共享库会怎样？

A: When a shared library is unloaded with dlclose, the uprobes attached to its functions are automatically removed. If the library leaks memory before being unloaded, those allocations will appear in the report with stack traces pointing to the (now unloaded) library. Address resolution may fail for unloaded libraries unless you keep the .so file for offline analysis.

中文: 当共享库通过 dlclose 卸载时，附加到其函数的 uprobes 自动移除。如果库在卸载前泄漏了内存，这些分配会出现在报告中，调用栈指向（现已卸载的）库。对于已卸载的库，地址解析可能失败，除非保留 .so 文件用于离线分析。

Q51: Does fork() affect memleak tracking?

Q51: fork() 会影响 memleak 追踪吗？

A: After fork(), the child process inherits the parent’s memory but not the BPF probe context. memleak continues tracking the original (parent) process. To track the child, you need to attach a new memleak instance to the child’s PID. Allocations inherited from the parent are not double-counted.

中文: fork() 之后，子进程继承父进程的内存但不继承 BPF 探针上下文。memleak 继续追踪原始（父）进程。要追踪子进程，需要将新的 memleak 实例挂载到子进程的 PID。从父进程继承的分配不会被重复计算。

12. Deployment & Workflow

Q52: What’s the recommended workflow for investigating a suspected leak?

Q52: 调查疑似泄漏的推荐工作流是什么？

A:
1. Confirm the symptom — check RSS growth in monitoring (Grafana/Prometheus)
2. Attach memleak — run for 2-5 minutes with periodic snapshots (every 5-10 seconds)
3. Generate HTML report — look for continuously growing sources
4. Identify the top offenders — expand stack traces to find the code path
5. Analyze the code — determine if it’s a real leak or intentional retention
6. Fix and verify — patch the code, run memleak again to confirm the leak is gone

中文:
1. 确认症状 — 在监控中检查 RSS 增长（Grafana/Prometheus）
2. 挂载 memleak — 运行 2-5 分钟，周期性快照（每 5-10 秒）
3. 生成 HTML 报告 — 查找持续增长的来源
4. 识别主要问题 — 展开调用栈找到代码路径
5. 分析代码 — 判断是真正的泄漏还是故意保留
6. 修复并验证 — 修补代码，再次运行 memleak 确认泄漏已消除

Q53: Can we save the raw data and generate the report later?

Q53: 能保存原始数据之后再生成报告吗？

A: Yes. memleak outputs text logs to stdout/file. You can collect the raw output, transfer it to another machine, and run the HTML report generator offline. This is useful when the production machine has limited tooling.

中文: 可以。memleak 将文本日志输出到 stdout/文件。你可以收集原始输出，传输到另一台机器，然后离线运行 HTML 报告生成器。当生产机器工具有限时很有用。

Q54: How do we handle false positives in reports?

Q54: 如何处理报告中的误报？

A: Common false positives and how to handle them:
1. Startup allocations — filter by looking at growth pattern (should plateau)
2. Cache/pool allocations — these grow then stabilize; the continuous-growth detector ignores them
3. Intentional leaks (daemon patterns) — create a baseline and compare against it
4. Third-party library allocations — filter by stack trace function names

You can also maintain a “known allocations” whitelist to suppress known non-issues.

中文: 常见误报及处理方式：
1. 启动分配 — 通过增长模式过滤（应趋于平稳）
2. 缓存/池分配 — 这些增长后稳定；持续增长检测器会忽略它们
3. 故意泄漏（守护进程模式） — 创建基线并与之比较
4. 第三方库分配 — 按调用栈函数名过滤

也可以维护一个”已知分配”白名单来抑制已知的非问题。

Q55: What privileges exactly do we need? Can we avoid running as root?

Q55: 具体需要什么权限？能避免以 root 运行吗？

A: Options:
1. Root — always works, simplest
2. CAP_BPF + CAP_PERFMON (Linux 5.8+) — minimum capabilities for BPF tracing without full root
3. CAP_SYS_ADMIN — also works but grants more than needed
4. Unprivileged BPF (sysctl kernel.unprivileged_bpf_disabled=0) — not recommended for security reasons

In production, we recommend creating a dedicated service account with only CAP_BPF + CAP_PERFMON.

中文: 选项：
1. Root — 始终有效，最简单
2. CAP_BPF + CAP_PERFMON（Linux 5.8+）— 不需要完整 root 的最小 BPF 追踪权限
3. CAP_SYS_ADMIN — 也有效但授予了超过需要的权限
4. 非特权 BPF（sysctl kernel.unprivileged_bpf_disabled=0）— 出于安全原因不推荐

在生产中，建议创建只有 CAP_BPF + CAP_PERFMON 的专用服务账号。

Q56: Is there a way to get real-time alerts when a leak is detected?

Q56: 有没有办法在检测到泄漏时实时告警？

A: Not built into the tool directly, but you can script it:
1. Run memleak with periodic output
2. Pipe output to a script that watches for allocation count growth
3. Trigger alerts (PagerDuty, Slack webhook, etc.) when growth exceeds threshold

Alternatively, integrate with your monitoring stack: export allocation metrics to Prometheus and set up Grafana alerts.

中文: 工具本身没有内置，但可以脚本化：
1. 运行 memleak 并周期性输出
2. 将输出管道到监视分配计数增长的脚本
3. 当增长超过阈值时触发告警（PagerDuty、Slack webhook 等）

或者，与你的监控栈集成：导出分配指标到 Prometheus 并设置 Grafana 告警。