eBPF 技术因其在提高 Linux 内核的可观测性、性能和安全性方面的重要作用而被广泛应用于各种场景,如网络堆栈、系统调用和文件系统的监控。尽管已有编译器支持 eBPF 程序的编译,但现有的工具常常忽略了关键的优化机会,导致性能不佳。此外,eBPF 程序的优化需要考虑到内核安全性的要求,这增加了优化的复杂性。
论文主要解决的问题是如何对 eBPF代码进行多级优化,以提高其在 Linux 内核中的性能和紧凑性。eBPF 是一种在内核中运行的虚拟机,它允许用户以安全的方式执行自定义程序,用于观察、分析和修改内核行为。然而,由于 eBPF 程序在执行前需要通过内核的严格验证,并且受限于指令数量和程序长度,这导致在不牺牲安全性的前提下进行性能优化成为一个挑战。
简单总结文中的创新方法:
效果:
@inproceedings{10.1145/3620666.3651387,
author = {Mao, Jinsong and Ding, Hailun and Zhai, Juan and Ma, Shiqing},
title = {Merlin: Multi-tier Optimization of eBPF Code for Performance and Compactness},
year = {2024},
isbn = {9798400703867},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3620666.3651387},
doi = {10.1145/3620666.3651387},
abstract = {eBPF (extended Berkeley Packet Filter) significantly enhances observability, performance, and security within the Linux kernel, playing a pivotal role in various real-world applications. Implemented as a register-based kernel virtual machine, eBPF features a customized Instruction Set Architecture (ISA) with stringent kernel safety requirements, e.g., a limited number of instructions. This constraint necessitates substantial optimization efforts for eBPF programs to meet performance objectives. Despite the availability of compilers supporting eBPF program compilation, existing tools often overlook key optimization opportunities, resulting in suboptimal performance. In response, this paper introduces Merlin, an optimization framework leveraging customized LLVM passes and bytecode rewriting for Instruction Representation (IR) transformation and bytecode refinement. Merlin employs two primary optimization strategies, i.e., instruction merging and strength reduction. These optimizations are deployed before eBPF verification. We evaluate Merlin across 19 XDP programs (drawn from the Linux kernel, Meta, hXDP, and Cilium) and three eBPF-based systems (Sysdig, Tetragon, and Tracee, each comprising several hundred eBPF programs). The results show that all optimized programs pass the kernel verification. Meanwhile, Merlin can reduce number of instructions by 73\% and runtime overhead by 60\% compared with the original programs. Merlin can also improve the throughput by 0.59\% and reduce the latency by 5.31\%, compared to state-of-the-art technique K2, while being 106 times faster and more scalable to larger and more complex programs without additional manual efforts.},
booktitle = {Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 3},
pages = {639–653},
numpages = {15},
keywords = {eBPF optimization, LLVM},
location = {<conf-loc>, <city>La Jolla</city>, <state>CA</state>, <country>USA</country>, </conf-loc>},
series = {ASPLOS '24}
}