Today extensive insight into system activity is critical for ensuring performance, security, and dependability. Traditional observability techniques have offered some information, but they frequently fall short in terms of obtaining granular data at the kernel level without incurring substantial expense.
Enter eBPF (extended Berkeley Packet Filter), a game-changing technique that allows for high-performance, safe, and customizable monitoring of system events at the kernel level.
Why is eBPF So Important for Observability?
The eBPF capabilities allow a (privileged) user to use general-purpose programs to inspect activities and processes down to the kernel level. Kernel-level events serve as the foundation for all system operations.
The eBPF code is event-driven; programs are connected, triggered, or run when specified events of interest occur (also known as hook points). These events, or hooks, often comprise system calls, network connections, particular network packets, function calls, the activation of trace points (kprobes and uprobes), and other events.
eBPF Features and Capabilities
Beyond capturing system information during events, eBPF programs can format data structures (maps), execute Boolean functions (such as searching, inserting, and deleting key-value pairs), generate pseudo-random numbers, and flag events, making the resulting metadata and telemetry highly valuable for subsequent analytics.
What enhances eBPF’s power is its ability to trigger additional helper programs, communicate with the host system, and perform various tasks when these events or hooks are activated. Let’s have a look at how eBPF provides both security and performance:
Highly Secure
- Strict verification – Before any eBPF program is put into a kernel, it is validated by the eBPF validator to guarantee that the code is completely safe.
- Sandboxed – eBPF applications run in a memory-isolated environment within the kernel, separate from other kernel components. This isolation prevents unauthorized access to kernel memory, data structures, and the kernel source code.
- Limited operations – eBPF applications are often written in a tiny subset of the C language, with a restricted instruction set. This restricts the activities that eBPF applications can execute, lowering the likelihood of security vulnerabilities.
High-Performance
- Run as native machine code – eBPF apps use native CPU instructions. This leads to faster execution and higher performance.
- No context switches – Traditional programs frequently transition between user space and kernel space, consuming significant resources. In contrast, eBPF programs run directly at the kernel layer, providing direct access to kernel data structures and resources without the need for context switching.
- Event-driven – eBPF apps frequently respond to specific kernel events rather than being constantly on. This minimizes overhead.
As a result, eBPF provides a secure and efficient programming interface to the kernel. Given that everything passes via the kernel, this opens up a number of previously unthinkable options.
How Does eBPF Affect Observability?
Let’s examine the components of our typical observability solution by leaving the eBPF universe and entering the observability universe in order to look into this. Any observability solution has four main components:
- Data collecting – Receiving telemetry data from apps and infrastructure.
- Data processing – This phase involves applying filters, indexes, and calculations to the gathered information.
- Data storage – This includes both short-term and long-term storage.
- The user experience layer – This determines how data is consumed by the user.
How Does eBPF Observability Work?
To completely comprehend the underlying mechanics that drive eBPF observability, we must first grasp the idea of hooks. As we previously stated, eBPF programs are generally event-driven, meaning they are activated once a certain event happens. For example, whenever a function call is performed, an eBPF program can be invoked to collect data for observability reasons.
- First, these hooks might reside in either kernel or user space. eBPF may therefore be used to monitor both user-space apps and kernel-level events.
- Second, these hooks may be either predetermined/static or introduced dynamically into a running system.
Each of these may be achieved using four unique eBPF processes. Dynamic and static eBPF hooks into kernel and user space.
- Kernel tracepoints – They are used to hook onto events predefined by kernel developers.
- USDT – They are used to hook into preset tracepoints established by developers in application code.
- Kprobes (kernel probes) – They can dynamically hook into any area of the kernel code during runtime.
- Uprobes (user probes) – They are used to dynamically hook into any portion of a user-space program at runtime.
There are various predefined hooks in the kernel space to which an eBPF application may be simply attached (for example, system calls, function entry/exit, network events, and kernel tracepoints). Similarly, in the user space, many language runtimes, database systems, and software stacks include preset hooks for Linux BCC utilities that eBPF programs can access.
eBPF vs Current Instrumentation Methods
Aside from eBPF, there are now two primary methods for instrumenting applications and infrastructure for observability:
- Agent-based instrumentation – Telemetry data is gathered using independent software SDKs or libraries that are integrated into the application code or infrastructure nodes.
- Sidecar proxy-based instrumentation – Sidecars are lightweight, standalone processes that run alongside an application or service. They are widely used in microservices and container-based designs like Kubernetes.
eBPF Observability Use Cases
eBPF can be leveraged for practically all existing observability use cases while also opening up new possibilities.
System and Infrastructure Monitoring
eBPF enables detailed monitoring of system-level events such as CPU utilization, memory allocation, disk I/O, and network traffic. For example, LinkedIn employs eBPF for all of its infrastructure monitoring.
Container and Kubernetes Monitoring
Individual containers and pods’ health, resource utilization, and Kubernetes-specific data are all visible.
Application Performance Monitoring (APM)
Fine-grained observability of user-space applications, including throughput, error rates, latency, and tracing.
Advanced Observability
eBPF is suitable for advanced observability tasks such as live debugging, low-overhead application profiling, and system call tracing.
Wrapping Up
In summary, by providing a considerably improved instrumentation method, eBPF has the potential to profoundly alter our approach to observability in the coming years. While this article focused on eBPF’s usage in data collection and instrumentation, it might also be employed in data processing or even data storage layers in the future. The possibilities are vast and as yet untapped.