eBPF is (now!) a cross-platform technology with origins in the Linux Kernel that can run sandboxed programs in a privileged context such as the operating system kernel. It is used to safely and efficiently extend the capabilities of the kernel without requiring to change kernel source code or write drivers with the native kernel APIs.
Since May 2021, Microsoft has been hard at work on bringing eBPF to Windows. This post is meant to provide a user’s view of the project circa early 2023. We’ll be looking at setting up a Windows-eBPF build environment, followed by creating a sample project to pass around data between a userspace program and an eBPF program running in the kernel.
Installation
To run an eBPF Program, we need a Windows VM with test-signing enabled or a kernel-debugger attached. eBPF drivers cannot be production signed at the current state of the project (Security Hardening is still in progress).
To set up the VM, follow the instructions in the repo here. Inside the VM, install the eBPF runtime by following the Method 1 instructions here. Make sure to only check the Runtime Components
for install. We’ll get the development files through a different method.
To get the eBPF Development files, we have three options:
- eBPF-for-Windows nuget package.
- Build the project in the repo and grab the newly built .msi installer from
x64/Debug/ebpf-for-windows.msi
. Instructions for building the project can be found here. - The 0.6.0 release .msi from the Releases section here.
At the time of writing, the 0.6.0 .msi has a few bugs regarding the directory structure to be ironed out. We’ll be using the nuget package for our development.
- Grab NuGet Windows x86 Commandline (version 6.31 or higher), which can be installed to a location such as "C:Program Files (x86)NuGet". Be sure to add
nuget.exe
to yourPATH
. - In the directory where you want to download the ebpf files, open a command prompt and run
nuget install eBPF-for-Windows -Version 0.6.0
. This should create a directory calledeBPF-for-Windows.0.6.0
in your working directory. - After installing the nuget package, as a one-time operation, you will currently need to run the
export_program_info.exe
tool from the command line to complete the install. This tool can be found theeBPF-for-Windows.0.6.0buildnativebin
directory.
That’s it. We’ll see what to do with the downloaded files later in this post.
The eBPF Programming Model
eBPF programs are executed by an eBPF runtime driver in the kernel. On Linux systems, this runtime ships with the kernel. On Windows, this runtime (ebpfcore.sys) ships with the .msi installer.
Let’s examine a high level view of how eBPF programs are built and run on Windows.
-
We start with our source code for the eBPF program written in a restricted set of C. This is the program that will run in the kernel.
-
We compile this program with a compiling toolchain that can emit eBPF bytecode. Currently, this can be done with Clang/LLVM.
-
Using an application written by you, or the
netsh
utility, the bytecode is fed into the PREVAIL Verifier through a userspace API (EbpfApi.lib/ebpfapi.dll
) which exposes functions for userspace manipulation of an eBPF Program. -
The verifier checks the program for invalid memory accesses, termination etc. This is why eBPF Programs are written in a restricted subset of C, so that another piece of software can verify it.
If you do manage to write a program which can check whether any piece of code terminates or not, please announce it to the world because you have just solved The Halting Problem. You’re going to be making a lot of money.
-
The valid program can be either Just-In-Time (JIT) compiled with the uBPF JIT Compiler into native code for the kernel, or interpreted with the uBPF Interpreter. (It can also produce native driver code as .sys files). Either the bytecode or native code (JIT Compiled) is loaded into the kernel.
-
But wait! How does it know what part of the kernel to attach itself to? eBPF Programs are designed to be attached to specific subsystems of the kernel. This attachment happens through an eBPF shim which contains the hooks and helpers to interact with the kernel subsystems, e.g., we can hook into the Network Stack to observe packets.
-
Once loaded into the kernel and attached to a subsystem, the eBPF program can be invoked for execution
Build Environment Setup
-
Install the eBPF development files using the .msi installer in the VM.
-
Clone the repo here. This will setup a convenient directory structure for us to use as follows. The repo also contains some helpful build scripts which will let us hit the ground running in our development.
Directory Structure
ebpf-starter-project: root directory
|
├───.vscode:vscode config, delete if not needed
|
├───bin:Generated by the build process that holds the userspace exe
|
├───build: holds all the build scripts for the userspace exe
|
├───include: common includes between the userspace and the kernel program
|
├───intermediate: generated by build with the intermediate .obj files of the userspace exe
|
├───kernelsrc: holds the source code and the binary for the eBPF kernel program
|
├───libs: holds our static libraries
|
|──src: source code for our userspace exe
|
|───vendor: source code from external vendors like the eBPF includes
|
|───eBPF: directory of the eBPF includes
-
Next, edit
LINE 5
in themakefile
in/root/build/
and thebuild.bat
file in/root/kernelsrc/
to point to the location of theinclude
directory of the eBPF development files you downloaded through the nuget package, i.e.eBPF-for-Windows.0.6.0/build/native/include
.(NOTE): By default both the
makefile
andbuild.bat
assume the eBPF includes to be in -/vendor/eBPF/include, this is because I’m lazy and copied the include folder from the eBPF directory. -
Likewise, edit
LINE 12
of themakefile
to point to the location of thelib
directory containingEbpfApi.lib
from the eBPF install directory. i.e.eBPF-for-Windows.0.6.0/build/native/lib
.(NOTE): Just like before the
makefile
defaults to the lazy route of expectingEbpfApi.lib
to be in /root/libs/EbpfApi.lib, because I copied it there.
Let’s write an eBPF Program
We’ll be writing an eBPF program to attach itself to the Layer 2 of the Network Stack, intercept the packets before they reach the kernel for processing, parse them ourselves and print out a few fields.
To intercept the packet at the earliest possible time, we need to write an eBPF program which will get attached to the XDP Hook. When an eBPF program is attached to an XDP Hook, the program will be invoked on every incoming packet.
Writing and Compiling a simple eBPF program
We’ll start by writing the simplest possible eBPF program that emits some observable behavior for us.
Create a .c
file in the kernelsrc
directory. We’ll call ours myxdp.c
and write the following.
// myxdp.c
#include "bpf_helpers.h"
#include "stdint.h"
SEC("xdp")
int32_t our_program(){
return 2;
}
Let’s examine the above code in detail.
-
bpf_helpers.h
contains struct definitions and macros that we’ll need for our eBPF programs. -
SEC("xdp")
is a macro to create a section calledxdp
in our object file and place theour_program
function code inside of thexdp
section. The section name identifies where to hook the eBPF program to. -
From the ebpf-for-windows docs:
Hook points are callouts exposed by the system to which eBPF programs can attach. By convention, the section name of the eBPF program in an ELF file is commonly used to designate which hook point the eBPF program is designed for. Specifically, a set of prefix strings are typically used to match against the section name. For example, any section name starting with "xdp" is meant as an XDP layer program.
-
Don’t worry, we’ll get to the
return 2
in a minute.
Compile the above file by opening a command prompt in the kernelsrc
directory and running build.bat
. This batch file will run the clang
compiler to emit bytecode in the form of myxdp.o
using the following command clang -I ..srcvendoreBPFinclude -target bpf -Werror -O2 -c -g myxdp.c -o myxdp.o
.
- In this command, we specify the eBPF include directory using the
-I
flag. - We specify our
-target
as bpf bytecode. -Werror
specifies that warnings are errors,-O2
is for compiling an optimized build,-c
is compile-only,-g
keeps the symbol information.
💡 We can look at the disassembly of the eBPF program by running llvm-objdump -S myxdp.o
where the -S
flag shows the corresponding source code for the assembly.
// myxdp.c
#include "bpf_helpers.h"
#include "stdint.h"
SEC("xdp")
int32_t our_program(){
return 2;
}
BPF assembly conventions specify to place return values of functions in the R0
register and return. We can see that we’re doing exactly that with our return value of 2
.
For a detailed overview of the BPF instruction set, see here and here.
Time to run our eBPF program? Not so fast! Let’s feed our program to the PREVAIL Verifier
and check for its correctness. Copy over myxdp.o
to your VM. Open an administrative command prompt in the directory where you pasted the program.
Run the verifier using netsh ebpf show verification myxdp.o
$ netsh ebpf show verification myxdp.o
Verification succeeded
Program terminates within 6 instructions
Our program is correct! Rejoice! We’re capable of writing a return statement. Very cool.
Can we please finally see something happen? Yes, yes we can. Load our program into the kernel with netsh ebpf add program myxdp.o
. If you’ve followed all the steps correctly till this point, you should see the program loaded message with a random ID:
$ netsh ebpf add program myxdp.o
Loaded with ID 65543 // This ID will be different for you
If you get an error and it says could not attach program
, either you’ve messed up, or I’ve messed up, or Microsoft/PREVAIL teams have made breaking changes/introduced a bug since the time of writing. The last one is quite possible, with the state of the project still being pre-release.
Wait, why are you coming towards me? You strut over passionately and say.
You: You there 👉! You said we’d see observable behavior, where’s my observable behavior?
I reply, in my youthful voice.
You expectantly open a browser on your VM and try to open your favorite website https://www.subcom.tech.
You: Nothing’s happening. Wait what? Nothing’s happening???
Author: YES! NOTHING IS HAPPENING! WE ARE DROPPING ALL INCOMING PACKETS! NO MORE EMAILS FROM YOUR MANAGER AT WORK! REJECT MODERNITY! RETURN TO CAVE
You: ಠ_ಠ. Okay, give me my internet back.
Author: Ugh, fine.
To return to the 21st century and unload the eBPF program from the kernel. Use netsh ebpf remove program 65543
, make sure you replace the ID with the ID of your program.
$ netsh ebpf remove program 65543
Unpinned 65543 from our_program
💡 If you need to check the ID of your program, you can view the list of active programs by using
netsh ebpf show programs
.
$ netsh ebpf show programs
ID Pins Links Mode Type Name
====== ==== ===== ========= ============= ==============
65543 1 1 JIT xdp our_program
Alternatively, you can use bpftool prog show
(or bpftool prog list
), same as on Linux.
Writing a more complex eBPF Program
Now that we know how to run an eBPF program in the kernel, let’s try writing a slightly more complex program to parse packet data and print out the type specified in the packet’s EtherType field.
- Back to
myxdp.c
then. Introduce a new parameter to our eBPF program and change the return value toXDP_DROP
. At the moment this program will have identical behavior as the previous iteration.
#include "bpf_helpers.h"
#include "stdint.h"
SEC("xdp") //++⌄⌄⌄⌄⌄⌄⌄⌄++
int32_t packet_parse xdp_md_t* ctx){
return XDP_DROP; //modified
}
bpf_helpers.h
internally includesebpf_nethooks.h
which provides definitions forxdp
constructs like thexdp_md_t
struct and thexdp_action_t
enum.xdp_md_t* ctx
is a pointer to a struct which holds pointers to the current packet data.XDP_DROP
is an enum value of typexdp_action_t
, can you guess what its integer value is supposed to be? Hint. It rhymes withGentoo
.- XDP programs return a value at the end of their execution to signal what should be done with the packet. We were returning an
XDP_DROP
all along, which drops the packet. - Other possible actions can be found in
ebpf_nethooks.h
in the enumxdp_action_t
.
💡
ebpf-for-windows
currently only supports a subset of theXDP_ACTIONS
available onlinux
. These are:
//ebpf_nethooks.h
//This file contains APIs for hooks and helpers that are
//exposed by netebpfext.sys for use by eBPF programs.
typedef enum xdp_action{
XDP_PASS = 1, //Allow the packet to pass.
XDP_DROP, //Drop the packet.
XDP_TX //Bounce the received packet back out the same NIC it arrived on.
} xdp_action_t;
- To access and therefore parse the packet data, we need to use
ctx
pointer. Let’s look at the fields of the struct it points to.
// ebpf_nethooks.h
// XDP hook.
typedef struct xdp_md{
void *data; // Pointer to start of packet data.
void *data_end; // Pointer to end of packet data.
uint64_t data_meta; // Packet metadata.
uint32_t ingress_ifindex; // Ingress interface index.
} xdp_md_t;
- The fields we’re interested in are the
data
anddata_end
pointers. For convenience, let’s create byte pointers from them.
SEC("xdp")
int32_t packet_parse(xdp_md_t *ctx) {
uint8_t* data = ctx->data; //+++
uint8_t* data_end = ctx->data_end; //+++
return XDP_DROP;
}
- The Layer-2 (The layer we’re working with) Ethernet IEEE 802.3 Frame Format is as follows:
To parse the packet, we can impose a type on the data
pointer by creating a struct with fields being of the appropriate number of bytes. Fortunately for us, we already have an Ethernet Header struct in net/if_ether.h
.
// net/if_ether.h
typedef struct _ETHERNET_HEADER{
uint8_t Destination[6];
uint8_t Source[6];
union{
uint16_t Type; // Ethernet
uint16_t Length; // IEEE 802
};
} ETHERNET_HEADER;
To parse the header, we only need to parse the Destination MAC, Source MAC, and the EtherType field.
- We can create a pointer of type
ETHERNET_HEADER
and cast the start of the packet to it, to access the fields. - To print out the
Type
field for our struct, we can usebpf_printk
- We also need to correct for network byte order (big-endian) for multi-byte fields, if we’re on a little-endian machine. We can use the
bpf_ntohs
function inbpf_endian.h
for a2 byte
field.
//omitted other includes
#include "bpf_endian.h"
SEC("xdp")
int32_t packet_parse(xdp_md_t *ctx) {
uint8_t* data = ctx->data;
uint8_t* data_end = ctx->data_end;
ETHERNET_HEADER *eth_hdr =(ETHERNET_HEADER*)data; //+++
bpf_printk("%x", bpf_ntohs(eth_hdr->Type)); //+++ %x to print the number in hex
return XDP_DROP;
}
- Build the program and copy over the object file to the VM. We can skip feeding our program to the verifier manually,
netsh
automatically does so when we useadd program
. So use the add program command.
$ netsh ebpf add program myxdp.o
error 0: could not load program
5: Upper bound must be at most packet_size (valid_access(r1.offset+12, width=2)
_(⊙◎)?__
Oh well, looks like manually feeding it to the verifier it is. We need a more detailed error log.
$ netsh ebpf show verification myxdp.o
Verification failed
Verification report:
; C:projectssubcomdevebpf-starter-projectkernelsrc/myxdp.c:14
; bpf_printk("%x", eth_hdr->Type);
5: Upper bound must be at most packet_size (valid_access(r1.offset+12, width=2)
1 errors
You: Okay, so it looks like our
bpf_printk
function is accessing something out of bounds?
Author: I guess so. ¯(ツ)/¯
You: But the packeteth_hdr->Type
is only12 bytes
in, and we know that the ethernet frame format specifies that there should be2 bytes
of data specifying the EtherType at that position. So, our program is correct, isn’t it?
Author: That’s what the specification says yes.
You: Then why doesn’t it work?
Author: Because that’s just a spec, not a physical law.
Packets can arrive in any sort of broken form, headers can be incomplete or corrupted due to problems in the transmission medium, the NIC, hax0rs etc. A valid eBPF program CANNOT crash the kernel. That’s sort of the primary reason we’re not writing kernel modules or drivers natively. The verifier can’t ensure that you’re not accessing beyond the bounds of the packet, and hence rejects the program.
To fix this, we just need to do a bounds check before we try the memory access.
// omitted above lines
int32_t packet_parse(xdp_md_t *ctx) {
uint8_t* data = ctx->data;
uint8_t* data_end = ctx->data_end;
ETHERNET_HEADER *eth_hdr =(ETHERNET_HEADER*)data;
//* Bounds check on the ethernet header
if ((uint8_t*)(eth_hdr + 1) > data_end) goto done; // <+++++++++++
// NOTE: the above check is the same as
// (uint_8*)eth_hdr + sizeof(ETHERNET_HEADER) > data_end
bpf_printk("%x", bpf_ntohs(eth_hdr->Type));
done: // <++++++++++
return XDP_DROP;
}
Build, copy and verify the bytecode.
$ netsh ebpf show verification myxdp.o
Verification succeeded
Program terminates within 52 instructions
Success! Now load it into the kernel with netsh ebpf add program myxdp.o
(or bpftool prog load myxdp.o
).
You: So the program is now running right?
Author: Right.
You: Then why is there no output printed? Aren’t we printing the EtherType for every packet?
Author: We are, just not in stdout.
💡 The PREVAIL Verifier is still under development. In verifying more complex programs, it can and will fail correct programs. The Linux verifier has a big head start in this regard, but will still sometimes reject correct programs. For catharsis, see every Boring Problem Found in eBPF (tmpout.sh).
My favorite excerpt from the above link about poor eBPF documentation.
$ netsh ebpf show verification myxdp.o
Verification succeeded
Program terminates within 52 instructions
And while you're writing them, the helpers available to you can vary wildly. And the documentation is incomplete, scattered, and often out of date, because...
- Because of the overlap, documentation of the pure BPF interface(s) (there's a plethora, we'll cover that) is lacking. The people that maintain it write the userspace tooling, so they don't need in-depth documentation. Seriously, go check out the BPF man page for whatever distro you're on. Chances are it's missing a ton of helpers and there's more than one "TODO: fill this out" that's been sitting there for years. Why not use their userspace tooling? Well...
- Their userspace tooling is a magic labyrinth. In order to get close to CO-RE in a backwards compatible way, it's filled with kludges you probably don't need. Ideally, you'd interface directly with the underlying syscalls and use only what you need. But doing that is undocumented. And, because of the documentation issues, there's really no community drive to simplify these libraries. Because these libraries cover the majority of historical use cases, there's no drive to improve the documentation. Even if you did, you'd have to backport and patch your documentation to cover all the little idiosyncrasies across kernel versions, and boy are there a lot of those.
In your eBPF programming journey, you’ll come to realize that writing verifier approved programs can sometimes be more of an art than a science. A program that verifies on linux, might not verify on PREVAIL, or vice-versa.
It is what it is ¯(ツ)*/¯
-
eBPF on Windows uses ETW (Event Tracing for Windows) for logging traces. To view traces in real-time, the
tracelog.exe
andtracefmt.exe
commands from the WDK can be used. Since we’re running eBPF on Windows in a VM, you can either install the full WDK in the VM (see the Prerequisites section) or just copy the two executables into the VM in the same directory asmyxdp.o
. -
The executables can be found in your WDK install directory. For me, it was:
C:Program Files (x86)Windows Kits10bin10.0.22621.0x64
.
To view the event logs:
-
Create a trace session with some name such as MyTrace:
$ tracefmt -rt MyTrace -displayonly -jsonMeta 0 Setting RealTime mode for MyTrace Examining C:UsersUserDesktopNew folderdefault.tmf for message formats, none found, file not found Searching for TMF files on path: C:\Users\User\Desktop\New folder [0]0000.0000::02/21/2023-22:44:08.219 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:08.789 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:09.798 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:11.287 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:11.799 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:12.788 [EbpfForWindowsProvider]{"Message":"806"} ...Ctrl+C
tracelog -start MyTrace -guid "%ProgramFiles%[eBPF for Windows install folder]ebpf-printk.guid" -rt
, replace the path beforeebpf-printk.guid
with your eBPF install folder. This command will only display thebpf_printk
traces. -
View the session in real-time on stdout: This will print logs until you hit Ctrl+C.
tracefmt -rt MyTrace -displayonly -jsonMeta 0
. -
Close the trace session:
tracelog -stop MyTrace
. You’ll see something like:Examining C:UsersUserDesktopNew folderdefault.tmf for message formats, none found, file not found Searching for TMF files on path: C:\Users\User\Desktop\New folder [0]0000.0000::02/21/2023-22:44:08.219 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:08.789 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:09.798 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:11.287 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:11.799 [EbpfForWindowsProvider]{"Message":"806"} [0]0000.0000::02/21/2023-22:44:12.788 [EbpfForWindowsProvider]{"Message":"806"} ...Ctrl+C
Your “Message” will vary depending on the packet you receive. e.g My tracing printed
806
,0806
is the EtherType in hex forARP
packets.
Therefore, at the time of tracing, I was receiving ARP
packets.
Working with Userspace
In this section, we’ll see how to manage the lifetime of an eBPF program through userspace and pass around data using eBPF maps. This post and the starter-project use MSVC 2022 from the command line in the form of cl.exe
. As such, you’ll need an installation of Visual Studio 2022 or the MS Build tools. The code should compile without problems on other compilers, however it has not been tested with anything other than MSVC.
Loading the eBPF Program through a userspace application
Let’s see how we can load an eBPF program into the kernel, through our own code.
-
We’ll start by creating a
cpp or c
file in thesrc
directory. I’ll call minemain.cpp
.// main.cpp #include
int main(){ printf(“Hello”); } -
To build the userspace application, we’ll use the build scripts in the
build
directory.ebpf-starter-project: ├───bin | ├───main.exe: Userspace application | ├───build | ├───built.bat: Build the application | | | ├───built_all.bat: Do a clean build of the application | | | ├───clean.bat: Clean any previous builds | | | ├───makefile: Makefile with buildsteps | | | ├───run.bat: Will run main.exe from the bin directory (of limited use since most of the code will only work on the VM.```
-
Open a Visual Studio Developer Command Prompt in the
build
directory and runbuild_all.bat
orbuild.bat
to buildmain.exe
. You can userun.bat
to runmain.exe
easily.
> build_all Microsoft (R) Program Maintenance Utility Version 14.34.31937.0 Copyright (C) Microsoft Corporation. All rights reserved. Compiling ..src ----------------------------------------------- cl /D_CRT_SECURE_NO_WARNINGS /nologo /MT /Zi /FC /c /EHsc /W4 /Od /wd4505 /wd4200 /wd4201 /wd4100 /wd4189 /wd4312 /std:c++20 /Fd..bin /I..srcvendoreBPFinclude /I..include /Fo..intermediate ..src*.cpp main.cpp Linking main.exe ----------------------------------------------- link /DEBUG:FULL /LIBPATH:..libs /out:..binmain.exe ..intermediate*.obj EbpfApi.lib Microsoft (R) Incremental Linker Version 14.34.31937.0 Copyright (C) Microsoft Corporation. All rights reserved. > run Hello
-
We’ll create a global struct
ProgramData
to store some important pointers to different BPF constructs. We’ll put all the loading code in theebpf_load_program
function.This function will take the path of the eBPF program object file we built, as a parameter. In our case,
filepath = "myxdp.o"
, as we’ll be putting the object file in the same directory as themain.exe
. Theprog_name
parameter is the name of the function in our eBPF program, i.e.packet_parse
.A
fatal
helper function is also created for easy reporting of errors and exiting the program.// main.cpp #include
#include // These headers contain some bpf definitions and helpers #include #include struct ProgramData { bpf_object* object; bpf_program* program; bpf_link* link; }; static ProgramData prog_data; // optionally static to limit all the bpf-specific logic to main.cpp //* Helper for error reporting void fatal(const char * message){ printf ("Fatal error: %s: %dn", message, errno); exit(1); } ProgramData ebpf_load_program(const char* filepath, const char * prog_name){ //* Loading logic goes here } -
The lifetime of an eBPF program goes through the following phases:
(a) Open phase: The object file is parsed. Definitions of global variables, maps, etc. are discovered, but the variables, maps are not yet created.
We can use the
bpf_object__open
function provided bylibbpf.h
to open the file.
// main.cpp
ProgramData ebpf_load_program(const char * filepath, const char * prog_name){
bpf_object* object = bpf_object__open(filepath);
if(!object) fatal("Could not open the object file, check the filepath");
}
(b). Load phase: eBPF maps are created, various parameters and relocations are resolved. The eBPF program is verified and loaded into the kernel. At this point, the program exists in the kernel memory but is yet to be executed. Initial state like map entries can now be set up for the eBPF program to use during its execution.
We can use bpf_object_load
to verify and load the program into the kernel memory.
// main.cpp
ProgramData ebpf_load_program(const char* filepath, const char* prog_name){
bpf_object* object = bpf_object__open(filepath);
if (!object) fatal("Could not open the object file, check the filepath");
if (bpf_object__load(object) < 0) //+++++
fatal("Loading the object into the kernel failed"); //++++++
}
(c). Attach phase: eBPF programs get attached to the appropriate hook point. After this phase, BPF programs start executing and doing useful work.
We first find the packet_parse
function as the entry point to our eBPF program using bpf_object__find_program_by_name
. We then attach the program to a hook using bpf_program_attach
. The correct hook is inferred by the program attach routine through our section name xdp
.
At the end we return the pointers to the object
, program
, link
in a ProgramData
struct.
// main.cpp
ProgramData ebpf_load_program(const char* filepath, const char* prog_name){
bpf_object* object = bpf_object__open(filepath);
if (!object) fatal("Could not open the object file, check the filepath");
if (bpf_object__load(object) < 0)
fatal("Loading the object into the kernel failed");
bpf_program* program = bpf_object__find_program_by_name(object, prog_name); //++++
if (!program) fatal("Could not find program by name"); //++++
bpf_link* link = bpf_program_attach(program); //++++
if (!link) fatal("Could not attach program"); //++++
return { object, program, link };
}
💡 There is a fine distinction between attaching
and linking
an eBPF program. Technically we just linked it. The distinction is not that important in practice as linking
is an abstraction over attaching
. For further reading see here.
d. Teardown phase: After we are done with our useful computation from the eBPF program, we can detach them from their hook points and free any associated resources.
We’ll create a separate function unload_ebpf_program
to handle the teardown. With some helpers to call it, along with error reporting. unload_ebpf_program
will be called whenever we want to exit our application.
// main.cpp
void unload_ebpf_program(){
printf("Destroying eBPF Linkn");
assert(bpf_link__destroy(prog_data.link) == 0);
printf("Unloading eBPF programn");
bpf_program__unload(prog_data.program);
printf("Closing eBPF objectn");
bpf_object__close(prog_data.object);
}
void exit_program(){
unload_ebpf_program();
printf("Exiting processn");
exit(1);
}
void fatal_with_cleanup(const char* message){
printf("Fatal error: %s: %d", message, error);
exit_program();
}
- Let’s bring it all together and do some tertiary work. Create a control handler to handle Ctrl+C and window close events. MAKE SURE THAT
windows.h
is at the TOP of the file as the first include, otherwise it’s going to break other include files. The exit logic is unrelated to eBPF and a bit beyond the scope of this post, please consult your nearest C++ masochist for clarification.
//main.cpp
#define WIN32_LEAN_AND_MEAN //Reduces stuff imported from windows.h
#define VC_EXTRALEAN //Reduces stuff imported from windows.h
#include
// omitted includes
#include
std::mutex _wait_for_shutdown_mutex;
std::condition_variable _wait_for_shutdown;
bool _shutdown = false;
BOOL WINAPI CtrlHandler(DWORD fdwCtrlType){
switch (fdwCtrlType) {
// Handle the CTRL-C signal.
case CTRL_C_EVENT:
case CTRL_CLOSE_EVENT:
case CTRL_BREAK_EVENT:
case CTRL_LOGOFF_EVENT:
case CTRL_SHUTDOWN_EVENT: {
std::unique_lock lock(_wait_for_shutdown_mutex);
_shutdown = true;
_wait_for_shutdown.notify_all();
return TRUE;
}
default:
return FALSE;
}
}
- Final bit of code in this section, I promise. Time to fill out our
main
function- At the start, we’ll register our control handler function.
- Load the eBPF program with
ebpf_load_program
and store theProgramData
inprog_data
global. - Wait for a Ctrl+C event to happen.
- Clean up resources with
unload_ebpf_program
.
//main.cpp
int main(int argc, char** argv){
//* Register control handler
if (SetConsoleCtrlHandler(CtrlHandler, TRUE)) {
printf("nThe Control Handler is installed.n");
}
else {
printf("nERROR: Could not set control handlern");
return 1;
}
//* Load the eBPF program and store ProgramData struct in prog_data global.
prog_data = ebpf_load_program("myxdp.o", "packet_parse");
printf("eBPF program loadedn");
//* Wait for Ctrl+C
{
std::unique_lock lock(_wait_for_shutdown_mutex);
_wait_for_shutdown.wait(lock, []() { return _shutdown; });
}
//* Resource cleanup
unload_ebpf_program();
printf("Exiting processn");
}
That’s it! Build the application and copy over main.exe
to the VM, into the same directory as myxdp.o
. Then open an administrative command prompt in that directory and run main
.
If all goes well, you should see:
$ main
The Control Handler is installed.
eBPF program loaded
Now open another command prompt in the directory with tracelog
and tracefmt
. Use the tracing commands to start tracing and you should see the same output as when we were using netsh
for program loading.
To stop the eBPF program, simply Ctrl+C in the terminal with main.exe
execution. Our code should successfully unload the program and we should see the following.
$ main
The Control Handler is installed.
eBPF program loaded
Destroying eBPF Link
Unloading eBPF program //<+++++
Closing eBPF object //<+++++
Exiting process //<+++++
Now if you run netsh ebpf show programs
, you should see no active programs listed. Hence, the lifetime of our eBPF program was controlled by our userspace application.
User-Kernel Communication with maps
Next-up we’re going to examine how we can pass data between userspace and kernelspace programs using eBPF maps. We will send over the five-tuple information of the IPV4 and UDP
packets we receive to the userspace. We could create a more generic parser which supports more kinds of packets, but the scope of that is needlessly large for a demonstration of maps.
But what are maps and why do we need them?
eBPF maps are data structures that provide generic storage for sharing data between kernel and userspace. For most map types, the actual map memory (i.e the physical location of the backing store) is addressable only by the kernel.
User programs can use the Map API to perform CRUD operations on maps. Specific bits of map data (e.g. key/value entries) are copied over to the other memory region when communication is required.
In contrast, RingBuffer map is implemented using a shared region of main memory. The kernel has read/write access whereas the user application can only read it. The user application is notified that there is new data to be consumed via async I/O by the kernel.
Maps get created by the loader of the eBPF program during the loading phase, i.e. by bpf_object__load
. We will see further in the post how we interact with a RingBuffer map using the libbpf API and a map descriptor. Map descriptors are held by the userspace to identify which map in memory to issue operations to.
A reference counter is maintained by the kernel whenever a process holds a map descriptor. When all the holders of the map descriptor exit, the map also gets cleaned up by the kernel. The same applies to the eBPF programs themselves. For more information on the lifetime of BPF objects based on reference counts, see Lifetime of BPF objects.
Since maps reside in main memory, they are transient storage. Maps can be pinned to a virtual filesystem path, but there is no file I/O involved with your on-disk filesystem. Therefore, maps do not persist between reboots. For more details, see the eBPF virtual filesystem section in eBPF Updates #3: Atomics Operations, Socket Options Retrieval, Syscall Tracing Benchmarks, eBPF in the Supply Chain.
Creating a common struct
We need a common type to be passed around between the user and kernelspace program.
Create a common header in root/include/
called packets.h
. We’ll create a FiveTuple
struct here. Then include this file in both myxdp.c
and main.cpp
.
#pragma once
#include "stdint.h"
#include "net/if_ether.h"
#include "net/ip.h"
#include "net/udp.h"
//I also moved some packet header includes here,
//since both the user and kernel programs will need it.
typedef struct FiveTuple {
uint32_t src_address;
uint32_t dst_address;
uint16_t src_port;
uint16_t dst_port;
uint8_t proto;
}FiveTuple;
Modifying the eBPF program to use a RingBuffer map
First, we need to create a map definition. The map definition is a struct named ebpf_map_definition_in_file_t
, which holds the necessary parameters for our map to be constructed at runtime.
Map definitions are to be placed inside the maps
section of our object file. So we’ll use SEC(”maps”)
to create a maps section in myxdp.c
. We’ll name the map as xdp_map
.
// myxdp.c
SEC("maps")
ebpf_map_definition_in_file_t xdp_map = {
.type = BPF_MAP_TYPE_RINGBUF,
.max_entries = sizeof(FiveTuple) * 1024,
};
For a ringbuffer, the only fields we need to specify are type
and max_entries
. max_entries
will take the size in bytes of the ringbuffer to allocated. So sizeof(FiveTuple) * 1024
will allocate enough memory for 1024 FiveTuple
(s).
💡 The currently supported maps in ebpf-for-windows can be seen in the ebpf_structs.h
inside the eBPF include
directory.
//ebpf_structs.h
typedef enum bpf_map_type{
BPF_MAP_TYPE_UNSPEC = 0, ///< Unspecified map type.
BPF_MAP_TYPE_HASH = 1, ///< Hash table.
BPF_MAP_TYPE_ARRAY = 2, ///< Array, where the map key is the array index
//omitted
BPF_MAP_TYPE_QUEUE = 10, ///< Queue.
//omitted
BPF_MAP_TYPE_RINGBUF = 13 ///< Ring buffer.
} ebpf_map_type_t;
Different maps need different fields to be filled out in ebpf_map_definition_in_file_t
. e.g. a queue map:
SEC("maps")
ebpf_map_definition_in_file_t xdp_map = {
.type = BPF_MAP_TYPE_QUEUE,
.key_size = 0,
.value_size = sizeof(FiveTuple),
.max_entries = 1024,
};
Queue maps have no keys, since access is FIFO (first-in-first-out). However, they do have a value, the byte size of which is specified in the value_size
field. Here the max_entries
field refers to the number of elements
and not the size in bytes of the underlying array. The runtime will automatically calculate the size needed for the underlying array by multiplying value_size
and max_entries
.
The takeaway here is that you need to carefully read up on the parameters required for each map through some examples online, before using them.
Let’s create a FiveTuple
instance called tuple
inside our packet_parse
function. We’ll create a new function called parse_udp_packet
to hold all our parsing logic and fill out the tuple
object. The parse_udp_packet
will return a bool
indicating whether the current packet was successfully parsed and determined to be a UDP
packet or not.
//myxdp.c
//* Returns true on successful parse of an ethernet packet, else false
bool parse_ethernet_packet(uint8_t* start, uint8_t* end, FiveTuple* tuple){}
SEC("xdp")
int packet_parse(xdp_md_t* ctx){
u8* data = ctx->data;
u8* data_end = ctx->data_end;
FiveTuple tuple = {0};
bool is_parse_success = parse_ethernet_packet(data, data_end, &tuple);
return XDP_PASS;
}
Time to fill out the parse_ipv4_udp_packet
function. The parsing process is very similar to the ethernet header parsing we did earlier. All the headers we need to parse have premade structs shipped with the eBPF includes, these structs are in the includes we’ve put in packets.h
. We simply need to find the start location of each header and impose the relevant pointer type to access the field. If while parsing the packet we find any indication of a malformed packet, we simply jump to the end and return false
.
//myxdp.c
// Returns true on successful parse of an ethernet packet, else false
// If parsing was a success, all the fields of the tuple are filled.
bool parse_ethernet_packet(uint8_t* start, uint8_t* end, FiveTuple* tuple){
bool is_parse_success = false;
ETHERNET_HEADER* ethernet_header = (ETHERNET_HEADER*)start;
//Bounds check on ethernet header
if ((uint8_t*)(ethernet_header + 1) > end)
goto ipv4_udp_parse_done;
//Skip if packet is not ipv4
if (!(ethernet_header->Type == bpf_ntohs(ETHERNET_TYPE_IPV4)))
goto ipv4_udp_parse_done;
//Move sizeof(ETHERNET_HEADER) bytes ahead to the next header
IPV4_HEADER* ipv4_header = (IPV4_HEADER*)(ethernet_header + 1);
//*Bounds check on ipv4 header
if ((uint8_t*)(ipv4_header + 1) > end)
goto ipv4_udp_parse_done;
tuple->proto = ipv4_header->protocol;
tuple->src_address = ipv4_header->SourceAddress;
tuple->dst_address = ipv4_header->DestinationAddress;
// Skip if packet is not UDP
if (tuple->proto == IPPROTO_UDP) {
UDP_HEADER* udp_header = (UDP_HEADER*)(ipv4_header + 1);
//*Bounds check on udp header
if ((uint8_t*)(udp_header + 1) > end)
goto ipv4_udp_parse_done;
tuple->src_port = udp_header->srcPort;
tuple->dst_port = udp_header->destPort;
}
else
goto ipv4_udp_parse_done;
is_parse_success = true;
ipv4_udp_parse_done:
return is_parse_success;
}
Back to the packet_parse
function. If the parse was successful, add the packet to the ringbuffer using bpf_ringbuf_output
. The userspace program will then be notified that there is new consumable data in the ringbuffer (We’ll see how to consume it later). We pass the packet up to the OS network stack in all cases using XDP_PASS
.
SEC("xdp")
int packet_parse(xdp_md_t* ctx){
FiveTuple tuple = {0};
bool is_parse_success = parse_ethernet_packet(ctx->data, ctx->data_end, &tuple);
if (!is_parse_success)
goto done;
bpf_ringbuf_output(&xdp_map, &tuple, sizeof(FiveTuple), 0);
done:
return XDP_PASS;
}
At this point, You, the smart performance-oriented programmer, start wondering aloud.
You: Isn’t copying the packet contents into a tuple, only to then copy it again to the ringbuffer, wasteful?
Author: ヽ(°_°)ノKind of… There is a tradeoff here in terms of space saved per packet, since we’re not copying the entire headers into userspace, only the bits we want. However yes, we are making extra copies. If our use case allowed us to copy some
n
bytes of data directly fromctx
into thexdp_map
ringbuffer, we would usebpf_ringbuf_reserve
andbpf_ringbuf_commit
.
You: Okay, I want to see how to do that. Show me.
Author: Can’t. Sorry. ebpf-for-windows doesn’t support those functions yet because the verifier is missing some required reference tracking semantics. It is what it is ¯(◉◡◔)/¯. Maybe by the time you’re reading this, they’ve been implemented. In which case, see here.
That’s it for the eBPF program, time to consume the data in userspace.
Consuming the map data in User Application
Head on back to main.cpp
, in the main
function, we need to get a pointer to the map definition and get its descriptor. We’ll do this after we have loaded the program but before we begin waiting for the Ctrl+C event.
Then we create a pointer to the ring buffer using ring_buffer__new
and register a callback process_packet
which will be called whenever there is data in the ringbuffer. We’ll create the callback in the next step.
//main.cpp inside main function
prog_data = ebpf_load_program("myxdp.o", "packet_parse");
printf("eBPF program loaded\n");
//+++++++++++++++++++
bpf_map* map = bpf_object__find_map_by_name(prog_data.object, "xdp_map");
if (!map) fatal_with_cleanup("Map could not be found");
int map_fd = bpf_map__fd(map);
struct ring_buffer* ring_buf = ring_buffer__new(map_fd, process_packet, nullptr, nullptr);
if (!ring_buf) fatal_with_cleanup("Ring buffer could not be attached");
//++++++++++++++++++++
//* Wait for Ctrl+C
The process_packet
callback needs to be of a specific signature as defined by the ringbuffer API and needs to use the C calling convention. C++ will name-mangle it, so we need to mark it extern C
.
//main.cpp
extern "C" {
int process_packet(void* ctx, void* data, size_t len);
}
Time to implement process_packet
, this will be a simple function which prints out the fields of the FiveTuple
struct in a JSON
representation. print_ipv4_addr
and ntohs
are a couple helpers for process_packet
.
void print_ipv4_addr(uint32_t addr){
printf("%u.%u.%u.%u", (addr & 0x000000FF), (addr & 0x0000FF00) >> 8, (addr & 0x00FF0000) >> 16, (addr & 0xFF000000) >> 24);
}
inline uint16_t ntohs(uint16_t us){
return us << 8 | us >> 8;
}
int process_packet(void* ctx, void* data, size_t len){
if (len == sizeof(FiveTuple)) {
printf("{n");
FiveTuple* tuple = (FiveTuple*)data;
printf(" "ip_proto": "IPV4",n");
printf(" "src_address": ");
print_ipv4_addr(tuple->src_address);
printf(",n");
printf(" "dst_address": ");
print_ipv4_addr(tuple->dst_address);
printf(",n");
printf(" "Transport Proto": "UDP",n");
printf(" "src_port": "%u",n "dst_port": "%u",n", ntohs(tuple->src_port), ntohs(tuple->dst_port));
printf("},nn");
}
return 0;
}
We’re done! Build and copy over main.exe
and myxdp.o
to the VM and run main
. You should see the five-tuple output of the captured UDP
packets.
//main
The Control Handler is installed.
eBPF program loaded
{
"ip_proto": "IPV4",
"src_address": "106.51.42.56",
"dst_address": "172.20.168.164",
"Transport Proto": "UDP",
"src_port": "443",
"dst_port": "52952",
},
....Ctrl+C
Destroying eBPF Link
Unloading eBPF program
Closing eBPF object
Exiting process
Epilogue
- All the source code for the userspace application, as well as the eBPF program we created in this post can be found here.
- This blog post could not have been this detailed without help from the ebpf-for-windows team. If you’re interested in discussing your use-case, issues, enhancements or just generally track the project, consider joining the weekly dev meeting on Monday at 8:30am Pacific. More details here Zoom Meeting Series · Discussion #427 · microsoft/ebpf-for-windows (github.com).
- You can also check the ebpf-for-windows - Cilium & eBPF - Slack for discussions.
- Tutorial resources for ebpf-for-windows are quite sparse at the time of writing. The end goal of the project is to have as much cross-platform parity with Linux as possible. However, the project is still quite deep in development and a significant portion of the equivalent Linux functionality is yet to be implemented. Perhaps one day you’ll be able to open any Linux eBPF guide and follow along effortlessly on the windows version. Till then, ask the devs for what you need.
- Special thanks to dthaler (Dave Thaler) (github.com) from the ebpf-for-windows team for reviewing this post.
Well, here we are, somehow you made it to the end. Or just scrolled down to see how long the post is. Either way 👋
HI Gurnoor,
Thanks for the artice. I helped a lot.
I have tried and it works for me till verification. But when i tried to add the program i m getting Error 131 : could not load program.
have u also similar issue during ur exploration. If yes, then please share your fix.
Thanks in advance.