Playing with uBPF

Published 04-02-2020 00:00:00

Introduction

uBPF is a user-land implementation of the eBPF VM.

Installation

$ git clone https://github.com/iovisor/ubpf.git
$ cd ubpf
$ make -C vm

This compiles the ubpf library (libubpf.a) and a test program to use this library.

Run a simple eBPF program with uBPF

static int idouble(int a) {
        return (a * 2);
}

int bpf_prog(void *ctx) {
        int a = 1;
        a = idouble(a);

        return (a);
}

Compile to a BPF binary, which is executable by the kernel eBPF VM, or, in our specific case, the user-land uBFP VM.

$ clang -O2 -target bpf -c hello.c -o hello.o

Execute our BPF code with the uBPF VM.

$ vm/test hello.o
0x2

BPF program with an argument

Let’s use the ctx variable.

#include <stdint.h>

static uint32_t idouble(uint32_t a) {
        return (a * 2);
}

uint32_t bpf_prog(int32_t *arg) {
        uint32_t result = 0;
        result = idouble(*arg);

        return result;
}

and create a file which represent the memory which will be passed to the bpf_prog function:

$ printf "%b" '\x03\x00\x00\x00' > integer_3.mem

Execute our new program, giving the VM the memory file:

$ vm/test hello.o --mem integer_3.mem
0x6

In fact, a pointer to the memory buffer is passed to the r1 register of the VM.

Another example

Let’s now create a BPF program taking a IP datagram and returning 1 if the destination address is 1.1.1.1.

Create the IP datagrams

We use scapy to do so.

p = IP(src='127.0.0.1', dst='1.1.1.1')
with open('ip_1_1_1_1.mem', 'wb') as f:
    f.write(raw(p))
p = IP(src='127.0.0.1', dst='1.1.1.2')
with open('ip_1_1_1_2.mem', 'wb') as f:
    f.write(raw(p))

Create the BPF program

#include <arpa/inet.h>
#include <stdint.h>

#define ONE_ONE_ONE_ONE 0x01010101

struct ipv4_header {
    uint8_t ver_ihl;
    uint8_t tos;
    uint16_t total_length;
    uint16_t id;
    uint16_t frag;
    uint8_t ttl;
    uint8_t proto;
    uint16_t csum;
    uint32_t src;
    uint32_t dst;
};

int is_dst_one_one_one_one(void *opaque) {
    struct ipv4_header *ipv4_header = (struct ipv4_header*)opaque;

    if (ntohl(ipv4_header->dst) == ONE_ONE_ONE_ONE) {
        return 1;
    }

    return 0;
}

The function is now called is_dst_one_one_one_one, in fact it can be called anything as long as it’s the only non-static, exported function.

Compile the BPF program:

$ clang -O2 -target bpf -c ip_dst.c -o ip_dst.o

then execute it with our IP datagram:

$ vm/test ip_dst.o --mem ip_1_1_1_1.mem
0x1
$ vm/test ip_dst.o --mem ip_1_1_1_2.mem
0x0

Form there, we can imagine anything.

Let’s dissect the compiled object file

It’s an ELF file:

$ file ip_dst.o
ip_dst.o: ELF 64-bit LSB relocatable, eBPF, version 1 (SYSV), not stripped

ELF sections:

$ readelf -SW ip_dst.o
There are 5 section headers, starting at offset 0x118:

Section Headers:
  [Nr] Name              Type            Address          Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            0000000000000000 000000 000000 00      0   0  0
  [ 1] .strtab           STRTAB          0000000000000000 0000c8 00004c 00      0   0  1
  [ 2] .text             PROGBITS        0000000000000000 000040 000028 00  AX  0   0  8
  [ 3] .llvm_addrsig     LOOS+0xfff4c03  0000000000000000 0000c8 000000 00   E  4   0  1
  [ 4] .symtab           SYMTAB          0000000000000000 000068 000060 18      1   3  8
[...]

The .text section contains our BPF code, with a size of 40 (0x28) bytes. The content of this section is:

$ readelf -x .text ip_dst.o

Hex dump of section '.text':
  0x00000000 61111000 00000000 b7000000 01000000 a...............
  0x00000010 15010100 01010101 b7000000 00000000 ................
  0x00000020 95000000 00000000                   ........

Use llvm-objdump to have more information on dissasembled code of the .text section.

$ llvm-objdump -d -r --section .text -print-imm-hex  ip_dst.o

ip_dst.o:	file format ELF64-BPF


Disassembly of section .text:

0000000000000000 is_dst_one_one_one_one:
       0:	61 11 10 00 00 00 00 00	r1 = *(u32 *)(r1 + 0x10)
       1:	b7 00 00 00 01 00 00 00	r0 = 0x1
       2:	15 01 01 00 01 01 01 01	if r1 == 0x1010101 goto +0x1 <LBB0_2>
       3:	b7 00 00 00 00 00 00 00	r0 = 0x0

0000000000000020 LBB0_2:
       4:	95 00 00 00 00 00 00 00	exit

We can extract it with:

$ objcopy -O binary -I elf32-little --only-section=.text ip_dst.o ip_dst.ebpf
$ hexdump -C ip_dst.ebpf
00000000  61 11 10 00 00 00 00 00  b7 00 00 00 01 00 00 00  |a...............|
00000010  15 01 01 00 01 01 01 01  b7 00 00 00 00 00 00 00  |................|
00000020  95 00 00 00 00 00 00 00                           |........|
00000028

This is the minimal set of data to execute this BPF program.

Is it possible to modify memory ?

If in the previous ip_dst.c we try to modify the IP datagram, it’s OK:

if (ntohl(ipv4_header->dst) == ONE_ONE_ONE_ONE) {
    ipv4_header->dst = htonl(0x01010142);
    return 1;
}

BUT, if we try to modify a area of the memory which is outside, then the execution fails:

if (ntohl(ipv4_header->dst) == ONE_ONE_ONE_ONE) {
    *((int *)opaque + 1024) = 0;
    return 1;
}

The execution then fails with:

uBPF error: out of bounds memory store at PC 4, addr 0x7ffef649cf50, size 4
mem 0x7ffef649bf50/20 stack 0x7ffef649be80/128
0xffffffffffffffff