Gigue presentation and binary generation

My goal is to run raw binary code on top of the Rocket emulator. The code itself is the output of my JIT code generator Gigue. It generates random JIT methods and Polymorphic Inline Caches (machine code switch before different methods). It generates raw binary instructions as the succession of the interpretation loop, a filler and the JIT code.

The generated code is customizable by defining:

  • the start address of the interpretation loop and JIT code
  • the number of JIT elements (methods + PICs)
  • the method/PICs ratio
  • the max call depth (nested calls) of a method
  • the max call number of a method
  • the max method size
  • etc.

The code is tested and simulated with capstone and unicorn.

Gigue uses pipenv and can be installed following:

$ git clone git@github.com:QDucasse/gigue.git
$ cd gigue
$ pipenv install --dev  # dev for capstone/unicorn
$ pipenv shell

A lot of command line arguments help customize the generated code. The ones we need are:

  • -I or --intaddr the address of the interpretation loop
  • -J or --jitaddr the address of the JIT code
  • -N or --nbelt the number of JIT elements
  • -R or --picratio the methods/PICs ratio (0 means no PICs 1 means only PICs)
  • -O or --out the name of the output binary file
  • -S or --metmaxsize the maximum size of a method

For now, let’s generate a small binary with 2 methods of max size 5 instructions, no PICs, and a small gap between the interpretation loop and the JIT code by using:

$ python -m gigue -N 2 -S 5 -R 0 -I 0 -J 224

The resulting binary is generated in bin/out.bin!

First run of the binary

The resulting bin/out.bin is raw data but we can still look at it using objdump:

$ riscv64-unknown-elf-objdump \
    --adjust-vma=0x1000 \
    -m riscv \
    -b binary \
    -D out.bin > out.dis
$ head out.dis
Disassembly of section .data:

0000000000001000 <.data>:
    1000:	fd410113          	addi	sp,sp,-44
    1004:	00812023          	sw	s0,0(sp)
    1008:	00912223          	sw	s1,4(sp)
    100c:	01212423          	sw	s2,8(sp)
    1010:	01312623          	sw	s3,12(sp)
    1014:	01412823          	sw	s4,16(sp)
    1018:	01512a23          	sw	s5,20(sp)
    101c:	01612c23          	sw	s6,24(sp)
    1020:	01712e23          	sw	s7,28(sp)
    1024:	03812023          	sw	s8,32(sp)
    1028:	03912223          	sw	s9,36(sp)
    102c:	02112423          	sw	ra,40(sp)
    1030:	00005097          	auipc	ra,0x5
    1034:	00c080e7          	jalr	12(ra) # 0x603c
    1038:	00009097          	auipc	ra,0x9
    103c:	044080e7          	jalr	68(ra) # 0xa07c
    1040:	00006097          	auipc	ra,0x6
    1044:	26c080e7          	jalr	620(ra) # 0x72ac
    1048:	00006097          	auipc	ra,0x6
    104c:	39c080e7          	jalr	924(ra) # 0x73e4
    1050:	00007097          	auipc	ra,0x7
    1054:	ae4080e7          	jalr	-1308(ra) # 0x7b34
    1058:	00004097          	auipc	ra,0x4

As you can see the whole code is wrapped in the .data section, starting at the address specified with --adjust-vma. Rocket expects an ELF file to run on top of its emulator so let’s see how we can wrap our file.

We will use objcopy to use a raw binary input and output an ELF64 little-endian RISC-V file.

$ riscv64-unknown-elf-objcopy \
    -I binary \
    -O elf64-littleriscv \
    -B riscv \
    out.bin out.o

Reading the file now with readelf shows us:

$ riscv64-unknown-elf-readelf -hS out.o
LF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           RISC-V
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          41528 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         5
  Section header string table index: 4

Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .data             PROGBITS         0000000000000000  00000040
       000000000000a12c  0000000000000000  WA       0     0     1
  [ 2] .symtab           SYMTAB           0000000000000000  0000a170
       0000000000000060  0000000000000018           3     1     8
  [ 3] .strtab           STRTAB           0000000000000000  0000a1d0
       0000000000000040  0000000000000000           0     0     1
  [ 4] .shstrtab         STRTAB           0000000000000000  0000a210
       0000000000000021  0000000000000000           0     0     1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings), I (info),
  L (link order), O (extra OS processing required), G (group), T (TLS),
  C (compressed), x (unknown), o (OS specific), E (exclude),
  D (mbind), p (processor specific)

Note: -h for the header, -S for the sections!

The file here has no Flags (where we would expect a presentation of the ABI) and is Relocatable and not Executable. If we try to feed it to Rocket’s emulator in its current state, it fails with the following error:

$ ./emulator-freechips.rocketchip.system-freechips.rocketchip.system.DefaultConfig \
    +max-cycles=100000000 \
    +verbose \
    path/to/out.o
... ../fesvr/elfloader.cc:35: std::map<std::__cxx11::basic_string<char>, long unsigned int> load_elf(const char*, memif_t*, reg_t*): Assertion `IS_ELF_EXEC(*eh64)' failed.

We can try to hack our way into the ELF and make it executable with elfedit:

$ riscv64-unknown-elf-elfedit out.o --input-type=rel --output-type=exec
$ riscv64-unknown-elf-readelf -hS out.o | grep "Type:"
  Type:            EXEC (Executable file)

Running it now launches the emulator for a bit, fails with an error and loops over NULL values:

$ ./emulator-freechips.rocketchip.system-freechips.rocketchip.system.DefaultConfig \
    +max-cycles=100000000 \
    +verbose \
    gigue/bin/out.o
...
C0:        240 [1] pc=[0000000000000828] W[r 8=0000000000000000][1] R[r 8=0000000000000002] R[r 0=0000000000000000] inst=[00147413] DASM(00147413)
C0:        241 [1] pc=[000000000000082c] W[r 0=0000000000000000][0] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[00040863] DASM(00040863)
C0:        245 [1] pc=[000000000000083c] W[r 8=0000000000000000][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[f1402473] DASM(f1402473)
C0:        265 [1] pc=[0000000000000840] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 8=0000000000000000] inst=[10802423] DASM(10802423)
C0:        266 [1] pc=[0000000000000844] W[r 8=e291ba58c157406e][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[7b202473] DASM(7b202473)
C0:        267 [1] pc=[0000000000000848] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[7b200073] DASM(7b200073)
C0:        286 [1] pc=[0000000000010058] W[r 0=000000000001005a][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[0000bff5] DASM(0000bff5)
C0:        288 [1] pc=[0000000000010054] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[10500073] DASM(10500073)
warning: tohost and fromhost symbols not in ELF; can't communicate with target
C0:        291 [0] pc=[0000000000010058] W[r 0=0000000000000000][0] R[r31=0000000000000000] R[r 0=0000000000000000] inst=[0000bff5] DASM(0000bff5)
C0:        296 [1] pc=[0000000000000800] W[r 0=0000000000000804][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[00c0006f] DASM(00c0006f)
C0:        297 [1] pc=[000000000000080c] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[0ff0000f] DASM(0ff0000f)
C0:        298 [1] pc=[0000000000000810] W[r 0=e291ba58c157406e][1] R[r 8=e291ba58c157406e] R[r 0=0000000000000000] inst=[7b241073] DASM(7b241073)
C0:        303 [1] pc=[0000000000000814] W[r 8=0000000000000000][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[f1402473] DASM(f1402473)
C0:        306 [1] pc=[0000000000000818] W[r 0=0000000000000000][0] R[r 0=0000000000000000] R[r 8=0000000000000000] inst=[10802023] DASM(10802023)
C0:        312 [1] pc=[000000000000081c] W[r 8=0000000000000000][0] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[40044403] DASM(40044403)
C0:        320 [1] pc=[0000000000000820] W[r 8=0000000000000000][1] R[r 8=0000000000000000] R[r 0=0000000000000000] inst=[00347413] DASM(00347413)
...

The tohost and fromhost routines are not setup in our executable…

Linking our raw binary

With the previous issue in mind, we will try to follow the way the benchmark tests link and wrap binaries! We need to collectivize several files: the common directory (that contains crt.S the initialization script, syscalls.c the system calls wrapper, utils.h some helpers and test.ld the loader script) as well as the encoding.h defines instructions/CSRs (taken from the riscv-test-env repository).

Everything is stored in the resources directory:

$ tree resources
resources
└── common
    ├── crt.S
    ├── encoding.h
    ├── syscalls.c
    ├── test.ld
    ├── template.S
    └── util.h

1 directory, 5 files

Note that to add our raw binary in an ELF file, rather than using objcopy, we can use the assembly .incbin operator! We use a template assembly file that includes the raw binary and redefines the main function to call it then exit. Our main function will override the one defined in syscalls as it was defined with the weak adjective. This way, we use the following template.S file:

.global gigue_start
gigue_start:
    .incbin "bin/out.bin"

.global gigue_end
gigue_end:

.global gigue_size
gigue_size:
    .int gigue_end - gigue_start


; .text.startup:
    .global main
main:
    call gigue_start
    j exit

Running make in the Gigue root repository should compile each file and link them together!

$ export RISCV=/path/to/toolchain
$ make
/opt/riscv-rocket/bin/riscv64-unknown-elf-gcc -Iresources/common -march=rv64gc -mabi=lp64d -DPREALLOCATE=1 -mcmodel=medany -static -std=gnu99 -O2 -ffast-math -fno-common -fno-builtin-printf resources/common/syscalls.c -c -o bin/syscalls.o 
/opt/riscv-rocket/bin/riscv64-unknown-elf-gcc -Iresources/common -march=rv64gc -mabi=lp64d -DPREALLOCATE=1 -mcmodel=medany -static -std=gnu99 -O2 -ffast-math -fno-common -fno-builtin-printf resources/common/crt.S -c -o bin/crt.o 
/opt/riscv-rocket/bin/riscv64-unknown-elf-gcc -Iresources/common -march=rv64gc -mabi=lp64d -DPREALLOCATE=1 -mcmodel=medany -static -std=gnu99 -O2 -ffast-math -fno-common -fno-builtin-printf resources/common/template.S -c -o bin/template.o 
/opt/riscv-rocket/bin/riscv64-unknown-elf-gcc -static -nostdlib -nostartfiles -lm -lgcc -T resources/common/test.ld bin/syscalls.o bin/crt.o bin/template.o -o bin/out

We can also generate the dumps of the different binaries, out.bin.dump to look at the Gigue-generated binary (passed through objcopy) or out.dump to look at the complete executable dump:

$ make dump
/opt/riscv-rocket/bin/riscv64-unknown-elf-objdump --disassemble-all --disassemble-zeroes --section=.text --section=.text.startup --section=.text.init --section=.data bin/out > bin/out.dump
/opt/riscv-rocket/bin/riscv64-unknown-elf-objcopy -I binary -O elf64-littleriscv -B riscv --rename-section .data=.text bin/out.bin bin/out.bin.dump.temp
/opt/riscv-rocket/bin/riscv64-unknown-elf-objdump --disassemble-all --disassemble-zeroes --section=.text --section=.text.startup --section=.text.init --section=.data bin/out.bin.dump.temp > bin/out.bin.dump
rm bin/out.bin.dump.temp

Running it back on Rocket

Now retrying on Rocket, running the instructions through the spike disssembly this time and outputting the result in the out.dis file:

$ ./emulator-freechips.rocketchip.system-freechips.rocketchip.system.DefaultConfig \
    +max-cycles=100000000 \
    +verbose \
    gigue/bin/out \
    3>&1 1>&2 2>&3 | spike-dasm > gigue/bin/out.dis

Inpecting the disassembly output, our code has been executed!!!

C0:     133134 [1] pc=[0000000080002734] W[r 2=0000000080022a28][1] R[r 2=0000000080022a80] R[r 0=0000000000000000] inst=[fa810113] addi    sp, sp, -88
C0:     133163 [1] pc=[0000000080002738] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r 8=0000000000000000] inst=[00813023] sd      s0, 0(sp)
C0:     133164 [1] pc=[000000008000273c] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r 9=0000000000000000] inst=[00913423] sd      s1, 8(sp)
C0:     133180 [1] pc=[0000000080002740] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r18=0000000080002b28] inst=[01213823] sd      s2, 16(sp)
C0:     133197 [1] pc=[0000000080002744] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r19=0000000000000000] inst=[01313c23] sd      s3, 24(sp)
C0:     133198 [1] pc=[0000000080002748] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r20=0000000000000001] inst=[03413023] sd      s4, 32(sp)
C0:     133199 [1] pc=[000000008000274c] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r21=0000000080002b40] inst=[03513423] sd      s5, 40(sp)
C0:     133200 [1] pc=[0000000080002750] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r22=0000000000000000] inst=[03613823] sd      s6, 48(sp)
C0:     133201 [1] pc=[0000000080002754] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r23=0000000000000000] inst=[03713c23] sd      s7, 56(sp)
C0:     133202 [1] pc=[0000000080002758] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r24=0000000000000000] inst=[05813023] sd      s8, 64(sp)
C0:     133203 [1] pc=[000000008000275c] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r25=0000000000000000] inst=[05913423] sd      s9, 72(sp)
C0:     133204 [1] pc=[0000000080002760] W[r 0=0000000000000000][0] R[r 2=0000000080022a28] R[r 1=0000000080002948] inst=[04113823] sd      ra, 80(sp)
C0:     133205 [1] pc=[0000000080002764] W[r 1=0000000080002764][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[00000097] auipc   ra, 0x0
C0:     133206 [1] pc=[0000000080002768] W[r 1=000000008000276c][1] R[r 1=0000000080002764] R[r 0=0000000000000000] inst=[0b0080e7] jalr    ra, ra, 176
C0:     133244 [1] pc=[0000000080002814] W[r 2=0000000080022a10][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[fe810113] addi    sp, sp, -24
C0:     133245 [1] pc=[0000000080002818] W[r 0=0000000000000000][0] R[r 2=0000000080022a10] R[r 8=0000000000000000] inst=[00813023] sd      s0, 0(sp)
C0:     133246 [1] pc=[000000008000281c] W[r 7=0000000000000000][0] R[r29=0000000000000000] R[r31=0000000000000000] inst=[03feb3b3] mulhu   t2, t4, t6
C0:     133256 [1] pc=[0000000080002820] W[r 7=fffffffffffffd0c][1] R[r17=0000000000000000] R[r 0=0000000000000000] inst=[d0c8839b] addiw   t2, a7, -756
C0:     133257 [1] pc=[0000000080002824] W[r10=0000000000000000][1] R[r 6=0000000000000000] R[r 0=0000000000000000] inst=[e8232513] slti    a0, t1, -382
C0:     133258 [1] pc=[0000000080002828] W[r10=0000000000000000][1] R[r11=0000000000000000] R[r31=0000000000000000] inst=[01f5f533] and     a0, a1, t6
C0:     133259 [1] pc=[000000008000282c] W[r 8=0000000000000000][1] R[r 2=0000000080022a10] R[r 0=0000000000000000] inst=[00013403] ld      s0, 0(sp)
C0:     133260 [1] pc=[0000000080002830] W[r 2=0000000080022a28][1] R[r 2=0000000080022a10] R[r 0=0000000000000000] inst=[01810113] addi    sp, sp, 24
C0:     133261 [1] pc=[0000000080002834] W[r 0=0000000080002838][1] R[r 1=000000008000276c] R[r 0=0000000000000000] inst=[00008067] ret
C0:     133262 [1] pc=[000000008000276c] W[r 5=0000000000000001][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[00100293] li      t0, 1
C0:     133263 [1] pc=[0000000080002770] W[r 1=0000000080002770][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[00000097] auipc   ra, 0x0
C0:     133264 [1] pc=[0000000080002774] W[r 1=0000000080002778][1] R[r 1=0000000080002770] R[r 0=0000000000000000] inst=[0c8080e7] jalr    ra, ra, 200
C0:     133268 [1] pc=[0000000080002838] W[r 6=0000000000000001][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[00100313] li      t1, 1
C0:     133269 [1] pc=[000000008000283c] W[r 0=0000000000000000][0] R[r 6=0000000000000001] R[r 5=0000000000000001] inst=[00531463] bne     t1, t0, pc + 8
C0:     133335 [1] pc=[0000000080002840] W[r 0=0000000080002844][1] R[r 0=0000000000000000] R[r 0=0000000000000000] inst=[01c0006f] j       pc + 0x1c
C0:     133337 [1] pc=[000000008000285c] W[r 0=0000000080002860][1] R[r 1=0000000080002778] R[r 0=0000000000000000] inst=[00008067] ret
C0:     133339 [1] pc=[0000000080002778] W[r 8=0000000000000000][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[00013403] ld      s0, 0(sp)
C0:     133340 [1] pc=[000000008000277c] W[r 9=0000000000000000][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[00813483] ld      s1, 8(sp)
C0:     133341 [1] pc=[0000000080002780] W[r18=0000000080002b28][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[01013903] ld      s2, 16(sp)
C0:     133342 [1] pc=[0000000080002784] W[r19=0000000000000000][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[01813983] ld      s3, 24(sp)
C0:     133343 [1] pc=[0000000080002788] W[r20=0000000000000001][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[02013a03] ld      s4, 32(sp)
C0:     133344 [1] pc=[000000008000278c] W[r21=0000000080002b40][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[02813a83] ld      s5, 40(sp)
C0:     133345 [1] pc=[0000000080002790] W[r22=0000000000000000][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[03013b03] ld      s6, 48(sp)
C0:     133346 [1] pc=[0000000080002794] W[r23=0000000000000000][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[03813b83] ld      s7, 56(sp)
C0:     133347 [1] pc=[0000000080002798] W[r24=0000000000000000][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[04013c03] ld      s8, 64(sp)
C0:     133348 [1] pc=[000000008000279c] W[r25=0000000000000000][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[04813c83] ld      s9, 72(sp)
C0:     133349 [1] pc=[00000000800027a0] W[r 1=0000000080002948][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[05013083] ld      ra, 80(sp)
C0:     133350 [1] pc=[00000000800027a4] W[r 2=0000000080022a80][1] R[r 2=0000000080022a28] R[r 0=0000000000000000] inst=[05810113] addi    sp, sp, 88
C0:     133351 [1] pc=[00000000800027a8] W[r 0=00000000800027ac][1] R[r 1=0000000080002948] R[r 0=0000000000000000] inst=[00008067] ret

We can distinguish the interpretation loop prologue and epilogue (with sd/lds), then the two JIT methods (ending with rets)!