Investigating VM runtime crashes can be obscure, especially when dealing with code that has been recompiled to machine code by the JIT compiler. This post will show the way to track the execution path within gdb
. Note that you will need gdb
to be available to debug the VM, it is not possible to do this through a simple qemu
image (at least by default) or user-space simulation such as qemu-debootstrap
. The solution is to debug directly on the hardware (provided it exists!).
Pharo VM Compilation
The Pharo VM is a meta-circular VM written in the language it aims to execute (here SmallTalk!). This means that the VM code can all be inspected and executed in the Pharo environment. However, to generate an actual executable, the VM code is translated by Slang, a Smalltalk-to-C transpiler. The translation is only made possible due to the fact that the VM Smalltalk code is written in a restricted SmallTalk and annotated so that most of the ambiguity is cleared. Once the code is translated, it can be compiled down to an executable. Two flavors are available: the StackVM
that only has the interpreter and the CoInterpreter
that has the Cogit JIT compiler. Hitting against a SEGFAULT
at runtime raises questions at several levels: issues in the translation? issues in the build? issues in the runtime environment?
Running the Pharo VM through gdb
To provide an example, we will trigger a SEGFAULT
in the x86 Pharo VM. Let’s get our hands on a fresh Pharo VM and image. Several are available at http://getpharo.org/ and can be obtained with:
$ mkdir getpharo; cd getpharo
$ curl https://get.pharo.org | bash # or wget -O- https://get.pharo.org | bash
The first step is to launch the Pharo VM with gdb
. This can be done with:
$ ./pharo-ui -gdb
(gdb) r Pharo.image
This should open a Pharo environment and you can see the different threads launched in gdb
:
[New Thread 0x7ffff76b3700 (LWP 43777)]
[New Thread 0x7ffff49b6700 (LWP 43778)]
[New Thread 0x7fffec06b700 (LWP 43779)]
[New Thread 0x7fffeb818700 (LWP 43780)]
...
Triggering a SEGFAULT
From within the Pharo environment, we can trigger a SEGFAULT
by defining an external address to some random number and reading it. This can be done by opening a playground and writing:
(ExternalAddress fromAddress: 1234567) readStringUTF8
Pressing Ctrl-D
will trigger the SEGFAULT
, freeze the image and bring the control back to gdb
.
Note: It can also be triggered as a standalone script that contains the line from the playground using:
(gdb) r --headless Pharo.image segfault.st
Inspecting the call stack
Running through gdb and inspecting the call stack through bt
(stands for backtrace). In our case, we can see:
(gdb) bt
#0 primitiveLoadUInt8FromExternalAddress ()
at /builds/workspace/pharo-vm_pharo-9/build-stockReplacement/generated/64/vm/src/gcc3x-cointerp.c:15026
#1 0x00007ffff7ce12a2 in interpret ()
at /builds/workspace/pharo-vm_pharo-9/build-stockReplacement/generated/64/vm/src/gcc3x-cointerp.c:6253
#2 0x00007ffff7ce6b02 in enterSmalltalkExecutiveImplementation ()
at /builds/workspace/pharo-vm_pharo-9/build-stockReplacement/generated/64/vm/src/gcc3x-cointerp.c:17418
#3 0x00007ffff7cd9a76 in interpret ()
at /builds/workspace/pharo-vm_pharo-9/build-stockReplacement/generated/64/vm/src/gcc3x-cointerp.c:2986
#4 0x00007ffff7c51c25 in vm_run_interpreter ()
at /builds/workspace/pharo-vm_pharo-9/repository/src/client.c:90
#5 0x00007ffff7c51c7e in runVMThread (p=p@entry=0x7fffffffd530)
at /builds/workspace/pharo-vm_pharo-9/repository/src/client.c:253
#6 0x00007ffff7c51f17 in runOnMainThread (parameters=0x7fffffffd530)
at /builds/workspace/pharo-vm_pharo-9/repository/src/client.c:261
#7 vm_main_with_parameters (parameters=0x7fffffffd530)
at /builds/workspace/pharo-vm_pharo-9/repository/src/client.c:151
#8 0x00007ffff7c521a8 in vm_main (argc=<optimized out>, argv=<optimized out>, env=<optimized out>)
at /builds/workspace/pharo-vm_pharo-9/repository/src/client.c:204
#9 0x00007ffff7a5e0b3 in __libc_start_main (main=0x4005e0 <main>, argc=2, argv=0x7fffffffd6a8,
init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>,
stack_end=0x7fffffffd698)
at ../csu/libc-start.c:308
#10 0x0000000000400619 in _start ()
Now we can also look at the call stack on the Pharo side with the method (translated to C!) printCallStack()
:
(gdb) p printCallStack()
0x7ffffffc8308 I ExternalAddress>unsignedByteAt: 0x693dfd8: a(n) ExternalAddress
0x7ffffffc8368 I ExternalData>readStringUTF8 0x693e250: a(n) ExternalData
0x7ffffffc83a8 I ExternalAddress(ByteArray)>readStringUTF8 0x693dfd8: a(n) ExternalAddress
0x7ffffffc83d8 M UndefinedObject>DoIt 0x6b43c00: a(n) UndefinedObject
0x7ffffffc8408 M [] in OpalCompiler>evaluate 0x69129e0: a(n) OpalCompiler
0x7ffffffc8438 M FullBlockClosure(BlockClosure)>on:do: 0x6913f48: a(n) FullBlockClosure
0x7ffffffc8490 I OpalCompiler>evaluate 0x69129e0: a(n) OpalCompiler
...
What do we get from those? From the OS side, the application triggered a SEGFAULT
in the interpretation loop when calling a primitive. From the Pharo side, the call stack can be deciphered as a succession of frames where each present:
- The address they are located at (starting with
0x7fffff...
) - An identifier (I for Interpreted, M for Machine code and S for Single)
- The method class and name (note that in the case of blocks it presents
[]
) - The address of the receiver
- The type of the receiver
Diving in a Frame
Now that we have the big picture of what is happening in the call stack, let’s look into a frame in more detail. The one we will dive into is:
0x7ffffffc83d8 M UndefinedObject>DoIt 0x6b43c00: a(n) UndefinedObject
We can print all the information it holds by using the method (translated to C once again!) printFrame()
:
(gdb) p printFrame(0x7ffffffc83d8)
0x7ffffffc83d8 M UndefinedObject>DoIt 0x6b43c00: a(n) UndefinedObject
0x7ffffffc83e8: rcvr/clsr: 0x6b43c00 =nil
0x7ffffffc83e0: caller ip: 0x551e475=89252981
0x7ffffffc83d8: saved fp: 0x7ffffffc8408=140737488126984
0x7ffffffc83d0: method: 0x553e308 0x692cf10: a(n) CompiledMethod
0x7ffffffc83d0: mcfrm flags: 0x0 numArgs: 0 noContext notBlock
0x7ffffffc83c8: context: 0x6b43c00 =nil
0x7ffffffc83c0: receiver: 0x6b43c00 =nil
0x7ffffffc83b0: frame pc: 0x553e3c5=89383877
A frame holds a lot of information such as the caller instruction pointer, the receiver, the frame instruction pointer and the method itself. This is from the frame that holds the DoIt
method which takes no argument and was identified with M. If we compare it with for example:
0x7ffffffc8308 I ExternalAddress>unsignedByteAt: 0x693dfd8: a(n) ExternalAddress
This frame holds the method unsignedByteAt:
which takes one argument and the I identifier. Printing the frame shows:
(gdb) p printFrame(0x7ffffffc8308)
0x7ffffffc8308 I ExternalAddress>unsignedByteAt: 0x693dfd8: a(n) ExternalAddress
0x7ffffffc8320: rcvr/clsr: 0x693dfd8 =a(n) ExternalAddress
0x7ffffffc8318: arg0: 0x9 =1(0x1)
0x7ffffffc8310: caller ip: 0x884a9dc=142911964
0x7ffffffc8308: saved fp: 0x7ffffffc8368=140737488126824
0x7ffffffc8300: method: 0x88473c8 0x88473c8: a(n) CompiledMethod
0x7ffffffc82f8: context: 0x6b43c00 =nil
0x7ffffffc82f0:intfrm flags: 0x101=257 numArgs: 1 noContext notBlock
0x7ffffffc82e8: saved ip: 0x0 0
0x7ffffffc82e0: receiver: 0x693dfd8 =a(n) ExternalAddress
0x7ffffffc82d8: stck: 0x693dfd8 =a(n) ExternalAddress
0x7ffffffc82d0: stck: 0x1 =0(0x0)
This time, an arg0
field can be found as well as a receiver and the flags field has changed from machine code flags to interpreter flags.
Extracting the method information
To find if a method is a machine code method, the function methodFor(<method_address>)
can output either the location of the machine code or NULL
if it is not a machine code method. In the two cases presented earlier, the outputs are:
(gdb) p methodFor(0x553e308) # frame DoIt
(CogMethod *) 0x553e308
(gdb) p methodFor(0x88473c8) # frame unsignedByteAt:
(CogMethod *) 0x0
Using this information, we can use another print to get more information from the machine code method, printCogMethod(<cog_method_pointer>)
:
(gdb) p printCogMethod((CogMethod *) 0x553e308)
0x553e308 <-> 0x553e3d8: method: 0x692cf10 selector: 0x6b43c00 (nil: DoIt)
<-> 0x553e3d8: method: 0x692cf10 selector: 0x6b43c00 (nil: DoIt)
This output presents:
-
The start of the machine code method
-
The end of the machine code method
-
The location of the compiled method (in bytecodes)
Note: This could be seen in the frame description side by side with its machine code one, and the two are identical in the case of a method that has not been recompiled which is the case for the second frame!
-
The location of the selector (here it is
nil
) -
If it is a primitive, its id is printed as well
When compiling a method to machine code it is split into 3 parts: header metadata (size depends on the machine), the machine code itself and more metadata. To disassemble the correct part, we need to go over the header. Its size can be found with the cmEntryOffset
and cmNoCheckEntryOffset
methods. In our case:
(gdb) print cmEntryOffset
48
(gdb) print cmNoCheckEntryOffset
74
Disassembling the JITed method
Now that we have the address of the JITed method as well as the offset size, we can disassemble it with the disassemble
instruction from gdb
. We need to pass it the address with the offset added as well as end address (can be an offset from the start address starting with a +):
(gdb) disassemble 0x553e308+48,+150
Dump of assembler code from 0x553e338 to 0x553e3ce:
0x000000000553e338: mov %rdx,%rax
0x000000000553e33b: and $0x7,%rax
0x000000000553e33f: jne 0x553e34d
0x000000000553e341: mov (%rdx),%rax
0x000000000553e344: and $0x3fffff,%rax
0x000000000553e34a: nop
0x000000000553e34b: nop
0x000000000553e34c: nop
0x000000000553e34d: cmp %rcx,%rax
0x000000000553e350: jne 0x553e333
0x000000000553e352: mov (%rsp),%r9
0x000000000553e356: mov %rdx,(%rsp)
0x000000000553e35a: push %r9
0x000000000553e35c: push %rbp
0x000000000553e35d: mov %rsp,%rbp
0x000000000553e360: lea -0x5f(%rip),%r8 # 0x553e308
0x000000000553e367: push %r8
0x000000000553e369: mov $0x6b43c00,%r9
0x000000000553e370: push %r9
0x000000000553e372: push %rdx
0x000000000553e373: movabs 0x7ffff7f391f8,%rax
0x000000000553e37d: cmp %rax,%rsp
0x000000000553e380: jb 0x553e330
0x000000000553e382: movabs $0x6bcb408,%rax
0x000000000553e38c: nop
0x000000000553e38d: mov (%rax),%r15
0x000000000553e390: and $0x3ffff7,%r15
0x000000000553e397: jne 0x553e39f
0x000000000553e399: mov 0x8(%rax),%rax
0x000000000553e39d: jmp 0x553e38d
0x000000000553e39f: mov 0x10(%rax),%r15
0x000000000553e3a3: mov $0x96b439,%rdi
0x000000000553e3aa: mov %r15,%rdx
0x000000000553e3ad: mov $0x2,%rcx
0x000000000553e3b4: callq 0x53ee0c0
0x000000000553e3b9: mov $0x3,%rcx
0x000000000553e3c0: callq 0x53ee080
0x000000000553e3c5: mov %rbp,%rsp
0x000000000553e3c8: pop %rbp
0x000000000553e3c9: retq $0x8
0x000000000553e3cc: int3
0x000000000553e3cd: int3
End of assembler dump.
This is the routine of our Cog method. The usual routine works as follows:
-
Abort routine
-
Entry point and type check
-
Frame building
-
Stack overflow and maybe context switch
-
Message send
-
Frame un-building and return
To see those steps in our methods, we can look at what the different calls are pointing at, that should be different trampolines. To print trampoline info, using printTrampolineTable()
will show the addresses and names of the ones known by the JIT compiler.
(gdb) p printTrampolineTable()
0x53ee000: ceGetFP
0x53ee008: ceGetSP
0x53ee010: ceCaptureCStackPointers
0x53ee030: ceDereferenceSelectorIndex
0x53ee080: ceSend0Args
0x53ee0c0: ceSend1Args
0x53ee108: ceSend2Args
0x53ee150: ceSendNArgs
...
0x53ef5d8: ceCallPIC0Args
0x53ef5f8: ceCallPIC1Args
0x53ef618: ceCallPIC2Args
0x53ef638: ceTraceLinkedSendTrampoline
0x53ef680: ceTraceBlockActivationTrampoline
0x53ef6c8: ceTraceStoreTrampoline
0x53ef710: methodZoneBase
Note: The
ce
prefix means call execution
Linking the trampolines to the different calls and looking into the jumps of our method we can annotate it as:
Dump of assembler code from 0x553e338 to 0x553e3ce:
____________________________________________________________________________________________________
Entry point and type check
____________________________________________________________________________________________________
0x000000000553e338: mov %rdx,%rax
0x000000000553e33b: and $0x7,%rax
0x000000000553e33f: jne 0x553e34d <-- Jump over the next type check
0x000000000553e341: mov (%rdx),%rax
0x000000000553e344: and $0x3fffff,%rax
0x000000000553e34a: nop
0x000000000553e34b: nop
0x000000000553e34c: nop
0x000000000553e34d: cmp %rcx,%rax
0x000000000553e350: jne 0x553e333 <-- Jump to the abort routine before the entry
____________________________________________________________________________________________________
Frame building
____________________________________________________________________________________________________
0x000000000553e352: mov (%rsp),%r9
0x000000000553e356: mov %rdx,(%rsp)
0x000000000553e35a: push %r9
0x000000000553e35c: push %rbp
0x000000000553e35d: mov %rsp,%rbp
0x000000000553e360: lea -0x5f(%rip),%r8 # 0x553e308
0x000000000553e367: push %r8
0x000000000553e369: mov $0x6b43c00,%r9
0x000000000553e370: push %r9
0x000000000553e372: push %rdx
____________________________________________________________________________________________________
Stack overflow and maybe context change
____________________________________________________________________________________________________
0x000000000553e373: movabs 0x7ffff7f391f8,%rax
0x000000000553e37d: cmp %rax,%rsp
0x000000000553e380: jb 0x553e330 <-- Jump to the abort routine before the entry
0x000000000553e382: movabs $0x6bcb408,%rax
0x000000000553e38c: nop
0x000000000553e38d: mov (%rax),%r15
0x000000000553e390: and $0x3ffff7,%r15
0x000000000553e397: jne 0x553e39f <-- Jump over next move
0x000000000553e399: mov 0x8(%rax),%rax
0x000000000553e39d: jmp 0x553e38d <-- Jump back to check again
____________________________________________________________________________________________________
Message sends
____________________________________________________________________________________________________
0x000000000553e39f: mov 0x10(%rax),%r15
0x000000000553e3a3: mov $0x96b439,%rdi <-- Our fake address 1234567
0x000000000553e3aa: mov %r15,%rdx
0x000000000553e3ad: mov $0x2,%rcx
0x000000000553e3b4: callq 0x53ee0c0 <-- Trampoline ceSend1Args unsignedByteAt:
0x000000000553e3b9: mov $0x3,%rcx
0x000000000553e3c0: callq 0x53ee080 <-- Trampoline ceSend0Args DoIt
____________________________________________________________________________________________________
Frame unbuilding and ret
____________________________________________________________________________________________________
0x000000000553e3c5: mov %rbp,%rsp
0x000000000553e3c8: pop %rbp
0x000000000553e3c9: retq $0x8
0x000000000553e3cc: int3
0x000000000553e3cd: int3
End of assembler dump.
The most important part of this is the Message sends
. If we look at what this part looks in the JIT intermediate representation CogRTL
, it is presented as:
MoveRR <receiver> ReceiverResultReg
MovePatcheableC32R <selector index> ClassReg
Call <trampoline>
This is the initial state of a JITed method, however the objective is to redirect it to the correct method by patching it into a monomorphic call for example:
MoveRR <receiver> ReceiverResultReg
MovePatcheableC32R <expected class of the receiver> ClassReg
Call <jited method address>
That can itself be patched again to a polymorphic call by redirecting the cal to a small piece of logic:
MoveRR <receiver> ReceiverResultReg
MovePatcheableC32R <expected class of the receiver> ClassReg
Call <polymorphic call logic>
Inspecting trampolines
The trampolines can be inspected too by disassembling them the same way it is done for methods. Looking around the ceSend0Args(0x53ee080)
:
(gdb) disassemble 0x53ee080,+70
Dump of assembler code from 0x53ee080 to 0x53ee0c6:
0x00000000053ee080: mov (%rsp),%r9 <-- ceSend0Args
0x00000000053ee084: mov %rdx,(%rsp)
0x00000000053ee088: push %r9
0x00000000053ee08a: callq 0x53ee030
0x00000000053ee08f: mov %rbp,0x30(%rbx)
0x00000000053ee093: mov %rsp,0x40(%rbx)
0x00000000053ee097: mov 0x8d608(%rbx),%rsp
0x00000000053ee09e: mov %rcx,%rdi
0x00000000053ee0a1: xor %rsi,%rsi
0x00000000053ee0a4: xor %rcx,%rcx
0x00000000053ee0a7: movabs $0x7ffff7cd79c0,%rax
0x00000000053ee0b1: callq *%rax
0x00000000053ee0b3: mov 0x40(%rbx),%rsp
0x00000000053ee0b7: mov 0x30(%rbx),%rbp
0x00000000053ee0bb: retq
0x00000000053ee0bc: int3
0x00000000053ee0bd: int3
0x00000000053ee0be: int3
0x00000000053ee0bf: int3
0x00000000053ee0c0: mov (%rsp),%r9 <-- ceSend1Args
0x00000000053ee0c4: mov %rdx,(%rsp)
End of assembler dump.