Let me tell you a little about how ELFs work since there has been some wheel reinventing going on. I'll go from memory with a simple example if relocation was implemented.
SCENE
Code: // marty.h
// All symbols here are exported
#include <not.important> // Just defined DLL_EXPORT, compiler-specific
extern DLL_EXPORT const char *marty_name;
extern DLL_EXPORT void marty_invoke();
Code: // marty.c
#include "marty.h"
#include <stdio.h>
const char *marty_name = "Marty McFly";
void marty_invoke() {
puts("Since when can weathermen predict the weather, let alone the future?");
}
Code: // doc.h
// All symbols here are exported
#include <not.important> // Just defined DLL_EXPORT, compiler-specific
extern DLL_EXPORT const char *doc_name;
extern DLL_EXPORT void doc_invoke();
Code: // doc.c
#include "doc.h"
#include <stdio.h>
const char *doc_name = "Emmett Brown";
void doc_invoke() {
puts("Oh, my God. They found me. I don't know how, but they found me. Run for it, Marty!");
}
Code: // delorean.c
#include "marty.h"
#include "doc.h"
#include <stdio.h>
int main() {
printf("Hey, %s, what's new?\n", marty_name);
marty_invoke();
printf("Hey, %s, what's new?\n", doc_name);
doc_invoke();
}
Well, I had fun. Now, I'm not going to go into the details, I'm saying what it looks like from a coder's perspective. Tools can be made to generate shared libraries, code can be made to dynamically link objects. I already wrapped my head around it when working on the Prizm.
Firstly, lets assume take the code above and build ELFs, 2 shared libraries for our stars and 1 runnable binary for the car. The fun happens when you poke into how the ELF was made.
Marty!
libmarty.so has quite a few symbols exported, but among them are marty_name and marty_invoke. Now, the library object is made up of different sections. The dynamic symbol section contains all symbols that can be used externally with metadata about them. They contain the symbol value, size, type, binding, visibility, and a few other things. Now, this is specific to the ELF format and can be changed for other implementations, I'm just saying what is being used today.
Ok, so, you have a dynamic symbol table looking something like:Code: Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND
1: 00000000000005c8 0 SECTION LOCAL DEFAULT 9
2: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_deregisterTMCloneTab
3: 0000000000000000 0 FUNC GLOBAL DEFAULT UND puts@GLIBC_2.2.5 (2)
4: 0000000000000000 0 NOTYPE WEAK DEFAULT UND __gmon_start__
5: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _Jv_RegisterClasses
6: 0000000000000000 0 NOTYPE WEAK DEFAULT UND _ITM_registerTMCloneTable
7: 0000000000000000 0 FUNC WEAK DEFAULT UND __cxa_finalize@GLIBC_2.2.5 (2)
8: 0000000000201038 0 NOTYPE GLOBAL DEFAULT 23 _edata
9: 0000000000201030 8 OBJECT GLOBAL DEFAULT 23 marty_name
10: 0000000000201040 0 NOTYPE GLOBAL DEFAULT 24 _end
11: 0000000000000730 19 FUNC GLOBAL DEFAULT 11 marty_invoke
12: 0000000000201038 0 NOTYPE GLOBAL DEFAULT 24 __bss_start
13: 00000000000005c8 0 FUNC GLOBAL DEFAULT 9 _init
14: 0000000000000744 0 FUNC GLOBAL DEFAULT 12 _fini
Hey, marty_invoke and marty_name are there! They are symbol #9 and 11! Great! First thing, look at the values. marty_name is an 8 byte pointer. The value is the virtual address (can be modified). That address points to the .data section, and at that location is the value 0x750. 0x750 is mapped directly in virtual memory and in the ELF, and guess what's at 0x750? "Marty McFly\0"
Now, hold on, Marty. I can take an entry, find out where that symbol's value is, and a few other things. How... do I find which symbol is what? There's another section for that! Meet the dynamic string section! It contains all symbol names as zero-terminated strings. Want to look up a symbol's index in the dynsym section? Start searching for the string in the dynstr section. (Now, there are extensions to this, but aren't needed. There are optional hash tables that can speed up symbol lookup times)
I can now take, say, "marty_name" and through some logic find the string is at 0x750. Great Scott!
But Doc...
libdoc.so isn't anything special, it comes down to being the same with different names and values. Nothing interesting there.
Where we're going, we don't need roads.
Now for the fun part, the DeLorean! It isn't much fun if it can't find Doc and Marty, right?
Now, remember, delorean is still an ELF, but it has an entry point (Yes, libraries can be ran, too. Try running libc.so.6) unlike the above 2. The ELF still has dynsym and dynstr sections (Doh, since I used the same headers in delorean without doing anything special about visibility, this ELF has marty_* and doc_* symbols exported. No worries), but there are sections that are worth looking at now.
Relocation!
There's a relocation section (.rela.*) that is important to letting the DeLorean know about Marty and Doc. There are 2 subsections, one for functions (.rela.plt) and one for symbols (.rela.dyn) (There is one for compile-time relocation, ignoring as these aren't in the final ELF). Let's take a peek into these sections:Code: Relocation section '.rela.dyn' at offset 0x6d8 contains 3 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000600ff8 000400000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000601048 000a00000005 R_X86_64_COPY 0000000000601048 marty_name + 0
000000601050 000e00000005 R_X86_64_COPY 0000000000601050 doc_name + 0
Relocation section '.rela.plt' at offset 0x720 contains 5 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000601018 000200000007 R_X86_64_JUMP_SLO 0000000000000000 printf + 0
000000601020 000300000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0
000000601028 000400000007 R_X86_64_JUMP_SLO 0000000000000000 __gmon_start__ + 0
000000601030 000500000007 R_X86_64_JUMP_SLO 0000000000000000 doc_invoke + 0
000000601038 000600000007 R_X86_64_JUMP_SLO 0000000000000000 marty_invoke + 0
Hey, that looks right. From the headers, you can see that these tables show the offset (But for what, might you ask?), info (a bit cryptic), the type of relocation operation to perform, symbol value (wait, why again?), and the name and addend. Now, wait, these aren't actually what is stored. Think about it, you already have some of this information stored (in terms of the ELF, you can move this into these sections if you want). The ELF has a symbol table with symbol names, value, etc.. This section just has the offset (into the program's virtual memory), relocation type, and symbol offset (in the table), sorry!
So, ok, our program has relocation tables, ok. So, how are they used? Well, this is where ELF gets more complicated as it thinks bigger, but this is what has been tossed around.
GOT and PLT
The Global Offset Table, or GOT/.got, is a reserved chunk of memory that is allocated at runtime for the process. This table stores the addresses to use for symbols. There is a second section called the GOT Procedure Lookup Table, or GOT PLT/.got.plt (also allocated and stored in RAM). Now, in this process, there is another special section named the PLT (not related to the GOT now, this is different. This is in the program, not copied). The PLT section holds a number of "trampolines" (different from GCC trampolines). You'll see why they are called trampolines further down.
Ok, that's a lot to drop on you. 2 GOT tables and a PLT. Wat.
How should I start... Let's look at how the program runs as-is, ok?Code: 00000000004008f6 <main>:
4008f6: 55 push %rbp
4008f7: 48 89 e5 mov %rsp,%rbp
4008fa: 48 8b 05 47 07 20 00 mov 0x200747(%rip),%rax # 601048 <__TMC_END__>
400901: 48 89 c6 mov %rax,%rsi
400904: bf e0 09 40 00 mov $0x4009e0,%edi
400909: b8 00 00 00 00 mov $0x0,%eax
40090e: e8 9d fe ff ff callq 4007b0 <printf@plt>
400913: b8 00 00 00 00 mov $0x0,%eax
400918: e8 d3 fe ff ff callq 4007f0 <marty_invoke@plt>
40091d: 48 8b 05 2c 07 20 00 mov 0x20072c(%rip),%rax # 601050 <doc_name>
400924: 48 89 c6 mov %rax,%rsi
400927: bf e0 09 40 00 mov $0x4009e0,%edi
40092c: b8 00 00 00 00 mov $0x0,%eax
400931: e8 7a fe ff ff callq 4007b0 <printf@plt>
400936: b8 00 00 00 00 mov $0x0,%eax
40093b: e8 a0 fe ff ff callq 4007e0 <doc_invoke@plt>
400940: b8 00 00 00 00 mov $0x0,%eax
400945: 5d pop %rbp
400946: c3 retq
400947: 66 0f 1f 84 00 00 00 nopw 0x0(%rax,%rax,1)
40094e: 00 00
Whoa, whoa, hold on, what am I looking at? This is in fact x64 assembly. Don't be scared, you don't need to know it to understand what's going on so long that you understand assembly in general.
Ok, reading this... some registers are being moved... Oh look, comments! And symbols! 0x4008fa, this is where magic happens. There's an offset to %rip (Note, this is an x64 feature to make PIC code easier, x86 has slightly different magic), objdump kindly told me it is 0x601048, or __TMC_END__. Yes..., but that just picked the first equate for that address. Looking at the symbols, marty_name is there. Hey, that's cool! From the symbol table:Code: 57: 0000000000601048 8 OBJECT GLOBAL DEFAULT 25 marty_name
Ok, but uh, that's not part of our program. And this address, where is it? readelf tells me it is in the .bss section (or uninitialized global data). Huh, why is marty_name in this section? I thought the GOT has the addresses, what gives? The answer is in the .rela.dyn section from before. Look at the offset for marty_name. 0x601048. This particular relocation type says to copy the actual symbol's value to that offset, in the .bss section. Cool, that's how that library symbol was relocated (I'll elaborate on how this happens once I talk about how it happens with functions). But, what about function calls?
Looking at 0x400904, 0x4009e0 is stored to the %edi register. Most likely this is the string "Hey, %s, what's new?\n" (which makes sense, below the same reference is used for printf). Now, this call to 0x4007b0. Hey, it says printf@plt, that looks like magic! It says that's in the PLT, so, what does the PLT look like?Code: Disassembly of section .plt:
00000000004007a0 <printf@plt-0x10>:
4007a0: ff 35 62 08 20 00 pushq 0x200862(%rip) # 601008 <_GLOBAL_OFFSET_TABLE_+0x8>
4007a6: ff 25 64 08 20 00 jmpq *0x200864(%rip) # 601010 <_GLOBAL_OFFSET_TABLE_+0x10>
4007ac: 0f 1f 40 00 nopl 0x0(%rax)
00000000004007b0 <printf@plt>:
4007b0: ff 25 62 08 20 00 jmpq *0x200862(%rip) # 601018 <_GLOBAL_OFFSET_TABLE_+0x18>
4007b6: 68 00 00 00 00 pushq $0x0
4007bb: e9 e0 ff ff ff jmpq 4007a0 <_init+0x28>
<snip>
00000000004007e0 <doc_invoke@plt>:
4007e0: ff 25 4a 08 20 00 jmpq *0x20084a(%rip) # 601030 <_GLOBAL_OFFSET_TABLE_+0x30>
4007e6: 68 03 00 00 00 pushq $0x3
4007eb: e9 b0 ff ff ff jmpq 4007a0 <_init+0x28>
00000000004007f0 <marty_invoke@plt>:
4007f0: ff 25 42 08 20 00 jmpq *0x200842(%rip) # 601038 <_GLOBAL_OFFSET_TABLE_+0x38>
4007f6: 68 04 00 00 00 pushq $0x4
4007fb: e9 a0 ff ff ff jmpq 4007a0 <_init+0x28>
Hmm, so it calls the printf@plt trampoline (but why call it a trampoline?!), which jumps to address specified in the GOT PLT (Different area than the GOT). Hey, that sounds exactly like what the GOT PLT is for, right? Storing the address of relocated functions? Yes! Well, erm, not quite yet. .got.plt + 0x18 (address 0x601018, important) contains 0x4007b6. Welp, that's shot me back into the PLT. Bounced into the GOT PLT, now bouncing to the dynamic linker, kinda trampoline-like
.
Under the hood
Ok, now what's this _init+0x28? Ding ding ding, that makes a call into the dynamic linker (well, through the GOT PLT, but it gets there)! Yup, ELF uses lazy loading for functions because these lookups can get quite long (Exceptions apply depending on situation). If you have a large project with tons of deeply nested symbols in other libraries, looking up every symbol at startup would be noticeable. (And again, you don't have to do this, but look at how easy lazy loading is. If the dynamic linker has to get much more complex to do single shot resolution, you might do away with lazy loading, its up to the implementer of the dynamic linker).
Now, this linker code, what is it doing? Well, see that pushq $0x0? That's the index into the .rela.plt table for the function it wants to invoke. And that row in the .rela.plt table has the address 0x601018, which is the printf entry in the GOT PLT. When the linker resolved, it replaces that placeholder entry in the GOT PLT with the resolved address for the function. Now, the trampoline bounces directly into the intended function. Amazing!
But... but...
Remember the *_name variables? On startup, the dynamic linker runs for all non-lazy relocations all at once. Nothing fancy here besides what was already said.
Ok, the linker is told to find these symbols. There is yet another section in the ELF that lists the names of the shared libraries and their relation (are they required or not). Some other metadata is stored, like the rpath (where to search for these), but that's not required afaik. (For calcs, you'd need a list of appvar names. Not the name of the library, you want these to be easily shared and interchangable. Makes searching the VAT easier afaik, else you'd need to read everything and find their name.
Relocation, location, what's your version?
Now, version differences was brought up on IRC. How do I know the function I am calling in the library will work right? What if the version changed? "We're all adults", this is less of a problem that ELF would handle, more-so a problem with the developers of the library and them changing the API to existing functions. In short, define your API. Specify a solid declaration. Define how the inputs are handled and what the outputs are. If you break API compatibility, release a differently named library so the dynamic linker will not pick it. Just like how requiring libusb-0.1.so will only pick that specific name, not libusb-1.0.so which has a different API.
(Yes, ELF has a section for versioning of libraries, it is mostly used for libc.so*, but mostly seend by different library names. If you have a newer version of the library, you'd see a symbolic link made to the specific version. For a calc, maybe an API version should be put in the ELF objects. Only increment when the existing API changes, not when existing functions are changed but have the same functionality or the API is expanded.)
(Just an idea, you could add versions to functions, but I'd see that as less optimal in terms of planning your API)
Great Scott! We made it!
Well, I drove this example at 88mph and didn't explode, which is great. This example is limited, so not all relocation types were shown. Also, I used x64 so I cheated a bit with %rip. How is it normally done for architectures that don't cheat? The code dedicates an index register to hold the address of the GOT. The program code indexes from this register to get any needed addresses. Someone fill me in, it looks like the PLT is only really needed for x64 platforms because of a limitation on the instructions. Are the PLT trampolines needed because you can't invoke a function at an address specified on and offset of an index register? Sounds like way too much for 1 instruction. In that case, the GOT PLT alone would suffice?
But wait, why happened to lonely GOT? Our name variables were put into .bss! GOT is for PIC code in your current binary (Oh god, this gets even more complex, turn off your brain for now). Lets say you have this code:Code: int x;
void inc() {
x++;
}
This is a masterpiece. Besides code quality, if you want PIC enforced in your code, fine, you can't have an absolute address built-in to your code. BUT WAIT, where's x? You don't where! And you don't care because the dynamic linker relocated the data section the variable was in and stored the address in the GOT! You now just use your GOT-set index register to get the value of x, increment it, and write it back.
Aaaand here's the bad part. You have a library that uses the GOT. You have a program that uses the GOT. You have a conflict, oh joy. You need to have multiple GOTs/GOT PLTs for each shared object. This is the part I like the least. How this is done, information is scarce. And it depends on the architecture. Please, someone fill me in on this!
And now, the larch
I think that just about covers how relocation and PIC works in ELF objects enough for those who haven't slaved over the technical details, ELF dumps, and disassemblies to have an idea of what goes on. I had loads of fun reading this as I already had plans of writing the dynamic linker for the Prizm for fun.
So, I don't do dev for TIOS, and my z80 knowledge didn't carry into the ez80. The above highlights how things work today for ELF binaries, a lot can carry into assembly programs on ez80 calcs with some design changes. (Here I go talking about an OS I don't know about for an architecture that I don't know its limits):
- Assembly programs are copied to a place in RAM, great. What was it, a duplicate VAT entry is made of the same size and the program is copied there? If so, find somewhere to allocate the GOT and GOT PLT (ELF-wise, they are stored right after each other).
- The real end-goal is to make a shell/kernel for the calcs with the dynamic linker contained. It should be able to take a relocation entry w/symbol entry (for name), search the VAT for the appvar or asm program (or w/e you want), optionally check the API versions match, then...? Find a way to make it accessible. Functions are easy, you can handle the calls (proxy the call to the flash, changing pages if needed). I don't know about the ez80, if you can map pages into memory so that all code is accessible, great, everything is there. Otherwise, that makes the following difficult:
- Figure out what to do with relocating objects (non-functions). Pointers can get nasty if you can't map them into memory (either copying or mapping pages.)
- A way of outputting in a different format is needed. ZDS/etc. output in a flat format that doesn't support unresolved symbols in the final output AFAIK. A linker'd be needed to build a program in the right format with all needed sections/metadata.
- ABI needs to change to allow these features. Getting the ZDS C compiler to bend appropriately might not be possible. Note, that's really needed for PIC generation, so libraries would have issues. Regular C programs could use PIC'd asm libraries, you'd just need to link accordingly and use a dynamic linker.
Cites Worked
- https://www.cs.stevens.edu/~jschauma/631/elf.html - Nice technical details on the ELF sections
- http://bottomupcs.sourceforge.net/csbu/x3824.htm (and the next page) - Simpler approach that walks through the discovery process
- https://docs.oracle.com/cd/E23824_01/html/819-0690/chapter6-46512.html#scrolltoc - Bunch of pages that detailed a bit more about the ELF format
- http://linux.die.net/man/5/elf - Structures, defines, not that much help on the low-level aspects
- http://sco.com/developers/gabi/latest/contents.html - The System V ABI Draft (2013), describes ELF object files, loading, and dynamic linking.
tl;dr, Tiny elves inside your computer write magical symbols that assemble themselves into a completed puzzle.
Total character count: 20,807