stage 00 readme done

2021-08-31 02:10:17 -04:00 · 2021-08-31 02:10:17 -04:00 · d052391270
commit d052391270
parent 9bcbd94e46
7 changed files with 525 additions and 174 deletions
--- a/00/README.md
+++ b/00/README.md
@ -0,0 +1,383 @@
 # stage 00
 This directory contains the file `hexcompile`, a handwritten executable. It
 takes input file `A` containing space/newline/[any character]-separated
 hexadecimal numbers and outputs them as bytes to the file `B`. On 64-bit Linux,
 try running `./hexcompile` from this directory (I've already provided an `A`
 file), and you will get a file named `B` containing the text `Hello, world!`.
 This stage is needed so that you can use your favorite text editor to write
 executables by hand (which have bytes outside of ASCII/UTF-8).  I wrote it with
 a program called hexedit, which can be found on most Linux distributions. Only
 64-bit Linux is supported, because each OS/architecture combination would need
 its own separate executable. The executable is 632 bytes long, and you could
 definitely make it smaller if you wanted to, especially if you didn't limit it
 to the set of instructions I've decided on. Let's take a look at what's inside
 (`od -t x1 -An hexcompile`):
 ```
 7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
 02 00 3e 00 01 00 00 00 78 00 40 00 00 00 00 00
 40 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 40 00 38 00 01 00 00 00 00 00 00 00
 01 00 00 00 07 00 00 00 78 00 00 00 00 00 00 00
 78 00 40 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 02 00 00 00 00 00 00 00 02 00 00 00 00 00 00
 00 10 00 00 00 00 00 00 48 b8 74 02 40 00 00 00
 00 00 48 89 c7 48 b8 00 00 00 00 00 00 00 00 48
 89 c6 48 89 c2 48 b8 02 00 00 00 00 00 00 00 0f
 05 48 89 c5 48 b8 76 02 40 00 00 00 00 00 48 89
 c7 48 b8 41 00 00 00 00 00 00 00 48 89 c6 48 b8
 a4 01 00 00 00 00 00 00 48 89 c2 48 b8 02 00 00
 00 00 00 00 00 0f 05 48 89 ef 48 b8 68 02 40 00
 00 00 00 00 48 89 c6 48 b8 03 00 00 00 00 00 00
 00 48 89 c2 48 b8 00 00 00 00 00 00 00 00 0f 05
 48 89 c3 48 b8 03 00 00 00 00 00 00 00 48 39 d8
 0f 8f 37 01 00 00 48 b8 68 02 40 00 00 00 00 00
 48 89 c3 48 8b 03 48 89 c3 48 89 c7 48 b8 ff 00
 00 00 00 00 00 00 48 21 d8 48 89 c6 48 b8 39 00
 00 00 00 00 00 00 48 89 c3 48 89 f0 48 39 d8 0f
 8f 1e 00 00 00 48 b8 30 00 00 00 00 00 00 00 48
 f7 d8 48 89 f3 48 01 d8 e9 26 00 00 00 00 00 00
 00 00 00 48 b8 a9 ff ff ff ff ff ff ff 48 89 f3
 48 01 d8 e9 0b 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 48 89 c2 48 b8 ff 00 00 00 00 00 00 00
 48 89 c3 48 89 f8 48 c1 e8 08 48 21 d8 48 93 48
 b8 39 00 00 00 00 00 00 00 48 93 48 39 d8 0f 8f
 1f 00 00 00 48 89 c3 48 b8 d0 ff ff ff ff ff ff
 ff 48 01 d8 e9 2a 00 00 00 00 00 00 00 00 00 00
 00 00 00 48 89 c3 48 b8 a9 ff ff ff ff ff ff 48
 01 d8 e9 0c 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 48 89 c7 48 89 d0 48 c1 e0 04 48 89 fb
 48 09 d8 48 93 48 b8 68 02 40 00 00 00 00 00 48
 93 48 89 03 48 89 de 48 b8 04 00 00 00 00 00 00
 00 48 89 c7 48 b8 01 00 00 00 00 00 00 00 48 89
 c2 0f 05 e9 8f fe ff ff 00 00 00 00 00 48 b8 3c
 00 00 00 00 00 00 00 0f 05 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 41 00 42 00
 ```
 Okay, that doesn't tell us much. I'll annotate it below. You might notice that
 all the numbers are backwards, e.g. `3e 00` for the number 0x003e (62 decimal).
 This is because almost all modern architectures (including x86-64) are
 little-endian, meaning that the *least significant byte* goes first, and the
 most significant byte goes last. There are various reasons why this is easier to
 deal with, but I won't explain that here.
 ## ELF header
 This header has a bunch of metadata about the executable.
 - `7f 45 4c 46` Special identifier saying that this is an ELF file (ELF is the
 format of almost all Linux executables)
 - `02` 64-bit
 - `01` Little-endian
 - `01` ELF version 1 (there is no version 2 yet)
 - `00 00 00 00 00 00 00 00 00` Reserved (not important yet, but may be in a later
 version of ELF)
 - `02 00` Object type = executable file (not a dynamic library/etc.)
 - `3e 00` Architecture x86-64
 - `01 00 00 00` Version 1 of ELF, again 
 - `78 00 40 00 00 00 00 00` **Entry point of the executable** = 0x400078 (explained later)
 - `40 00 00 00 00 00 00 00` Program header table offset in bytes from start of file (see below)
 - `00 00 00 00 00 00 00 00` Section header table offset (we're not using sections)
 - `00 00 00 00` Flags (not important)
 - `40 00` The size of this header, in bytes = 64
 - `38 00` Size of the program header (see below) = 56
 - `01 00` Number of program headers = 1
 - `00 00` Size of each section header (unused)
 - `00 00` Number of section headers (unused)
 - `00 00` Index of special .shstrtab section (unused)
 ## program header
 The program header describes a segment of data that is loaded into memory when
 the program starts. Normally, you would have more than one of these, maybe 
 one for code, one for read-only data, and one for read-write data, but to
 simplify things we've only got one, which we'll use for any code and any data
 we need. This means it'll have to be read-enabled, write-enabled, and
 execute-enabled. Normally people don't do this, for security, but we won't worry
 about that (don't compile any untrusted code with any compiler from this series!)
 Without further ado, here's the contents of the program header:
 - `01 00 00 00` Segment type 1 (this should be loaded into memory)
 - `07 00 00 00` Flags = RWE (readable, writeable, and executable)
 - `78 00 00 00 00 00 00 00` Offset in file = 120
 - `78 00 40 00 00 00 00 00` Virtual address = 0x400078
 **wait a minute, what's that?**
 We just specified the *virtual address* of this segment. This is the virtual
 memory address that the segment will be loaded to. Virtual memory means that
 memory addresses in our program do not actually correspond to where the memory
 is physically stored in RAM. There are many reasons for it, including allowing
 different processes to have overlapping memory addresses, making sure that some
 memory can't be read/written/executed, etc. You can read more about it
 elsewhere.
 - `00 00 00 00 00 00 00 00` Physical address (not applicable)
 - `00 02 00 00 00 00 00 00` Size of this segment in the executable file = 512
 bytes
 - `00 02 00 00 00 00 00 00` Size of this segment when loaded into memory = also
 512 bytes
 - `00 10 00 00 00 00 00 00` Segment alignment = 4096 bytes
 That last field, segment alignment, is needed, because on default-settings Linux
 each page (block) of memory is 4096 bytes long, and has to start at an address
 that is a multiple of 4096. Our program needs to be loaded into a memory page,
 so its *virtual address* needs to be a multiple of 4096. We're using `0x400000`.
 But wait! Didn't we use `0x400078` for the virtual address? Well, yes but that's
 because the *data in the file* is loaded to address `0x400078`. The actual page
 of memory that the OS will allocate for our code will start at `0x400000`. The
 reason we need to start `0x78` bytes in is that Linux expects the data *in the
 file* to be at the same position in the page as when it will be loaded, and it
 appears at offset `0x78` in our file. Don't worry if you didn't understand all
 of that.
 ## the code
 Now we get to the actual code in our executable (well there's a bit of data here
 too). We specified `0x400078` as the *entry point* of our executable, which
 means that the program will start executing from there. That virtual address
 corresponds to the start of the code right here:
 The first thing we want to do is open our input file, `A`:
 - `48 b8 74 02 40 00 00 00 00 00` `mov rax, 0x400274`
 - `48 89 c7` `mov rdi, rax`
 - `48 b8 00 00 00 00 00 00 00 00` `mov rax, 0`
 - `48 89 c6` `mov rsi, rax`
 - `48 89 c2` `mov rdx, rax`
 - `48 b8 02 00 00 00 00 00 00 00` `mov rax, 2`
 - `0f 05` `syscall`
 These instructions execute syscall `2` with arguments `0x400274`, `0`, `0`.
 If you're familiar with C code, this is `open("A", O_RDONLY, 0)`.
 A syscall is the mechanism which lets software ask the kernel to do things.
 [Here](https://filippo.io/linux-syscall-table/) is a nice table of syscalls you
 can look through if you're interested.
 Syscall #2, on Linux, is `open`. It's used to open a file. On Linux, you can
 read about it by running `man 2 open`.
 The first argument, `0x400274`, is a pointer to some data at the very end of
 this segment (scroll down). Specifically, it holds the byte `41` (ASCII `A`),
 followed by `00` (null byte). This indicates the name of the file, "A". The
 second argument (`O_RDONLY`, or 0) specifies that we will be reading from this
 file.  The third is only really needed when creating new files, but I've just
 set it to 0, why not.
 This call gives us back a *file descriptor*, used later to read from the file,
 in register `rax`.
 - `48 89 c5` `mov rbp, rax` Store the file descriptor for later
 Now we'll open the output file
 - `48 b8 76 02 40 00 00 00 00 00` `mov rax, 0x400276`
 - `48 89 c7` `mov rdi, rax`
 - `48 b8 41 00 00 00 00 00 00 00` `mov rax, 0x41`
 - `48 89 c6` `mov rsi, rax`
 - `48 b8 a4 01 00 00 00 00 00 00` `mov rax, 0o644`
 - `48 89 c2` `mov rdx, rax`
 - `48 b8 02 00 00 00 00 00 00 00` `mov rax, 2`
 - `0f 05` `syscall`
 These instructions execute the syscall `open("B", O_WRONLY|O_CREAT, 0644)`. This
 is similar to our first one, but with some important differences. First, the
 second argument specifies both that we are writing to a file `0x01`, and that we
 want to create the file if it doesn't exist `0x40`. Secondly, the third
 argument specifies the permissions that the file should be created with (`644` -
 user read/write, group read). This here isn't particularly important to how the
 program works.
 - `48 89 ef` `mov rdi, rbp`
 - `48 b8 68 02 40 00 00 00 00 00` `mov rax, 0x400268`
 - `48 89 c6` `mov rsi, rax`
 - `48 b8 03 00 00 00 00 00 00 00` `mov rax, 3`
 - `48 89 c2` `mov rdx, rax`
 - `48 b8 00 00 00 00 00 00 00 00` `mov rax, 0`
 - `0f 05` `syscall`
 Here we call syscall #0 (`read`) to read from a file. The arguments are:
 - `fd (rdi) = rbp` read from the file descriptor we stored away earlier
 - `buf (rsi) = 0x400268` output to a part of this segment I've left empty
 - `count (rdx) = 3` read 3 bytes
 The number of bytes *actually* read (taking into account the fact that we might
 have reached the end of the file) is stored in `rax`.
 Note that we read the entire file 3 bytes at a time, which is a *terrible* idea
 for performance. syscalls take quite a while (3 microseconds or so, which would
 make this very slow for a several-megabyte file), so modern programs tend to
 read ~4KB at a time. But our programs will be small, and we don't care a lot
 about performance, so it's okay.
 - `48 89 c3` `mov rbx, rax`
 - `48 b8 03 00 00 00 00 00 00 00` `mov rax, 3`
 - `48 39 d8` `cmp rax, rbx`
 - `0f 8f 37 01 00 00` `jg 0x40024d`
 Together, these instructions say to jump to a different part of the code
 (explained later), if we ended up reading less than 3 bytes, i.e. we reached the
 end of the file. Note that rather than specifying the *address* to jump to, we
 specify the *relative address* (it's relative to the address of the first byte
 after the jump instruction). In other words, we're adding `0x137` to the program
 counter, `rip`. This has many reasons including saving space.
 - `48 b8 68 02 40 00 00 00 00 00` `mov rax, 0x400268`
 - `48 89 c3` `mov rbx, rax`
 - `48 8b 03` `mov rax, qword [rbx]`
 This copies out 8 bytes of the data that was just read into the 64-bit register
 rax. We only read 3 bytes of data from the file, but the rest will just be
 zeros (because that's what we put at offset `0x268` of the file).
 - `48 89 c3` `mov rbx, rax`
 - `48 89 c7` `mov rdi, rax`
 Here we copy away this data for later use.
 - `48 b8 ff 00 00 00 00 00 00 00` `mov rax, 0xff`
 - `48 21 d8` `and rax, rbx`
 This grabs the first byte of data we read and stores it in `rax`. This will be
 the code of the first ASCII character of the hexadecimal number in our input
 file.
 - `48 89 c6` `mov rsi, rax`
 - `48 b8 39 00 00 00 00 00 00 00` `mov rax, 0x39 ('9')`
 - `48 89 c3` `mov rax, rbx`
 - `48 89 f0` `mov rax, rsi`
 - `48 39 d8` `cmp rax, rbx`
 - `0f 8f 1e 00 00 00` `jg 0x400173`
 These instructions compare that character code against the character code for
 `9`. If it's greater, then it's one of the hex digits `a` through `f`, which are
 handled separately later.
 - `48 b8 30 00 00 00 00 00 00 00` `mov rax, 0x30 ('0')`
 - `48 f7 d8` `neg rax`
 - `48 89 f3` `mov rbx, rsi`
 - `48 01 d8` `add rax, rbx`
 Subtract the character code for `0` from the character code we read in, to get
 the *number* corresponding to the first hex digit in the pair.
 - `e9 26 00 00 00` `jmp 0x400193`
 Go to a different part of the program (we'll get there later).
 - `00 00 00 00 00 00`
 Unneeded 0 bytes I left in, to make room in case I needed it.
 Now we get to the `a`-`f` handling code:
 - `48 b8 a9 ff ff ff ff ff ff ff` `mov rax, -87`
 - `48 89 f3` `mov rbx, rsi`
 - `48 01 d8` `add rax, rbx`
 - `e9 0b 00 00 00` `jmp 0x400193`
 - `00 00 00 00 00 00 00 00 00 00 00` (unused)
 If our character code is one of `abcdef`, we add `-87` (subtract `87`) from it,
 to convert the character code to the numerical value of the digit. Here I
 decided to just set `rax` to the two's complement encoding for `-87`, but you
 could also use the `neg` instruction, like I did last time. <s>I just wanted to
 show two different ways of doing it</s> I thought of the better way the second
 time around.
 Now we get to `0x400193`, the common place we jumped to from both branches.
 - `48 89 c2` `mov rdx, rax`
 Store away the first digit in the pair into `rdx`.
 - `48 b8 ff 00 00 00 00 00 00 00` `mov rax, 0xff`
 - `48 89 c3` `mov rbx, rax`
 - `48 89 f8` `mov rax, rdi`
 - `48 c1 e8 08` `shr rax, 8`
 - `48 21 d8` `and rax, rbx`
 Now we extract the second character code we read from the file.
 The entire character code to number conversion is rewritten here, but slightly
 differently this time because I came up with some new ideas.
 - `48 93` `xchg rax, rbx`
 - `48 b8 39 00 00 00 00 00 00 00` `mov rax, 0x39 ('9')`
 - `48 93` `xchg rax, rbx`
 - `48 39 d8` `cmp rax, rbx`
 - `0f 8f 1f 00 00 00` `jg 0x4001e3 ('a'-'f' handling code)`
 - `48 89 c3` `mov rbx, rax`
 - `48 b8 d0 ff ff ff ff ff ff ff` `mov rax, -48`
 - `48 01 d8` `add rax, rbx`
 - `e9 2a 00 00 00` `jmp 0x400203`
 - `00 00 00 00 00 00 00 00 00 00` (unused)
 ('a'-'f' handling)
 - `48 89 c3` `mov rbx, rax`
 - `48 b8 a9 ff ff ff ff ff ff` `mov rax, -87`
 - `48 01 d8` `add rax, rbx`
 - `e9 0c 00 00` `jmp 0x400203`
 - `00 00 00 00 00 00 00 00 00 00 00 00 00` (unused)
 (common code)
 - `48 89 c7` `mov rdi, rax`
 Okay now we've read the first hex digit into `rdx`, and the second into `rdi`.
 - `48 89 d0` `mov rax, rdx`
 - `48 c1 e0 04` `shl rax, 4`
 - `48 89 fb` `mov rbx, rsi`
 - `48 09 d8` `or rax, rbx`
 Okay, now we have the full hexadecimal number in `rax`!
 - `48 93` `xchg rax, rbx`
 - `48 b8 68 02 40 00 00 00 00 00` `mov rax, 0x400268`
 - `48 93` `xchg rax, rbx`
 - `48 89 03` `mov qword [rbx], rax`
 This stores the byte we want to write to the file at address `0x400268`. This is
 the same address we used to read in the input text; again, it's just part of
 this segment I've left blank.
 - `48 89 de` `mov rsi, rbx`
 - `48 b8 04 00 00 00 00 00 00 00` `mov rax, 4`
 - `48 89 c7` `mov rdi, rax`
 - `48 b8 01 00 00 00 00 00 00 00` `mov rax, 1`
 - `48 89 c2` `mov rdx, rax`
 - `0f 05` `syscall`
 Here we call syscall #1, `write`, with arguments:
 - `fd = 4` we could have stored away the file descriptor we got before for the
 output file, like we did with the input file, but I was out of easy-to-use
 registers! Instead, we can use the fact that Linux assigns file descriptors
 sequentially starting from 3 (0, 1, and 2 are standard input, output, and
 error), so we know our output file, the second file we opened, will have
 descriptor 4.
 - `buf = 0x400268` where we put our data
 - `count = 1` write 1 byte
 - `e9 8f fe ff ff` `jmp 0x4000d7`
 - `00 00 00 00 00` (unused)
 Now we go back to read in the next pair of digits! Finally...
 - `48 b8 3c 00 00 00 00 00 00 00` `mov rax, 0x3c`
 - `0f 05` `syscall`
 This is where we conditionally jumped to way back when we determined if we
 reached the end of the file. This just calls syscall #60, `exit`, to exit our
 program nicely. We didn't specify the exit code, but that's okay for our
 purposes.
 And we could close the files (syscall #3), to tell Linux we're done with them,
 but we don't need to. It'll close all our open file descriptors when our program
 exits.
 - `00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00` Unused bytes (I wasn't
 sure exactly how long the program would be)
 - `00 00 00 00 00 00 00 00` This is where we read/wrote the file data!
 - `41 00` Input file name, `"A"`
 - `42 00` Output file name, `"B"`
 That's quite a lot to take in for such a simple program, but here we are! We now
 have something that will let us write individual bytes with an ordinary text
 editor and get them translated into a binary file.
--- a/00/README.txt
+++ b/00/README.txt
@ -1,147 +0,0 @@
 --- stage 00 ---
 This directory contains the file 'hexcompile', a handwritten executable.
 It takes an input file A containing space/newline/[any character]-separated
 hexadecimal numbers and outputs them as bytes to the file B. On 64-bit Linux,
 try running ./hexcompile from this directory (I've already provided an A file),
 and you will get a file named B containing the text "Hello, world!".
 I made this program so that you can use your favorite text editor to write
 executables by hand (which have bytes outside of ASCII/UTF-8).
 I wrote it with a program called hexedit, which can be found on most Linux
 distributions. Only 64-bit Linux is supported, because each OS/architecture
 combination would need its own separate executable. The executable is 632 bytes
 long, and you could definitely make it smaller if you wanted to. Let's take a
 look at what's inside (see hexdump -C hexcompile):
 7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00
 02 00 3e 00 01 00 00 00  78 00 40 00 00 00 00 00
 40 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 00 00 00 00 40 00 38 00  01 00 00 00 00 00 00 00
 01 00 00 00 07 00 00 00  78 00 00 00 00 00 00 00
 78 00 40 00 00 00 00 00  00 00 00 00 00 00 00 00
 00 02 00 00 00 00 00 00  00 02 00 00 00 00 00 00
 00 10 00 00 00 00 00 00  48 b8 74 02 40 00 00 00
 00 00 48 89 c7 48 b8 00  00 00 00 00 00 00 00 48
 89 c6 48 89 c2 48 b8 02  00 00 00 00 00 00 00 0f
 05 48 89 c5 48 b8 76 02  40 00 00 00 00 00 48 89
 c7 48 b8 41 00 00 00 00  00 00 00 48 89 c6 48 b8
 a4 01 00 00 00 00 00 00  48 89 c2 48 b8 02 00 00
 00 00 00 00 00 0f 05 48  89 c1 48 89 ef 48 b8 68
 02 40 00 00 00 00 00 48  89 c6 48 b8 03 00 00 00
 00 00 00 00 48 89 c2 48  b8 00 00 00 00 00 00 00
 00 0f 05 48 89 c3 48 b8  03 00 00 00 00 00 00 00
 48 39 d8 0f 8f 37 01 00  00 48 b8 68 02 40 00 00
 00 00 00 48 89 c3 48 8b  03 48 89 c3 48 89 c7 48
 b8 ff 00 00 00 00 00 00  00 48 21 d8 48 89 c6 48
 b8 39 00 00 00 00 00 00  00 48 89 c3 48 89 f0 48
 39 d8 0f 8f 1e 00 00 00  48 b8 30 00 00 00 00 00
 00 00 48 f7 d8 48 89 f3  48 01 d8 e9 26 00 00 00
 00 00 00 00 00 00 48 b8  a9 ff ff ff ff ff ff ff
 48 89 f3 48 01 d8 e9 0b  00 00 00 00 00 00 00 00
 00 00 00 00 00 00 48 89  c2 48 b8 ff 00 00 00 00
 00 00 00 48 89 c3 48 89  f8 48 c1 e8 08 48 21 d8
 48 93 48 b8 39 00 00 00  00 00 00 00 48 93 48 39
 d8 0f 8f 1f 00 00 00 48  89 c3 48 b8 d0 ff ff ff
 ff ff ff ff 48 01 d8 e9  2a 00 00 00 00 00 00 00
 00 00 00 00 00 00 48 89  c3 48 b8 a9 ff ff ff ff
 ff ff 48 01 d8 e9 0c 00  00 00 00 00 00 00 00 00
 00 00 00 00 00 00 48 89  c7 48 89 d0 48 c1 e0 04
 48 89 fb 48 09 d8 48 93  48 b8 68 02 40 00 00 00
 00 00 48 93 48 89 03 48  89 de 48 b8 04 00 00 00
 00 00 00 00 48 89 c7 48  b8 01 00 00 00 00 00 00
 00 48 89 c2 0f 05 e9 8f  fe ff ff 00 00 00 00 00
 48 b8 3c 00 00 00 00 00  00 00 0f 05 00 00 00 00
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 00 00 00 00 41 00 42 00
 Okay, that doesn't tell us much, I'll annotate it below. You might notice that
 all the numbers are backwards, e.g. 3e 00 for the number 0x003e (62 decimal).
 This is because almost all modern architectures (including x86-64) are
 little-endian, meaning that the *least significant byte* goes first, and the
 most significant byte goes last. There are various reasons why this is easier to
 deal with, which I won't explain here.
 -- ELF header --
 This header has a bunch of metadata about the executable.
 7f 45 4c 46 - Special identifier saying that this is an ELF file (ELF is the
 format of almost all Linux executables)
 02 - 64-bit
 01 - Little-endian
 01 - ELF version 1 (there is no version 2 yet)
 00 00 00 00 00 00 00 00 00 - Reserved (not important yet, but may be in a later
 version of ELF)
 02 00 - This is an executable file (not a dynamic library/etc)
 3e 00 - Architecture x86-64
 01 00 00 00 - Version 1 of ELF (minor version or something)
 78 00 40 00 00 00 00 00 - **Entry point of the executable** = 0x400078 (explained later)
 40 00 00 00 00 00 00 00 - Program header table offset in bytes from start of file (see below)
 00 00 00 00 00 00 00 00 - Section header table offset (we're not using sections)
 00 00 00 00 - Flags (not important)
 40 00 - The size of this header, in bytes = 64
 38 00 - Size of the program header (see below) = 56
 01 00 - Number of program headers = 1
 00 00 - Size of each section header (unused)
 00 00 - Number of section headers (unused)
 00 00 - Index of special .shstrtab section (unused)
 -- Program header --
 The program header describes a segment of data that is loaded into memory when
 the program starts. Normally, you would have more than one of these, one for
 code, one for read-only data, and one for read-write data, perhaps, but to
 simplify things we've only got one, which we'll use for any code and any data
 we need. This means it'll have to be read-enabled, write-enabled, *and*
 execute-enabled. Normally people don't do this, for security, but we won't worry
 about that (don't compile any untrusted code with any compiler from this series!)
 Without further ado, here's the contents of the program header:
 01 00 00 00 - Segment type 1 (this should be loaded into memory)
 07 00 00 00 - Flags = RWE (readable, writeable, and executable)
 78 00 00 00 00 00 00 00 - Offset in file = 120
 78 00 40 00 00 00 00 00 - Virtual address = 0x400078
 - Wait a minute, what's that? -
 We just specified the *virtual address* of this segment. This is the virtual
 memory address that the segment will be loaded to. Virtual memory means that
 memory addresses in our program do not actually correspond to where the memory
 is physically stored in RAM. There are many reasons for it, including allowing
 different processes to have overlapping memory addresses, making sure that some
 memory can't be read/written/executed, etc. You can read more about it
 elsewhere.
 00 00 00 00 00 00 00 00 - Physical address (not applicable)
 00 02 00 00 00 00 00 00 - Size of this segment in the executable file = 512
 bytes
 00 02 00 00 00 00 00 00 - Size of this segment when loaded into memory = also
 512 bytes
 00 10 00 00 00 00 00 00 - Segment alignment = 4096 bytes
 48 b8 74 02 40 00 00 00
 00 00 48 89 c7 48 b8 00  00 00 00 00 00 00 00 48
 89 c6 48 89 c2 48 b8 02  00 00 00 00 00 00 00 0f
 05 48 89 c5 48 b8 76 02  40 00 00 00 00 00 48 89
 c7 48 b8 41 00 00 00 00  00 00 00 48 89 c6 48 b8
 a4 01 00 00 00 00 00 00  48 89 c2 48 b8 02 00 00
 00 00 00 00 00 0f 05 48  89 c1 48 89 ef 48 b8 68
 02 40 00 00 00 00 00 48  89 c6 48 b8 03 00 00 00
 00 00 00 00 48 89 c2 48  b8 00 00 00 00 00 00 00
 00 0f 05 48 89 c3 48 b8  03 00 00 00 00 00 00 00
 48 39 d8 0f 8f 37 01 00  00 48 b8 68 02 40 00 00
 00 00 00 48 89 c3 48 8b  03 48 89 c3 48 89 c7 48
 b8 ff 00 00 00 00 00 00  00 48 21 d8 48 89 c6 48
 b8 39 00 00 00 00 00 00  00 48 89 c3 48 89 f0 48
 39 d8 0f 8f 1e 00 00 00  48 b8 30 00 00 00 00 00
 00 00 48 f7 d8 48 89 f3  48 01 d8 e9 26 00 00 00
 00 00 00 00 00 00 48 b8  a9 ff ff ff ff ff ff ff
 48 89 f3 48 01 d8 e9 0b  00 00 00 00 00 00 00 00
 00 00 00 00 00 00 48 89  c2 48 b8 ff 00 00 00 00
 00 00 00 48 89 c3 48 89  f8 48 c1 e8 08 48 21 d8
 48 93 48 b8 39 00 00 00  00 00 00 00 48 93 48 39
 d8 0f 8f 1f 00 00 00 48  89 c3 48 b8 d0 ff ff ff
 ff ff ff ff 48 01 d8 e9  2a 00 00 00 00 00 00 00
 00 00 00 00 00 00 48 89  c3 48 b8 a9 ff ff ff ff
 ff ff 48 01 d8 e9 0c 00  00 00 00 00 00 00 00 00
 00 00 00 00 00 00 48 89  c7 48 89 d0 48 c1 e0 04
 48 89 fb 48 09 d8 48 93  48 b8 68 02 40 00 00 00
 00 00 48 93 48 89 03 48  89 de 48 b8 04 00 00 00
 00 00 00 00 48 89 c7 48  b8 01 00 00 00 00 00 00
 00 48 89 c2 0f 05 e9 8f  fe ff ff 00 00 00 00 00
 48 b8 3c 00 00 00 00 00  00 00 0f 05 00 00 00 00
 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 00 00 00 00 41 00 42 00
--- a/00/hexcompile
+++ b/00/hexcompile
--- a/README.md
+++ b/README.md
@ -0,0 +1,99 @@
 # boostrapping a (Linux x86-64) C compiler
 Compilers nowadays are written in languages like C, which themselves need to be
 compiled. But then, you need a C compiler to compile your C compiler! Of course,
 the very first C compiler was not written in C (because how would it be
 compiled?). Instead, it was slowly built up, starting from a very basic
 assembler, eventually reacing a full-scale compiler. This process is known as
 bootstrapping. In this repository, we'll explore how that's done. Each directory
 represents a new "stage" in the process. The first one, `00`, is a hand-written
 executable, and the last one will be a C compiler. Each directory has its own
 README explaining what's going on.
 You can run `bootstrap.sh` to run through and test every stage.
 ## the basics
 In this series, I want to explain *everything* that's going on. I'm going to
 need to assume some passing knowledge about computers, but here's a quick
 overview of what you'll want to know before starting. I can't explain everything
 so you may need to do your own research. You don't need to understand each of
 these in full, just get a general idea at least:
 - what an operating system is
 - what memory is
 - what a programming language is
 - what a compiler is
 - what an executable file is
 - number bases -- if a number is preceded by 0x, 0o, or 0b in this series, that
 means hexadecimal/octal/binary respectively. So 0xff = FF hexadecimal = 255
 decimal.
 - what a CPU is
 - what a CPU architecture is
 - what a CPU register is
 - what a pointer is
 - bits, bytes, kilobytes, etc.
 - bitwise operations (not, or, and, xor, left shift, right shift)
 - 2's complement
 - null-terminated strings
 - how floating-point numbers work
 - maybe some basic Intel-style x86-64 assembly (you can probably pick it up on
 the way though)
 ## instruction set
 x86-64 has a *gigantic* instruction set. The manual for it is over 2,000 pages
 long! So, it makes sense to select only a small subset of it to use for all the
 stages of our compiler. The set I've chosen can be found in `instructions.txt`.
 I think it achieves a pretty good balance between having few enough
 instructions to be manageable and having enough instructions to be useable.
 To be clear, you don't need to read that file to understand the series, at least
 not right away.
 ## principles
 - as simple as possible
 Bootstrapping a compiler is not an easy task, so we're trying to make it as easy
 as possible. We don't even necessarily need a standard-compliant C compiler, we
 only need enough to compile someone else's C compiler, specifically TCC
 (https://bellard.org/tcc/) since that's a compiler with very few dependencies.
 - efficiency is not a concern
 We will create big and slow executables, and that's okay. It doesn't really
 matter if compiling TCC takes 8 as opposed to 0.01 seconds; once we compile TCC
 with itself, we'll get the same executable either way.
 ## reflections on trusting trust
 In 1984, Ken Thompson wrote the well-known article
 [*Reflections on Trusting Trust*](http://users.ece.cmu.edu/~ganger/712.fall02/papers/p761-thompson.pdf).
 This is one of the things that inspired me to start this project. To summarize
 the article: it is possible to create a malicious C compiler which will
 replicate its own malicious functionalities (e.g. detecting password-checking
 routines to make them also accept another password the attacker knows) when used
 to compile other C compilers. For all we know, such a compiler was used to
 compile GCC, say, and so all programs around today could be compromised. Of
 course, this is practically definitely not the case, but it's still an
 interesting experiment to try to create a fully trustable compiler.  This
 project can't necessarily even do that though, because the Linux kernel, which
 we depend on, is compiled from C, so we can't fully trust *it*. To *truly*
 create a fully trustable compiler, you'd need to manually write to a USB with a
 circuit, create an operating system from nothing (without even a text editor),
 and then follow this series, or maybe you don't even trust your CPU vendor...
 I'll leave that to someone else
 ## license
 ```
 This project is in the public domain. Any copyright protections from any law
 for this project are forfeited by the author(s). No warranty is provided for
 this project, and the author(s) shall not be held liable in connection with it.
 ```
 ## contributing
 If you notice a mistake/want to clarify something, you can submit a pull request
 via GitHub, or email `pommicket at pommicket.com`. Translations are welcome!
--- a/README.txt
+++ b/README.txt
@ -1,25 +0,0 @@
 --- boostrapping a (Linux x86-64) C compiler ---
 Compilers nowadays are written in languages like C, which themselves need to be
 compiled. But then, you need a C compiler to compile your C compiler! Of course,
 the very first C compiler was not written in C (because how would it be
 compiled?). Instead, it was slowly built up, starting from a very basic
 assembler, eventually reacing a full-scale compiler. This process is known as
 bootstrapping. In this repository, we'll explore how that's done. Each directory
 represents a new "stage" in the process. The first one, "00", is a hand-written
 executable, and the last one will be a C compiler. Each directory has its own
 README.txt explaining in full what's going on.
 -- instruction set --
 x86-64 has a *gigantic* instruction set. The manual for it is over 2,000 pages
 long! So, it makes sense to select only a small subset of it to use for all the
 stages of our compiler. The set I've chosen can be found in instructions.txt (a
 work in progress). I think it achieves a pretty good balance between 
 having few enough instructions to be manageable and having enough
 instructions to be useable.
 -- license --
 This software is in the public domain. Any copyright protections from any law
 for this software are forfeited by the author(s). No warranty is provided for
 this software, and the author(s) shall not be held liable in connection with it.
--- a/bootstrap.sh
+++ b/bootstrap.sh
@ -0,0 +1,39 @@
 #!/bin/sh
 # check OS/architecture
 esc() {
 	: # comment out the following line to disable color output
 	printf '\33[%dm' "$1"
 }
 echo_red() {
 	esc 31
 	echo "$1"
 	esc 0
 }
 echo_green() {
 	esc 32
 	echo "$1"
 	esc 0
 }
 if uname -a | grep -i 'x86_64' | grep -i -q 'linux'; then
 	: # all good
 else
 	echo_red "Only 64-bit Linux is supported. This doesn't seem to be 64-bit Linux."
 	exit 1
 fi
 cd 00
 rm -f B
 ./hexcompile A
 if [ "$(cat B)" != 'Hello, world!' ]; then
 	echo_red 'Stage 00 failed.'
 	exit 1
 fi
 rm -f B
 cd ..
 echo_green 'Done all stages!'
--- a/instructions.txt
+++ b/instructions.txt
@ -1,7 +1,9 @@
-SYSCALL CALLING CONVENTION
+Linux syscall calling convention:
-rdi rsi rdx r10 r8 r9
+rax  - syscall number
 rdi, rsi, rdx, r10, r8, r9  - arguments
 return value placed in rax
 Instruction set:
 mov rax, imm64
 >48 b8 IMM64