It's me, bugging around again.
I finally got back to work on my compilers-related series, while it is focused on people untouched by automatons, VMs and compilers in general (or more likely the ones that are using it, while not knowing what goes inside), I tried to keep code-base quite small (and still manageable and readable by average viewer).
The funny thing around this is, that I've never actually worked on any big compiler (like GCC or such). So the series might give a look of someone who tried to understand how it works and how to do it, with formal languages and automatons theory in head (or at least whatever stayed there after university). While doing the compiler by myself as a form of education and challenge, I realized how little of this is covered in general - and that a lot of people don't fully understand what goes on behind the scenes - which motivated me to learn about compilers and languages more in theory and write at least a few-article series to introduce others into this problematic (and possibly help them avoid a lot of troubles I've hit while making my first compiler).
Actually the original final version of the compiler (even before I started working on articles) was able to compile something like this:int test(int* x){ x[0] = *x * 2; return -1;}int mult(int** x){ int* y = x[0]; y[0] = x[0][0] * 5; return -1;}int main(){ int z[3] = {1, 2, 3}; int x = 7; int y = test(&x); y = mult(&z); while (x > 0) { y = y * 2; x = x - 1; } return z[0];}
Into an assembly like this:.data.texttest: mov eax, 0 push eax mov ebx, [ebp+0] mov eax, ebx mov eax, [eax] push eax mov eax, 2 pop ebx mul eax, ebx pop ebx mov edx, 4 mul ebx, edx mov edx, [ebp+0] mov [ebx+edx], eax mov eax, 1 neg eax pop ebx push eax mov eax, 1 add ebx, eax mov eip, ebxmult: mov eax, 0 mov ebx, 4 mul eax, ebx mov ebx, [ebp+0] mov eax, [eax+ebx] mov [ebp+8], eax add esp, 4 mov eax, 0 push eax mov eax, 0 mov ebx, 4 mul eax, ebx mov ebx, [ebp+0] mov eax, [eax+ebx] mov ebx, eax mov eax, 0 mul eax, 4 mov eax, [ebx+eax] push eax mov eax, 5 pop ebx mul eax, ebx pop ebx mov edx, 4 mul ebx, edx mov edx, [ebp+8] mov [ebx+edx], eax mov eax, 1 neg eax sub esp, 4 pop ebx push eax mov eax, 1 add ebx, eax mov eip, ebxmain: mov eax, esp add eax, 4 mov [ebp+4], eax add esp, 4 mov eax, 1 push eax mov eax, 2 push eax mov eax, 3 push eax mov eax, 7 mov [ebp+20], eax add esp, 4 push ebp mov ecx, esp mov ebx, [ebp+20] mov eax, ebx mov eax, edx push eax mov ebp, ecx push eip call test pop eax sub esp, 4 mov esp, ebp pop ebp mov [ebp+24], eax add esp, 4 push ebp mov ecx, esp mov ebx, [ebp+4] mov eax, ebx mov eax, edx push eax mov ebp, ecx push eip call mult pop eax sub esp, 4 mov esp, ebp pop ebp mov [ebp+24], eaxL0: mov ebx, [ebp+20] mov eax, ebx push eax mov eax, 0 pop ebx sub eax, ebx neg eaxjle L1 mov ebx, [ebp+24] mov eax, ebx push eax mov eax, 2 pop ebx mul eax, ebx mov [ebp+24], eax mov ebx, [ebp+20] mov eax, ebx push eax mov eax, 1 pop ebx sub eax, ebx neg eax mov [ebp+20], eax jmp L0L1: mov eax, 0 mov ebx, 4 mul eax, ebx mov ebx, [ebp+4] mov eax, [eax+ebx] sub esp, 24 pop ebx push eax mov eax, 1 add ebx, eax mov eip, ebx__start: push ebp mov ebp, esp push eip call main pop eax sub esp, 0 mov esp, ebp pop ebp
While it was structured as procedural-based compilers it wasn't good enough to be presented. So for article purposes I tried to re-work everything in better structure from scratch. Not covering just practical implementation, but hitting a theory from time to time.
At that point I realized what all I want to put in the article(s):
- Arithmetic math operations on integers
- Arithmetic math on pointers and their reference/dereference operators
- Variables and arrays
- Standard control constructs (if, do-while, while, for)
- Function calling (e.g. define and use a call convention)
- Interaction with host application (the one running VM)
- More types - shorts, bytes and floats (and of course variable promotion)
So far, the articles are still somewhere in the middle, slowly getting finished. Although I'm already planning what is going to be next, but let's keep that for next entry.