diff --git a/blog-post-3.md b/blog-post-3.md index e69de29..d31d766 100644 --- a/blog-post-3.md +++ b/blog-post-3.md @@ -0,0 +1,202 @@ +# Blog Entry #3 — Function Calls in Spider + +In this entry I will explain how Spider handles function calls at the +VM level. This is one of the most important concepts in the language +because it defines how functions communicate with each other, how +arguments are passed, and how the machine state is preserved and +restored after every call. + +--- + +## What is a Calling Convention? + +A calling convention is a set of rules that defines: + +- How arguments are passed to a function +- Where the return value is stored +- Which registers must be preserved across a call +- Who is responsible for cleaning up after the call + +Without this agreement, two parts of a program could make conflicting +assumptions and corrupt each other's data. Spider's calling convention +is designed to be efficient by prioritizing registers over the stack, +which is critical for constrained hardware like the ATmega328p (2KB RAM). + +--- + +## The Registers Involved + +Before explaining the steps, it helps to know which registers play +a role in a function call: + +| Register | Role | +|----------------|-----------------------------------------------| +| `RA` | First argument and return value | +| `RB`, `RC`, `RD` | Second, third, fourth arguments | +| `R8`, `R9` | Fifth and sixth arguments | +| `R0`–`R3` | Caller-saved: may be destroyed by the callee | +| `R4`–`R7` | Callee-saved: must be restored by the callee | +| `RS` | Stack pointer, tracks the top of the stack | +| `RZ` | Stack base pointer, reference for local vars | + +--- + +## Step by Step: Making a Function Call + +Consider this high-level Spider call: +``` +result = square(x) +``` + +The compiler translates this into the following sequence of operations: + +### Step 1 — Save caller-saved registers + +Before touching anything, the caller pushes R0–R3 onto the stack. +These registers may be freely overwritten by the called function, +so if the caller needs their values after the call, they must be +saved first. +``` +Stack: [R0][R1][R2][R3] ← RS points here +``` + +### Step 2 — Handle large return values + +If the function returns more than 16 bytes (128 bits), the caller +reserves space on the stack for the result and adds a pointer to +that space at the front of the argument list. For normal cases +(return value fits in registers), this step is skipped. + +### Step 3 — Place arguments in registers + +Arguments are placed in registers in order: RA, RB, RC, RD, R8, R9. +For our example, `x` goes into `RA`: +``` +RA = x +``` + +**Special case — Booleans:** +Boolean values (1 bit) are too small to occupy a full register each. +Instead they are accumulated in a queue and packed together before +being placed in a single register. If the queue fills 64 bits it is +sent immediately. Any remaining booleans are zero-padded and sent +when arguments are exhausted. + +### Step 4 — Overflow arguments go to the stack + +If there are more arguments than available registers (more than 6), +the remaining ones are pushed onto the stack in this order: + +1. Small arguments (up to 8 bytes) are pushed first +2. Large arguments (more than 8 bytes) are pushed last +``` +Stack: [R0][R1][R2][R3][arg7][arg8] ← RS points here +``` + +### Step 5 — Execute CALL + +The compiler emits the `CALL` instruction with the function's address. +This automatically pushes the return address onto the stack and jumps +to the function. `RI` (the instruction register) now points to the +first instruction of the called function. +``` +Stack: [R0][R1][R2][R3][return address] ← RS points here +RI → points to square() +RZ → updated to the new stack frame base +``` + +### Step 6 — The function executes + +Inside `square()`, the function reads its arguments from RA, RB, etc., +does its work, and places the return value in `RA` before returning. +The callee-saved registers R4–R7, if used, must be restored before +returning. + +### Step 7 — Return + +The function executes `RET`. This pops the return address from the +stack and jumps back to the instruction immediately after the `CALL`. +``` +RI → back to the instruction after CALL +RZ → restored to the caller's frame +``` + +### Step 8 — Collect the result + +The caller reads the result from `RA`. For return values up to 16 bytes, +RA, RB, RC and RD are used. For larger values, the result was already +written to the space reserved in Step 2. +``` +result = RA +``` + +### Step 9 — Clean up the stack + +The caller removes any arguments that were pushed to the stack in Step 4, +and pops the caller-saved registers in reverse order to restore them. +``` +Stack: [] ← clean, exactly as before the call +RS = 0 +R0, R1, R2, R3 restored to their original values +``` + +--- + +## Complete State Diagram +``` +BEFORE CALL: + RA = ? RB = ? RS → bottom of stack + R0 = 5 R1 = 3 Stack: [] + +AFTER do_function_call(x): + RA = x ← argument placed + RS → 5 ← 4 caller-saved + return address + Stack: [R0][R1][R2][R3][return addr] + +INSIDE square(): + RA = x * x ← result computed + RZ → new frame base + +AFTER undo_function_call(): + RA = result ← return value + R0 = 5 ← restored + R1 = 3 ← restored + RS → 0 ← stack clean + Stack: [] +``` + +--- + +## Why registers instead of the stack? + +Many virtual machines (like the JVM) are stack-based, meaning almost +every operation pushes and pops values from the stack. Spider instead +uses registers as the primary mechanism for passing data. + +The reason is performance and memory efficiency. On a microcontroller +with only 2KB of RAM, every push and pop costs precious memory and +time. Registers are fixed, always available, and require no memory +allocation. A well-written Spider program can make complex function +calls with almost no stack usage beyond saving the caller-saved registers. + +--- + +## Python Implementation + +As part of this internship, I implemented a Python simulation of this +calling convention in `calling-convention/calling-convention.ipynb`. +The simulation models the Spider VM as a Python dictionary with all +registers and a stack represented as a list. + +The two key functions implemented are: + +- `do_function_call(input_params, output_return)` — simulates all steps + before and including the CALL instruction +- `undo_function_call(input_params, output_return)` — simulates all steps + after the CALL, collecting the result and restoring the machine state + +Three test cases were validated: a single byte argument, multiple +arguments overflowing to the stack, and packed boolean arguments. +All three confirmed that the stack returns to its exact original state +after every call, proving the algorithm correctly preserves the machine +context. \ No newline at end of file