blog 3

2026-03-09 17:10:07 -06:00
parent 98d1cdee0a
commit f681e7ef53
1 changed files with 202 additions and 0 deletions
--- a/blog-post-3.md
+++ b/blog-post-3.md
@@ -0,0 +1,202 @@
+# Blog Entry #3 — Function Calls in Spider
+
+In this entry I will explain how Spider handles function calls at the 
+VM level. This is one of the most important concepts in the language 
+because it defines how functions communicate with each other, how 
+arguments are passed, and how the machine state is preserved and 
+restored after every call.
+
+---
+
+## What is a Calling Convention?
+
+A calling convention is a set of rules that defines:
+
+- How arguments are passed to a function
+- Where the return value is stored
+- Which registers must be preserved across a call
+- Who is responsible for cleaning up after the call
+
+Without this agreement, two parts of a program could make conflicting 
+assumptions and corrupt each other's data. Spider's calling convention 
+is designed to be efficient by prioritizing registers over the stack, 
+which is critical for constrained hardware like the ATmega328p (2KB RAM).
+
+---
+
+## The Registers Involved
+
+Before explaining the steps, it helps to know which registers play 
+a role in a function call:
+
+| Register       | Role                                          |
+|----------------|-----------------------------------------------|
+| `RA`           | First argument and return value               |
+| `RB`, `RC`, `RD` | Second, third, fourth arguments             |
+| `R8`, `R9`     | Fifth and sixth arguments                     |
+| `R0`–`R3`      | Caller-saved: may be destroyed by the callee  |
+| `R4`–`R7`      | Callee-saved: must be restored by the callee  |
+| `RS`           | Stack pointer, tracks the top of the stack    |
+| `RZ`           | Stack base pointer, reference for local vars  |
+
+---
+
+## Step by Step: Making a Function Call
+
+Consider this high-level Spider call:
+```
+result = square(x)
+```
+
+The compiler translates this into the following sequence of operations:
+
+### Step 1 — Save caller-saved registers
+
+Before touching anything, the caller pushes R0–R3 onto the stack. 
+These registers may be freely overwritten by the called function, 
+so if the caller needs their values after the call, they must be 
+saved first.
+```
+Stack: [R0][R1][R2][R3]  ← RS points here
+```
+
+### Step 2 — Handle large return values
+
+If the function returns more than 16 bytes (128 bits), the caller 
+reserves space on the stack for the result and adds a pointer to 
+that space at the front of the argument list. For normal cases 
+(return value fits in registers), this step is skipped.
+
+### Step 3 — Place arguments in registers
+
+Arguments are placed in registers in order: RA, RB, RC, RD, R8, R9.
+For our example, `x` goes into `RA`:
+```
+RA = x
+```
+
+**Special case — Booleans:**
+Boolean values (1 bit) are too small to occupy a full register each.
+Instead they are accumulated in a queue and packed together before 
+being placed in a single register. If the queue fills 64 bits it is 
+sent immediately. Any remaining booleans are zero-padded and sent 
+when arguments are exhausted.
+
+### Step 4 — Overflow arguments go to the stack
+
+If there are more arguments than available registers (more than 6), 
+the remaining ones are pushed onto the stack in this order:
+
+1. Small arguments (up to 8 bytes) are pushed first
+2. Large arguments (more than 8 bytes) are pushed last
+```
+Stack: [R0][R1][R2][R3][arg7][arg8]  ← RS points here
+```
+
+### Step 5 — Execute CALL
+
+The compiler emits the `CALL` instruction with the function's address.
+This automatically pushes the return address onto the stack and jumps 
+to the function. `RI` (the instruction register) now points to the 
+first instruction of the called function.
+```
+Stack: [R0][R1][R2][R3][return address]  ← RS points here
+RI → points to square()
+RZ → updated to the new stack frame base
+```
+
+### Step 6 — The function executes
+
+Inside `square()`, the function reads its arguments from RA, RB, etc.,
+does its work, and places the return value in `RA` before returning.
+The callee-saved registers R4–R7, if used, must be restored before 
+returning.
+
+### Step 7 — Return
+
+The function executes `RET`. This pops the return address from the 
+stack and jumps back to the instruction immediately after the `CALL`.
+```
+RI → back to the instruction after CALL
+RZ → restored to the caller's frame
+```
+
+### Step 8 — Collect the result
+
+The caller reads the result from `RA`. For return values up to 16 bytes,
+RA, RB, RC and RD are used. For larger values, the result was already 
+written to the space reserved in Step 2.
+```
+result = RA
+```
+
+### Step 9 — Clean up the stack
+
+The caller removes any arguments that were pushed to the stack in Step 4,
+and pops the caller-saved registers in reverse order to restore them.
+```
+Stack: []  ← clean, exactly as before the call
+RS = 0
+R0, R1, R2, R3 restored to their original values
+```
+
+---
+
+## Complete State Diagram
+```
+BEFORE CALL:
+  RA = ?   RB = ?   RS → bottom of stack
+  R0 = 5   R1 = 3   Stack: []
+
+AFTER do_function_call(x):
+  RA = x              ← argument placed
+  RS → 5              ← 4 caller-saved + return address
+  Stack: [R0][R1][R2][R3][return addr]
+
+INSIDE square():
+  RA = x * x          ← result computed
+  RZ → new frame base
+
+AFTER undo_function_call():
+  RA = result         ← return value
+  R0 = 5              ← restored
+  R1 = 3              ← restored
+  RS → 0              ← stack clean
+  Stack: []
+```
+
+---
+
+## Why registers instead of the stack?
+
+Many virtual machines (like the JVM) are stack-based, meaning almost 
+every operation pushes and pops values from the stack. Spider instead 
+uses registers as the primary mechanism for passing data.
+
+The reason is performance and memory efficiency. On a microcontroller 
+with only 2KB of RAM, every push and pop costs precious memory and 
+time. Registers are fixed, always available, and require no memory 
+allocation. A well-written Spider program can make complex function 
+calls with almost no stack usage beyond saving the caller-saved registers.
+
+---
+
+## Python Implementation
+
+As part of this internship, I implemented a Python simulation of this 
+calling convention in `calling-convention/calling-convention.ipynb`. 
+The simulation models the Spider VM as a Python dictionary with all 
+registers and a stack represented as a list.
+
+The two key functions implemented are:
+
+- `do_function_call(input_params, output_return)` — simulates all steps 
+  before and including the CALL instruction
+- `undo_function_call(input_params, output_return)` — simulates all steps 
+  after the CALL, collecting the result and restoring the machine state
+
+Three test cases were validated: a single byte argument, multiple 
+arguments overflowing to the stack, and packed boolean arguments.
+All three confirmed that the stack returns to its exact original state 
+after every call, proving the algorithm correctly preserves the machine 
+context.