blog 3

2026-03-09 17:10:07 -06:00
parent 98d1cdee0a
commit f681e7ef53
1 changed files with 202 additions and 0 deletions
--- a/blog-post-3.md
+++ b/blog-post-3.md
@@ -0,0 +1,202 @@
 # Blog Entry #3 — Function Calls in Spider
 In this entry I will explain how Spider handles function calls at the 
 VM level. This is one of the most important concepts in the language 
 because it defines how functions communicate with each other, how 
 arguments are passed, and how the machine state is preserved and 
 restored after every call.
 ---
 ## What is a Calling Convention?
 A calling convention is a set of rules that defines:
 - How arguments are passed to a function
 - Where the return value is stored
 - Which registers must be preserved across a call
 - Who is responsible for cleaning up after the call
 Without this agreement, two parts of a program could make conflicting 
 assumptions and corrupt each other's data. Spider's calling convention 
 is designed to be efficient by prioritizing registers over the stack, 
 which is critical for constrained hardware like the ATmega328p (2KB RAM).
 ---
 ## The Registers Involved
 Before explaining the steps, it helps to know which registers play 
 a role in a function call:
 | Register       | Role                                          |
 |----------------|-----------------------------------------------|
 | `RA`           | First argument and return value               |
 | `RB`, `RC`, `RD` | Second, third, fourth arguments             |
 | `R8`, `R9`     | Fifth and sixth arguments                     |
 | `R0`–`R3`      | Caller-saved: may be destroyed by the callee  |
 | `R4`–`R7`      | Callee-saved: must be restored by the callee  |
 | `RS`           | Stack pointer, tracks the top of the stack    |
 | `RZ`           | Stack base pointer, reference for local vars  |
 ---
 ## Step by Step: Making a Function Call
 Consider this high-level Spider call:
 ```
 result = square(x)
 ```
 The compiler translates this into the following sequence of operations:
 ### Step 1 — Save caller-saved registers
 Before touching anything, the caller pushes R0–R3 onto the stack. 
 These registers may be freely overwritten by the called function, 
 so if the caller needs their values after the call, they must be 
 saved first.
 ```
 Stack: [R0][R1][R2][R3]  ← RS points here
 ```
 ### Step 2 — Handle large return values
 If the function returns more than 16 bytes (128 bits), the caller 
 reserves space on the stack for the result and adds a pointer to 
 that space at the front of the argument list. For normal cases 
 (return value fits in registers), this step is skipped.
 ### Step 3 — Place arguments in registers
 Arguments are placed in registers in order: RA, RB, RC, RD, R8, R9.
 For our example, `x` goes into `RA`:
 ```
 RA = x
 ```
 **Special case — Booleans:**
 Boolean values (1 bit) are too small to occupy a full register each.
 Instead they are accumulated in a queue and packed together before 
 being placed in a single register. If the queue fills 64 bits it is 
 sent immediately. Any remaining booleans are zero-padded and sent 
 when arguments are exhausted.
 ### Step 4 — Overflow arguments go to the stack
 If there are more arguments than available registers (more than 6), 
 the remaining ones are pushed onto the stack in this order:
 1. Small arguments (up to 8 bytes) are pushed first
 2. Large arguments (more than 8 bytes) are pushed last
 ```
 Stack: [R0][R1][R2][R3][arg7][arg8]  ← RS points here
 ```
 ### Step 5 — Execute CALL
 The compiler emits the `CALL` instruction with the function's address.
 This automatically pushes the return address onto the stack and jumps 
 to the function. `RI` (the instruction register) now points to the 
 first instruction of the called function.
 ```
 Stack: [R0][R1][R2][R3][return address]  ← RS points here
 RI → points to square()
 RZ → updated to the new stack frame base
 ```
 ### Step 6 — The function executes
 Inside `square()`, the function reads its arguments from RA, RB, etc.,
 does its work, and places the return value in `RA` before returning.
 The callee-saved registers R4–R7, if used, must be restored before 
 returning.
 ### Step 7 — Return
 The function executes `RET`. This pops the return address from the 
 stack and jumps back to the instruction immediately after the `CALL`.
 ```
 RI → back to the instruction after CALL
 RZ → restored to the caller's frame
 ```
 ### Step 8 — Collect the result
 The caller reads the result from `RA`. For return values up to 16 bytes,
 RA, RB, RC and RD are used. For larger values, the result was already 
 written to the space reserved in Step 2.
 ```
 result = RA
 ```
 ### Step 9 — Clean up the stack
 The caller removes any arguments that were pushed to the stack in Step 4,
 and pops the caller-saved registers in reverse order to restore them.
 ```
 Stack: []  ← clean, exactly as before the call
 RS = 0
 R0, R1, R2, R3 restored to their original values
 ```
 ---
 ## Complete State Diagram
 ```
 BEFORE CALL:
  RA = ?   RB = ?   RS → bottom of stack
  R0 = 5   R1 = 3   Stack: []
 AFTER do_function_call(x):
  RA = x              ← argument placed
  RS → 5              ← 4 caller-saved + return address
  Stack: [R0][R1][R2][R3][return addr]
 INSIDE square():
  RA = x * x          ← result computed
  RZ → new frame base
 AFTER undo_function_call():
  RA = result         ← return value
  R0 = 5              ← restored
  R1 = 3              ← restored
  RS → 0              ← stack clean
  Stack: []
 ```
 ---
 ## Why registers instead of the stack?
 Many virtual machines (like the JVM) are stack-based, meaning almost 
 every operation pushes and pops values from the stack. Spider instead 
 uses registers as the primary mechanism for passing data.
 The reason is performance and memory efficiency. On a microcontroller 
 with only 2KB of RAM, every push and pop costs precious memory and 
 time. Registers are fixed, always available, and require no memory 
 allocation. A well-written Spider program can make complex function 
 calls with almost no stack usage beyond saving the caller-saved registers.
 ---
 ## Python Implementation
 As part of this internship, I implemented a Python simulation of this 
 calling convention in `calling-convention/calling-convention.ipynb`. 
 The simulation models the Spider VM as a Python dictionary with all 
 registers and a stack represented as a list.
 The two key functions implemented are:
 - `do_function_call(input_params, output_return)` — simulates all steps 
  before and including the CALL instruction
 - `undo_function_call(input_params, output_return)` — simulates all steps 
  after the CALL, collecting the result and restoring the machine state
 Three test cases were validated: a single byte argument, multiple 
 arguments overflowing to the stack, and packed boolean arguments.
 All three confirmed that the stack returns to its exact original state 
 after every call, proving the algorithm correctly preserves the machine 
 context.