blog 3
This commit is contained in:
202
blog-post-3.md
202
blog-post-3.md
@@ -0,0 +1,202 @@
|
|||||||
|
# Blog Entry #3 — Function Calls in Spider
|
||||||
|
|
||||||
|
In this entry I will explain how Spider handles function calls at the
|
||||||
|
VM level. This is one of the most important concepts in the language
|
||||||
|
because it defines how functions communicate with each other, how
|
||||||
|
arguments are passed, and how the machine state is preserved and
|
||||||
|
restored after every call.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What is a Calling Convention?
|
||||||
|
|
||||||
|
A calling convention is a set of rules that defines:
|
||||||
|
|
||||||
|
- How arguments are passed to a function
|
||||||
|
- Where the return value is stored
|
||||||
|
- Which registers must be preserved across a call
|
||||||
|
- Who is responsible for cleaning up after the call
|
||||||
|
|
||||||
|
Without this agreement, two parts of a program could make conflicting
|
||||||
|
assumptions and corrupt each other's data. Spider's calling convention
|
||||||
|
is designed to be efficient by prioritizing registers over the stack,
|
||||||
|
which is critical for constrained hardware like the ATmega328p (2KB RAM).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## The Registers Involved
|
||||||
|
|
||||||
|
Before explaining the steps, it helps to know which registers play
|
||||||
|
a role in a function call:
|
||||||
|
|
||||||
|
| Register | Role |
|
||||||
|
|----------------|-----------------------------------------------|
|
||||||
|
| `RA` | First argument and return value |
|
||||||
|
| `RB`, `RC`, `RD` | Second, third, fourth arguments |
|
||||||
|
| `R8`, `R9` | Fifth and sixth arguments |
|
||||||
|
| `R0`–`R3` | Caller-saved: may be destroyed by the callee |
|
||||||
|
| `R4`–`R7` | Callee-saved: must be restored by the callee |
|
||||||
|
| `RS` | Stack pointer, tracks the top of the stack |
|
||||||
|
| `RZ` | Stack base pointer, reference for local vars |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step by Step: Making a Function Call
|
||||||
|
|
||||||
|
Consider this high-level Spider call:
|
||||||
|
```
|
||||||
|
result = square(x)
|
||||||
|
```
|
||||||
|
|
||||||
|
The compiler translates this into the following sequence of operations:
|
||||||
|
|
||||||
|
### Step 1 — Save caller-saved registers
|
||||||
|
|
||||||
|
Before touching anything, the caller pushes R0–R3 onto the stack.
|
||||||
|
These registers may be freely overwritten by the called function,
|
||||||
|
so if the caller needs their values after the call, they must be
|
||||||
|
saved first.
|
||||||
|
```
|
||||||
|
Stack: [R0][R1][R2][R3] ← RS points here
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2 — Handle large return values
|
||||||
|
|
||||||
|
If the function returns more than 16 bytes (128 bits), the caller
|
||||||
|
reserves space on the stack for the result and adds a pointer to
|
||||||
|
that space at the front of the argument list. For normal cases
|
||||||
|
(return value fits in registers), this step is skipped.
|
||||||
|
|
||||||
|
### Step 3 — Place arguments in registers
|
||||||
|
|
||||||
|
Arguments are placed in registers in order: RA, RB, RC, RD, R8, R9.
|
||||||
|
For our example, `x` goes into `RA`:
|
||||||
|
```
|
||||||
|
RA = x
|
||||||
|
```
|
||||||
|
|
||||||
|
**Special case — Booleans:**
|
||||||
|
Boolean values (1 bit) are too small to occupy a full register each.
|
||||||
|
Instead they are accumulated in a queue and packed together before
|
||||||
|
being placed in a single register. If the queue fills 64 bits it is
|
||||||
|
sent immediately. Any remaining booleans are zero-padded and sent
|
||||||
|
when arguments are exhausted.
|
||||||
|
|
||||||
|
### Step 4 — Overflow arguments go to the stack
|
||||||
|
|
||||||
|
If there are more arguments than available registers (more than 6),
|
||||||
|
the remaining ones are pushed onto the stack in this order:
|
||||||
|
|
||||||
|
1. Small arguments (up to 8 bytes) are pushed first
|
||||||
|
2. Large arguments (more than 8 bytes) are pushed last
|
||||||
|
```
|
||||||
|
Stack: [R0][R1][R2][R3][arg7][arg8] ← RS points here
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 5 — Execute CALL
|
||||||
|
|
||||||
|
The compiler emits the `CALL` instruction with the function's address.
|
||||||
|
This automatically pushes the return address onto the stack and jumps
|
||||||
|
to the function. `RI` (the instruction register) now points to the
|
||||||
|
first instruction of the called function.
|
||||||
|
```
|
||||||
|
Stack: [R0][R1][R2][R3][return address] ← RS points here
|
||||||
|
RI → points to square()
|
||||||
|
RZ → updated to the new stack frame base
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 6 — The function executes
|
||||||
|
|
||||||
|
Inside `square()`, the function reads its arguments from RA, RB, etc.,
|
||||||
|
does its work, and places the return value in `RA` before returning.
|
||||||
|
The callee-saved registers R4–R7, if used, must be restored before
|
||||||
|
returning.
|
||||||
|
|
||||||
|
### Step 7 — Return
|
||||||
|
|
||||||
|
The function executes `RET`. This pops the return address from the
|
||||||
|
stack and jumps back to the instruction immediately after the `CALL`.
|
||||||
|
```
|
||||||
|
RI → back to the instruction after CALL
|
||||||
|
RZ → restored to the caller's frame
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 8 — Collect the result
|
||||||
|
|
||||||
|
The caller reads the result from `RA`. For return values up to 16 bytes,
|
||||||
|
RA, RB, RC and RD are used. For larger values, the result was already
|
||||||
|
written to the space reserved in Step 2.
|
||||||
|
```
|
||||||
|
result = RA
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 9 — Clean up the stack
|
||||||
|
|
||||||
|
The caller removes any arguments that were pushed to the stack in Step 4,
|
||||||
|
and pops the caller-saved registers in reverse order to restore them.
|
||||||
|
```
|
||||||
|
Stack: [] ← clean, exactly as before the call
|
||||||
|
RS = 0
|
||||||
|
R0, R1, R2, R3 restored to their original values
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Complete State Diagram
|
||||||
|
```
|
||||||
|
BEFORE CALL:
|
||||||
|
RA = ? RB = ? RS → bottom of stack
|
||||||
|
R0 = 5 R1 = 3 Stack: []
|
||||||
|
|
||||||
|
AFTER do_function_call(x):
|
||||||
|
RA = x ← argument placed
|
||||||
|
RS → 5 ← 4 caller-saved + return address
|
||||||
|
Stack: [R0][R1][R2][R3][return addr]
|
||||||
|
|
||||||
|
INSIDE square():
|
||||||
|
RA = x * x ← result computed
|
||||||
|
RZ → new frame base
|
||||||
|
|
||||||
|
AFTER undo_function_call():
|
||||||
|
RA = result ← return value
|
||||||
|
R0 = 5 ← restored
|
||||||
|
R1 = 3 ← restored
|
||||||
|
RS → 0 ← stack clean
|
||||||
|
Stack: []
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Why registers instead of the stack?
|
||||||
|
|
||||||
|
Many virtual machines (like the JVM) are stack-based, meaning almost
|
||||||
|
every operation pushes and pops values from the stack. Spider instead
|
||||||
|
uses registers as the primary mechanism for passing data.
|
||||||
|
|
||||||
|
The reason is performance and memory efficiency. On a microcontroller
|
||||||
|
with only 2KB of RAM, every push and pop costs precious memory and
|
||||||
|
time. Registers are fixed, always available, and require no memory
|
||||||
|
allocation. A well-written Spider program can make complex function
|
||||||
|
calls with almost no stack usage beyond saving the caller-saved registers.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Python Implementation
|
||||||
|
|
||||||
|
As part of this internship, I implemented a Python simulation of this
|
||||||
|
calling convention in `calling-convention/calling-convention.ipynb`.
|
||||||
|
The simulation models the Spider VM as a Python dictionary with all
|
||||||
|
registers and a stack represented as a list.
|
||||||
|
|
||||||
|
The two key functions implemented are:
|
||||||
|
|
||||||
|
- `do_function_call(input_params, output_return)` — simulates all steps
|
||||||
|
before and including the CALL instruction
|
||||||
|
- `undo_function_call(input_params, output_return)` — simulates all steps
|
||||||
|
after the CALL, collecting the result and restoring the machine state
|
||||||
|
|
||||||
|
Three test cases were validated: a single byte argument, multiple
|
||||||
|
arguments overflowing to the stack, and packed boolean arguments.
|
||||||
|
All three confirmed that the stack returns to its exact original state
|
||||||
|
after every call, proving the algorithm correctly preserves the machine
|
||||||
|
context.
|
||||||
Reference in New Issue
Block a user