blog 3
This commit is contained in:
202
blog-post-3.md
202
blog-post-3.md
@@ -0,0 +1,202 @@
|
||||
# Blog Entry #3 — Function Calls in Spider
|
||||
|
||||
In this entry I will explain how Spider handles function calls at the
|
||||
VM level. This is one of the most important concepts in the language
|
||||
because it defines how functions communicate with each other, how
|
||||
arguments are passed, and how the machine state is preserved and
|
||||
restored after every call.
|
||||
|
||||
---
|
||||
|
||||
## What is a Calling Convention?
|
||||
|
||||
A calling convention is a set of rules that defines:
|
||||
|
||||
- How arguments are passed to a function
|
||||
- Where the return value is stored
|
||||
- Which registers must be preserved across a call
|
||||
- Who is responsible for cleaning up after the call
|
||||
|
||||
Without this agreement, two parts of a program could make conflicting
|
||||
assumptions and corrupt each other's data. Spider's calling convention
|
||||
is designed to be efficient by prioritizing registers over the stack,
|
||||
which is critical for constrained hardware like the ATmega328p (2KB RAM).
|
||||
|
||||
---
|
||||
|
||||
## The Registers Involved
|
||||
|
||||
Before explaining the steps, it helps to know which registers play
|
||||
a role in a function call:
|
||||
|
||||
| Register | Role |
|
||||
|----------------|-----------------------------------------------|
|
||||
| `RA` | First argument and return value |
|
||||
| `RB`, `RC`, `RD` | Second, third, fourth arguments |
|
||||
| `R8`, `R9` | Fifth and sixth arguments |
|
||||
| `R0`–`R3` | Caller-saved: may be destroyed by the callee |
|
||||
| `R4`–`R7` | Callee-saved: must be restored by the callee |
|
||||
| `RS` | Stack pointer, tracks the top of the stack |
|
||||
| `RZ` | Stack base pointer, reference for local vars |
|
||||
|
||||
---
|
||||
|
||||
## Step by Step: Making a Function Call
|
||||
|
||||
Consider this high-level Spider call:
|
||||
```
|
||||
result = square(x)
|
||||
```
|
||||
|
||||
The compiler translates this into the following sequence of operations:
|
||||
|
||||
### Step 1 — Save caller-saved registers
|
||||
|
||||
Before touching anything, the caller pushes R0–R3 onto the stack.
|
||||
These registers may be freely overwritten by the called function,
|
||||
so if the caller needs their values after the call, they must be
|
||||
saved first.
|
||||
```
|
||||
Stack: [R0][R1][R2][R3] ← RS points here
|
||||
```
|
||||
|
||||
### Step 2 — Handle large return values
|
||||
|
||||
If the function returns more than 16 bytes (128 bits), the caller
|
||||
reserves space on the stack for the result and adds a pointer to
|
||||
that space at the front of the argument list. For normal cases
|
||||
(return value fits in registers), this step is skipped.
|
||||
|
||||
### Step 3 — Place arguments in registers
|
||||
|
||||
Arguments are placed in registers in order: RA, RB, RC, RD, R8, R9.
|
||||
For our example, `x` goes into `RA`:
|
||||
```
|
||||
RA = x
|
||||
```
|
||||
|
||||
**Special case — Booleans:**
|
||||
Boolean values (1 bit) are too small to occupy a full register each.
|
||||
Instead they are accumulated in a queue and packed together before
|
||||
being placed in a single register. If the queue fills 64 bits it is
|
||||
sent immediately. Any remaining booleans are zero-padded and sent
|
||||
when arguments are exhausted.
|
||||
|
||||
### Step 4 — Overflow arguments go to the stack
|
||||
|
||||
If there are more arguments than available registers (more than 6),
|
||||
the remaining ones are pushed onto the stack in this order:
|
||||
|
||||
1. Small arguments (up to 8 bytes) are pushed first
|
||||
2. Large arguments (more than 8 bytes) are pushed last
|
||||
```
|
||||
Stack: [R0][R1][R2][R3][arg7][arg8] ← RS points here
|
||||
```
|
||||
|
||||
### Step 5 — Execute CALL
|
||||
|
||||
The compiler emits the `CALL` instruction with the function's address.
|
||||
This automatically pushes the return address onto the stack and jumps
|
||||
to the function. `RI` (the instruction register) now points to the
|
||||
first instruction of the called function.
|
||||
```
|
||||
Stack: [R0][R1][R2][R3][return address] ← RS points here
|
||||
RI → points to square()
|
||||
RZ → updated to the new stack frame base
|
||||
```
|
||||
|
||||
### Step 6 — The function executes
|
||||
|
||||
Inside `square()`, the function reads its arguments from RA, RB, etc.,
|
||||
does its work, and places the return value in `RA` before returning.
|
||||
The callee-saved registers R4–R7, if used, must be restored before
|
||||
returning.
|
||||
|
||||
### Step 7 — Return
|
||||
|
||||
The function executes `RET`. This pops the return address from the
|
||||
stack and jumps back to the instruction immediately after the `CALL`.
|
||||
```
|
||||
RI → back to the instruction after CALL
|
||||
RZ → restored to the caller's frame
|
||||
```
|
||||
|
||||
### Step 8 — Collect the result
|
||||
|
||||
The caller reads the result from `RA`. For return values up to 16 bytes,
|
||||
RA, RB, RC and RD are used. For larger values, the result was already
|
||||
written to the space reserved in Step 2.
|
||||
```
|
||||
result = RA
|
||||
```
|
||||
|
||||
### Step 9 — Clean up the stack
|
||||
|
||||
The caller removes any arguments that were pushed to the stack in Step 4,
|
||||
and pops the caller-saved registers in reverse order to restore them.
|
||||
```
|
||||
Stack: [] ← clean, exactly as before the call
|
||||
RS = 0
|
||||
R0, R1, R2, R3 restored to their original values
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Complete State Diagram
|
||||
```
|
||||
BEFORE CALL:
|
||||
RA = ? RB = ? RS → bottom of stack
|
||||
R0 = 5 R1 = 3 Stack: []
|
||||
|
||||
AFTER do_function_call(x):
|
||||
RA = x ← argument placed
|
||||
RS → 5 ← 4 caller-saved + return address
|
||||
Stack: [R0][R1][R2][R3][return addr]
|
||||
|
||||
INSIDE square():
|
||||
RA = x * x ← result computed
|
||||
RZ → new frame base
|
||||
|
||||
AFTER undo_function_call():
|
||||
RA = result ← return value
|
||||
R0 = 5 ← restored
|
||||
R1 = 3 ← restored
|
||||
RS → 0 ← stack clean
|
||||
Stack: []
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Why registers instead of the stack?
|
||||
|
||||
Many virtual machines (like the JVM) are stack-based, meaning almost
|
||||
every operation pushes and pops values from the stack. Spider instead
|
||||
uses registers as the primary mechanism for passing data.
|
||||
|
||||
The reason is performance and memory efficiency. On a microcontroller
|
||||
with only 2KB of RAM, every push and pop costs precious memory and
|
||||
time. Registers are fixed, always available, and require no memory
|
||||
allocation. A well-written Spider program can make complex function
|
||||
calls with almost no stack usage beyond saving the caller-saved registers.
|
||||
|
||||
---
|
||||
|
||||
## Python Implementation
|
||||
|
||||
As part of this internship, I implemented a Python simulation of this
|
||||
calling convention in `calling-convention/calling-convention.ipynb`.
|
||||
The simulation models the Spider VM as a Python dictionary with all
|
||||
registers and a stack represented as a list.
|
||||
|
||||
The two key functions implemented are:
|
||||
|
||||
- `do_function_call(input_params, output_return)` — simulates all steps
|
||||
before and including the CALL instruction
|
||||
- `undo_function_call(input_params, output_return)` — simulates all steps
|
||||
after the CALL, collecting the result and restoring the machine state
|
||||
|
||||
Three test cases were validated: a single byte argument, multiple
|
||||
arguments overflowing to the stack, and packed boolean arguments.
|
||||
All three confirmed that the stack returns to its exact original state
|
||||
after every call, proving the algorithm correctly preserves the machine
|
||||
context.
|
||||
Reference in New Issue
Block a user