4.7 KiB
ECE 222: Digital Computers
Exceptions
In ARM, anything that interrupts the normal control flow of a program is an exception.
- An interrupt from an interrupt request (IRQ) occurs when a peripheral wants to interrupt the current flow
- A fault indicates a CPU error (e.g., division by zero) and returns to the faulty instruction
- A trap runs the interrupt handler and returns to the next instruction
Exceptions are handled by running an exception handler then returning to the original line.
Vector table
A vector table is an array of handler addresses. Each index contains a number (a “vector”) and a priority.
Exception handling
First, in hardware: If the exception priority is higher than the current operating priority, the exception is immediately handled.
- the current context is pushed to the stack
- the operating mode is set to privileged
- the operating priority is set to the exception priority
- the program counter is set to the address of the exception
(
vector_table[exception_num]
)
Next, the handler runs, and it should:
- preserve the any R4-R11 it modifies
- clear the interrupt request (IRQ)
- restore R4-R11
- return with
BX LR
Finally, in hardware:
- the previous context is restored
- the previous operating priority and mode are restored
!!! warning Interrupts can interrupt other interrupts, if their priority is sufficiently high!
!!! example How to interrupt-driven I/O:
**Write the ISR:** Assuming that the IRQ bit is cleared if `R0` is read:
```asm
ISR PUSH {R4-R11} ; save previous state onto stack
LDR R3, [R0] ; clear the IRQ by reading from it
POP {R4-R11} ; restore state
BX LR ; return to original address
```
**Store the interrupt handler in the vector table:** Assuming that the vector number is `22` and the vector table starts 16 addresses after the 0x00:
```asm
MOV32 R0, #ISR ; handler address
MOV R1, #38 * 4 ; offset: (16 + 22) * 4 bytes per address
STR R1, [R0] ; save address to table
```
**Enable interrupt requests:**
```asm
MOV32 R0, #ADDRESS_INTERRUPT_ENABLE
MOV R1, #1
STR R1, [R0] ; enable interrupts
```
Processor design
Comparing the complex instruction set computer architecture to the reduced instruction set computer architecture:
Task | CISC | RISC |
---|---|---|
ALU operands can come from? | registers, memory | registers (load/store) |
Addressing mode | complex | simple |
Binary size | small | large (~30% larger) |
Instruction size | variable | fixed |
Pipelining | difficult | simple |
Operation encoding
The R-format is used for operations of the form
ADD Rd, Rn, Rm
:
\[\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{Rm}}^\text{5 b}\ \ \underbrace{\text{shift amount}}_\text{6 b}\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rd}}_\text{5 b}\]
The D-format is used for operations of the form
LDR Rt, [Rn, #offset]
:
\[\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{offset}}^\text{9 b}\ \ 00\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rt}}_\text{5 b}\]
The CB-format is used for operations of the form
CBZ Rt, LABEL
:
\[\underbrace{\text{op-code}}_\text{8 b}\ \ \overbrace{\text{offset}}^\text{19 b}\ \ \underbrace{\text{Rt}}_\text{5 b}\]
Instruction data path
To execute an instruction, the following steps are observed:
- Instruction fetch (IF)
- fetch the instruction from instruction memory
- increment the instruction address (
PC += 4
), latchedd into PC register at the end of the CPU cycle
- Instruction decode (ID)
- decode fields like the op-code, offset
- read recoded registers
- Execute (EX)
- ALU calculates ADD, SUB, etc, as well as addresses for LDR/STR, sets zero status for CBZ
- branch adder calculates any branch target addresses
- Memory (ME)
- if memory needs to be reached, either
Write
orRead
must be asserted to prepare for it - write to memory
- Writeback (WB)
- write results to registers from memory, the ALU, or another register
Performance
Each step in the instruction data path has a varying time, so the clock period must be at least as long as the slowest step.
Performance is usually compared by comparing the execution times of standard benchmarks, such that:
\[\text{time}=n_{instructions}\times\underbrace{\frac{\text{cycles}}{\text{instruction}}}_\text{CPI}\times\frac{\text{seconds}}{\text{cycle}}\]
Pipelining
Pipelining changes the granularity of a clock cycle to be per step, instead of per-instruction. This allows multiple instructions to be processed concurrently.
(Source: Wikimedia Commons)