# ECE 222: Digital Computers ## Exceptions In ARM, anything that interrupts the normal control flow of a program is an exception. - An **interrupt** from an **interrupt request (IRQ)** occurs when a peripheral wants to interrupt the current flow - A **fault** indicates a CPU error (e.g., division by zero) and returns to the faulty instruction - A **trap** runs the interrupt handler and returns to the next instruction Exceptions are handled by running an exception handler then returning to the original line. ### Vector table A vector table is an array of handler addresses. Each index contains a number (a "vector") and a priority. ### Exception handling First, in hardware: If the exception priority is higher than the current operating priority, the exception is immediately handled. - the current context is pushed to the stack - the operating mode is set to privileged - the operating priority is set to the exception priority - the program counter is set to the address of the exception (`vector_table[exception_num]`) Next, the handler runs, and it should: - preserve the any R4-R11 it modifies - clear the interrupt request (IRQ) - restore R4-R11 - return with `BX LR` Finally, in hardware: - the previous context is restored - the previous operating priority and mode are restored !!! warning Interrupts can interrupt other interrupts, if their priority is sufficiently high! !!! example How to interrupt-driven I/O: **Write the ISR:** Assuming that the IRQ bit is cleared if `R0` is read: ```asm ISR PUSH {R4-R11} ; save previous state onto stack LDR R3, [R0] ; clear the IRQ by reading from it POP {R4-R11} ; restore state BX LR ; return to original address ``` **Store the interrupt handler in the vector table:** Assuming that the vector number is `22` and the vector table starts 16 addresses after the 0x00: ```asm MOV32 R0, #ISR ; handler address MOV R1, #38 * 4 ; offset: (16 + 22) * 4 bytes per address STR R1, [R0] ; save address to table ``` **Enable interrupt requests:** ```asm MOV32 R0, #ADDRESS_INTERRUPT_ENABLE MOV R1, #1 STR R1, [R0] ; enable interrupts ``` ## Processor design Comparing the **complex instruction set computer** architecture to the **reduced instruction set computer** architecture: | Task | CISC | RISC | | ---- | ---- | ---- | | ALU operands can come from? | registers, memory | registers (load/store) | | Addressing mode | complex | simple | | Binary size | small | large (~30% larger) | | Instruction size | variable | fixed | | Pipelining | difficult | simple | ### Operation encoding The **R-format** is used for operations of the form `ADD Rd, Rn, Rm`: $$\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{Rm}}^\text{5 b}\ \ \underbrace{\text{shift amount}}_\text{6 b}\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rd}}_\text{5 b}$$ The **D-format** is used for operations of the form `LDR Rt, [Rn, #offset]`: $$\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{offset}}^\text{9 b}\ \ 00\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rt}}_\text{5 b}$$ The **CB-format** is used for operations of the form `CBZ Rt, LABEL`: $$\underbrace{\text{op-code}}_\text{8 b}\ \ \overbrace{\text{offset}}^\text{19 b}\ \ \underbrace{\text{Rt}}_\text{5 b}$$ ### Instruction data path To execute an instruction, the following steps are observed: 1. Instruction fetch (IF) - fetch the instruction from instruction memory - increment the instruction address (`PC += 4`), latchedd into PC register at the end of the CPU cycle 2. Instruction decode (ID) - decode fields like the op-code, offset - read recoded registers 3. Execute (EX) - ALU calculates ADD, SUB, etc, as well as addresses for LDR/STR, sets zero status for CBZ - branch adder calculates any branch target addresses 4. Memory (ME) - if memory needs to be reached, either `Write` or `Read` must be asserted to prepare for it - write to memory 5. Writeback (WB) - write results to registers from memory, the ALU, or another register ### Performance Each step in the instruction data path has a varying time, so the clock period must be at least as long as the slowest step. Performance is usually compared by comparing the execution times of standard benchmarks, such that: $$\text{time}=n_{instructions}\times\underbrace{\frac{\text{cycles}}{\text{instruction}}}_\text{CPI}\times\frac{\text{seconds}}{\text{cycle}}$$ ### Pipelining Pipelining changes the granularity of a clock cycle to be per step, instead of per-instruction. This allows multiple instructions to be processed concurrently. (Source: Wikimedia Commons) ### Data forwarding If data needs to be used from a prior operation, a pipeline stall would normally be required to remove the hazard and wait for the desired result (a **read-after-write** data hazard). However, a processor can mitigate this hazard by allowing the stalled instrution to read from the prior instruction's result instead.