# ECE 222: Digital Computers

## Exceptions

In ARM, anything that interrupts the normal control flow of a program is an exception.

- An **interrupt** from an **interrupt request (IRQ)** occurs when a peripheral wants to interrupt the current flow
- A **fault** indicates a CPU error (e.g., division by zero) and returns to the faulty instruction
- A **trap** runs the interrupt handler and returns to the next instruction

Exceptions are handled by running an exception handler then returning to the original line.

### Vector table

A vector table is an array of handler addresses. Each index contains a number (a "vector") and a priority.

### Exception handling

First, in hardware: If the exception priority is higher than the current operating priority, the exception is immediately handled.

  - the current context is pushed to the stack
  - the operating mode is set to privileged
  - the operating priority is set to the exception priority
  - the program counter is set to the address of the exception (`vector_table[exception_num]`)

Next, the handler runs, and it should:

- preserve the any R4-R11 it modifies
- clear the interrupt request (IRQ)
- restore R4-R11
- return with `BX LR`

Finally, in hardware:

- the previous context is restored
- the previous operating priority and mode are restored

!!! warning
    Interrupts can interrupt other interrupts, if their priority is sufficiently high!

!!! example
    How to interrupt-driven I/O:
    
    **Write the ISR:** Assuming that the IRQ bit is cleared if `R0` is read:
    
    ```asm
    ISR		PUSH	{R4-R11}	; save previous state onto stack
    		LDR		R3, [R0]	; clear the IRQ by reading from it
    		POP		{R4-R11}	; restore state
    		BX		LR			; return to original address
    ```
    
    **Store the interrupt handler in the vector table:** Assuming that the vector number is `22` and the vector table starts 16 addresses after the 0x00:
    
    ```asm
    MOV32	R0, #ISR	; handler address
    MOV		R1, #38 * 4	; offset: (16 + 22) * 4 bytes per address
    STR		R1, [R0]	; save address to table
    ```
    
    **Enable interrupt requests:**
    
    ```asm
    MOV32	R0, #ADDRESS_INTERRUPT_ENABLE
    MOV		R1, #1
    STR		R1, [R0]	; enable interrupts
    ```

## Processor design

Comparing the **complex instruction set computer** architecture to the **reduced instruction set computer** architecture:

| Task | CISC | RISC |
| ---- | ---- | ---- |
| ALU operands can come from? | registers, memory | registers (load/store) |
| Addressing mode | complex | simple |
| Binary size | small | large (~30% larger) |
| Instruction size | variable | fixed |
| Pipelining | difficult | simple |

### Operation encoding

The **R-format** is used for operations of the form `ADD Rd, Rn, Rm`:

$$\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{Rm}}^\text{5 b}\ \ \underbrace{\text{shift amount}}_\text{6 b}\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rd}}_\text{5 b}$$

The **D-format** is used for operations of the form `LDR Rt, [Rn, #offset]`:

$$\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{offset}}^\text{9 b}\ \ 00\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rt}}_\text{5 b}$$

The **CB-format** is used for operations of the form `CBZ Rt, LABEL`:

$$\underbrace{\text{op-code}}_\text{8 b}\ \ \overbrace{\text{offset}}^\text{19 b}\ \ \underbrace{\text{Rt}}_\text{5 b}$$

### Instruction data path

To execute an instruction, the following steps are observed:

1. Instruction fetch (IF)
  - fetch the instruction from instruction memory
  - increment the instruction address (`PC += 4`), latchedd into PC register at the end of the CPU cycle
2. Instruction decode (ID)
  - decode fields like the op-code, offset
  - read recoded registers
3. Execute (EX)
  - ALU calculates ADD, SUB, etc, as well as addresses for LDR/STR, sets zero status for CBZ
  - branch adder calculates any branch target addresses
4. Memory (ME)
  - if memory needs to be reached, either `Write` or `Read` must be asserted to prepare for it
  - write to memory
5. Writeback (WB)
  - write results to registers from memory, the ALU, or another register

### Performance

Each step in the instruction data path has a varying time, so the clock period must be at least as long as the slowest step.

Performance is usually compared by comparing the execution times of standard benchmarks, such that:

$$\text{time}=n_{instructions}\times\underbrace{\frac{\text{cycles}}{\text{instruction}}}_\text{CPI}\times\frac{\text{seconds}}{\text{cycle}}$$

### Pipelining

Pipelining changes the granularity of a clock cycle to be per step, instead of per-instruction. This allows multiple instructions to be processed concurrently.

<img src="https://upload.wikimedia.org/wikipedia/commons/c/cb/Pipeline%2C_4_stage.svg" width=500>(Source: Wikimedia Commons)</img>

### Data forwarding

If data needs to be used from a prior operation, a pipeline stall would normally be required to remove the hazard and wait for the desired result (a **read-after-write** data hazard). However, a processor can mitigate this hazard by allowing the stalled instrution to read from the prior instruction's result instead.