eifueo/docs/2a/ece222.md
2023-11-07 12:57:16 -05:00

4.7 KiB

ECE 222: Digital Computers

Exceptions

In ARM, anything that interrupts the normal control flow of a program is an exception.

  • An interrupt from an interrupt request (IRQ) occurs when a peripheral wants to interrupt the current flow
  • A fault indicates a CPU error (e.g., division by zero) and returns to the faulty instruction
  • A trap runs the interrupt handler and returns to the next instruction

Exceptions are handled by running an exception handler then returning to the original line.

Vector table

A vector table is an array of handler addresses. Each index contains a number (a “vector”) and a priority.

Exception handling

First, in hardware: If the exception priority is higher than the current operating priority, the exception is immediately handled.

  • the current context is pushed to the stack
  • the operating mode is set to privileged
  • the operating priority is set to the exception priority
  • the program counter is set to the address of the exception (vector_table[exception_num])

Next, the handler runs, and it should:

  • preserve the any R4-R11 it modifies
  • clear the interrupt request (IRQ)
  • restore R4-R11
  • return with BX LR

Finally, in hardware:

  • the previous context is restored
  • the previous operating priority and mode are restored

!!! warning Interrupts can interrupt other interrupts, if their priority is sufficiently high!

!!! example How to interrupt-driven I/O:

**Write the ISR:** Assuming that the IRQ bit is cleared if `R0` is read:

```asm
ISR     PUSH    {R4-R11}    ; save previous state onto stack
        LDR     R3, [R0]    ; clear the IRQ by reading from it
        POP     {R4-R11}    ; restore state
        BX      LR          ; return to original address
```

**Store the interrupt handler in the vector table:** Assuming that the vector number is `22` and the vector table starts 16 addresses after the 0x00:

```asm
MOV32   R0, #ISR    ; handler address
MOV     R1, #38 * 4 ; offset: (16 + 22) * 4 bytes per address
STR     R1, [R0]    ; save address to table
```

**Enable interrupt requests:**

```asm
MOV32   R0, #ADDRESS_INTERRUPT_ENABLE
MOV     R1, #1
STR     R1, [R0]    ; enable interrupts
```

Processor design

Comparing the complex instruction set computer architecture to the reduced instruction set computer architecture:

Task CISC RISC
ALU operands can come from? registers, memory registers (load/store)
Addressing mode complex simple
Binary size small large (~30% larger)
Instruction size variable fixed
Pipelining difficult simple

Operation encoding

The R-format is used for operations of the form ADD Rd, Rn, Rm:

\[\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{Rm}}^\text{5 b}\ \ \underbrace{\text{shift amount}}_\text{6 b}\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rd}}_\text{5 b}\]

The D-format is used for operations of the form LDR Rt, [Rn, #offset]:

\[\underbrace{\text{op-code}}_\text{11 b}\ \ \overbrace{\text{offset}}^\text{9 b}\ \ 00\ \ \overbrace{\text{Rn}}^\text{5 b}\ \ \underbrace{\text{Rt}}_\text{5 b}\]

The CB-format is used for operations of the form CBZ Rt, LABEL:

\[\underbrace{\text{op-code}}_\text{8 b}\ \ \overbrace{\text{offset}}^\text{19 b}\ \ \underbrace{\text{Rt}}_\text{5 b}\]

Instruction data path

To execute an instruction, the following steps are observed:

  1. Instruction fetch (IF)
  • fetch the instruction from instruction memory
  • increment the instruction address (PC += 4), latchedd into PC register at the end of the CPU cycle
  1. Instruction decode (ID)
  • decode fields like the op-code, offset
  • read recoded registers
  1. Execute (EX)
  • ALU calculates ADD, SUB, etc, as well as addresses for LDR/STR, sets zero status for CBZ
  • branch adder calculates any branch target addresses
  1. Memory (ME)
  • if memory needs to be reached, either Write or Read must be asserted to prepare for it
  • write to memory
  1. Writeback (WB)
  • write results to registers from memory, the ALU, or another register

Performance

Each step in the instruction data path has a varying time, so the clock period must be at least as long as the slowest step.

Performance is usually compared by comparing the execution times of standard benchmarks, such that:

\[\text{time}=n_{instructions}\times\underbrace{\frac{\text{cycles}}{\text{instruction}}}_\text{CPI}\times\frac{\text{seconds}}{\text{cycle}}\]

Pipelining

Pipelining changes the granularity of a clock cycle to be per step, instead of per-instruction. This allows multiple instructions to be processed concurrently.

(Source: Wikimedia Commons)