RISC V Pipelined Processor

In this blog post, we will explore the implementation of a pipelined RISC-V processor using Verilog, a hardware description language.

RISC V Pipelined Processor
Photo by Ryan / Unsplash
A pipelined processor is a symphony of synchronized stages

In this blog we'll be looking at the implementation of a 4-stage pipelined processor.

The processor supports:

  1. addition (add)
  2. shift left logical (sll)
  3. unconditional jump (j)

The processor implements forwarding to resolve data hazards.

Inputs
  1. Reset
  2. Clock
Outputs

None

Components
  1. Instruction Fetch Unit (IF)
  2. Register File (8 8-bit registers)
  3. Execution Unit (EX)
  4. Writeback Unit (WB)

Read and write operations on register file can happen simultaneously and should be independent of clock.

Pipelining

Pipelining is a technique used to enhance processor performance by overlapping instruction execution. It divides the processor's execution path into several stages, allowing multiple instructions to be processed simultaneously. Each stage performs a specific operation, such as instruction fetch, decode, execute and writeback. By breaking down the execution into smaller tasks, pipelining significantly improves the processor's throughput and performance.

Pipelined Registers

In order for pipelining to happen we need to store the results from the individual functional components at the end of every clock cycle so that it can be picked up in the next clock cycle by the next component in the pipeline. For this we'll be using pipeline registers.

  1. IF/ID
  2. ID/EX
  3. EX/WB

When reset is activated, the program counter, pipelined registers are initialized to 0 and the instruction memory and register file get loaded by predefined values.

When the instruction unit starts fetching the first instruction, the pipeline registers contain unknown values.

When the second instruction is being fetching in the IF unit, the IF/ID registers will be hold the instruction code for the first instruction.

When the third instruction is being fetched by the IF unit, IF/ID register contains the instruction code of the second instruction and the ID/EX register contains information relevant to the first instruction and so on. This in summary is the essence of pipelining.

We will further assume an 8-bit program counter. The address and the data size is 8-bits as well.

Instruction Classes

The first class of instructions would look something akin to below.

add R2, R0 (R2⬅️R2+R0)

The second class of instruction would look like below:

sll R0, 4 (R0⬅️R0<<4)

Finally the jump instruction would look like so:

j L1 (Jump address is calculated using pseudo direct addressing)

Processor Overview
Overview of Final Design

Before delving into the gory details of every module, lets look at all the components and their connections in all their glory. We'll first look at a more simplified diagram and then look at something more detailed.

Overview of the processor architecture generated with Quartus Prime

As we can see, the clock and reset inputs get fed to the instruction fetch, instruction decode and pipeline registers. A more detailed diagram is shown below:

Detailed processor functional architecture generated with Quartus Prime

The blocks individually are as shown below:

  1. Instruction Fetch Unit
    The IF stage is responsible for fetching instructions from memory based on the program counter (PC) value. It retrieves the instruction from memory and passes it to the next stage, the Instruction Decode (ID) stage. In Verilog, this stage involves reading the instruction memory and updating the program counter accordingly.
Instruction Fetch Unit

2. IF/ID Register File:

IF/ID Register File

3. Instruction Decode Unit:
 In the ID stage, the fetched instruction is decoded to determine the operation to
 be performed. It involves extracting the opcode and operands from the
 instruction and fetching the corresponding values from the register file.

Instruction Decode Unit

4. ID/EX Register File

ID/EX Register File

5. Forwarding Units

Forwarding Unit

6. Execution Unit
 The EX stage executes the arithmetic and logical operations specified by the
 instruction. It performs calculations, such as addition and logical operations, on
 the operands obtained from the ID stage. The ALU (Arithmetic Logic Unit) is a
 critical component of this stage.

Execution Unit

7. EX/WB Register File

EX/WB Register File

8. Write Back Unit
 The final stage, WB, is responsible for writing the result back to the register file.
 It receives the data from the previous stage and updates the destination register
  with the computed value. This stage ensures that the final result is correctly
 stored for future instructions or output.

Write Back Unit
Control Signals

Instructions

Control Signals

ALU_Src 

ALU_op

RegWrite

add

 0

  0

  1

sll

  1

  1

 1

j

  X

X

  0

Data Hazards

Data Hazards can be detected when

  1. The destination register in the EX/WB Pipeline Register and ID/EX Pipeline Register is the same
    OR
  2. The destination register in the EX/WB Pipeline Register and the source register in the ID/EX Pipeline Register is the same.
Conclusion

Implementing a pipelined RISC-V processor in Verilog provides a deeper understanding of processor architecture and the intricacies of instruction execution. By breaking down the execution into separate stages and leveraging Verilog's power, we can design a high-performance processor capable of executing complex tasks efficiently.

All the code for this processor architecture can be found here