Let us design a simple MIPS based processor and write a Verilog code for it.
Let us first individually examine the typical components of a generic processor and then put them all together to build the complete design of the processor.
We will be designing a 32-bit processor. This means that we will be handling 1 Dword (4-bytes) of data at a time.
We will use RISC Instruction Set Architecture. Here, each instruction executes in exactly 1 clock cycle.
Let us build a simple processor that supports only the following instructions:
AND, OR, ADD, SUB, SLT, NOR, LW, SW, BEQ.
The processor broadly consists of a Datapath and a Control Unit.
Datapath refers to all the elements that work with data and process them. We will look at all the datapath components in detail in this section.
Control Unit generates control signals in order to direct the processor operation under different situations. We will see this in the next section.
Single clock cycle vs Pipelined execution:
All instructions do not take the same time to execute. Hence for single clock cycle execution, we need to ensure that the clock period is long enough to accomodate the execution of the slowest instruction.
To speed up the execution process, the Pipelining technique is commonly used where multiple instructions are overlapped in execution. Pipelining however, is prone to different 'hazards' which we will not discuss here.
Here, we have considered single clock cycle execution.
Components of the processor datapath:
1. Instruction Memory:
- To store the instructions of a program.
- Given the address, it must supply the instruction located at that address.
- We can load instruction memory using readmemh function from a mem file. The contents of the mem file will be explained in a later section of the tutorial.
module Instruction_Memory( instrn_address, instrn ); input [31:0] instrn_address; //5-bit address holds 8 instructions of 32-bit width output wire [31:0] instrn; reg [7:0] instrn_mem [31:0];
initial begin $readmemh("instrn_memory.mem", instrn_mem); //load initial values end assign instrn = {instrn_mem[instrn_address+3],instrn_mem[instrn_address+2], instrn_mem[instrn_address+1],instrn_mem[instrn_address]}; endmodule
- Holds the address of the current instruction.
- For a 32-bit (4-byte) processor, we must increment the address by 4 to fetch the next instruction (as the width of each instruction is 4 bytes). This adder module will be connected to the PC.
- This address increment must happen at every clock cycle and hence, it will be a D Flip-Flop.
- A new instruction is executed every clock cycle.
module Program_Counter( clk, rst_n, in_address, out_address ); input clk, rst_n; input [31:0] in_address; output reg [31:0] out_address; always @ (posedge clk or negedge rst_n) begin if(!rst_n) out_address <= 32'd0; else out_address <= in_address; end endmodule
- This module will enclose all the independent registers of the processor, to perform write and read operations.
- MIPS consists of 32 inbuilt registers as shown in the below table. We will use the same configuration for our design.
- R-Format instructions have three operands. So we will need to read 2 dwords from the register file (2 output read ports) and 1 write port (input port) along with a write enable signal that indicates when the data has to be written.
- Example Instruction: add $t1, $t2, $t3
To execute this instruction, we need to read two registers t1 and t2. Add them. Then write the result to register t3. - In the Verilog code below, combinational read is done from the register memory using assign statement. But usually, read data will appear only after 1 clock cycle (flopped)
- We can load register memory using readmemh function from a mem file.
module Register_File( clk, rst_n, read_addr1, read_addr2, write_en, write_addr, write_data, read_data1, read_data2 ); input clk; input rst_n; input [4:0] read_addr1; input [4:0] read_addr2; input write_en; input [4:0] write_addr; input [31:0] write_data; output wire [31:0] read_data1; output wire [31:0] read_data2; reg [31:0] reg_mem [31:0]; initial begin $readmemh("reg_memory.mem", reg_mem); //Load initial values end assign read_data1 = reg_mem[read_addr1]; assign read_data2 = reg_mem[read_addr2]; always @ (posedge clk or negedge rst_n) begin if (!rst_n) begin reg_mem[write_addr] <= reg_mem[write_addr]; end else begin reg_mem[write_addr] <= write_en ? write_data : reg_mem[write_addr]; end end endmodule
- ALU will be required to perform the required operations on the data provided to it.
- For our processor, we will need to perform the following operations: add, subtract, and, or, nor, less than (for SLT).
- The post on ALU design using MIPS Instruction set explains about this ALU design. We will use the same ALU here.
AND, OR, ADD, SUB, SLT, NOR
- See the format of the lw and sw instructions:
lw $t1, $t2, offset
sw $t1, $t2, offset
where the offset_value is a signed 16-bit value. - Since we are working with 32-bit values, we will need to sign extend the 16-bit offset value to bring it to 32-bits, and so we require a Sign Extension Unit.
module Sign_Extension( bits16_in, bits32_out ); input [15:0] bits16_in; output wire [31:0] bits32_out; assign bits32_out = {{16{bits16_in[15]}} , bits16_in[15:0]}; endmodule
- For lw and sw instructions, we are computing a data memory address from which we have to either fetch the data or store the data.
- lw $t1, $t2, offset means
Fetch the data from this data memory address (value present in $t1 + sign-extended offset value) and store it in $t2 - sw $t1, $t2, offset means
The data present in $t2 has to be stored in the calculated data memory address (value present in $t1 + sign-extended offset value) - Below is the Verilog code, again it is similar to the other memories we have designed above.
- We can load data memory using readmemh function.
module Data_Memory( clk, address, write_en, write_data, read_data ); input clk; input [31:0] address; input write_en; input [31:0] write_data; output wire [31:0] read_data; //Registers are addressed as per MIPS register table reg [7:0] data_mem [31:0]; initial begin $readmemh("data_memory.mem", data_mem); end assign read_data = {data_mem[address+3],data_mem[address+2], data_mem[address+1],data_mem[address]}; always @ (posedge clk) begin data_mem[address] <= write_en ? write_data[7:0] : data_mem[address]; data_mem[address+1] <= write_en ? write_data[15:8] : data_mem[address+1]; data_mem[address+2] <= write_en ? write_data[23:16] : data_mem[address+2]; data_mem[address+3] <= write_en ? write_data[31:24] : data_mem[address+3]; end endmodule
- Compare the two register values $t1 and $t2 to check for equality (can be done by subtraction operation in ALU and check for zero)
- If condition is false, we directly execute the next instruction (no issues here)
- If condition is true, we branch to the instruction with address = next instruction address + 18-bit offset (16-bit offset shifted left by 2 bits). To enable this shifting, we require a shifter module as well. For above address addition, we will require another small ALU.
- We do the shift left by 2 bits because for every offset value, we must increment address by 4 to reach the next address.
Example: Offset 1 means 1<< 2 = 4 meaning take the next instruction from address = address+4
module Shifter( indata, shift_amt, shift_left, outdata ); input [31:0] indata; input [1:0] shift_amt; input shift_left; output wire [31:0] outdata; assign outdata = shift_left ? indata<<shift_amt : indata>>shift_amt; endmodule
No comments:
Post a Comment