Saturday, October 24, 2020

ASIC Synthesis using Synopsys Design Compiler

Synthesis is defined as the process of converting a synthesizable HDL code into a Gate level netlist using various specifications and parameters obtained from the Design Library.

Let us quickly go through the basics:
ASIC refers to Application Specific IC. Unlike FPGA, here we start from a blank tableau and build the design from the generated netlist using various design component specifications obtained from the design library.

Synthesis Basics


Netlist is a simple gate-level representation of a design that can be created using HDL. It must be ensured that the design code uses completely synthesizable constructs.

The Design Library comprises of the various design elements (or cells) and their logic functionality which are used to build the design.

It is extremely cumbersome to build an ASIC starting from the most basic component, the transistor. Hence, certain blocks which implemented basic logic functions (Eg. AND, OR) or storage functions (Flip-flop) were made readily available and are called Standard cells.

There is also a need for us to specify a set of constraints called Design Constraints on the design so that the design can be synthesized according to our requirements.
Design Constraints can be of various types such as:
1. Area constraints
2. Power constraints
3. Timing constraints
4. Environmental constraints

Synopsys Design Constraints are available as mentioned in the SDC user guide. Most of them are also supported by Cadence Genus RC tool. 

Synopsys Design Compiler


For detailed information about the commands and complete flow, refer the Synopsys Design Compiler User Guide.
The SDC tool can perform synthesis as well as timing analysis.

Libraries used for synthesis 
  1. Technology Library: For characteristics and functions of each cell. The basic design function, timing as well as power values are obtained from this library.
  2. Symbol Library: For the design schematic
  3. Synthetic Library: Synopsys specific one which uses the basic reusable design blocks.
Before performing synthesis, the libraries must be loaded into the design.
The design must use completely synthesizable constructs in a HDL language like Verilog or VHDL. All design and include files must be specified to the tool.
Set the following using Tcl commands:
  1. set target_library (.db), symbol_library (.sdb), synthetic_library (.sldb), link_library
  2. set search_path variable to the library folder

Reading the Design


The HDL design can be read using the analyze and elaborate commands or both can be done at a time using read_file command.
In order to read multiple design files, specify a list of files using the file_list option.

Design Environment Constraints


Before design optimization, it is required to define the design environment using various commands as shown in the figure:
Image source: SDC User guide
Meaning of above constraints:
  1. set_operating_conditions: This constraint is used to compensate for variations in PVT (process technology, voltage and temperature)
  2. set_wire_load_mode: There are three wire load modes: top, enclosed and segmented. It is easiest to go with the top one, which uses the same wire load model for all modules in the design.
  3. set_wire_load_model: Here, we define the wire load model for eg, 10x10. The report_lib command can be used to obtain a list of wire load models used in a specific technology library.
  4. set_drive: This command sets drive resistance of an input port with an existing library cell.
    Note 1: By default, drive resistance of an input port is zero, meaning infinite drive strength.
    Note 2: Transition time delay at input port = Drive Resistance x Capacitance Load at input port
  5. set_load: This command sets a capacitive load value at the input and output ports of the design.
    Note: By default, capacitive load is set to zero on all ports.

Timing Constraints


After setting above environmental constraints, many other design constraints can be added for efficient synthesis and optimization of design.
These are the primary timing constraints:
  1. create_clock: This constraint creates a clock with desired frequency.
  2. set_input_delay: This specifies how much delay will occur at an input port before the signal can enter the module. 
  3. set_output_delay: This specifies how much delay will occur at an output port before the signal can be sent out of the module. 
  4. create_clock_groups: This constraint is used to set certain properties on clock groups such as asynchronous.
There are many other max/min constraints that can be added to ensure the values remain in the required range.

Compile


After this, the design can be compiled and optimized using compile or compile_ultra command.
To check for potential errors in design, use check_design command.

Viewing Results and Reports


Various reports can be generated from the SDC tool using these commands:
Area report: report_area
Power report: report_power
Timing report: report_timing

From the area report, we get the area in sq. microns.
Another important parameter one looks at is gate count, ie., how many gates of that particular library has been used. 
Gate count =  Total area / std cell with smallest area
The cell with smallest area can be obtained from the library handbook.
Usually memories are black-boxed and excluded from Synthesis. So we report the size of the memory in bits.
Memory size = Data width x Depth

Synopsys Design Compiler (SDC) is one of the most popular compilers for performing synthesis. However, Cadence Genus RC is also gaining popularity with many users claiming Genus provides a more optimized area and timing results.  

Example:

Let us write a simple script and perform synthesis for a simple Verilog file using Synopsys Design Compiler. We will extract the reports as well for area, timing and power.

Verilog Code:
Consider a simple clock divider verilog code (for even N) from an earlier post.

module clkdiv (
	clk,
	rst,
	N,
	out_clk
	);

input clk;
input rst;
input [2:0] N;
output reg out_clk;
reg next_out_clk;
reg [2:0] cnt;
reg [2:0] next_cnt;

always @ ( posedge clk )
begin
  if(!rst)
  begin
    out_clk <= 1'b0;
    cnt <= 3'd0;
  end
  else
  begin
    out_clk <= next_out_clk;
    cnt <= next_cnt;
  end
end

always @ (*)
begin
  if(cnt==(N>>1))
  begin
    next_out_clk = ~out_clk;
    next_cnt = 1'b1;
  end
  else
  begin
    next_out_clk = out_clk;
    next_cnt = cnt+1;
  end
end

endmodule

Here is a simple Tcl script that can be sourced inside the Design Compiler shell.
The redirection operator '>' will redirect the output to a file.

set target_library  <path>
set symbol_library <path>
set link_library <path>

read_file clkdiv.v

create_clock clk -name clk -period 10
set_input_delay 2 [all_inputs] -clock clk
set_output_delay 2 [all_outputs] -clock clk

compile > spyglass.log
check_design > check.rpt
report_area > area.rpt
report_power > power.rpt
report_timing > timing.rpt
write -f verilog -h -o netlist.v

write -format ddc -hierarchy -output clkdiv.ddc

Using a TSMC 28nm library, I obtained the netlist as shown:
Netlist:

module clkdiv ( clk, rst, N, out_clk );
  input [2:0] N;
  input clk, rst;
  output out_clk;
  wire   \next_cnt[2] , N4, N5, N6, N8, n5, n6, n7, n8, n9, N15, N14, N13, n10,
         n11, n12, n13;
  wire   [2:0] cnt;

  DFFQL_X1M_A12TS_C31 \cnt_reg[2]  ( .D(N6), .CK(clk), .Q(cnt[2]) );
  DFFQL_X1M_A12TS_C31 \cnt_reg[1]  ( .D(N5), .CK(clk), .Q(cnt[1]) );
  DFFQL_X1M_A12TS_C31 \cnt_reg[0]  ( .D(N4), .CK(clk), .Q(cnt[0]) );
  DFFQL_X1M_A12TS_C31 out_clk_reg ( .D(n9), .CK(clk), .Q(out_clk) );
  NOR2B_X1M_A12TS_C31 U13 ( .AN(n12), .B(n11), .Y(N4) );
  NOR2_X1A_A12TS_C31 U14 ( .A(n10), .B(n11), .Y(N5) );
  XOR2_X1M_A12TS_C31 U15 ( .A(N14), .B(n12), .Y(n10) );
  INV_X1M_A12TS_C31 U16 ( .A(N13), .Y(n12) );
  NAND3XXB_X1M_A12TS_C31 U17 ( .CN(cnt[2]), .A(n7), .B(n8), .Y(N8) );
  XNOR2_X1M_A12TS_C31 U18 ( .A(cnt[1]), .B(N[2]), .Y(n7) );
  XNOR2_X1M_A12TS_C31 U19 ( .A(cnt[0]), .B(N[1]), .Y(n8) );
  INV_X1M_A12TS_C31 U20 ( .A(rst), .Y(n11) );
  NOR2B_X1M_A12TS_C31 U21 ( .AN(\next_cnt[2] ), .B(n11), .Y(N6) );
  XOR2_X1M_A12TS_C31 U22 ( .A(N15), .B(n13), .Y(\next_cnt[2] ) );
  AND2_X1M_A12TS_C31 U23 ( .A(N8), .B(cnt[2]), .Y(N15) );
  AND2_X1M_A12TS_C31 U24 ( .A(N14), .B(N13), .Y(n13) );
  OAI31_X1M_A12TS_C31 U25 ( .A0(n11), .A1(out_clk), .A2(n5), .B0(n6), .Y(n9)
         );
  NAND2_X1A_A12TS_C31 U26 ( .A(out_clk), .B(n5), .Y(n6) );
  AND2_X1M_A12TS_C31 U27 ( .A(N8), .B(rst), .Y(n5) );
  AND2_X1M_A12TS_C31 U28 ( .A(cnt[1]), .B(N8), .Y(N14) );
  AND2_X1M_A12TS_C31 U29 ( .A(cnt[0]), .B(N8), .Y(N13) );
endmodule

As can be seen, the source code has been converted to a netlist with only library components. Multi-bit buses have been converted to single bit wires and so this is called a Bit-blasted netlist.

Check Design:

The check design report performs a Lint-style check on the design and displays any violations. As shown above, N[0] is flagged as unconnected but this issue can be waived because we are doing a right shift in the RTL code for divide by 2. So the LSB of N is lost and is not required here. 

Area Report:

As shown, the area report provides information on how much area is used up by the design in square micron units.

Timing Report:

The timing report shows if there is any slack violations which can occur if the data has not arrived at the required time from one point of the design to the other.

Power Report:

As we dive deep into more and more miniature circuits, low power designs are becoming increasingly important. This report displays power consumption of the design. We can reduce power consumption by reducing unnecessary toggling of bits (switching power) and turning off components at intervals of time when they are not required.

References:

Synopsys Design Compiler User Guide

No comments:

Post a Comment