Conventional figure systems is the leaden fixed positive base figure systems, where signed figure uses the mark as a symbol followed by the figure portion either in magnitude or R ‘s complement signifier. Addition of conventional figure systems requires carry extension ( consecutive signal extension ) from LSD to MSD and the add-on clip depends on word-length, which is the chief restriction of the VLSI public presentation.
But Redundant figure systems ( RNS ) is to let add-on of two Numberss in which no consecutive signal extension is required along the adder ; that is, the clip continuance of the operation is independent of length of the operands and is the clip required for the add-on of two figures. This is the advantage of RNS over conventional figure systems.
Because of this advantage, in this thesis it proposed to plan an FIR filter based on RNS. In order to implement FIR filter, it is necessary to plan adder, multiplier and D-FF. For execution, the structural blocks are to be designed such as PPM adder, MMP subtractor, D-FF, Digit-serial multiplier.
In this thesis, a 368.18MHZ 3-tap FIR filter and 80MHZ Box-car FIR filter be designed based on bottom-up design flow utilizing CADENCE 5.1.41, meter IC design environment. The design was based on the CMOS 90nm engineering procedure. Bottom degree transistors are used from gpdk090 library. The advantages of full usage are maximal circuit public presentation, minimal design size, and minimal high-volume production cost.
Contentss
Chapter 1 Introduction
Introduction 1
Motivation and Goals 1
VLSI Design Flow 2
Introduction on Bottom-up and Top-down Design Flow 2
Bottom-up Design Flow 2
Thesis Organization 6
Chapter 2 Computer Binary Number Systems
Binary Number Systems 7
Signed Digit Number Systems 7
Excess Number Systems ( RNS ) 8
Arithmetical Operationss of RNS 8
CARRY-FREE Radix-2 add-on ( PPM ADDER ) 10
Radix-2 Subtraction ( MMP Subtractor ) 12
Digit-serial SBD redundant adder 14
Radix-2 Redundant Binary Multiplier 14
Excess Binary to Binary Conversion 17
Chapter 3 FIR Filter
FIR Filter Theory 18
Architecture of FIR filter 19
BOX-CAR FIR filter 21
Chapter 4 Architecture, Design and Implementation of
Different Digital Cells
Architecture and Design of Different Digital Cells 22
CMOS Inverter 22
Design 23
Simulation 25
Layout 26
Parametric extraction and post-layout simulation 26
NAND-2 27
NAND-3 29
D-FF ( Delay ) 31
PPM Adder 31
MMP Subtractor 36
Digit-serial SBD redundant adder 38
Box-car FIR Filter 39
Digit-serial Multiplier 41
3-tap FIR Filter 44
Chapter 5 Conclusions and Future Works
Decision 48
Future Work 48
Mentions 49
Appendix I 51
Appendix II 52
List of Figures
Fig 1.3.1 Bottom-up Design Flow
Fig 2.5.1 PPM Adder
Fig 2.5.2 Lsd PPMAdder
Fig 2.5.3 4-bit figure PPM Adder
Fig 2.7.1 Digit consecutive SBD Adder
Fig 4.2.1 CMOS Inverter conventional
Fig 4.2.3 Layout of inverter
Fig 4.3.1 NAND2 conventional
Fig 4.3.2 layout of NAND2
Fig 4.3.3 extraction of NAND2
Fig 4.4.1 NAND3 conventional
Fig 4.4.2 NAND3 layout
Fig 4.5.1 D-FF conventional
Fig 4.5.2 D-FF layout
Fig 4.5.3 extraction of D-FF
Fig 4.6.1 PPM Adder conventional
Fig 4.6.2 PPM Adder layout
Fig 4.6.3 extraction of PPM Adder
Fig 4.7.1 MMP Subtractor conventional
Fig 4.7.2 layout MMP Subtractor
Fig 4.8.1 SBD adder
Fig 4.8.2 simulation wave form of SBD adder
Fig 4.9.1 4-tap Box-car FIR filter ( 1-bit input ) schematic
Fig 4.9.2 4-tap Box-car FIR filter ( 4-bit input ) schematic
Fig 4.9.3 Simulation wave forms of Box-car FIR filter
Fig 4.10.3 Digit consecutive multiplier conventional
Fig 4.11.1 9-FA conventional
Fig 4.11.2 9-DFF conventional
Fig 4.11.3 FIR filter schematic
Fig 4.11.4 Test-bench of FIR filter
Fig 4.11.5 Simulation wave forms of FIR filter
List of Tables
Table 2.5.1 Digit sets in add-on
Table 2.6.1 Digit sets in minus
Table 2.8.1 Recoding of bj
Chapter 1
Introduction
1.1 Introduction
Since the theory of digital signal processing ( DSP ) is developed and applied to the electrical technology universe, digital filtering ever plays a really of import function. Digital filtrating techniques is used to stamp down noise, enhance signal in selected frequence ranges, constrain bandwidth, take or rarefy specific frequences and other particular operations. Digital filters are classified into finite impulse response ( FIR ) and infinite impulse response ( IIR ) filters. FIR digital filters can hold precisely additive stage response and a really regular architecture, and suffer less from the effects of finite word length as compared with IIR digital filters. This thesis presents the design and an execution of such a filter based on excess binary figure systems. The chief constituents of FIR filter are adder, multiplier and hold. The carry extension hold is a confining factor of the adder and multiplier. Based on excess figure, adders and multipliers are designed in such a manner that the extension hold is reduced of the FIR filter.
In this thesis the FIR filter is designed based on bottom-up or full usage design flow utilizing CDS 5.1.41. The advantages of full usage are maximal circuit public presentation, minimal design size, and minimal high-volume production cost. Finally, planing box-car FIR filter and 3-tap FIR ( multiplier coefficient 4-bit ) filter can function as a BASIC of IC design pupils to work with as a tool in their apprehension of digital design. It is besides a stepping-stone for pupils in planing other CMOS french friess utilizing the 90nm CMOS engineering and to promote them to do betterments in the design.
1.2 Motivation and ends
Area, hold ( public presentation ) and power are the three of import design restraints for planing an embedded real-time digital signal processing systems. The country restraint is imposed chiefly by considerations of cost. Area efficient execution consequences in a smaller dice size and hence becomes more cost effectual. It besides enables incorporating more functionality on a individual bit. The public presentation demands of a system are driven by its informations processing demands. For DSP systems, throughput is the primary public presentation standard. The public presentation restraint is therefore dependent on the rate at which the input signals are sampled and on the complexness of processing to be performed. Low power dissipation is a cardinal demand for portable, battery operated systems as it extends battery life. Low power dissipation besides helps cut down the packaging cost ( fictile alternatively of ceramic ) , extinguish / cut down chilling ( heat sinks ) overhead and increase the dependability of the device.
For the demand of high-speed and low-power applications, the development and execution of high-velocity FIR digital filters need both increased correspondence and decreased complexness in order to run into both trying rate and power dissipation ends. In this thesis, FIR filter is designed based on RNS to accomplish high velocity operation. Bottom-up design flow is used for maximal circuit public presentation, minimal design size, and minimal high-volume production cost.
1.3 VLSI Design Flow
1.3.1 Introduction on Bottom-up and Top-down Design Flow
The interior decorator normally follows some design phases to make his undertaking. At the get downing the interior decorator has to stipulate the functionality of the system. Basic blocks of the hardware are identified and their interfaces, composed of informations and control signals, are fixed. Today, there are two chief ways to plan a VLSI circuit with traditional tools that have been developed in these last old ages. The interior decorator can take at discretion an attack Bottom-up or a Top-down flow but sometimes the pick can be forced in effect of peculiar design demands or circuit construction. Top-down is a procedure of iterative polishs. The interior decorator starts with a top position of the system and decomposes individual blocks into smaller 1s. Bottom-up flow starts with low-level edifice blocks and interconnects them to greater 1s. In world, these two techniques are non really incompatible and, for case, the interior decorator can besides take to utilize peculiar self-made cells and to make non touch their construction within a top-down attack [ 10 ] . The attack Bottom-up is preferred in digital design if the interior decorator desires to kick a peculiar cell accomplishing specific public presentation with transistors full-custom designed and so he wants to retroflex this construction in his undertaking.
1.3.2 Bottom-up Design Flow
The Bottom-Up design flow is given in Fig 1.3.1. The Bottom-Up design flow starts with a set of design specifications. The “ eyeglasses ” typically describe the expected functionality of the designed circuit every bit good as other belongingss like hold times, country, etc. To run into the assorted design specifications certain design trade offs ( country poetries hold ) are required [ 10 ] .
Design Specifications
Schematics Capture
Create Symbol
Pre-layout Simulation
Layout
Design Rule Check
Extraction
Layout Verses Schematic Check
Post-Layout Simulation
Fig 1.3.1 Bottom-up Design Flow
A. Schematic Capture
A Conventional Editor is used for capturing ( i.e. depicting ) the transistor-level design. The Conventional Editors provide simple, intuitive agencies to pull, to topographic point and to link single constituents that make up the design. The ensuing conventional drawing must accurately depict the chief electrical belongingss of all constituents and their interconnectednesss. Besides included in the conventional are the supply connexions ( VDD and gnd ) , every bit good as all pins for the input and end product signals of the circuit. From the conventional, a netlist is generated, which is used in ulterior phases of the design. The coevals of a complete circuit schematic is hence the first of import measure of the transistor-level design.
B. Symbol Creation
A symbol position of the circuit is besides required for some of the subsequent simulation stairss or for certification intents. Therefore, the conventional gaining control of the circuit topology is normally followed by the creative activity of a symbol to stand for the full circuit. The form of the icon to be used for the symbol may propose the map of the faculty ( logic Gatess – AND, OR, etc. ) , but the default symbol icon is a simple rectangular box with input and end product pins. The symbol creative activity will besides assist the circuit interior decorator to make a system degree design dwelling of multiple hierarchy degree.
C. Layout
The creative activity of the mask layout is one of the most of import stairss in the full-custom design flow, where the interior decorator describes the elaborate geometrics and the comparative placement of each mask bed to be used in existent fiction, utilizing a Layout Editor. Physical layout design is really tightly linked to overall circuit public presentation since the physical constructions determines the transconductances of the transistors, the parasitic electrical capacities and oppositions, and evidently the Si country which is used to recognize a certain map. But the procedure is really intensive and time-consuming design attempt. It is besides highly of import that the layout design must non go against any of the layout design regulations, in order to guarantee a defect free fiction of the design. The layout procedure can be a manual procedure, in which layout of each design is done manually or an automatic procedure utilizing a CAD tool. But the quality of the layouts produced utilizing automatic procedures are still far from manus optimized layouts.
D. Design Rule Check ( DRC )
The created mask layout must conform to a complex set of design regulations, in order to guarantee a lower chance of fiction defects. A tool built in to the layout editor called Design Rule Checker, is used to observe any design regulation misdemeanors during and after the mask layout design. If mistakes are detected, they should be removed from the mask layout, before the concluding design is saved.
E. Circuit Extraction
After the mask layout has been made free from design regulation mistakes, circuit extraction is performed to make a elaborate netlist for the simulation of the circuit. The circuit extractor identifies the single transistors and their connexions every bit good as the parasitic electrical capacities and oppositions that are necessarily present. The extracted netlist can give a really accurate appraisal of the device dimensions and device parasitic that finally determine the circuit public presentation. The extracted netlist are used in transistor degree simulations and in Layout Verses Schematic comparing.
F. Layout Verses Schematic Check
After the mask layout design of the circuit is completed, the design should be checked against the conventional circuit description created earlier. The ‘Layout Verses Schematic ( LVS ) Check ‘ will compare the original web with the one extracted from the mask layout. The LVS measure provides an extra degree of assurance for the unity of the design, and ensures that the mask layout is a right realisation of the intended circuit topology. Besides it should be noted that a successful LVS would non vouch that the extracted circuit would really fulfill the public presentation demands since LVS cheque guarantees merely a topological lucifer. If any mistakes show up during LVS, so it should be corrected before continuing to post layout simulation.
G. Post-Layout Simulation
The electrical public presentation of a full usage design can be best analyzed by executing a post-layout simulation on the extracted circuit netlist. The elaborate simulation performed utilizing the extracted netlist will supply a clear appraisal of the circuit velocity and the influence of circuit parasitic. If the consequences of the post-layout simulation are non satisfactory, the interior decorator should modify the transistor dimensions or the circuit topology, in order to accomplish the coveted circuit public presentation. Therefore, it may necessitate multiple loops on the design, until the postlayout simulation consequences satisfy the original design demands.
Finally, it should be noted that a satisfactory consequence in post-layout simulation is still no warrant for a wholly successful merchandise, since the existent public presentation of the bit can be merely be verified by proving the fabricated paradigm.
1.4 Thesis Organization
The organisation of this thesis is as follows. In Chapter 2, a reappraisal of computing machine binary figure systems, excess figure systems and its arithmetic operations. In Chapter 3, it describes the FIR filter theory, box-car fir filter and constituents of the filter. In Chapter 4, it gives the architecture, design and execution of different digital cells and eventually implemented FIR filter. Decision and future plants are given in Chapter 5.
Chapter 2
Computer Binary Number Systems
2.1 Binary Number Systems
A figure system is defined by the set of values that each figure can presume and by an reading regulation that define the function between the sequences of figures and their numerical values. There are two types of figure systems viz. conventional ( e.g. binary, denary ) and unconventional ( e.g. signed-digit figure ) .In conventional figure systems, every figure has a alone representations i.e. no two sequences have the same numerical value and hence these Numberss are called non-redundant figure systems [ 2 ] . In conventional digital computing machines, whole numbers are represented as binary Numberss of fixed length N holding a represents the whole number value by
( 2.1 )
The weight of the figure eleven is the ith power of the 2 where 2 is the base of the binary figure system e.g. the whole number X ( 5 ) can be represented as 5=1*22+0*21+1*20. Because of the trade-off between the word length and hardware size and between the extension hold, assorted types of figure representations have been proposed. In non-redundant figure systems, carry extension is the restriction of VLSI execution of high velocity generation and add-on. In the add-on of two conventional binary Numberss the carry may propagate all the manner from the least important figure to the most important. The add-on clip therefore dependant on the word- length. To cut down the add-on clip i.e. extension hold, we need another figure systems called unconventional or signed digit figure systems [ 2 ] .
2.2 Signed Digit Number Systems
In an unconventional radix-r figure system, a figure can take on values { 0, 1, 2, aˆ¦aˆ¦.. , r-1 } and the figure set is S = { – ( r-1 ) , – ( R-2 ) , aˆ¦aˆ¦ , -1, 0, 1, aˆ¦ . , ( r-1 ) } . For illustration, the figure set { -1, 0, 1 } is used for radix-2 ( r =2 ) figure system. A signed-digit is represented by the figures zi and has the algebraic value
( 2.2.1 )
In this instance, the figure 3 can be represented as 0011 or 0101-1. Hence every figure allows multiple representations in signed-digit format and these Numberss are called Redundant Number Systems. Signed-digit representations limit carry extension to one place to the left during the operation of add-on and minus in digital computing machines [ 1-2 ] . Carry extension ironss are eliminated by the usage of excess representations for the operands.
2.3 Excess Number Systems ( RNS )
The category of signed-digit figure or excess figure representations is derived harmonizing to four demands which are postulated as necessary for figure representations in fast parallel arithmetic.
The intent of Redundant figure representations is to let add-on and minus of two Numberss in which no consecutive signal extension is required along the adder ; that is, the clip continuance of the operation is independent of length of the operands and is equal to the clip required for the add-on or minus of two figures. The signed-digit representation must hold a alone representation of zero algebraic value of a figure [ 3 ] . The excess figure is represented by n+m+1 figures zi ( i=-naˆ¦aˆ¦aˆ¦ , -1,0,1, aˆ¦.. , m ) has the whole number value
( 2.3.1 )
Where the values of R and zi are such that the undermentioned demands are satisfied:
The base R is a positive whole number.
The algebraic value Z=0 has a alone representation.
There exits transmutations between the conventional representation and the signed-digit representation for every algebraic value Z within a specified scope.
Wholly parallel add-on and minus is possible for all figures in matching places of two representations.
2.4 Arithmetical Operationss of RNS
The arithmetic operations of wholly parallel add-on and minus of two figures zi and yi from the corresponding places of the representations of Numberss Z and Y are defined as follows [ 3 ] :
Definition 1: Addition of figures zi and Lolo is wholly parallel if the following two conditions are satisfied:
The amount digits Si ( ith figure of the amount S=Z+Y ) is a map merely of the augned figure zi addend figure Lolo and the transportation digit Ti from the ( i+1 ) th place on the right: si=f ( zi, Lolo, Ti ) . The term “ reassign figure ” is used here alternatively of the commonly used footings “ carry ” or “ borrow ” for two grounds. First the transportation figure may presume both positive and negative values for their add-on or minus ; secondly unlike the “ carry ” or “ borrow ” of conventional add-on or minus, the transportation figure is ne’er propagated past the first adder place on the left.
The transportation digit ti-1 to the ( i-1 ) Thursday place on the left is a map merely of the augend figure zi and the addend figure Lolo: ti-1=f ( zi, Lolo ) .
Definition 2: Wholly parallel minus of the subtrahend figure Lolo from the minuend figure zi is performed as the wholly parallel add-on of the linear opposite of Lolo, i.e. , zi-yi=zi+ ( -yi ) .
The add-on of two figures is performed in two consecutive stairss. First, an surpassing transportation digit ti-1 and an interim amount figure Wisconsin are formed:
zi+yi =rti-1 +wi ( 2.4.1 )
Then the amount figure Si is formed:
Si =wi +ti ( 2.4.2 )
Definition 1 will be satisfied if the scope of values which si may presume in ( 4 ) does non transcend the allowed the scope of values for the figures zi and Lolo may presume in ( 4 ) does non transcend the allowed scope of values for the figures zi and Lolo in ( 3 ) . Definition will be satisfied if, every allowed nonzero value of the digit Lolo ;
For every yi=a, there exits yi=-a such that a+ ( -a ) =0.The demand for alone representations of the zero value of a figure will be satisfied by the status:
a”‚zia”‚ r-1 ( 2.4.3 )
2.5 CARRY-FREE RADIX-2 ADDITION ( PPM ADDER )
Excess figure representations limit the carry extension to a few spot places, which is normally independent of the word length W. This carry propagation-free characteristic enables fast add-on.
A radix-2 signed figure figure is coded utilizing two unsigned binary Numberss, one is positive and other is negative, as X = X+ – X- . Hence each signed figure is represented utilizing 2 spots as eleven = xi+ – xi- , where xi+ , xi- ?„ { 0, 1 } and xi ?„ { 1- , 0, 1 } . In adder shown in Fig ( 2.5.1 ) , one signed figure figure xi is to be added to an unsigned figure Lolo. This add-on can be carried out in two stairss. The first measure is carried out in analogue for all spot places i ( 0 a‰¤ I a‰¤ w-1 ) . An intermediate amount pi = xi + Lolo is computed, which lies in the scope { 1- , 0, 1, 2 } [ 1 ] , [ 5-7 ] . This add-on is expressed as
( 2.5.1 )
Fig. 2.5.1 PPM adder
Where Ti is the transportation figure and has the value either 0 or 1, and is denoted as ti+ ; ui is the interim amount and has the value either 1- or 0, and is denoted as -ui- . The least important transportation digit t-1 is assigned the zero value, the same as the most important interim amount figure uw. In the 2nd measure, the amount figures Si is formed by uniting and ui- as 1 figure as shown in fig ( 2.5.2 ) :
( 2.5.2 )
Table 2.1 summarizes the digit sets involved in adder operation.
Then the add-on operation, performed by the adder, is
xi+ – xi- + yi+ = 2ti+ – u-i ( 2.5.3 )
This arithmetic operation can be performed by the adder known as plus-plus-minus adder ( PPM ) .The PPM adder is besides called Redundant Binary Full Adder ( RBFA ) .
Fig 2.5.2 lysergic acid diethylamide PPM Adder
Table 2.5.1 Digit sets in add-on.
Digit
Digit Set
Binary Code
eleven
{ -1, 0, 1 }
xi+-xi-
Lolo
{ 0,1 }
yi+
pi=xi+yi
{ -1, 0, 1, 2 }
aˆ¦..
ui
{ -1, 0 }
-ui-
Ti
{ 0, 1 }
ti+
si=ui+ti-1
{ -1, 0, 1 }
si+-si-
Fig 2.5.3 shows the construction of a 4-digit parallel add-on. In fig. the amount has 5 figures, i.e, 1 more figure than the addends [ 1 ] .
Fig 2.5.3 four-bit figure PPM adder
2.6 Radix-2 minus ( MMP Subtractor )
The subtractor shown in fig 2.6.1 can deduct an unsigned figure from a signed figure figure. A radix-2 signed figure figure is coded utilizing two unsigned binary Numberss, one is positive and other is negative, as X = X+ – X- . Hence each signed figure is represented utilizing 2 spots as eleven = xi+ – xi- , where xi+ , xi- ?„ { 0, 1 } and xi ?„ { 1- , 0, 1 } . One signed figure figure xi is to be added to an unsigned figure Lolo. This minus can be carried out in two stairss. In the first measure, an intermediate difference pi = xi – Lolo is computed digit independently, which lies in the scope { 2- , 1- , 0, 1 } shown in Table 2.2 and is expressed utilizing following equation [ 1 ] :
( 2.6.1 )
where the transportation figure Ti has value either 1- or 0, and is denoted as -ti- , the interim difference ui has value either 0 or 1, and is denoted as ui+ . In the 2nd measure, the amount figure Si is formed by uniting t-i-1 and ui+ as 1 figure:
( 2.6.2 )
Then the minus operation, performed by the subtractor, is
xi+ – xi- – yi- = -2t- + u+ ( 2.6.3 )
Fig 2.6.1 MMP adder
This arithmetic operation can be performed by the minus known as minus-minus-plus ( MMP ) subtractor or type-2 full adder. Fig 2.6.2 shows the construction of a 4-digit parallel radix-2 subtractor.
.
Fig 2.6.2 four-bit figure MMP subtractor.
Digit
Radix-2 Digit Set
Binary Code
eleven
{ -1, 0, 1 }
xi+-xi-
Lolo
{ 0, 1 }
yi-
pi=xi-yi
{ -2, -1, 0, 1 }
aˆ¦aˆ¦ .
ui
{ 0,1 }
ui+
Ti
{ -1, 0 }
-ti-
si=ui+ti-1
{ -1, 0, 1 }
si+-si-
Table 2.6.1 Digit sets in minus
2.7 Digit-serial SBD redundant adder
In Digit-serial SBD adder shown in fig ( ) , two excess binary Numberss xi ( = xi+ – xi- ) and yi ( = yi+ – yi- ) can be added at the same time and gives the consequence as a excess binary figure amount Si ( = si+- si- ) . This adder consists of PPM adder, MMP subtractor and D-FF ( hold ) . This adder behaves as pipelining architecture, by which critical way will be reduced and hence decrease of the extension delays [ 1 ] .
Fig 2.7 Digit-serial SBD redundant adder.
2.8 Radix-2 Redundant Binary Multiplier
See the bit-serial generation of two W-bit Numberss a and B to give a merchandise P as described by the algorithm below [ 9 ] :
Algorithm
Input signal: a, B
End product: P
INITIALIZE: Army Intelligence, Bi = 0 for I & gt ; W-1
curie, J Si, J = 0 I, J
Begin
for i=0 to W-1
Begin
for j=0 to W
Begin
Army Intelligence * bj + curie, j-1 + si-1, j+1 = 2ci, J + Si, J ( 2.8.1 )
Using systolic design method, the ensuing bit-serial multipliers of above equation are shown in fig ( 2.8.1 ) [ 1, 13,14 ] .
Fig 2.8.1 redundant multiplier architecture.
See the multiplicand B ( b3b2b1b0 ) is to be a radix-2 redundant figure, the figure A ( a3a2a1a0 ) is to be in unsigned representation. Each figure bj = bj+ – bj- of a radix-2 redundant figure B is recoded ( shown Table 2.8 ) utilizing a mark spot and a magnitude spot as follows:
Table 2.8.1 Recoding of bj
bj
-1
0
1
mark ( bj )
1
0
0
1
0
1
:
If the input spot bj is positive, so the adder cell matching to coefficient Army Intelligence, 0 a‰¤ I a‰¤ 2 in fig 2.8.1 can be implemented as an full adder ; the last adder cell, which involves the most important gestural spot of A with negative weight, carries out the undermentioned calculation [ 1 ] :
-a3 * bj + carryin + sumin = 2* carryout – sumout ( 2.8.2 )
Which can be implemented as a PPM adder dwelling of an full adder and 2 inverters. If the input figure bj is negative, so uniting above two equations and the elaborate multiplier circuit shown in fig 2.8.2.
Fig 2.8.2 redundant multiplier with PPM
Fig 2.8.3 Recoding of bj
2.9 Excess Binary to Binary Conversion
The transition procedure from excess double star to binary format in lsd-format manner can be carried out by sing x+ and x- as 2 independent unsigned Numberss and deducting x- from x+ as follows:
xi+ – xi- – curie = -2ci+1 + Si, ( 2.9.1 ) where an MMP adder is used at each spot place. Lsd-first excess double star to binary transition circuit is shown in fig 2.9.1 for a word-length of 4 spots. In this circuit carryout at any phase can be either 0 or 1- [ 1, 14 ] .
Fig 2.9 RB-to-Binary transition.
Chapter 3
3.1 FIR Filter Theory
A filter is used to take some constituent or modify some feature of a signal, but frequently the two footings are used interchangeably. A digital filter is merely a discrete-time, discrete-amplitude convolved. Basic Fourier transform theory provinces that the additive whirl of two sequences in the clip sphere is the same as generation of two matching spectral sequences in the frequence sphere. Filtering is in kernel the generation of the signal spectrum by the frequence sphere impulse response of the filter [ 1 ] .
A finite impulse response ( FIR ) filter performs a leaden norm of a finite figure of samples of the input sequence. The basic input-output construction of the FIR filter is a time-domain calculation based on a feed-forward difference equation. Figure 3.1 shows a flow diagram of a standard 3-tap FIR filter. The filter has seven informations registries. The FIR is frequently termed a transversal filter since the input informations transverses through the information registries in displacement registry manner. The end product of each registry ( D1 to D2 ) is called a pat and is termed ten [ n ] , where N is the tap figure. Each pat is multiplied by a coefficient ck and the resulting merchandises are summed. A general look for the FIR filter ‘s end product can be derived in footings of the impulse response. Since the filter coefficients are indistinguishable to the impulse response values, the general signifier of a standard FIR filter can be represented as Equation 3.1.
( 3.1 )
When the relation between the input and the end product of the FIR filter is expressed in footings of the input and the impulse response, it is called a finite whirl amount. We say that the end product is obtained by convoluting the sequences x [ n ] and H [ n ] . There is a simple reading that leads to a better algorithm for accomplishing whirl. This algorithm can be implemented utilizing the tableau that tracks the comparative place of the signal values. The illustration in Figure 2.3 shows how to convolute x [ n ] with h [ N ] . The finding of filter coefficients controls the feature of the FIR filter.
Fig 3.1 FIR filter.
3.2 Architecture of FIR filter
The velocity of the filter is defined as the rate at which input samples can be processed. To increase the velocity it is necessary to cut down the critical way between input and end product. The critical way is defined to be the way with the longest calculation clip among all waies that contain zero holds. Fig 3.1 shows direct concatenation has an estimated hold of
Tchain = Tm + ( N-1 ) Ta ( 3.2 )
The sample period ( Tsample ) is given by,
Tsample a‰? Tm + ( N-1 ) Ta ( 3.3 )
Therefore the sampling frequence ( fsample ) is given by
fsample a‰¤ ( 3.4 )
For 3-tap FIR filter, the critical way hold is ( Tm+2Ta ) . Pipelining reduces the effectual critical way by presenting pipelining latches along the informations way. The critical way is now reduced from Tm+2Ta to Tm+Ta shown in fig ( 3.2a and 3.2b ) . In this agreement while the left adder initiates the calculation of the current loop the right adder is finishing the calculation of the old loop consequence [ 1 ] .
Fig 3.2a datapath
Fig 3.2b 2-level pipelined construction
Another FIR filter known as converse or data-broadcast structured shown in fig3.3. The critical way of the filter of fig3.1 can be reduced without presenting any pipelining latches by permuting construction. Now the extension hold is Tm+Ta.
Fig 3.3 transposed FIR filter.
3.3 BOX-CAR FIR filter
If the multiplier coefficients of the filter are 1, so the filter is called box-car FIR filter. The critical way depends upon merely the clip needed for add-on operation.
The FIR filter consists of three chief constituents:
A D-FF to implement a simple hold.
A Multiplier to implement the coefficients.
An Adder to sum the nodes at the terminal of each pat.
Chapter 4
4.1 Architecture, Design and Implementation of Different Digital Cells
Design of 3-tap FIR filter is being implemented in the 090nm CMOS engineering and all architecture and simulation is done utilizing Cadence Design Environment 5.1.41. The FIR filter IC design consists of D-FF, the multiplier, and the adder. By utilizing functional description attack design, all single architecture digital logic cells be designed. Because functional description utilizes the modular design doing it easier to understand. Since each block will be thought out separately, the interior decorator has intimate cognition of how their circuit works. The downside of this method is that the circuit may be of less than optimum size. From transistor degree to gate flat design of different digital cells such as D-FF, adder and multiplier, first CMOS inverter is be taken as mention cell.
4.2 CMOS Inverter
For 90nm CMOS engineering, power supply VDD is 1.8v.The schematic of inverter shown in fig- 4.2.1.
Using gpdk090nm engineering library, ( see APPENDIX-I )
Aµn COX = 300 AµA/V2, Aµp COX =170 AµA/V2
Width of PMOS = Wp
Width of NMOS = Wn
Length of PMOS and NMOS = Lp = Ln =100nm.
For better noise border or symmetrical inverter design, the electromotive force VI is called the inverter gate threshold electromotive force, and is defined by the point where the electromotive force transportation curve intersects the unity addition line defined by Vout = Vin [ 3 ] .
Device transconductances value on NMOS is I?n = kn ( W/L ) N and for PMOS I?p = kp ( W/L ) P.
( I?n/I?p ) =1.083
But ( I?n/I?p ) =1.76 ( Wn/Wp )
Wp/Wn =1.63.
Degree centigrades: Documents and SettingsLenovoDesktopfigreinv.mayScreenshot-1.png
Fig 4.2.1 CMOS Inverter schematic.
Specification of CMOS inverter
Maximal switching frequence fmax =10 GHz, rise clip ( tr ) = autumn clip ( tf )
4.2.1 Design
The amount of the transient times ( tr +tf ) represents the minimal clip needed for a gate to undergo
a complete shift rhythm, i.e, for the end product to alter from a logic 1 to a logic 0 electromotive force, and so back up to a logic 1 value. We may utilize this to specify the maximal switching frequence by
fmax = 1/ ( tr+tf ) ( 4.2.1 )
Switch overing public presentation of CMOS digital circuits are characterized by the clip intervals required to bear down and dispatch capacitances at end product nodes. CMOS inverters use transistors to supply current flow waies between the power supply ( Mp ) and land ( Mn ) . All exchanging times are therefore set by the current degrees and the value of Cout. The end product high-to-low clip represents the clip interval needed for the end product capacitance to dispatch through the n-channel MOSFET Mn when Mp is in cutoff. is besides referred to as the autumn clip ( tf ) for the circuit since it gives the clip needed for the end product to disintegrate from a chiseled logic 1 province to a chiseled logic 0 province. The low-to-high clip besides known as the rise clip ( tr ) represents the clip interval needed for the end product capacitance to bear down through the p-channel MOSFET Mp. During this clip interval, Mn is in cutoff while Mp is carry oning from the power supply [ 3 ] .
From the design specifications:
tr = ( 1/20 GHz )
( 4.2.2 )
( 4.2.3 )
Where VTn: threshold electromotive force of NMOS.
Radon: opposition of NMOS.
Cout: capacitive burden applied to the end product of the inverter=50f.
V0: 0.1VDD.
V1: 0.9VDD.
Puting all the values together, we have
( W/L ) n =5.725
Wn =5.725*100nm =572.5nm
But ( Wp/Wn ) =1.63
Wp =1.63*Wn =933.175nm.
Using gpdk090nm CMOS engineering in meter IC 5.1.41, Wp =935n, Wn =575n. The schematic of the inverter shown in fig 4.2.1 [ 11 ] .
4.2.2 Simulation of CMOS inverter
Transistors degrees are simulated by utilizing SPECTRE simulator in the Analog Design Envirnoment. Both DC and transeunt analysis are done shown in fig4.2.2. Then the Affirm Analog test bench was created to prove the schematics [ 11 ] . The Analog simulations will demo the effects of electrical capacity related to transistor size and hence clock skew, signal holds and setup-and-hold misdemeanors will go apparent.
The extension hold clip tP is the logic hold through a gate. Physically
we interpret as the mean clip needed for the end product to react to a alteration in the input logic
province. By definition,
( 4.3.1 )
Where tPHL and tPLH represent the extension delays for a high-to-low, and a low-to-high passage, severally. Let us specify the 50 % electromotive force points as.Then, and tPLH are defined by the clip intervals between the input and end product electromotive forces.
From the simulation
Degree centigrades: Documents and SettingsLenovoDesktopfigreinv.mayScreenshot-6.png
Fig 4.2.2 transient and DC analysis wave forms.
4.2.3 Layout
Degree centigrades: Documents and SettingsLenovoDesktopfigreinv.mayScreenshot-2.png
Fig 4.2.3 CMOS inverter layout
4.2.4 Parametric Extraction and Post-layout simulation
Fig 4.1.4 shows the parametric extraction of CMOS inverter. After post-layout simulation the extension hold is 90.2 psec.
Degree centigrades: Documents and SettingsLenovoDesktopfigreinv.mayScreenshot-3.png
Fig 4.2.4 avs extraction of CMOS Inverter.
4.3 NAND-2
Two PMOS transistors are connected in parallel and two NMOS are connected in series shown in fig 4.3.1 [ 3 ] .
( Wn ) NAND = 2 ( Wn ) INV =575*2=1150nm
( Wp ) NAND = ( Wp ) INV= 935nm.
Degree centigrades: Documents and SettingsLenovoDesktopfigre
and2.mayScreenshot-1.png
Fig 4.3.1 NAND2 schematics.
. The layout and parametric extraction are shown in fig 4.3.2 and fig 4.3.3. The extension hold after post-layout is 71.41psec.
Degree centigrades: Documents and SettingsLenovoDesktopfigre
and2.mayScreenshot-3.png
Fig 4.3.2 NAND2 Layout
Degree centigrades: Documents and SettingsLenovoDesktopfigre
and2.mayScreenshot-6.png
Fig 4.3.3 Extraction of NAND2
4.4 NAND3
Three PMOS transistors are connected in analogue and three NMOS are connected in series shown in fig 4.4.1 [ 3 ] .
( Wn ) NAND = 3 ( Wn ) INV =575*3=1725nm.
( Wp ) NAND = ( Wp ) INV= 935nm.
Degree centigrades: Documents and SettingsLenovoDesktopfigre
and3Screenshot-2.png
Fig 4.4.1 NAND3 schematics.
The layout and parametric extraction are shown in fig 4.4.2 and fig 4.4.3. The extension hold in pre-layout and post-layout is measured as 56 psec.
Degree centigrades: Documents and SettingsLenovoDesktopfigre
and3Screenshot-1.png.
Fig 4.4.2 layout of NAND3.
Degree centigrades: Documents and SettingsLenovoDesktopfigre
and3Screenshot-3.png
4.4.3 Extraction of NAND3.
4.5 D-FF ( Delay )
A D-flip-flop was made from NAND3s, NAND2s, and an Inverter shown in fig 4.5.1.
D-FF with Set
With S = 1.8V, Q = D
With S = 0V, Q = 0V
Degree centigrades: Documents and SettingsLenovoDesktopuntitled.bmp
. Fig 4.5.1 D-FF schematics
After simulation of D-FF, the consequence gives
Degree centigrades: Documents and SettingsLenovoDesktopfigreproject1DFFlayut.bmp
Fig 4.5.2 layout of D-FF.
Degree centigrades: Documents and SettingsLenovoDesktopfigreproject1Screenshot-1.png
Fig 4.5.3 extraction of D-FF.
4.6 PPM Adder
The PPM Adder performs the undermentioned operation:
( 4.6.1 )
Using above equation t+ and u- are represented in
( 4.6.2 )
( 4.6.3 )
Based on equations ( 4.6.2, 4.6.3 ) , u- consists of two XNOR Gatess and t+ be used utilizing XNOR, XOR and base on balls transistors shown in fig 4.6.1. The Adder consists of 10 transistors. From figure, utilizing two 4-transistors XNOR Gatess to bring forth u- and two base on balls Gatess to bring forth t+ . These two Gatess are based on base on balls transistor logic doing threshold electromotive force VTn losingss for specific input sets. Because n-MOS base on balls transistors from electromotive force loss when conveying logic 1, while p-MOS degrades the transmittal of logic 0 electromotive force degree by VTp alternatively of 0 [ 3 ] , [ 5-7 ] . The logic 1 end product electromotive force of the 10-transistors PPM Adder degraded to alternatively of VDD and 2VT alternatively of 0v shown in fig 4.6.2.
Hydrogen: march26ppm.png
Fig 4.6.1 PPM Adder schematics.
Hydrogen: may24Screenshot-6.png
Fig 4.6.2 wave form of PPM Adder.
Layout and extractions are shown in fig 4.6.3 and 4.6.4. Comparison of tP between pre-layout and post-layout simulation shown below:
Pre-layout post-layout
tPHL 75.91psec 83.06psec
tPLH 79.85psec 81.44psec
tP 77.74psec 82.25psec
avg pressurized water reactor 1.968Aµwatt. 3.305Aµwatt.
Degree centigrades: Documents and SettingsLenovoDesktopfigreproject1ppmlayout.bmp
Fig 4.6.3 PPM Adder layout.
Degree centigrades: Documents and SettingsLenovoDesktopfigreproject1ppm1avs.bmp
Fig 4.6.4 Extraction PPM Adder.
4.7 MMP Subtractor
The MMP subtractor performs the undermentioned operation:
( 4.7.1 )
Using above equation t- and u+ are represented in
( 4.7.2 )
( 4.7.3 )
Based on equations ( 4.7.2, 4.7.3 ) , u+ consists of two XNOR Gatess and t- be used utilizing XNOR, XOR and base on balls transistors shown in fig 4.7.1. The subtractor consists of 10 transistors. From figure, utilizing two 4-transistors XNOR Gatess to bring forth u+ and two base on balls Gatess to bring forth t- . These two Gatess are based on base on balls transistor logic doing threshold electromotive force VTn losingss for specific input sets. Because n-MOS base on balls transistors from electromotive force loss when conveying logic 1, while p-MOS degrades the transmittal of logic 0 electromotive force degree by VTp alternatively of 0. The logic 1 end product electromotive force of the 10-transistors MMP subtractor degraded to alternatively of VDD and 2VT alternatively of 0v as shown in fig 4.7.2 [ 3 ] , [ 6 ] .
Hydrogen: march26mmp.png
Fig 4.7.1 MMP Subtractor schematics.
Hydrogen: march26Screenshot-9.png
Fig 4.7.2 wave form of MMP Subtractor.
Hydrogen: project1mmplayut.bmp
Fig 4.7.3 layout of MMP Subtractor.
The layout and extraction of MMP are shown in fig 4.7.3 and fig 4.7.4 and comparing of simulation consequences shown.
Pre-layout post-layout
tPHL 75.91psec 84.39psec
tPLH 79.53psec 81.42psec
tP 77.22psec 82.91psec
avg pressurized water reactor 3.49Aµwatt. 3.299Aµwatt.
Hydrogen: project1mmpavex.bmp
Fig 4.7.4 Extraction of MMP Adder.
4.8 Digit-serial SBD redundant adder
The digit-serial SBD redundant adder consists of three constituents, PPM Adder, MMP subtractor and holds shown in fig 4.8.1, simulation wave form shown in fig 4.8.2 and extension hold and mean power dissipation are measured.
Hydrogen: march26adder.png
Fig 4.8.1 SBD Adder schematic.
Hydrogen: march26adderwave.png
Fig 4.8.2 Simulation wave form SBD Adder.
4.9 Box-car FIR filter
4-tap, 1-bit input and 4-bit input Box-car FIR filter are shown in fig 4.9.1 and fig 4.9.2. The simulation wave form shown in fig 4.9.3, and measured extension hold is 11.48nsec. But avg pressurized water reactor for 1-bit input is 663.4Aµwatt, and 2.653mwatt for 4-bit input.
Hydrogen: march26fir.png
Fig 4.9.1 4-tapBoxcar FIR filter ( 1-bit input ) .
Hydrogen: march26fir4bit.png
Fig 4.9.2 4-tap Box-car FIR filter ( 4-bit input ) .
Hydrogen: march26fir4bitwave.png
Fig 4.9.3 Simulation wave forms of Box-car FIR filter.
4.10 Digit-serial Multiplier
Multiplication is merely a series of perennial add-on that are shifted. See the followers signed
binary generation of a two 4-bit whole number value. The multiplicand B ( b3b2b1b0 ) is signed binary and the multiplier A ( a3a2a1a0 ) is normal binary representations. Depending the value bj, it is recoded ( utilizing Table 2.8 ) and its gate execution is shown in fig 4.10.1. In Digit-serial multiplier, for every spot, get downing with the most important spot ( MSB ) and stoping with the least
important spot ( LSB ) , the multiplier is multiplied with the multiplicand. Every generation spot is merely combination of X-OR and AND operation [ 1,9,13 ] shown in fig 4.10.2. For an N-bit broad multiplicand and Multiplier ( an N x N generation ) , the merchandise will hold a 2N-bit broad merchandise. The consequence of our coveted 4 X 4 generation has a 8-bit merchandise. But here it is 9-bit, because of PPM and MMP adders are used. At the concluding phase the 9-bit multiplier end product is in usually binary signifier as shown in fig 4.10.3.
Degree centigrades: Documents and SettingsLenovoDesktopfigremultiplier
ecoding.bmp
Fig 4.10.1 Recoding of bj ( schematics ) .
Degree centigrades: Documents and SettingsLenovoDesktopfigremultiplierScreenshot-17.png
Fig 4.10.2 multiplier ( schematics ) .
The simulation consequences of the Digit-serial multiplier are shown below:
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1Screenshot-1.png
Fig 4.10.3 Digit-serial multiplier.
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1Screenshot-2.png
Fig 4.10.4 symbol of multiplier
4.11 3-tap FIR filter
Here we have design the data-broadcast or transposed 3-tap FIR filter. It consists of three multipliers, two adders with D-FFs. X ( n ) is the 4-bit impulse input and 4-bit multiplier coefficient holding value less than 1. We choose the coefficient A is 0.125 ( see APPENDIX II ) . 9-bit end product is produced at the multipliers and so these are connected to 9-FA with 9-D-FFs shown in fig4.11.1 and fig4.11.2. Finally, 10-bit ( S9aˆ¦aˆ¦S0 ) is produced at the end product of the filter.
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1fa9sym.bmp
Fig 4.11.1 9-FA schematic.
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.19dffs.bmp
Fig 4.11.2 9-DFF schematic.
The conventional and test-bench of 3-tap FIR filter shown in fig4.11.3 and fig 4.11.4. The simulation wave forms are shown in fig 4.11.5.
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1fir3tap.bmp
Fig 4.11.3 FIR filter schematic.
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1firtb.bmp
Fig 4.11.4 test-bench of FIR filter.
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1firwav3.bmp
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1firwav1.bmp
Degree centigrades: Documents and SettingsLenovoDesktopfigremay23.1firwav4.bmp
Fig 4.11.5 simulation wave forms of FIR filter.
The simulation consequences are summarized below:
As the extension hold of the designed filter is 2.716nsec, frequence of operation is 368.18MHz.The trying frequence of the filter is 368.18MHZ.
Chapter 5
5.1 Decision
1 ) A bottom-up 3-tap FIR filter is designed with trying frequence 368.18MHZ i.e. the design can run at 368.18 MHZ and uses 7.825 mwatt per clock of power.
2 ) Complexity is more.
3 ) High public presentation Boxcar FIR filter was designed.
4 ) Because FIR Filters are such an of import component of DSP design, it is good to make a undertaking like this to beef up apprehension of the construct.
5 ) This undertaking is a good start for pupils to larn IC design flow with CDS tool.
5.2 Future Work
It is impossible to plan N-tap FIR filter utilizing bottom-up flow. A top-down ASIC design flow is used to plan N-tap filter with optimisation algorithm technique.