# Performance Analysis of Modified Feedthrough Logic for Low Power and High Speed

Sauvagya Ranjan Sahoo<sup>1</sup>, Kamala Kanta Mahapatra<sup>2</sup>

Department of Electronics & Communication Engineering
National Institute of Technology Rourkela
Rourkela, India

<sup>1</sup>sauvagya.nitrkl@gmail.com <sup>2</sup>kmaha2@gmail.com

Abstract— In this paper the design of a low power and high performance dynamic circuit using a new CMOS domino logic family called feedthrough logic is presented. The need for faster circuits with low power dissipation has made it common practice to use feedthrogh logic. The proposed circuit for low power improves dynamic power consumption as compared to the existing feedthrough logic and to further improve its speed we proposed another circuit which improves the speed by sacrificing dynamic power consumption. The proposed circuit is simulated using 0.18 µm, 1.8 V CMOS process technology. Intensive simulation results in Cadence environment shows that the proposed modified low-power structure reduces the dynamic power approximately by 35% and the modified structure for high performance achieves a speed up- 1.3 for 10-stage of inverters and 8-bit ripple carry adder in comparison to existing feedthrough logic. The concept is validated through extensive simulation. The problem of requirement of output inverter and non-inverting logic are also completely eliminated in the proposed design.

Index Terms— Feedthrough logic (FTL); dynamic CMOS logic circuit; high performance; low-power adder.

### I. INTRODUCTION

Reducing the power consumption of CMOS integrated circuits along with improving its performance has been a topic of great interest in recent years. The various design techniques proposed in the last two decades trade power for performance. This is achieved through a mix of dynamic and static circuit styles [1], use of dual threshold voltage transistors [2] and dual supply voltages [3]. For many applications, speed improvement is achieved at the expense of power.

Domino logic circuits are widely used in high performance integrated circuits. It reduces silicon area, transistor count and improves performance as compared to static CMOS logic [4, 5, 6]. The major drawback of domino logic circuit is its excessive power dissipation due to

switching activity and requirement of inverters during cascading various logic blocks.

To improve the power consumption further a new logic family called as feedthrough logic (FTL) is proposed in [7]. This logic works on domino concept along with the important feature that output is evaluated before all the inputs are valid. This feature results in very fast evaluation in computational block. The problems associated with domino logic [4] circuits like charge sharing, charge leakage and non-inverting logic are completely eliminated.

The dynamic design in [3] uses high supply voltage for logic evaluation and low supply voltage for clocking dynamic logic. The adder designed in [8] uses architectural technique to reduce the short circuit current, the research work in [9] uses two dynamic gate between three static gates.

The dynamic power dissipation in a CMOS is given by [10].

$$P_{dynamic} = V_{dd} F_{clk} \sum_{i} V_{iswing} C_{iload} \alpha_{i}$$

Where  $\alpha_i$  is the switching factor at node i, C  $_{i\ load}$  is the load capacitance at node i, V  $_{i\ swing}$  is the voltage swing at node i,  $F_{clk}$  is the system clock frequency and  $V_{dd}$  is the supply voltage.

In this paper we present the design of a low power FTL (LP-FTL) circuit that further improves the power consumption of FTL in [7] at the cost of propagation delay. In order to prove the usefulness of the proposed circuit we designed an 8-bit ripple carry adder (RCA) structure. The LP-FTL circuit is further modified to improve speed.

The rest of the sections are organized as follows, in section II the existing basic FTL structure operating principle is explained, section III presents the proposed modified FTL structures, section IV presents the performance analysis of both the proposed structure using a long chain of inverter. The performance analysis of an 8-bit ripple carry adder (RCA) structure given in section V, and conclusions are derived in section VI.



Fig. 1. (a) Basic structure of FTL. (b) Transistor level circuit diagram for long chain of inverter using FTL (10-stages) (c) Output voltages from 1<sup>st</sup> stage (N1) to 10<sup>th</sup> stage (N10) of Inverters designed by using basic FTL.

# II. CONVENTIONAL FTL PRINCIPLE OF OPERATION

The basic structure of FTL is shown in Fig. 1(a). It consist a NMOS reset transistor  $M_r$  for resetting the output node to low logic level, a pull up PMOS load transistor  $M_p$  and an NMOS block.  $M_p$  and  $M_r$  controlled by the clock signal CLK.

The basic principle of operation of a FTL circuit was presented in [7] and is briefed here. During CLK=1, (reset phase)  $M_r$  turned on and the output node pulled to ground through  $M_r$ . Since the output node is pulled to ground during reset phase the need of inverter during cascading is eliminated. When CLK goes low (evaluation phase)  $M_r$  is turned off and the output node conditionally evaluates to logic high  $(V_{OH})$  or low  $(V_{OL})$  depending upon input to NMOS block. If the NMOS block evaluates to high then output node pulled toward  $V_{DD}$  otherwise it remain at logic low.

A long chain of inverter designed by using FTL is shown in Fig. 1(b). When CLK=1, all the output nodes are at logic zero. When CLK goes low, the output node of the cascaded gate rises to the gate threshold voltage  $V_{TH}$  as shown in Fig. 1.(c). At this point any small variation in the input node causes a fast variation in voltage at the output node. When the inputs to the gates are valid then output node makes only a partial transition from  $V_{TH}$  to  $V_{OH}$  or  $V_{OL}$ . Since here the transition at output node occurs only from  $V_{TH}$  to  $V_{OH}$  or  $V_{OL}$  as a result both low-to-high and high-to-low propagation delay reduces.

Despite its performance advantage, FTL suffers from reduced noise margin, direct path current and the most important is non-zero nominal low output voltage caused by the contention between PMOS  $(M_p)$  and NMOS in the evaluation block, i.e.  $V_{OL}\!\!\neq\!0.$  This makes more dynamic power consumption by the circuit.

### III. PROPOSED MODIFIED FTL

# A. Low power proposed modified FTL (LP-FTL)

The proposed modified circuit is shown in Fig. 2(a). This circuit reduces V<sub>OL</sub> by using one additional PMOS transistor  $M_{P2}$  in series with  $M_{P1}$ . The operation of this circuit is similar to that of FTL in [7]. During reset phase i.e. when CLK = 1, output node is pulled to ground (GND) through M<sub>r</sub>. During evaluation phase output node charges through  $M_{p1}$  and  $M_{p2}$ When CLK goes low (evaluation phase) M<sub>r</sub> is turned off and the output node conditionally evaluates to logic high  $(V_{\text{OH}})$  or low (VoL) depending upon input to NMOS block. If the NMOS block evaluates to high then output node pulled toward  $V_{DD}$  i.e.  $V_{OH} = V_{DD}$ , otherwise it remain at logic low i.e.  $V_{OL}$ . Since  $M_{p1}$  and  $M_{p2}$  are in series the voltage at drain of M<sub>P1</sub> is less than V<sub>DD</sub>. During evaluation due to ratio logic the output node pulled to logic low voltage i.e. V<sub>OL</sub> which is less than the  $V_{OL}$  of existing FTL. This reduction in  $V_{OL}$ causes significant reduction in dynamic power consumption but due to the insertion of PMOS transistor M<sub>p2</sub> propagation delay of the proposed LP-FTL in Fig. 2(a). increases.

The propagation delay of propose FTL circuit can be improved by using an NMOS transistor as shown in Fig. 2(b).

### B. High speed proposed modified FTL (HS-FTL)

In order to improve the speed of proposed LP-FTL structure the reset transistor  $M_r$  is connected to  $V_{\text{DD}}/2$  as shown in Fig. 2(b). The operation of this circuit is as follows, when CLK =1, the output node (OUT) will charges to the threshold voltage  $V_{\text{TH}}$ . During evaluation phase according to input value the output node only makes partial transition from  $V_{\text{TH}}$  to  $V_{\text{OH}}$  or  $V_{\text{OL}}$ . Since during evaluation phase the output node (OUT) only makes partial transitions, this improves propagation delay. An inverter designed by using HS-FTL is shown in Fig. 2(c).



Fig. 2. (a) Proposed modified low power FTL structure (LP-FTL). (b) Proposed modified high speed FTL structure (HS-FTL). (c) Circuit diagram of inverter using HS-FTL structure.



Fig. 3. Plot of the output voltages from 1st stage (N1) to 10th stage (N10) of inverters. (a) For FTL structure in [7]. (b) For LP-FTL structure. (c) For HS-FTL structure.

# IV. PERFORMACE ANALYSIS OF PROPOSED MODIFIED FTL

The two proposed modified FTL structure's performance is verified against the existing FTL structure in [7] by designing a long chain of inverter consisting 10 stages. We have used 0.18 $\mu$ m CMOS process technology model library from UMC, using the parameter for typical process corner at 25  $^{0}$ C. Power supply  $V_{DD}$  is constant for all simulations and is equal to 1.8V. Circuits are simulated in HSPICE simulator.

Fig. 3(a-c). show the plot of output voltage from the 1st stage of inverter to the 10th stage of inverter at 20 fF capacitive loads for existing FTL in [7], proposed modified LP-FTL and HS-FTL respectively. From Fig. 3(a) and 3(b) the  $V_{OL}$  value of LP-FTL is less than that of existing FTL in [7]. The transition for HS-FTL occurs from  $V_{TH}$  to either  $V_{OH}$  or  $V_{OL}$  as shown in Fig. 3(c).

Table I shows the dynamic power, average values of propagation delays (t<sub>p</sub>), and power delay product comparison of two proposed modified FTL and the existing FTL in [7] for

20 fF capacitive load at 100 MHz. The LP-FTL structure provides reduction in power consumption due to reduction in  $V_{OL}$ . The power consumption by LP-FTL structure is 36.8% less than that of FTL in [7].The HS-FTL structure provides improvement in average propagation delay by a factor of 1.83 with respect to LP-FTL and 1.36 with respect to FTL in [7]. The power delay product of both the proposed modified FTL structure is better as compared to existing FTL structure.

TABLE I. SIMULATION RESULTS FOR FTL, LP-FTL, HS-FTL (10-INVERTER CHAIN)

| Logic family    | Power     | t <sub>p</sub> | PDP     |
|-----------------|-----------|----------------|---------|
|                 | $(\mu W)$ | (ns)           | (µW*ns) |
| FTL in [7]      | 290.1     | 1.294          | 375.38  |
| Proposed LP-FTL | 183.2     | 1.743          | 319.31  |
| Proposed HS-FTL | 316.9     | 0.95           | 301.05  |

# V. RIPPLE CARRY ADDER DESIGN AND PERFORMACE ANALYSIS

A full adder is designed by using these basic sum and carry cell shown in Fig. 4(a), (b). These basic cells are designed by using proposed modified LP-FTL structure. These cells are used for the design of 8-bit ripple carry adder as in [4]. Cells for the HS-FTL structure are similar.

All the 8-bit ripple carry adders designed by various structures are simulated in  $0.18\mu m$  CMOS process technology model library from UMC, using the parameter for typical process corner at  $25^{\circ}$ C. Power supply  $V_{DD}$  is constant for all simulations and is equal to 1.8V.

Table II shows dynamic power consumption, propagation delay time  $(t_p)$ , and power delay product (PDP) of existing FTL structure in [7], LP-FTL and HS-FTL structure for 10 fF capacitive loads at 100 MHz.

With respect to the existing FTL structure the proposed LP-FTL structure provides 38% reduction in dynamic power. The proposed HS-FTL structure achieves a speed up factor of 2.65 with respect to LP-FTL structure and 1.96 with respect to existing FTL structure.

The power delay product is shown in Fig. 5. From the PDP chart the PDP of both the proposed structures are better as compared to the existing FTL structure. The PDP improves due to reduction in power in LP-FTL and reduction in average propagation delay in HS-FTL structure.

Fig. 6. and Fig. 7. are the plots of propagation delay versus load capacitance (variation from 1 fF to 20 fF) and temperature (variation from  $-20\,^{\circ}$ C to  $120\,^{\circ}$ C).

The effect of inter-stage load capacitance on dynamic power is shown in Fig. 8. by varying load capacitance from 1 fF to 20 fF. The power consumption of the proposed LP-FTL structure is less as compared to the other structures.

TABLE II. SIMULATION RESULTS FOR DYNAMIC POWER FOR AN 8-BIT RIPPLE CARRY ADDER DESIGNED BY PROPOSED FTL CIRCUITS AND THE EXISTING FTL STRUCTURE [7].

| Logic family    | Power(µW) | t <sub>p</sub> (ns) | PDP<br>(µW*ns) |
|-----------------|-----------|---------------------|----------------|
| FTL in [7]      | 409.68    | 0.63                | 258.09         |
| Proposed LP-FTL | 249.91    | 0.85                | 212.42         |
| Proposed HS-FTL | 540.12    | 0.32                | 172.83         |





Fig. 4. LP-FTL structure of (a) sum cell (b)carry cell.



Fig.5. power delay product for RCA



Fig.6. Effect of output load on propagation delay



Fig.7. Effect of temperature on propagation delay



Fig.8. Effect of output load capacitance on dynamic power dissipation

### VI. CONCLUSION

In this paper we proposed a low power dynamic circuit. The proposed circuit is simulated in 0.18 µm CMOS process technology from UMC. The proposed modified circuit when compared with the recently proposed scheme the LP-FTL structure reduces dynamic power consumption by at-least 35% as compared to existing FTL structure and the HS-FTL structure provides a speed up factor of at-least 1.35 over the LP-FTL and existing FTL structure. The simulation for a long chain of inverter (10-stage) and 8-bit ripple carry adder is also carried out in this work. The simulation result confirms that for a given load and at same frequency of operation the power delay product of both the proposed circuits is much better than that of existing FTL structure.

### ACKNOWLEDGMENT

The authors acknowledge to DIT (Ministry of Information & Communication Technology) for the financial support for carrying out this research work.

#### REFERENCES

- [1] S. Mathew, M. Anders, R. Krishnamurthy, S. Borkar, "A 4 GHz 130 nm address generation unit with 32-bit sparse-tree adder core," *IEEE VLSI Circuits Symp.*, *Honolulu*, *Hi,jun* 2002, pp. 126-127.
- [2] S. vangal, Y. Hoskote, D. Somasekhar, V. Erraguntla, J. Howard, G. Ruhl, V. Veeramachaneni, D. Finan, S. Mathew, and N. Borkar, "A 5-GHz floating point multiply-accumulator in 90-nm dual V<sub>T</sub> CMOS," in Proc. *IEEE Int. Solid-State Circuits Conf.*, San Francisco, CA, Feb.2003, pp. 334–335
- [3] R.K. Krishnamurthy, S. Hsu, M. Anders, B. Bloechel, B. Chatterjee, M. Sachdev, S. Borkar, "Dual Supply voltage clocking for 5GHz 130nm integer execution core," proceedings of IEEE VLSI Circuits Symposium, Honolulu Jun. 2002, pp. 128-129.
- [4] J.M. Rabaey, A. Chandrakasan, B. Nikolic, 'Digital Integrated Circuits: A Design perspective' 2e Prentice-Hall, Upper saddle River, NJ, 2002.
- [5] S. M. Kang, Y. Leblebici, 'CMOS Digital Integrated Circuits: Analysis & Design', TATA McGraw-Hill Publication, 3e, 2003.
- [6] N. Weste, K. Eshraghian, 'Principles of CMOS VLSI Design, A systems perspective', Addision Wesley MA, 1988.
- [7] V. Navarro-Botello, J. A. Montiel-Nelson, and S. Nooshabadi, "Analysis of high performance fast feedthrough logic families in CMOS," *IEEE Trans. Cir. & syst. II*, vol. 54, no. 6, Jun. 2007, pp. 489-493.
- [8] Y.jiang, A. Al-sheraidah, Y. Wang, E.sha, J. Chung, A novel multiplexer based low-power full adder, IEEE Trans. Circuits Syst.-II,vol. 52, 2004, pp. 345-348.
- [9] S. Mathew, M. Anders, R. Krishnamurthy, S. Borkar, "A 4 GHz 130 nm address generation unit with 32-bit sparse-tree adder core," *IEEE J. Solid State Circuits* Vol.38 (5) 2003, pp. 689-695.
- [10] K.S. Yeo, K. Roy, 'Low- Voltage, Low-Power VLSI Subsystems'.