Low Power, High Performance 4bit/8bit Hybrid Multiplier Architecture

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.


Abstract: In this paper a review of the hybrid radix-4/radix-8 architecture of the high bit multipliers is done with an approach to compromise between the high speed of a radix-4 multiplier and the lower power dissipation of 8 bit multiplier. In this hybrid architecture of multiplier, the power consumption is reduced to 13% for a 64 x 64 bit multiplier with an increase of 9% delay as compares to a conventional radix-4 multiplier. The delay exhibited is same in case of scaling of supply voltage in the hybrid architecture and the basic radix-4 architecture and the power dissipation is also less than that of a radix-4 multiplier. Thus, the hybrid architecture is most appropriate for the minimal power consumption and higher performance.

Keywords: Radix-4 Multiplier, Booth Multiplier, Hybrid Multiplier etc.


Signal Processing systems and arithmetic based systems employ high speed multipliers as their fundamental elements. The higher bit width multipliers provide a great opportunity to explore the large architecture which was impossible for the lower bit multiplier circuits. The architectures were designed to operate at higher speeds for the circuit elements. The power dissipation and architecture speed is considered as the main parameter at the circuit level.

Figure1. Multiplier Power Factor Data

The data in Figure 1 present the power factors for a number of recent multiplier implementations. Sharma eral. Utilized Booth radix-4 encoding along with a reduction array of carry save adders (CSAS) generated by a recursive algorithm to produce the 16 x 16-bit multiplier in [2]. In [3],Yano ef al. introduced the complementary pass-transistor logic family (CPL) and implemented a 16x 16-bit multiplier in CPL which used no encoding but did use a Wallace tree for partial product reduction. Nagamatsu ef al. presented a 32 x32-bit multiplier in which Booth radix-4 was used to generate the partial products and a tree of 4:2 counters was used to reduce these partial products [4]. Mori ef d. designed a 54x 54-bit multiplier similar in structure to that of [4], also utilizing Booth radix~ and 42 counters [5]. In [6], Go to et al. presented a 54 x 54-bit multiplier with Booth radix-4 partial product generation, but used a regularly structured tree for partial product reduction, thereby simplifying the physiezd layout. Lu and Samueli were most concerned with throughput in the design of the multiplier-accumulator described in [7], and thus they presented a 13-stage, deeply pipelined 12 x 12-bit multiplier-accumulator which used no encoding and was implemented with a quasi-domino dynamic logic family. The data point representing this work is a 64 x 64-bit multiplier using both Booth mlix-4 and miix-8 encoding with a Dad& reduction tree.

In this paper a hybrid 4-bit/8-bit multiplier architecture is implemented as a method to trade-off speed and power dissipation in two’s complements signed multipliers. The improved characteristics are compared with the standard 4-bit and 8-bit multipliers.


In radix-8 architecture, the multiplication process is serially dependent upon the time required to generate 3B: while 3B is being generated by a high speed adder, no other processing can take place within the multiplier. This requirement to generate 3B leads to a significant delay penalty, on the order of 1O-2O%, as compared with a radix-4 architecture (where the partial products may be generated by simple shifting and/or complementing).


In the hybrid radix-4/radix-8 architecture, a subset of the partial products is generated using radix-4 modified Booth encoding. Reduction begins on these radix-4 partial products while 3B is simultaneously being generated by a high speed adder. Upon generating 3B, the remaining partial products are generated using radix-8 encoding, and these partial products are subsequently included within the reduction tree.


In this section the performance of a hybrid multiplier is explained. The power dissipation, propagation delay and the delay results are compared to analyze the performance of hybrid architecture over conventional hybrid multiplier.

The power dissipation of the hybrid multiplier architecture is reduced up to a considerable level with a minimal increase in delay. The transistor count for the hybrid 32*32 architecture is 25,678 where it is 90,238 for the 64*64 multiplier architecture.

Voltage scaling, reducing the power supply voltage, may be applied to higher speed multipliers to reduce the power dissipation of these circuits, while simultaneously increasing delay.


In the proposed work we will reduce the transistor count and the multiplexer circuit will be replaced by the PTL circuit and the radix-4 hybrid architecture will yield to better results and the voltage scaling in the power supply will help to reduce the power consumption and the propagation delay also can be optimized in the proposed circuit with some better approaches. The conventional radix-4 multiplier can be enhanced and improved with the introduction of PTL logic in the circuit.


We have studied and reviewed the radix-4 and radix-8 booth multiplier architectures along with their characteristics. The parameters will be improved and enhanced in the proposed architecture. The power dissipation is reduced with the increase in delay and the hybrid architecture will be proposed in the research paper.


[1] H, Sam and A. Gap@ “A Generalized Multi-bit Recoding of Two’s Complement Binary Numbers and Its Proof with Application in Multiplier Implementations,” IEEE Transactions on Computers, Vol. C-39, No. 8, pp. 1006-1015, August 1990.

[2] B. Millar, P. E. Madrid, and E. E. Swartzlsnder, Jr., “A Fast Hybrid Multiplier Combining Booth and Wallace/f)adda Algorithms? Proceedings of the 35h IEEE Midwest Symposium on Circuits and Systems, pp. 158–165, August 1992.

[3] C. S. Wallace, “A Suggestion for a Fast Multiplier IEEE Transactwnr on Electronic Computers, Vol. EC-13, pp. 14-17, February 1964.

[4] L. DadW “Some Schemes for Parallel Multipliers: Alta Frequenza, Vol. 34, No. 5, pp. 349-356, May 1965.

[5] S. M. Kartg, “Accurate Simulation of Power Dissipation in VLSI Circuits,” lEEE Journal of Solid-State Circuits, Vol. SC-21, No. 5, pp. 889-891, October 1986.

[6] B. S. Cherkauer and E. G. Friedman, “A Unified Design Methodology for CMOS Tapered Buffers,” IEEE Transactions on VLSI System, Vol.VLSI-3, No. 1, pp. 99-111, March 1995.

[7] T. K. Callaway and E. E. Swartzlander, Jr., “Estimating the Power Consumption of CMOS Adders; Proceedings of the Ilti IEEE Symposium on Computer Arithmetic, pp. 210-216,

June/July 1993.

[8] A. P. Chandrakasart, S. Sheng, and R. W. Broderson, “Low-Power CMOS Digital Design,” IEEE Journal of Solid-State Circuits, Vol. SC–27, No. 4. pp. 473483, April 1992.