# A New Dual-Modulus Divider Circuit Technique

Michael D. Pierschel and Hans Gustat

IHP Im Technologiepark 25 D-15236 Frankfurt (Oder) Germany email: pierschel@ihp-ffo.de

## Abstract

We report a new dual-modulus divider circuit technique, which avoids the frequency limitations due to commonly used additional logic in the high speed divider chain. The new pulse-swallow approach does not introduce additional delay. This circuit technique is not restricted to CMOS.

Prototypes of a 16/17 and a 32/33 divider in a 0.4 micron digital CMOS technology have been measured to run up to 2.825 GHz with 21.7 mW of total power. Between 2.3 GHz and 2.4 GHz, the circuit draws 7.3 mA current from a 2.7 V supply. The measured input sensitivity is below 0 dBm between 1.0 GHz and 2.4 GHz.

# 1. Introduction

Dual-modulus frequency dividers are crucial parts of frequency synthesiser blocks used in integrated transceivers.

Commonly, dual-modulus frequency dividers are built up from synchronous binary dividers and logic blocks to set the desired divider ratio [1], [2]. The logic block delay is a fundamental speed limitation in such an architecture, since the logic decision must be completed before the next input clock cycle begins. This significantly reduces the maximum input clock frequency below that corresponding to binary division.

The limitation due to logic delay can be alleviated a phase selection rotation approach [3] based on an asynchronous full-speed divide-by-two circuit as a first stage.

The goal of this work is to develop a new dualmodulus divider architecture, which comes near to a binary divider in terms of power consumption at a given operating frequency and technology.

## 2. Novel dual-modulus divider technique

### 2.1. Pulse-swallow concept

We present a dual-modulus divider concept, which is based on switching between two pure divide-by-two (div2) circuit structures rather than inserting additional logic or selection circuitry in the signal path.

For normal div2 operation, the active circuit is identical to a binary divider circuit comprising a fully differential master and slave latch in closed loop.

Correct divide-by-two function of this Johnson divider requires one logic inversion within the loop. Due to signal symmetry, inversion simply means the crossing between positive and negative signal line.





Independent of the location of signal crossing between latch A and B (Fig 1) or between B and A (Fig. 2), both circuits act as constant-rate div2 circuits.



Fig. 2: Divide-by-two circuit with fully differential signals. Signal crossing between A and B.

For clock swallowing, we assume the location of signal crossing to be changed using ideal, delay-free switches, represented by boxes in Fig. 3.



Fig. 3: Divide-by-two circuit with ideal switches for signal crossing.

The location of the signal inversion point can be set according to a switch signal SW, thus allowing one to switch between the circuit structures shown in Fig. 1 and Fig. 2, respectively.

In binary division mode, each latch would sample a logic state inverse to current state at every input clock cycle.

When switching takes place, each latch input is inverted, resulting in sampling a logic state equal to current state. Consequently, the circuit is in a stationary state for this time and no changes in the output signal appear.

We obtain exactly a one-cycle delay before the normal div2 operation is executed again. This delay is equivalent to swallowing a full input clock cycle, as will be illustrated in the circuit solution section below. To implement the delay-free switch function, we use modified D-latches, including a switch signal input to select one of two parallel input stages (Fig. 4).



Fig. 4: Pulse-swallow divider concept

# 2.2. Circuit solution

The divider circuit is based on the standard binary divider circuit built of two symmetrical latches in current-mode logic (CML). A standard single CML latch is shown in Fig. 5.



Fig. 5: Standard CML D-Latch

To derive the modified CML D-Latch version, the input part is extended by a second input transistor pair and a switch transistor pair to select one of the input pairs D or O in mutual exclusion (Fig. 6).



Fig. 6: Modified version of CML D-Latch

Both input pairs have their gates always connected to their respective nodes. Switching is done by activating the current flow through one pair. Therefore, the switching process has little effect on intermediate node voltages, and is executed rapidly.

This circuit provides a mechanism for producing a delay of one input clock cycle at each transition of the swallow signal. An N+1 divider with  $N=2^{M}$  can be built up using this pulse-swallow divider circuit as the first divider stage followed by M asynchronous div2 stages. The output of the chain is fed back to the swallow input of the first stage. The m-th stage provides the desired output frequency  $f_{out}$ .

The following characteristics are based on circuit simulation (BSIM level 3 V.3) in the 1.6 - 2.4 GHz range:

- operating current i=570 $\mu$ A,
- load resistance Rl=1100  $\Omega$ ,
- estimated parasitic capacitance Cp=20fF.

In Fig. 7, a Cadence Spectre transient simulation illustrates a transition of the switch signal SW.

Regardless of the switch signal setting, the input circuit drives the same output nodes (Q and QQ in Fig. 6).

The power consumption of this circuit in dualmodulus mode equals the power consumption in constant rate mode.



Fig. 7: Simulated transient response to a switch transition

#### 2.3. Switching control

A common switch signal to both D-Latches of Fig. 3, which is not synchronous to the input clock can lead to unexpected behavior, when applied to dual-modulus division by N. For example, a stable N+0.5 division ratio can appear when applying different slopes for rising and falling edges of the switch signal.

For correct pulse-swallow function, the newly activated input stage should have enough active time to charge/discharge the output nodes corresponding to the input signal levels. Otherwise, glitches and metastable states affect the pulse-swallow behavior, and can even lead to a pulse generation function, instead of swallowing (division by N-1 instead of N+1).

This timing sensitivity requires a synchronization of the switch signal with the input clock. To avoid deterioration of the binary division function, the master and the slave latches are each switched in their inactive clock phase. Two subsequent current-mode latches (E and F in Fig. 8) provide the control signals for this two-step switching.

Since the input pair exchange of the inactive latch is faster than the output signal change of the active one, switching is completed before the end of the input clock cycle, and divider performance is not degraded by the swallow procedure.

Circuits C to F consist of standard current-mode logic D-Latches (Fig. 5) in series to convert CMOS to CML logic levels and provide half-cycle delayed switching signals to the A and B latches. This provides precise synchronous timing control at the expense of additional power consumption in the synchronisation latches.

# 2.4. Circuit application

In our implementation, the high speed pulse-swallow stage described above is followed by three CML div2 dividers and a level shifter to CMOS logic. The circuit is completed by two standard CMOS toggle flip-flop (FF) circuits, some low speed CMOS logic gates, and a CMOS output driver block. The first standard CMOS FF is configured as pass-through or div2 for selecting the The power consumption breakdown at 2.7 V supply and 2.4 GHz input frequency is as follows: 7 mW in the synchronisation block; 3.1 mW in the high speed div2/3 swallow circuit; 3.1 mW in the CML div8 block; and 0.84 mW in the level shifter.



Fig. 8: Novel pulse-swallow divide-by-two core circuit

division ratio N of 16 or 32, and the last stage halves this output frequency to generate one swallow signal transition for each  $f_{out}$  cycle.



# Fig. 9: Final circuit block diagram, including power distribution

Figure 10 shows the chip layout including DC blocking capacitances and internal voltage references. The circuit is designed to work from 2.7 V to 3.3 V supply voltage. The total simulated supply power of the circuit is 18.9 mW driving a  $50\Omega$  load.



Fig. 10: Novel dual modulus prescaler chip photo

To facilitate simple measurement, the clock inputs are connected to the pads with integrated DC blocking capacitances.

# 3. Obtained results

# 3.1. Experimental results

Chips were manufactured using MOSIS in the 0.4 micron scalable digital CMOS technology of TSMC.



# Fig. 11: Test PCB assembly

The chips were bonded to test printed-circuit boards as shown in Fig. 11, and measured using the setup shown schematically in Fig. 12 for differential mode measurement.



Fig. 12: Schematic of differential measurement setup

The input signal sine wave is split into a reference input signal to the oscilloscope and then split again to match the differential input requirements.



Fig. 13: Dual-modulus prescaler sensitivity

In single-ended mode, the circuit was connected directly to the signal source shorting the second input to ground. Fig. 13 displays the sensitivities obtained using both topologies.

If the RF signal is generated on chip, the integrated DC blocking capacitances including parasitic capacitances of about equal value can be omitted. This would increase the input sensitivity given in Fig. 13 by about 3 dB.

Measurements were carried out with the pulseswallow function enabled (div33 mode) and disabled (div32 mode), respectively.

Figures 14 and 15 present oscilloscope traces of input clock and divider output for both cases.



Fig. 14: Input-output trace @ 2.3GHz/div32 mode



Fig. 15: Input-output trace @ 2.3GHz/div33 mode



Fig. 16: Dual-mode trace @ 2.3 GHz

The total power consumption of the test chip was measured over a frequency range from 200 MHz - 2.8 GHz.



Fig. 17: Total power consumption

To operate the circuit over a wide frequency range, the internal voltage reference of 1.47 V was raised externally to 1.7 V, or 2.45 V. Fig. 17 illustrates the measured total power consumption.



Fig. 18: Maximum input frequency vs. temperature

This test chip runs without any temperature correction circuitry. Power consumption therefore drops with increasing temperature, and there is a strong roll-off in the maximum operating frequency. Fig. 18 shows the temperature dependence of this frequency limit in div32 and div33 modes, respectively. We find a similar circuit performance versus temperature in both modes. Over the temperature range -60 °C to 120 °C, the pulse-swallow function has very little effect on performance limits, compared to binary division.

### 3.2. Comparison with other prescalers

To compare the obtained results with existing dualmodulus prescaler approaches, we have chosen two key parameters for evaluation: maximum input frequency and power consumption at this frequency.

While the maximum input frequency is an easy criterion, the aspect of power consumption is slightly more complicated. For example, one can trade DC power consumption for RF input power by choosing another prescaler approach. Further, prescalers differ in division ratio; a higher division ratio requires subsequent divider stages and increases power.

Thus, benchmarking these circuits strongly depends on the viewpoint, or on the cost factors assigned to various details. To find a compromise based on practical aspects, we consider a single-chip transceiver application. The RF signal has been assumed to be generated on chip with a 50 % power efficiency from DC supply, and RF power is derived from RF input voltage using a 50  $\Omega$  resistance. Presumably, the former assumption is too optimistic (compared to 25% for a sine wave), but may be balanced by the errors created by the latter one. However, this rough approximation might still discriminate designs with high RF input impedance.

For dual-modulus prescalers with a division ratio less than 128/129, we imagine subsequent asynchronous divide-by-two stages added until this ratio is achieved. Each stage ideally consumes half the power of the preceding one. Summing up the given DC and calculated RF power dissipation results in a normalised power consumption, which serves as the second key parameter.

These key parameters are assigned to the axes of the diagram in Fig. 19. Points representing higher performance designs tend to the lower right corner of the diagram.



Fig. 19: Prescaler power vs. frequency

In addition to measured results of this work, all RF dual-modulus prescalers in CMOS and SIMOX technology from 1985 to date known to the authors are represented here. For simplification, reference assignments to the respective points are restricted to an input frequency of 1.5 GHz and above.

Reference [4] is marked with an asterisk, because this design was based on an experimental P/N balanced technology rather than being compatible to digital CMOS.

The power consumption of the prescaler proposed here is still high, when considered as a part of a singlechip transceiver.

A clear disadvantage of this solution is the requirement to synchronise the swallow signal with the RF clock. Several prescaler principles require such synchronisation, which increases power dissipation considerably over that of a pure binary divider chain.

## 4. Conclusions

A new low-power dual-modulus divider circuit technique has been demonstrated with a prototype chip. This chip serve the div16/17 and the div32/33 division modes. The circuit is fabricated in  $0.4\mu$ m CMOS. We reached 2.825 GHz operating frequency with 21.7mW total power consumption. Below 2.4 GHz, the power consumption is less than 20 mW. The input sensitivity is below 0 dBm over the frequency range of 1 - 2.4 GHz.

The pulse-swallow technique applied here is relatively insensitive to the maximum input frequency, thus exhibiting a low power/frequency ratio. The circuit technique could be adapted to bipolar or GaAs circuits. Work is in progress to significantly reduce the power consumption due to the synchronisation circuit.

## 5. Acknowledgments

The authors express thanks to P. Weger and A. Ourmazd for helpful support.

## 6. References

- Francesco Piazza and Qiuting Huang, "A Low Power CMOS Dual Modulus Prescaler for Frequency Synthesizers", IEICE Trans. Electron, Vol. E80-C, No. 2, Feb. 1997, pp 314-319
- [2] Noriyuki Hirakata, MitsuakiFujihira, Akihiro Nakamura, and Tomihiro Suzuki, "3 V-Operation GaAs Prescaler IC with Power Saving Function", IEICE Trans. Electron, Vol.E75-C. No.10, Oct. 1992, pp 1115-1120
- Jan Craninckx and Michael S. J. Steyaert, "A 1.75-GHz/3-V Dual-Modulus Divide-by-128/129 Prescaler in 0.7-mm CMOS", IEEE Journal of Solid State Circuits, Vol. 31, No.7, July 1996, pp. 890-897
- [4] Cong, H.-I. et al., "Multigigahertz CMOS Dual-Modulus Prescalar IC", IEEE Journal of Solid State Circuits, Vol. 23, No.5, Oct. 1988, pp. 1189-1194
- [5] Tiebout, M. J., "A 480μW 2GHz Ultra Low Power Dual-Modulus Prescaler in 0.25μm Standard CMOS",
  To be published at the 2000 IEEE International Symposium on Circuits and Systems, May 2000, Geneva
- [6] Kado, Y. et al., "A 1-GHz/0.9mW CMOS/SIMOX Divide-by-128/129 Dual-Modulus Prescaler Using a Divide-by-2/3 Syncronous Counter",

IEEE Journal of Solid State Circuits, Vol. 28, No.4, April 1993, pp. 513-517

- [7] Kamgar, A. et al., "Ultra-High Speed CMOS Circuits in Thin SIMOX Films", IEEE IEDM 1989, pp 829-832
- [8] Larsson, P., "High-Speed Architecture for a Programmable Frequency Divider and a Dual-Modulus Prescaler", IEEE Journal of Solid State Circuits, Vol. 31, No.5, May 1996, pp. 744-748
- [9] Soares, J. N. Jr. and Van Noije, W. A. M., "A 1.6-GHz Dual Modulus Prescaler Using the Extended True-Single-Phase-Clock CMOS Circuit Technique (E-TSPC)", IEEE Journal of Solid State Circuits, Vol. 34, No.1,

Jan. 1999, pp. 97-102

 [10] Huang, Q. and Rogenmoser, R., Speed Optimization of Edge-Triggered CMOS Circuits for Gigahertz Single-Phase Clocks", IEEE Journal of Solid State Circuits, Vol. 31, No.3, March 1996, pp. 456-465