Study Of Clock Skew And Design Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

Clock Skew and distribution networks play a major part in synchronous circuits to ensure the availability of the clock signal at each flip flops across the integrated circuit. There are several proposed design to design effective clock distribution networks. Since necessity is the mother of all invention, the continuous exponential reduction in feature size of the transistors necessitates high speed clock signals. That only creates more difficulties in practically designing an efficient clock distribution network. This report discusses the various clock distributions used in conventional sequential circuits and some of the emerging designs.

Index Terms - Clock distribution networks, clock skew, Buffer Clock Distribution, H-trees, Interconnect widening, Threshold tracking, wireless clock distribution.


Clock signals are important in synchronizing asynchronous data signals arriving from different parts of the integrated circuit, such that the correct data is available for every level of computation. Due to impedances present in interconnects there are mismatches in the clock arrival time due to spatial distances between two clocks. These mismatches in time are known as clock skews. "Most synchronous digital systems consist of cascaded banks of sequential registers with combinatorial logic between each set of registers. The functional requirements of the digital system are satisfied by the stages of logic.[11]" "The global performance of the system and local timing requirements are satisfied by the careful analysis of time windows to satisfy critical worst case timing constraints.[2]" The proper design of the clock distribution network further ensures that these critical timing requirements are satisfied and that no race conditions exist. "Due noises caused by other interconnect lines running in parallel with the clock signals, clock signals arriving at two different registers with the same clock input experience a phase noise, commonly known as clock jitter [1]."

A clock distribution network ensures minimization of these constraints regarding clock skew and jitters. Design of clock distribution network is however a cumbersome task and a designer must decide the clock distribution before the circuit is designed because the difficulty in designing an efficient clock distribution network increases in the latter stages of design[1]. Different techniques such as H-tree, buffered clock trees and meshed clock network are used in the design of the clock networks. Clock signals are typically loaded with the greatest fanout, travel over the longest distances, and operate at the highest speeds of any signal, which can either be control or data, within the entire system. Since the data signals are in a temporal reference with the clock signals, clean and sharp clock waveforms is a must. Since the clock interconnects do not scale proportionally to the rapidly scaling transistor sizes that operate at high clock frequencies, it sets a difficulty in designing an efficient clock distribution networks.[6]

The paper is arranged in Sections. Section II discusses the clock considerations and the various Clock Distribution techniques used. Section III discusses about the various future alternatives of clock distribution networks.

Figure Types Of Clock Skews [2]

Clock Considerations and Clock distribution networks

Clock Considerations

Delay mismatches exist due to clock interconnect resistances and delay in the combinational circuits in the intermediate sequential elements. Therefore a definite clock transition period is required to ensure proper latching of data in the flip flop. Race conditions that produce incorrect output at the combinational circuit and glitches at the output that causes unwanted power loss in the chip resulting in heating arise due to incorrect clock period.

1) Clock Skew: The difference in clock signal arrival time between two sequentially-adjacent registers, as shown in (1), is the clock skew tSKEW, . If the clock signals Ci and Cf are in complete synchronism (i.e., the clock signals arrive at their respective registers at exactly the same time), the clock skew is zero. A definition of clock skew is provided below.

Definition: "Given two sequentially-adjacent registers, Ri

and Rj, and an equipotential clock distribution network,

the clock skew between these two registers is defined as

tSKEWij = TCi -TCj

where TCi and TCj are the clock delays from the clock

source to the registers Ri and Rj, respectively." [10]

There are two types of clock skews, as shown in Figure 1, based in the flip flops experiencing it. The first type is the positive clock skew and the other is the negative clock skew. Positive clock skew occurs when second register that is connected to the output of the first register with/without a combination circuit between them is latched after the first register gets latched. Negative clock skew occurs when the second register gets latched before the first register.[6] Proper designing of clock signal takes care of all the conditions and ensures the proper latching of data. It must follow the following requirements,

If the delay in the combinational path is represented by tLogic, the internal delay of the transistors as tD-FF, setup time and hold time requirements as tSETUP and tHOLD respectively, and tSKEW as the arrival time difference of the clock edges at two different registers. The latest arrival time is given by tLogic(MAX) and the earliest arrival time is given by tLogic(MIN), since data is latched into each register within the same clock period. Then the total clock period TCLK must be

TCLK ≥ tLogic + tD-FF + tSETUP + tHOLD + tSKEW

This requirement ensures that the register latches the right data to the output.

To minimize positive clock skew the time tSKEW must be [3]


whe re TREG(MAX) = tLogic(MAX) + tD-FF + tINT

Similarly to minimize negative clock skew,


where TREG(MIN) = tLogic(MIN) + tD-FF + tINT

Structured VLSI Circuits Clock Distribution Networks

Different methods are employed to satisfy the clock conditions given above. Techniques involving a tree like structure that has a main "trunk" supplying the global clock which branches at various points in the circuit based on the loads. Such clock distribution networks include buffered clock distribution network, mesh type clock distribution network, H-tree distribution network and X-tree distribution network. To ensure that clock load is balanced at each branch of the clock tree in an H-tree or X-tree network, interconnect that carries these clock signal are scaled by 1/3 at each branch [2].

Buffered Clock Distribution Networks

Buffered clock distribution networks consist of a global clock signal that is buffered at several points at interconnects to boost the clock signal. The unique clock source is frequently called as the root of the tree, the initial portion of the tree as the trunk, individual paths driving each register as the branches, and the registers being driven as the leaves. The main disadvantage of this distribution network is that the buffers can cause additional delays in the path, and that the clock transition at the buffers inputs and outputs cause additional dynamic power loss.[6] These power losses are more significant and have effect on the overall performance of the chip (Less Efficient) owing to the high switching speeds of the conventional clock.

Figure 2 Three-Level Buffered Clock Distribution Network [2]

Mesh Clock Distribution Network

Mesh clock distribution networks are very similar to buffer clock distribution networks. Additional buffers are used along the interconnect path in meshed clock distribution to reduce interconnect resistances within the clock tree. The branch resistances are effectively placed in parallel, minimizing the clock skew [2]. These buffers serve a dual purpose to minimize the path resistance while also amplifying the clock signal along the path so as to maintain the potential of the clock. The main drawback with this type is the same as in buffered clock distribution, i.e., the buffers cause addition delays and power losses.

H-Tree and X-Tree Clock Distribution Networks

H-tree and X-tree clock distribution network consists of branches from a global clock interconnect into a symmetrical H or X type structure. This branching continues until the registers are reached. 1/3 scaling is provided to each branch interconnect [2] to balance the load at those branches. This type of clock distribution network resolves the issue of buffer delay but is inefficient due to the power loss in the clock interconnect. It is shown in [1] that the thermal power loss in distribution networks as these are lesser on the branches than in the global clock network. This may cause over heating in the digital circuits thereby affecting the overall performance of the chip if not failing permanently.

Figure 3 H-Tree & X-Tree Clock Distribution Network[11]

Process Insensitive Clock Distribution Networks

A primary disadvantage of clock distribution networks is that the delay of each of the elements of a clock path, the distributed buffers and the interconnect impedances, are highly sensitive to geometric, material, and environmental variations that exist in implementing a technology. Thus, as device and interconnect parameters vary between processes, the specific performance characteristics of the clock distribution network may change. This phenomenon can have a formidable impact on both the performance and the reliability of a digital circuit debilitating both the precision and the design structure of the clock distribution network. Various designs of clock distribution networks have been developed mitigating the effect of process tolerances while maintaining an effective methodology for designing these networks.[2]

Threshold Tracking to Control Clock Skew

The technique uses the MOS circuit characteristic that P-channel and N-channel parameters tend not to track each other as a process varies so as the effective skew is minimized. The primary objective is to match the two clock edges (of either a N-channel or a P-channel transistor) as the process parameters vary. There are two rules to minimize the effects of process variations on clock skew. The rules are:

1) "Match the sum of the pull-up delays of the P-channel MOSFET with the pull-up delays of any related clock signal paths"[2].

2) "Match the sum of the pull-down delays of the N-channel MOSFET with the pull-down delays of any related clock signal paths"[2].

The clock delay along a given path may change with Process variations but the difference in delay between paths will track each other, keeping the skew small. Figure 4 shows a circuit utilising this technique.

The N-channel and P-channel transistors of both branches individually track each other, making the system more tolerant to variations in the N-channel and P-channel transistor characteristics.

TA + TB = T1 + T2 + T3, with a stricter constraint of,

TA = T1 + T3

TB = T2

This design technique can be used to make circuits less sensitive to process variations and environmental conditions even if the circuits are more general forms of logic gates. Similar behaviour is assured in this technique when interconnect impedances are included within the circuit. Simulated worst case clock skews of circuits using this technique exhibit skews that are 10% less than that of conventionally designed circuits [8].

Figure 4 Elimination of Process Sensitive clock skew by scaling matched transistor types.[2]

Interconnect Widening to Minimize Clock Skew Sensitivity

One approach of keeping the skew close to zero is by lengthening specific clock nets of the automated layout of clock. A disadvantage of this approach is that these minimum width lines are very susceptible to variations in the etch rate of the metal lines, as well as to mask misalignment or local spot defects. Therefore, the effective interconnect impedance (and the delay) of these long thin clock nets can vary greatly from wafer to wafer as these line-widths vary.[2]

In order to design these clock nets to be less sensitive to process variations, an automated layout algorithm is used that widens the clock nets rather than lengthens the nets while equalizing the line delays[9]. These nets are therefore less sensitive to both under- and over etching during the metal patterning process. Interconnect resistance decreases and interconnect capacitance increases on widening the clock line. It is interesting to note that those branches closer to the root of the RC tree had a greater effect on the clock path delay when the line width is increased than in increasing the widths closer to the leaf nodes (the clocked registers). Thus, decreasing the resistance at the source by increasing the line width affects the total path delay more significantly than decreasing the resistance at the leaf node, since more capacitance is seen by the large source resistance than if the resistance is greater near the leaf. Therefore, changes in line width close to the clock source tend to affect the clock skew far more than at the leaves. One approach to making the clock lines more tolerant of process variations is to make the width of the clock interconnect lines widest near the clock source and thinner as the leaf nodes are approached. One of the primary advantages of this approach is that it separates out the process of laying out the clock nets automatically from the clock skew reduction process. Thus, local layout techniques, such as widening the clock nets, can be used to make the overall circuit less sensitive to variations in process parameters [2].

future clock distribution methods[6]

The rapidly decreasing transistor sizes and the increase in switching speed of the devices necessitate more efficient designs for clock distribution networks. Some of the interesting techniques currently under researches include clock distribution networks using De-Skew buffers[5], wireless clock distribution using RF/ microwave technologies [4] and 3-D ICs[7].

De-Skew Buffers

Clock skew can also be reduced/adjusted locally using delay locked loops (DLL) or phase locked loops (PLL). Such approaches are described in [5] and [6] employing ring tuning architecture with DLL and phase detectors, respectively. The DLL with ring tuning architecture delineated in [5] uses an UP/DOWN detector to measure the skew between the two clocks and use tunable buffers with temperature-coded delays to vary the delay of the clock period consistent to the skew measured. These type of de-skew buffers can be used with H-tree clock distribution networks. Another method uses a single phase detector to adjust the clock skew in the H-tree network. De-skewing is essentially done by detecting phase difference and using tunable capacitors to match the delay in the signals. These methods are however used to fine tune the local clock signals in the finals stages of processing.

Wireless Clock Distribution Networks

Wireless clock distribution network is more promising than the other potential distribution networks. These networks use local transmission of microwave or RF signals across the chip using transmission antennas placed at several points across the chip. On being received, these signals are then amplified locally using receivers and low noise amplifiers. In this type of clock network, the global clock generates a clock frequency 8 times the local clock frequency which is down-converted using frequency dividers[8]. A major drawback put forth[4] is the signal to noise ratio of the signal. This is of great concern because these devices operate among the digital devices which are usually noisy.


A brief case study of the clock distribution networks was performed. Clock skew analysis and Clock distribution networks are an important part of designing large scale integrated circuits using sequential elements, to ensure the correct operation of the device. The rapid decrease in transistor sizes and the increase in clock rate pose a serious concern in clock delays. Proper design of clock distribution could reduce these delays.