Module 53 min

Clock Tree Synthesis (CTS)

Key CTS Concepts - From Scratch

Pro Tip

⏰ Start Here - The Clock Distribution Problem — Your chip has one clock source (e.g. a PLL output) but potentially millions of flip-flops that all need that clock signal. You cannot connect one wire from the source to every FF - a single wire driving 1,000,000 FFs would have astronomical capacitance → extremely slow transitions → the clock would barely toggle. Also, if the wire is very long, the signal takes different amounts of time to reach FFs at different corners of the die → clock skew - some FFs see the clock edge nanoseconds after others, which breaks timing. CTS solves this by building a tree of buffers that fans out the clock progressively, like a branching river delta, ensuring every FF gets a clean, fast clock edge at approximately the same time.

Pro Tip

What Is a Clock Buffer and Why Is It Needed? — A clock buffer is a standard cell with one input (the incoming clock) and one output (a buffered copy of the clock). Its job: take a weak, slightly degraded clock signal and reproduce it as a clean, strong signal that can drive many downstream loads. Without buffers: one wire from the PLL driving 100,000 FFs would have total capacitance of ~50pF → the clock signal would have a 5ns rise time → the "edge" would be a slow ramp instead of a sharp transition → setup and hold times cannot be met → chip fails. Clock buffers are inserted every few cells in the tree to keep the clock transition times under ~100ps at every FF.

Key CTS Concepts - From Scratch

What Is a Clock Tree?

A tree of clock buffers starting at the clock source and branching out to all flip-flop clock pins. Each level buffers the signal and drives the next level. A typical design might have 4–8 levels of buffering. The root drives 2–4 branches, each branch drives more sub-branches, eventually reaching individual FFs. The tree ensures signal integrity (clean transitions) and balances arrival times.

Clock Skew

The difference in clock arrival time between any two flip-flops. If FF_A receives the clock edge at 1.00ns and FF_B at 1.25ns, the skew is 0.25ns. Skew matters because: positive skew relaxes setup but tightens hold; negative skew tightens setup. CTS target: local skew < 50ps, global skew < 200ps.

Clock Insertion Delay (Latency)

The time from the clock source to a flip-flop's clock pin, through all the buffers and wires of the clock tree. Typical values: 0.3–1.5ns depending on design size and node. Latency itself doesn't cause problems - it's the difference in latency between FFs (skew) that causes timing issues.

Clock Uncertainty

A timing margin added to clock edges to account for: Jitter (cycle-to-cycle variation from PLL noise, typically 50–100ps), Skew (modeled as uncertainty pre-CTS), and extra guardband. Applied in SDC as set_clock_uncertainty. Post-CTS, skew is captured by propagated clock latencies, so only jitter+guardband remain.

Skew

Δ in clock arrival between FFs

Target: <50ps (local), <200ps (global)

Latency

Clock insertion delay

Source → FF clock pin delay

Uncertainty

Jitter + skew margin

Applied as timing margin in STA

Click to enlarge

Key CTS Commands

CommandToolPurpose
ccopt_designInnovusRun CTS with concurrent optimization
set_ccopt_propertyInnovusSet CTS target skew, latency targets
clock_optICC2Run clock tree optimization
set_clock_tree_optionsICC2Configure CTS parameters
report_clock_treeBothReport skew, latency, buffer count