Clock Tree Synthesis (CTS)
Key CTS Concepts - From Scratch
⏰ Start Here - The Clock Distribution Problem — Your chip has one clock source (e.g. a PLL output) but potentially millions of flip-flops that all need that clock signal. You cannot connect one wire from the source to every FF - a single wire driving 1,000,000 FFs would have astronomical capacitance → extremely slow transitions → the clock would barely toggle. Also, if the wire is very long, the signal takes different amounts of time to reach FFs at different corners of the die → clock skew - some FFs see the clock edge nanoseconds after others, which breaks timing. CTS solves this by building a tree of buffers that fans out the clock progressively, like a branching river delta, ensuring every FF gets a clean, fast clock edge at approximately the same time.
What Is a Clock Buffer and Why Is It Needed? — A clock buffer is a standard cell with one input (the incoming clock) and one output (a buffered copy of the clock). Its job: take a weak, slightly degraded clock signal and reproduce it as a clean, strong signal that can drive many downstream loads. Without buffers: one wire from the PLL driving 100,000 FFs would have total capacitance of ~50pF → the clock signal would have a 5ns rise time → the "edge" would be a slow ramp instead of a sharp transition → setup and hold times cannot be met → chip fails. Clock buffers are inserted every few cells in the tree to keep the clock transition times under ~100ps at every FF.
Key CTS Concepts - From Scratch
What Is a Clock Tree?
A tree of clock buffers starting at the clock source and branching out to all flip-flop clock pins. Each level buffers the signal and drives the next level. A typical design might have 4–8 levels of buffering. The root drives 2–4 branches, each branch drives more sub-branches, eventually reaching individual FFs. The tree ensures signal integrity (clean transitions) and balances arrival times.
Clock Skew
The difference in clock arrival time between any two flip-flops. If FF_A receives the clock edge at 1.00ns and FF_B at 1.25ns, the skew is 0.25ns. Skew matters because: positive skew relaxes setup but tightens hold; negative skew tightens setup. CTS target: local skew < 50ps, global skew < 200ps.
Clock Insertion Delay (Latency)
The time from the clock source to a flip-flop's clock pin, through all the buffers and wires of the clock tree. Typical values: 0.3–1.5ns depending on design size and node. Latency itself doesn't cause problems - it's the difference in latency between FFs (skew) that causes timing issues.
Clock Uncertainty
A timing margin added to clock edges to account for: Jitter (cycle-to-cycle variation from PLL noise, typically 50–100ps), Skew (modeled as uncertainty pre-CTS), and extra guardband. Applied in SDC as set_clock_uncertainty. Post-CTS, skew is captured by propagated clock latencies, so only jitter+guardband remain.
Skew
Δ in clock arrival between FFs
Target: <50ps (local), <200ps (global)
Latency
Clock insertion delay
Source → FF clock pin delay
Uncertainty
Jitter + skew margin
Applied as timing margin in STA
Key CTS Commands
| Command | Tool | Purpose |
|---|---|---|
| ccopt_design | Innovus | Run CTS with concurrent optimization |
| set_ccopt_property | Innovus | Set CTS target skew, latency targets |
| clock_opt | ICC2 | Run clock tree optimization |
| set_clock_tree_options | ICC2 | Configure CTS parameters |
| report_clock_tree | Both | Report skew, latency, buffer count |