Module 319 min

STA Interview Questions (Top 35)

STA exhaustively analyzes all timing paths in a design mathematically without requiring simulation vectors. It checks whether data can propagate from every star

1. What is static timing analysis and how does it differ from dynamic simulation?

STA exhaustively analyzes all timing paths in a design mathematically without requiring simulation vectors. It checks whether data can propagate from every startpoint to every endpoint within timing constraints.

Differences from dynamic simulation:

• STA covers 100% of paths; simulation covers only exercised paths

• STA is fast (minutes); full simulation can take days

• STA cannot find functional bugs; simulation can

• STA is deterministic given constraints; simulation depends on input vectors

• STA uses library models; simulation uses detailed transistor behavior

2. What is setup time and hold time?

Setup time (T_su): The minimum time BEFORE the active clock edge that data must be stable at the FF input (D pin). If data changes within this window, the FF may fail to capture correctly → metastability.

Hold time (T_h): The minimum time AFTER the active clock edge that data must remain stable. If data changes within this window, the FF may capture the new value instead of the intended value.

Both are characteristics of the flip-flop cell from the technology library, measured at specific operating conditions. They represent fundamental timing requirements of the storage element.

3. How is setup slack calculated?

Setup slack = Required Arrival Time − Actual Arrival Time

Required Arrival Time: = Clock period + Capture clock latency − Clock uncertainty (setup) − Setup time of FF

Actual Arrival Time: = Launch clock edge time + Launch clock latency + CK→Q delay + Combinational delay

Slack ≥ 0: Setup MET (timing passes) Slack < 0: Setup VIOLATED (must fix)

Example: Required = 4.8ns, Arrival = 3.5ns → Slack = +1.3ns (MET with 1.3ns margin)

4. Why do we need to fix both setup AND hold violations?

Both represent different failure modes that cause the flip-flop to capture the wrong data:

Setup violation: Data arrives too late → FF is asked to capture before data is stable → metastability → wrong Q output (random). Functional failure at speed.

Hold violation: Data changes too soon after clock → FF captures new value when old value was expected → wrong Q output. A hold violation is particularly dangerous because it causes failure at ALL frequencies - it's not a speed problem, it's a structural problem that causes failure even at low frequency.

Both must be zero violations at sign-off in every MMMC corner.

5. What is clock-to-Q delay (CK-to-Q)?

CK-to-Q (propagation delay) is the time from when the clock edge arrives at the FF's clock pin to when the output Q settles to its new logic value. It is a library cell characteristic and contributes to the data path delay in timing analysis:

Data arrival time = T_clk_source + T_launch_clk_latency + T_CKtoQ + T_combo_logic

Typical values: 50–200ps depending on cell drive strength and load. Larger, faster cells have smaller CK-to-Q. It also depends on output load capacitance (higher load → longer CK-to-Q).

6. What is metastability? How do synchronizers help?

Metastability occurs when a flip-flop's input violates setup or hold time. The FF output neither fully resolves to 0 nor 1 - it remains at an intermediate voltage. Given enough time (mean time to resolve), it will eventually resolve to a valid logic level, but the resolution time is unpredictable - it can be arbitrarily long, causing downstream logic to see incorrect values.

Synchronizers (2-FF chains) help by providing extra time for the FF output to resolve before being used. The probability of metastability causing failure decreases exponentially with the resolution time given. Mean Time Between Failures (MTBF) increases exponentially with the number of synchronizer stages.

7. What is OCV (On-Chip Variation) and why does it matter?

OCV is the spatial variation in process, voltage, and temperature across a single die. Two identical cells at different locations on the same chip may have different delays due to process gradient, local power supply variations, and thermal gradients.

OCV matters because STA corner analysis (SS, TT, FF) assumes the whole chip is at one corner. In reality, the launch path might be slow while the capture path is fast (or vice versa), creating additional timing margin loss. OCV derating adds guardband by making launch paths pessimistically slow and capture paths pessimistically fast (or vice versa for hold).

8. What is AOCV? How does it differ from flat OCV derating?

Flat OCV: Apply the same derate factor (e.g., 5%) to every cell, regardless of path length. This is very pessimistic for long paths - a path with 50 cells has much more statistical averaging than one with 3 cells.

AOCV (Advanced OCV): Applies a smaller derating factor to longer paths (more cells) because statistical averaging reduces the probability of all cells simultaneously being at the worst case. Shorter paths get higher derating. This reduces over-pessimism in long paths, recovering timing margin and avoiding unnecessary ECO effort. The derate table is a function of path depth (cell count).

9. What is MMMC analysis? Name four typical corners.

MMMC (Multi-Mode Multi-Corner) runs STA simultaneously across all operating modes and PVT corners to ensure the design meets timing in every scenario:

• func_slow: SS, 0.9V, 125°C - Setup check for functional mode

• func_fast: FF, 1.1V, -40°C - Hold check for functional mode

• scan_slow: SS, 0.9V, 25°C - Scan shift timing

• hold_extreme: FF, 1.2V, -55°C - Worst-case hold

All must pass simultaneously. One tool run covers all corners efficiently.

10. What is clock uncertainty and what components does it model?

Clock uncertainty is a timing margin applied to clock edges to account for:

• Jitter (period jitter): Cycle-to-cycle variation in clock period from the PLL/crystal

• Skew: Spatial variation in clock arrival times (pre-CTS only; post-CTS uses actual propagated latencies)

• Uncertainty margin: Additional guardband for modeling limitations

Setup uncertainty reduces required time (more pessimistic). Hold uncertainty reduces required hold time (more pessimistic for hold). Applied using: set_clock_uncertainty -setup 0.15 -hold 0.05

11. What is clock reconvergence pessimism removal (CRPR)?

When the launch and capture flip-flops share a portion of their clock path (common clock path), the STA tool would otherwise apply OCV derating to the shared segment twice - once making it slow (for launch) and once making it fast (for capture). This is physically impossible: the shared wire has one actual delay.

CRPR removes this double pessimism by identifying the common portion of the clock path and applying derating only to the diverging portions. This can recover significant timing margin (50–200ps) especially in designs with long shared clock networks.

12. What is a timing arc?

A timing arc is a delay specification between an input pin and an output pin of a cell in the library. It describes how long it takes for a transition at the input to propagate to the output. Types:

• Cell arc: Input→Output delay within a cell (e.g., A→Y in AND2)

• Net arc: Wire delay from cell output to next cell input (RC delay)

• Setup/hold arc: Constraint arcs on FF data vs clock pins

• Clock arc: CK→Q propagation arc of a flip-flop

STA tools traverse all arcs to compute path delays.

13. What is the purpose of read_parasitics / read_spef in PrimeTime?

read_parasitics -format spef filename.spef loads the extracted wire RC parasitics from the post-layout extraction tool (StarRC, QRC). Without this, PrimeTime uses ideal wires or estimated loading (from SDC set_load), which is inaccurate.

After loading SPEF, wire delays are computed from actual metal resistance and capacitance (R×C delay), giving accurate net delays. Sign-off timing MUST use SPEF parasitics. The SPEF file must match the design netlist exactly (same net names). Mismatches cause warnings and incorrect timing.

14. What is hold analysis and why is it corner-reversed from setup?

Hold analysis checks whether data changes too quickly after a clock edge. Hold slack = Data arrival time − (Capture clock latency + Hold time).

Hold violations occur when the DATA PATH is too fast (short logic path) and the CLOCK arrives late at the capture FF.

Therefore, hold analysis uses the FAST corner (FF process, high voltage, low temperature) which makes data paths fast and can make hold more critical. This is the reverse of setup analysis which uses the SLOW corner. That's why MMMC must check setup at slow corner AND hold at fast corner simultaneously.

15. What does check_timing check and what warnings indicate?

check_timing validates constraint completeness and reports:

• Unconstrained endpoints: FF data/output ports with no timing path from a clock - path not analyzed by STA

• No-clock FFs: Registers with no associated clock definition

• Partial path constraints: Input_delay covers only -max but not -min (or vice versa)

• Loop detection: Combinational loops

• Multiple clocks: Endpoints with multiple clock paths (may need set_false_path or set_clock_groups)

All warnings should be investigated - unconstrained paths are a sign-off risk.

16. What is path-based analysis (PBA) vs graph-based analysis (GBA)?

GBA (Graph-Based Analysis): Standard STA mode. Each cell's arrival time is calculated once using the worst-case input transition and output load from all converging paths. Very fast but pessimistic - assumes the worst condition at every cell simultaneously, even if physically impossible.

PBA (Path-Based Analysis): Re-analyzes specific critical paths using the actual input transition experienced by each cell on that specific path. More accurate, less pessimistic - removes false worst-case combinations. Much slower (only applied to a subset of near-critical paths). Used to "rescue" paths that look violated in GBA but actually pass when analyzed properly.

17. What is input transition and output load in cell timing models?

Cell delay is characterized as a 2D lookup table indexed by:

Input transition time (slew): How fast the input signal switches (rise/fall). A slower input → longer cell propagation delay.

Output load capacitance: Total capacitance the cell drives (input caps of fanout cells + wire cap). Higher load → longer output transition and higher cell delay.

The 2D table is NLDM (Non-Linear Delay Model). STA tools interpolate within the table to compute accurate delays for the specific transition and load seen at each cell in the design.

18. What causes a max-capacitance violation and how is it fixed?

A max-capacitance violation occurs when the capacitive load on a cell's output exceeds the maximum capacitance limit specified in the technology library for that cell. This causes:

• Output transition (slew) becoming too slow

• Downstream cell delays increasing

• Possible functional failure if slew is extremely slow

Fixes:

• Insert buffers to split the high-fanout net

• Upsize the driving cell to a higher drive strength

• Reduce wire length (physical proximity of sinks)

Max-cap violations show up as DRC violations in STA reports.

19. What is the difference between setup uncertainty and hold uncertainty?

Setup uncertainty: Applied to reduce the timing window available for data to meet setup. It tightens setup (makes it harder to meet). Typically 100–150ps for pre-CTS, reduced post-CTS.

Hold uncertainty: Applied to increase the minimum data arrival time required to meet hold. It tightens hold (makes hold harder to meet). Typically 50ps.

The asymmetry is because hold uncertainty models jitter that shortens the clock cycle for the capture edge, while setup uncertainty models jitter that either shortens or lengthens. Pre-CTS uses larger uncertainty; post-CTS switches to propagated clocks with only jitter uncertainty remaining.

20. What is back-annotated timing? When is it used?

Back-annotated timing (post-layout STA) uses actual extracted RC parasitics (SPEF) from the physical layout to compute wire delays. Contrast with pre-layout timing which uses estimated loads.

Used: After routing is complete for final sign-off. The parasitics precisely capture the resistance and capacitance of every metal wire and via, giving timing accuracy within 5% of silicon measurement.

Back-annotation reveals new violations not seen pre-route (because estimated wires underestimated actual wire capacitance). These violations require post-route ECO fixes with minimal netlist perturbation.

21. What is a timing exception and why must it be carefully applied?

A timing exception modifies how STA analyzes a specific path: set_false_path, set_multicycle_path, set_max_delay, set_min_delay.

They must be carefully applied because:

• Over-generous false paths hide real timing violations

• Wrong multicycle path settings (missing hold correction) create hold violations

• Incorrectly specified endpoints leave real functional violations unchecked

• Timing exceptions survive synthesis to PD to sign-off - errors propagate through the entire flow

All exceptions must be documented and reviewed. Functional paths must never be marked false.

22. What is the difference between max-delay and false path?

set_false_path: Completely removes the path from timing analysis. The tool ignores it entirely - no timing report, no optimization. For paths that are genuinely never timing-critical in any operating scenario.

set_max_delay: Still analyzes the path for timing, but uses the specified delay as the timing constraint instead of the default (clock period). For paths that need to meet a specific delay that's different from the clock period (e.g., async paths that must complete within 10ns regardless of clock).

Key difference: set_false_path means "never check this." set_max_delay means "check this, but use this constraint."

23. What is a violation cascade and how do you prioritize fixes?

A violation cascade occurs when fixing one timing violation makes another one worse. For example, upsizing a cell to fix setup on path A may load a net and degrade setup on path B.

Prioritization strategy:

• Fix the WNS (worst) path first - largest magnitude violation

• Use ECO minimize-impact mode (minimize cell moves)

• Iterate in small batches (fix 20 paths, re-analyze, fix next 20)

• Monitor TNS trend - decreasing TNS = making progress

• Separate setup and hold fixes (hold buffer insertion can slow setup)

24. How does temperature inversion affect timing at advanced nodes?

At mature nodes (130nm+): Higher temperature → slower transistors (mobility decreases). Standard worst-case timing = high temp.

At advanced nodes (<65nm): Below a threshold voltage, temperature inversion occurs - at low Vdd, transistors can be SLOWER at low temperature than high temperature because subthreshold current becomes significant. This means the traditional slow corner (SS, 125°C) may no longer be worst-case timing; SS at -40°C may be worse.

Impact: Need to check timing at multiple temperature points. Some foundries provide separate library corners for this. Ignoring temperature inversion at advanced nodes can lead to post-silicon timing failures.

25. What is signal integrity (SI) in STA context?

In STA, Signal Integrity (SI) analysis accounts for crosstalk-induced delay changes:

SI delta delay: Coupling from aggressor wires causes victim wire delay to increase or decrease. STA includes SI analysis in sign-off by computing the worst-case delay considering all possible aggressor switching combinations.

SI noise analysis: Checks if crosstalk-induced voltage glitches on quiet nets can cause logic errors. The noise immunity of the receiving cell must exceed the peak noise voltage.

SI analysis requires layout parasitics including coupling capacitance (SPEF with coupling) - simple ground capacitance models are insufficient for SI-accurate timing.

26. What is setup recovery and removal time for asynchronous pins?

For asynchronous control pins (async reset, preset, clear) of flip-flops:

Recovery time: Minimum time the async signal must be deasserted BEFORE the active clock edge. Analogous to setup time - if async reset is released too close to the clock, the FF may not properly respond to the clock. Checked with set_max_delay or special recovery constraints in SDC.

Removal time: Minimum time the async signal must remain asserted AFTER the active clock edge. Analogous to hold time. These are library-characterized values that must be checked if async resets are used in a synchronous design.

27. What is the difference between input/output delay -max and -min?

set_input_delay -max: Latest time data can arrive at the port relative to clock. Used for setup analysis of the first internal register that captures this input.

set_input_delay -min: Earliest time data arrives. Used for hold analysis (ensures data doesn't arrive so early that it violates hold at the capturing FF).

set_output_delay -max: Latest time data must be stable at output before next clock edge (for the downstream receiver's setup).

set_output_delay -min: Earliest time data must be stable (for downstream receiver's hold).

All four values (-max/-min for input/output) must be specified for complete I/O timing coverage.

28. What is POCV (Parametric OCV)?

POCV (Parametric/Statistical OCV) replaces flat or AOCV derating with a statistical model. Each cell delay is modeled as a Gaussian distribution with a mean and standard deviation (from silicon characterization).

STA computes the statistical distribution of path delay (sum of independent Gaussian cell delays → Gaussian path delay by central limit theorem). Slack is then expressed as a sigma value - e.g., "path meets timing at 3σ".

Benefits: Most accurate OCV model, removes pessimism from flat/AOCV derating. Used in advanced (<7nm) nodes where OCV is very significant. Requires POCV characterization data from the foundry library.

29. How do you handle paths between asynchronous clock domains in STA?

Paths between asynchronous clock domains (clocks with no fixed phase relationship) cannot be meaningfully analyzed by standard STA - the arrival time of data relative to the capture clock is unbounded.

Proper handling:

• set_clock_groups -asynchronous: Tells the STA tool to not analyze paths between these clock domains. The crossing is handled by synchronizers in the design.

• CDC analysis (separate tool: Mentor CDC, Cadence JasperGold): Verifies correct synchronization structures are present

• The synchronizer itself is analyzed with appropriate timing constraints

Failure to set clock_groups for async clocks creates false setup violations with pessimistic slack values.

30. What is the ECO flow in PrimeTime and how is it used?

PT-ECO is PrimeTime's automated ECO (Engineering Change Order) capability for fixing timing violations post-route:

Flow:

• PT analyzes sign-off netlist with SPEF, finds violations

• fix_eco_timing -setup and fix_eco_timing -hold generate cell changes (upsize/insert buffers)

• Changes written to eco_changes.tcl

• Innovus/ICC2 reads changes, places/routes ECO cells

• RC re-extracted, PT re-runs analysis

• Iterate until clean

PT-ECO minimizes cell perturbation to preserve DRC cleanliness of the post-route database.

31. What is Clock Domain Crossing (CDC) and why is it dangerous?

CDC occurs when a signal driven by flip-flops in clock domain A is sampled by flip-flops in clock domain B, where A and B are asynchronous (no fixed phase relationship).

Why dangerous: The receiving FF's setup/hold requirements may be violated at random times depending on the phase relationship of the two clocks at the moment of crossing. This causes metastability - the FF output remains at an intermediate voltage and resolves to 0 or 1 unpredictably, after an unpredictable delay.

Why STA alone can't catch it: With set_clock_groups -asynchronous, STA ignores these paths entirely. You need a dedicated CDC verification tool (JasperGold CDC, Questa CDC, SpyGlass) to verify synchronizers are correct.

Consequences if missed: Random data corruption in silicon - impossible to reproduce, intermittent failures, very hard to debug post-silicon.

32. What synchronization structures are used for CDC and when do you choose each?

• 2-FF Synchronizer: For single control bits, flags, enable signals. Two back-to-back FFs in destination domain. First FF may go metastable but has one full clock cycle to resolve before second FF samples it. MTBF increases exponentially with resolution time.

• 3-FF Synchronizer: Same as 2-FF but with extra resolution time - for safety-critical designs or very fast destination clocks where 1 cycle may be insufficient resolution time.

• Async FIFO: For multi-bit data buses (data payload, pixel streams). Read/write pointers encoded in Gray code so only 1 bit changes per increment - safe to synchronize with 2-FF. Never cross multi-bit buses directly with just a 2-FF synchronizer.

• Handshake (req/ack): For slow control signals where latency is acceptable. Request crosses to destination, Acknowledge comes back. Guarantees data is only sampled after confirmation.

• Gray Code Counter: For pointer values that must cross domains - only 1 bit changes per count, so metastability affects at most ±1 count (appears as empty/full, not data corruption).

33. How do you handle CDC in STA? What is set_clock_groups?

In STA, CDC paths between asynchronous domains cannot be meaningfully constrained with a clock period - the phase relationship is unknown.

set_clock_groups -asynchronous -group {clk_a} -group {clk_b}: Tells the STA tool to completely ignore timing paths between clk_a and clk_b. No setup or hold check is performed on these paths. This is correct because the synchronizer handles the crossing - you separately verify the synchronizer with CDC tools.

vs set_false_path: set_false_path is directional (only A→B) and removes the path from optimization too. set_clock_groups is bidirectional and is the preferred method for truly asynchronous clocks.

vs set_max_delay -datapath_only: Used when the CDC path has a latency requirement - e.g., a register crossing must complete within 100ns. -datapath_only prevents hold correction interference.

Important: Using set_clock_groups does NOT mean the design is CDC-correct - it only tells STA to stop analyzing. CDC tool sign-off is still required.

34. What is latch borrowing (time borrowing)? How does it differ from flip-flop pipelines?

Latch borrowing allows a pipeline stage to use more than one clock period worth of time, borrowing from the slack of the following stage.

How it works: A level-sensitive latch is transparent during the clock high phase (T/2). If logic in stage N takes longer than T/2 but less than T, the data can still pass through the latch while it is still transparent - effectively borrowing time from stage N+1. The next stage then has less time available.

Key difference from FF pipelines:

• FF: Strictly 1 clock period per stage - no slack redistribution

• Latch: Slack can flow between stages - imbalanced pipelines can still work

• FF: Edge-triggered, easier for STA and DFT

• Latch: Level-triggered, more complex STA (hold checks are different), harder scan insertion

Max borrow = T/2 (the full transparent window). If borrowed time > T/2, a hold violation occurs on the next latch because data arrives during the closed phase.

35. How is latch borrowing reported in STA tools? What are the timing checks for latches?

In PrimeTime and Tempus, latch paths show additional information in report_timing:

"Time Borrowed": The amount of time the latch borrowed from the next stage. A positive value means borrowing occurred. A negative value means the path completed early (donated time to previous stage).

Latch timing checks:

• Setup check: Data must arrive before the latch CLOSES (end of transparent phase), not before the clock edge. Required time = latch closing edge + clock latency − uncertainty − setup time of latch.

• Hold check: Data must hold past the latch OPENING edge. This is trickier than FF hold - the latch opens at every clock cycle, so hold must be met from the previous latch opening.

• Max borrow check: Tools report a violation if borrowed time > T/2.

TCL: report_timing -pba_mode exhaustive is often needed for accurate latch analysis since GBA can be overly pessimistic for latch paths.

DFT consideration: Latches require special scan insertion (scan-enable must be added to convert latch to scannable element). Many flows prefer FFs to avoid this complexity.

Go further

Get all 90+ questions with detailed model answers and company-specific sets.

Interview Bootcamp₹499₹299View course →

Want 1-on-1 VLSI interview mentorship?Explore the program →