Retiming: Moving Registers to Balance Logic
Relocating flops across logic to shorten the critical path
Ordinary synthesis optimizes the combinational logic between two fixed flops. Retiming goes one level deeper: it moves the flops themselves across the logic to balance the delay between stages, without changing what the circuit computes. A pipeline often has one heavy stage and one light stage, and the clock period is set by the heavy stage, so the light stage wastes slack. Retiming borrows logic from the heavy side and pushes it across a register into the light side so both stages end up closer to the same delay. The total logic and the number of register-cycles on every path are preserved, so the behavior after the same number of clocks is identical.
| Stage | Before retiming | After retiming |
|---|---|---|
| Stage 1 logic delay | 1.8 ns | 1.2 ns |
| Stage 2 logic delay | 0.6 ns | 1.2 ns |
| Achievable period | 1.8 ns + setup | 1.2 ns + setup |
Backward and forward retiming
- Backward retiming pulls a register back toward the inputs, moving logic from before the flop to after it.
- Forward retiming pushes a register forward toward the outputs, moving logic from after the flop to before it.
- A legal move must respect every fanin and fanout: to move a register forward across a gate, there must be a register on every input of that gate to absorb, and the same flavor of move applies in reverse.
How you invoke it
In Design Compiler retiming is run with the optimize_registers command (or by passing -retime to compile_ultra), and in Genus it is enabled with the retime attribute before mapping. The registers being moved must not be preserved by other constraints, or the tool will leave them alone.
# Design Compiler: retiming as part of high-effort compile
compile_ultra -retime
# Or run the dedicated retiming pass on specific designs
optimize_registers -designs my_block -check_design
# Genus equivalent: turn on retiming, then map
set_db design:my_block .retime true
syn_generic
syn_mapthe strongest use of retiming is right after adding a fresh pipeline stage: drop the registers in roughly the right place, then let retiming slide them to the true balance point. In an interview, stress that retiming preserves cycle latency and function, it only redistributes where the logic sits between clock edges.
retiming can move or merge flops, which renames them and can break things that depend on flop names: scan chains, formal mappings, and waveform debug. Run it before scan insertion, keep an eye on whether reset and clock-gating relationships survive the move, and always close the loop with formal equivalence checking because the netlist no longer maps one-to-one to the RTL registers.