Fast Adders: Carry Lookahead and Beyond
How generate and propagate beat the slow ripple carry
A ripple-carry adder chains N full adders, and each stage cannot finish until the carry from the stage below arrives. The carry has to travel through every bit, so the worst-case delay grows linearly with the width. For a 64-bit adder that is too slow, so designers compute the carries a different way.
Generate and propagate
For each bit position define two signals. A bit generates a carry when both inputs are 1 (g = a AND b), because that produces a carry no matter what comes in. A bit propagates an incoming carry when exactly one input is 1 (p = a XOR b), because then the carry-out simply equals the carry-in. From these, every carry can be written directly in terms of the inputs and the very first carry-in.
// 4-bit carry lookahead, c0 is the carry-in
assign g = a & b; // generate, 4 bits
assign p = a ^ b; // propagate, 4 bits
assign c1 = g[0] | (p[0] & c0);
assign c2 = g[1] | (p[1] & g[0]) | (p[1] & p[0] & c0);
assign c3 = g[2] | (p[2] & g[1]) | (p[2] & p[1] & g[0])
| (p[2] & p[1] & p[0] & c0);
assign sum = p ^ {c3, c2, c1, c0};Scaling: blocks, trees and skip
All carries in one block settle in roughly constant time, but the AND-OR fan-in grows with block width, so a full lookahead across 64 bits would need impractically wide gates. Real wide adders therefore combine ideas. Block carry lookahead groups bits into 4-bit blocks, computes a block-level generate and propagate, then looks ahead between blocks, giving roughly logarithmic delay. Parallel-prefix adders such as Kogge-Stone and Brent-Kung arrange the same generate-propagate combination as a tree of prefix operators, trading area and wiring for the shortest depth. Carry-skip and carry-select adders take a middle path: cheaper than full lookahead but much faster than ripple.
| Adder | Delay (worst case) | Area | Note |
|---|---|---|---|
| Ripple carry | Order N | Smallest | Carry walks every bit |
| Carry skip | Roughly sqrt N | Small | Skips known-propagate blocks |
| Carry select | Roughly sqrt N | Larger | Computes both carry cases, then picks |
| Carry lookahead | Roughly log N | Large | Flat AND-OR carries per block |
| Kogge-Stone | Roughly log N | Largest | Prefix tree, heavy wiring |
in an interview, anchor the delays: ripple is order N, lookahead and prefix adders are order log N, and skip or select sit around sqrt N. Then add the tradeoff: faster carries always cost more gates and wiring, so the right adder depends on the width and the timing budget.
do not claim lookahead is constant time at any width. The AND-OR fan-in and gate loading grow with block size, so beyond about 4 bits per block you must go hierarchical or use a prefix tree. Constant time holds only inside one small block.