Multipliers and Barrel Shifters
Turning multiply and shift into fast parallel hardware
The array multiplier
Multiplication and variable shifting look simple in software but become real datapath structures that decide the area and critical path of any ALU or DSP block. An unsigned multiply is just shifted partial products added up: for each bit of the multiplier you AND it with the whole multiplicand to form one partial product, shift it left by the bit position, then sum all of them. A pure array multiplier wires N rows of AND gates into a grid of adders. It is regular and easy to lay out, but its delay grows with N because the partial-product sum still ripples.
// Behavioral N-bit unsigned multiply; synthesis maps it
// to a partial-product array plus a fast adder tree.
module mult #(parameter N = 8)
(input [N-1:0] a, b,
output [2*N-1:0] p);
integer i;
reg [2*N-1:0] acc;
reg [2*N-1:0] a_ext;
always @* begin
acc = 0;
a_ext = {{N{1'b0}}, a}; // widen to 2N so high bits survive
for (i = 0; i < N; i = i + 1)
if (b[i]) acc = acc + (a_ext << i); // shifted partial product
// 'acc' accumulates all partial products
end
assign p = acc;
endmoduleTwo tricks dominate fast multipliers. A Wallace or Dadda tree compresses the column of partial products using 3-to-2 carry-save adders, reducing the bits to add in logarithmic depth, and only the final two rows go through one fast carry-propagate adder. Booth encoding cuts the number of partial products roughly in half by recoding the multiplier into signed digits, which also makes signed multiplication natural. Production multipliers combine radix-4 Booth encoding with a Wallace tree.
The barrel shifter
A barrel shifter shifts or rotates a word by any amount in a single cycle, with no clocked loop. It is built from a row of multiplexers per shift stage, where stage k either passes data straight through or shifts it by 2 to the power k. Selecting stages by the bits of the shift amount lets you reach any distance in log N mux layers. This is exactly why shift amounts are encoded in binary: each bit controls one stage.
// 8-bit logical left barrel shifter, 3 mux stages
module barrel_lsh
(input [7:0] din,
input [2:0] sh, // shift amount 0..7
output [7:0] dout);
wire [7:0] s0, s1;
assign s0 = sh[0] ? {din[6:0], 1'b0} : din; // shift by 1
assign s1 = sh[1] ? {s0[5:0], 2'b0} : s0; // shift by 2
assign dout = sh[2] ? {s1[3:0], 4'b0} : s1; // shift by 4
endmoduleif asked why a barrel shifter beats a simple shift register, say the shift register needs up to N clock cycles while the barrel shifter does any shift combinationally in log N mux stages. For multipliers, the headline answer is Booth to reduce partial products and a Wallace tree to add them in logarithmic depth.
signed values change everything. An array multiplier built for unsigned inputs gives wrong results on twos-complement numbers unless you sign-extend the partial products or use Booth encoding. Likewise, an arithmetic right shift must replicate the sign bit, not feed in zeros like a logical shift.