$30.00
Description

Faster multiplication
The neuron in lab 3 problem 3.1 uses our sequential multiplier from lab 2 problem 2.1; this takes (up to) 8 clock cycles to compute the product of two 8bit numbers. This is ne if a new set of inputs comes no more than once every 8 clock cycles. However, if, for example, we would like our \dit” length to be 1 clock cycle, a stream of `E’ characters would result in a new set of inputs every 4 clock cycles. We will explore options for designing a multiplier with faster throughput.
In the following problems, follow the rules for singleclock logic: you may use any combinational logic blocks along with any number of dregisters all driven by the same clock, but no other memory elements or feedback. Also, recall that according to these rules, all outputs to sequential blocks must be generated directly as outputs to dregisters.
1.1 Combinational multiplier
1.1(a). Recall our implementation of a combinational multiplier using AND gates and Full Adder blocks. Assuming a t_{P D} of 2ns per AND gate, 3ns per FA block, and 3ns for a dregister, along with a setup time of 6ns for the dregister, what clock period would we need to be able to use our 8bit multiplier in a pipelined circuit? What is the latency and throughput of this multiplier?
1.1(b). Design a kpipeline for this combinational multiplier to increase the throughput. Draw the resulting block diagram with lines indicating where the pipeline registers should go. (Hint: see the prelecture video for Week 9 Wednesday)
1.1(c). What is the resulting latency and throughput of the pipelined combinational multiplier?
1.2 Parallel multipliers
1.2(a). Design a distribute module according to the rules for singleclock logic that takes in a onebit ag (indicating new or reset) that pulses high for one clock cycle (e.g. when a new input is ready), and alternately pulses one of two onebit outputs new_{1}, new_{2}.
That is, the rst time the input new pulses high, new_{1} will get pulsed high, but new_{2} must remain low. The very next time new pulses high, new_{2} will get pulsed high, but new_{1} must remain low. The outputs continue to alternate in that fashion.
Draw a block diagram of this module using any singleclock logic combination of wires, CMOS gates, muxes, and/or dregisters.
1.2(b). Recall that the multiplier block produces a onebit output ready that is high when its complete product has been generated; assume the multiplier has been designed so that ready is high for exactly one clock cycle (exactly 8 cycles after its new input pulses). Design a join block that interleaves alternating outputs of two parallel multipliers onto a single output with a single onebit ready output.
Draw a block diagram of this module using any singleclock logic combination of wires, CMOS gates, muxes, and/or dregisters.
1.2(c). Use the above distribute and join modules with two parallel multiplier blocks to create the fast parallel multiplier. Draw the resulting block diagram. What is the fastest latency (in terms of the clock period t_{CLK}) and maximum throughput (in 1/t_{CLK}) of this system?
1.2(d). How does the area of this circuit (in terms of number of transistors) compare to the pipelined combinational multiplier above?
2