An arithmetic unit is implemented in two ways A 4stage pipel

An arithmetic unit is implemented in two ways: A 4-stage pipeline design, namely as P, where the four stages, S1, S2, S3, and S4 have combinational delays 1 unit, 1.5 units, 3.5 units, and 2 unit, respectively. All operands are fed into S1 and the output is provided by S4. A single-cycle design, namely as U. What is the best-case speedup of the design P over the design U? Under what conditions will this speedup be achieved? The delay of a pipeline stage corresponds to the path with the largest delay. For stage S3, only 20% all operations are and excite this worst-case delay path, when the remaining 30% and 50% are \"medium\" and \"fast\" and have a maximum delay of 1.7 Units and 1.3 Units, respectively. Suppose we alter the pipeline to a configuration design P\', which still has four stages as in P. but where the \"fast\" operations in S3 are completed in one cycle, and the \"medium\" and \"slow\", operations in S3 require two and three cycles, respectively- Note that while a \"medium\" or \"slow\" operation is being executed, stage S3 is busy and the pipeline is appropriately stalled until 53 becomes available. Over a large number of operations where the mix of \"fast\", \"medium\" and \"slow\" operations follows the 50%/30%/20% distribution, what is the best-case speedup of the design P\' over U? Now, consider the case where the delays of the stages In the original design P are balanced so that each stage now has a delay of and unlike (b), each stage requires exactly one cycle. Call this design P\". What is the best-case speedup of design P\" over the U? Under what conditions will this speedup be achieved?

Solution

Pipelining cannot decrease the interval needed for one task. The advantage of pipelining is that it will increase the outturn of the system once process a stream of tasks.

Applying too several pipelined functions will cause accumulated latency - that\'s, the time needed for one task to propagate through the total pipe is prolonged. A pipelined system might also need additional resources (buffers, circuits, process units, memory etc.), if the utilize of resources across totally different stages is restricted.

Comparison with parallel approaches[edit]
Another technique to boost the potency through concurrency is data processing. The core distinction is that parallel techniques sometimes duplicate operate units and distribute multiple input tasks right away amongst them. Therefore, it will complete additional tasks per unit time however could suffer dearer resource prices.

For the previous example, the parallel technique duplicates every operate units into another 2. consequently, all the tasks may be operated upon by the duplicated operate units with identical operate at the same time. The time to complete these 3 tasks is reduced to a few slots.

Pipelining in FIR filters[edit]
Consider a 3-tap FIR filter:[1]


which is as shown within the following figure.

Assume the calculation time for multiplication units is metallic element and tantalum for add units. The essential path, representing the minimum time needed for process a brand new sample, is restricted by one multiplication and a pair of add operate units. Therefore, the sample amount is given by

}\\geq T_+2T_} }\\geq T_+2T_}
Pipelined FIR filters.png
However, such structure might not be appropriate for the planning with the necessity of high speed. to scale back the sampling amount, we are able to introduce additional pipelining registers on the essential knowledge path. Then the structure is divided into 2 stages and therefore the knowledge made within the initial stage are hold on within the introduced registers, delaying one clock to the second stage. the info in initial 3 clocks is recorded within the following table. below such pipelined structure, the sample amount is reduced to

}\\geq T_+T_.} }\\geq T_+T_.}
Pipelined FIR filters2.png


Pipelined FIR filters table.png
Pipelining in 1st-order IIR filters[edit]
By combining look-ahead techniques and pipelining,[2] we have a tendency to area unit ready to enhance the sample rate of target style. Look-ahead pipelining can add canceling poles and zeroes to the transfer operate such the coefficients of the subsequent terms within the divisor of the transfer operate area unit zero.

,\\ldots ,z^\\}} ,\\ldots ,z^\\}}
Then, the output sample y(n) may be computed in terms of the inputs and therefore the output sample y(n M) such there area unit M delay parts within the essential loop. These parts area unit then wont to pipeline the essential loop by M stages so the sample rate may be accumulated by an element M.

Consider the 1st-order IIR filter transfer operate

}}} }}}
The output y(n) may be computed in terms of the input u(n) and therefore the previous output.


In a easy structure to style such operate, the sample rate of this algorithmic filter is restricted by the calculation time of 1 multiply-add operation.

To pipeline such style, we have a tendency to observe that H incorporates a pole at


Therefore, in an exceedingly 3-stage pipelined equivalent stable filter, the transfer operate may be derived by adding poles and zeros at

})}} })}}
and is given by

+a^z^}{1-a^z^}}} +a^z^}{1-a^z^}}}
Therefore, the corresponding sample rate may be accumulated by an element three.

 An arithmetic unit is implemented in two ways: A 4-stage pipeline design, namely as P, where the four stages, S1, S2, S3, and S4 have combinational delays 1 un
 An arithmetic unit is implemented in two ways: A 4-stage pipeline design, namely as P, where the four stages, S1, S2, S3, and S4 have combinational delays 1 un

Get Help Now

Submit a Take Down Notice

Tutor
Tutor: Dr Jack
Most rated tutor on our site