in CO and Architecture edited by
20,418 views
35 votes
35 votes

A 4-stage pipeline has the stage delays as $150$, $120$, $160$ and $140$ $nanoseconds$, respectively. Registers that are used between the stages have a delay of $5$ $nanoseconds$ each. Assuming constant clocking rate, the total time taken to process $1000$ data items on this pipeline will be:

  1. $\text{120.4 microseconds}$

  2. $\text{160.5 microseconds}$

  3. $\text{165.5 microseconds}$

  4. $\text{590.0 microseconds}$

in CO and Architecture edited by
20.4k views

4 Comments

In a pipelined processor, a clock signal is applied to each segment(synchronised clock-same signal at same time to each segment). Suppose if the clock is positive edge triggered, then when positive edge occurs, each segment will start its operation.When different segments take different times to complete their suboperation. The clock cycle must be chosen so that data has reached infront of every segment.

Here we cannot use clock cycle time less than 165ns because before 165ns the data from segment number 3 has not reached to segment number 4.It will take a time of (160+5)ns for it to reach infront of segment number 4. So take clock cycle time of 165ns. Now 1st instruction will take 165*4 ns and remaining 999 instructions will take 165 ns each. So total time should be 165*4+165*999=165495 ns
5
5
(k+n-1)tp=(4+1000-1)165=165.5microsec     here tp=160+5=165
0
0
Is it like this that “If constant clocking rate is mentioned, then we need to find the max among all the stage delays and then calculate the runtime else we need to count the delay of each instruction by independently adding each of the stage delays”?
0
0

3 Answers

63 votes
63 votes
Best answer
Pipelining requires all stages to be synchronized meaning, we have to make the delay of all stages equal to the maximum pipeline stage delay which here is $160$. We also have to add the intermediate register delay which here is $5ns$ which makes the clock period as $165ns.$

Time for execution of the first instruction $= 165* 4 = 660$ ns.

Now, in every $165$ ns, an instruction can be completed. So,

Total time for $1000$ instructions $= 660 + 999*165 = 165.495$ microseconds

Correct Answer: $C$
edited by
by

15 Comments

sir,

" Time for execution of the first instruction = (160+5) * 3 + 160 = 655 ns (5 ns for intermediate registers which is not needed for the final stage). "

the first instruction cycle is 165*4=660
0
0
@Arjun SIr..

One doubt.Here it is not mentioned that pipeline is synchronous.then why we are considering all stage taking 165ns  time?
4
4

Question mentions it

Assuming constant clocking rate,

Even otherwise, unless specified otherwise we should assume this.

12
12
Okk..thank you
0
0

^@Shubham, 

If it had,

Registers that are used at the end of every stage have a delay of 5ns each then for the first instruction cycle needed should be  = 165*4=660..

but here it is given that Registers that are used between the stages have a delay of 5 nanoseconds each

Am I right @Arjun sir .. ?

11
11
edited by
$n=1000$, $k=4$, $t_p=160+5=165$. Putting in the formula:

$(k+(n-1))t_p \rightarrow (4+(999))165=165,495ns.$
But we have to remove 5ns of the 4th stage of first instruction. Final answer is 165.490 microsec
1
1
I don't think we need to subtract the 5ns for the first instruction because the calculation of the clock time period includes both maximum delay of a stage and delay of the register.

And even if we do not consider including the register delay in the calculation of the time period, it is for the last instruction whose register delay (of the last stage) would not be counted, not for the first instruction.
0
0
Every instruction following the first instruction is coming out after every 165ns, which includes the last instruction too.

We need to subtract the 5ns because there are 3 buffers between the 4 stages. There is no buffer after the 4th stage, and hence first instruction won't be needing the 5ns.
1
1
@Arjun sir... What is the reason behind the taking of max delay among the delays my point of view is ....We do not consider the next stage without completing the previous as that is a rule we must consider all the stage delays independently sir plzz explain it sirr
0
0
@Arjun Sir,

as we don't have register after the 4th stage why are we taking 165 as cycle time

I think it should be 655 + 999*160
1
1
Actually we have to consider the maximum delay and pipeline works in synchronous mode only. So, the first cycle time I calculated was wrong -- just fixed it.
2
2

@gatecse sir, are you sure that we have to consider $4\times 165=660$ and not $3\times 165+160=655.$ Because in questions where latches were involved I have solved using the latter case and not the former one because of the previous explanation given here.

Even I have made a comment here on this question solving using the latter case : https://gateoverflow.in/118719/gate2017-1-50?show=329342#c329342

Even I got a doubt because of these two different cases. I posted here : https://csedoubts.gateoverflow.in/21999/self-doubt-py-co-%26-architecture

2
2
edited by

@KUSHAGRA गुप्ता , any confirmation about your doubt?

Even I believe, this would be the correct way to do, as the registers are placed between stages : 3×165+160=655.
Also, when to consider the max of among all delays and when to consider the delay of each individual stage while calculating the execution time?
Please help. @adad20

0
0

What if maximum stage delay is at last stage, say instead of 160 delay at second last stage it is at last stage. Will then also we will add register delay to 160? Because it is given “Registers that are used ‘between’ the stages” not at the last stage.

In a test series solution it was not added but I think it will be added. Please clarify.

0
0

@ankit3009 Individual delays are considered when we are calculating EMAT for non-pipelined structure.

2
2
5 votes
5 votes
Lets first instruction will take all four stages(4cycle) and rest 999 instruction will be completed in every clock cycle.

TT(total time)=First instruction x Number of cycle x Duration of each cycle + 999 x Number of cycle x Duration of cycle

TT=1 x4x(160+5)+999x1x165 ns

TT=165,495 ns

TT=165.495 micro second

//Max time period=Max_duration(150,120,160,140)+register delay=165ns
2 votes
2 votes
Delay between each stage is 5 ns.
Total delay in pipline = 150 + 120 + 160 + 140 = 570
Total delay for one data item = 570 + 5*3 (Note that there are 3 intermediate registers)
                              = 585 
For 1000 data items, first data will take 585 ns to complete and rest 
999 data will take max of all the stages that is 160 ns + 5 ns register delay

Total Delay = 585 + 999*165 ns which is approximately 165.5 microsecond.

1 comment

Wrong
0
0
Answer:

Related questions