GATE CSE 2005 | Question: 68

Question

Dark Mode

GATE CSE 2005 | Question: 68

Kathleen asked in CO and Architecture Sep 22, 2014 edited Aug 1, 2020 by KUSHAGRA गुप्ता

46,033 views

A $5$ stage pipelined CPU has the following sequence of stages:

IF – instruction fetch from instruction memory
RD – Instruction decode and register read
EX – Execute: ALU operation for data and address computation
MA – Data memory access – for write access, the register read at RD state is used.
WB – Register write back

Consider the following sequence of instructions:

$I_1$: $L$ $R0, loc$ $1$; $R0 \Leftarrow M[loc1]$
$I_2$: $A$ $R0$, $R0$; $R0 \Leftarrow R0 +R0$
$I_3$: $S$ $R2$, $R0$; $R2 \Leftarrow R2-R0$

Let each stage take one clock cycle.

What is the number of clock cycles taken to complete the above sequence of instructions starting from the fetch of $I_1$?

$8$
$10$
$12$
$15$

Kathleen asked in CO and Architecture Sep 22, 2014 edited Aug 1, 2020 by KUSHAGRA गुप्ता

by Kathleen

46.0k views

4 Comments

Show 11 previous comments

4 Answers

Best answer

Answer is option A.

Without data forwarding:

13 clock - WB and RD state non overlapping.

$$\begin{array}{|c|c|c|c|c|} \hline \textbf {T1} & \textbf {T2} & \textbf {T3} & \textbf {T4} & \textbf {T5} & \textbf {T6} & \textbf {T7} & \textbf {T8} & \textbf {T9} & \textbf {T10} & \textbf {T11} & \textbf {T12} & \textbf {T13} \\\hline \text{IF}& \text{RD} & \text{EX} & \text{MA} & \text{WB} & & \\\hline \text{} & \text{IF} & &&&\text{RD} & \text{EX} & \text{MA} & \text{WB} & \text{} & \text{}\\\hline &&&&& \text{IF}& & & &\text{RD} & \text{EX} & \text{MA}&\text{WB} \\\hline \end{array}$$

Here, WB and RD stage operate in Non-Overlapping mode.

11 clock - WB and RD states overlapping.

$$\begin{array}{|c|c|c|c|c|} \hline \textbf {T1} & \textbf {T2} & \textbf {T3} & \textbf {T4} & \textbf {T5} & \textbf {T6} & \textbf {T7} & \textbf {T8} & \textbf {T9} & \textbf {T10} & \textbf {T11} \\\hline \text{IF}& \text{RD} & \text{EX} & \text{MA} & \text{WB} & & \\\hline \text{} & \text{IF} & &&\text{RD} & \text{EX} & \text{MA} & \text{WB} & \text{} & \text{}\\\hline &&&& \text{IF}& & &\text{RD} & \text{EX} & \text{MA}&\text{WB} \\\hline \end{array}$$

Split Phase access between WB and RD means:

WB stage produce the output during the rising edge of the clock and RD stage fetch the output during the falling edge.

In Question it is mentioned

for write access, the register read at RD state is used.

This means that for writing operands back to memory, register read at RD state is used (no operand forward for STORE instructions).

Note

As in any question in any subject unless otherwise stated we always consider the best case. So, do overlap - unless otherwise stated. But this is for only WB/RD

Why there is stall for I2 in T3 and T4 ?
RD is instruction decode and register read. IF we execute RD of I2 in T3, data from memory will not get stored to R0 hence proper operands are not available at T3. Perhaps I2 has to wait until I1 write values to memory.
WB of I1 and RD of I2 are operating in same clock why it is so ?
If nothing has mentioned in question. This scenario is taken into consideration by default. It is because after MA operands will be available in register so RD and WB could overlap .

With data forwarding

(Should be the case here as question says no operand forwarding for memory register for STORE instructions)

8 clock cycles

Why there is a stall I2 in T4 ?
Data is being forwarded from MA of I1 EX of I2 .MA operation of I1 must complete so that correct data will be available in register .
Why RD of I2 in T3 ? Will it not fetch incorrect information if executed before Operand are forwarded from MA of I1 ?
Yes. RD of I2 will definitely fetch INCORRECT data at T3 . But don't worry about it Operand Forwarding technique will take care of it .
Why can't RD of I2 be placed in T4 ?
Yes . We can place RD of I2 in T4 as well. But what is the fun in that ? pipeline is a technique used to reduce the execution time of instructions . Why do we need to make an extra stall ? Moreover there is one more problem which is discussed just below .After reading the below point Just think if we had created a stall at T3 !
Why can't RD of I3 be placed at T4 ?
This cannot be done . I3 cannot use RD because Previous instruction I2 should start next stage (EX) before current (I3) could utilize that(RD) stage . It is because data will be residing in buffers.
Can an operand being forwarded from one clock cycle to same clock cycle ?
No, the previous clock cycle must complete before data being forwarded . Unless split phase technique is used
Cant there be a forwarding from EX stage(T3) of I1 to EX stage(T4) of I2 ?
This is not possible . See what is happening in I1 . It is Memory Read .So data will be available in register after memory read only .So data cannot be forwarded from EX of I1 .
In some case data is forwarded from MA and some case data is forwarded from EX Why it is so ?
Data is forwarded when it is ready . It solely depends on the type of instruction .
When to use Split-Phase ?
We can use split phase if data is readily available like between WB/RD and also when operand forwarding happens from EX-ID stage, but not from EX-EX stage. We cannot do split phase access between EX-EX because here the instruction execution may not be possible in the first phase. (This is not mentioned in any standard resource but said by Arjun Suresh by considering practical implementation and how previous year GATE questions have been formed)

[Mostly it is given in question that there is operand forwarding from A stage to B stage eg:https://gateoverflow.in/8218/gate2015-2_44 ]

Split-Phase can be used even when no Operand Forwarding because they aren't related.

References

http://web.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/forward.html

4 Comments

Show 74 previous comments

Arjun · Answer 1 · 2015-01-17T09:41:13+0000

For write access the register read at RD stage is used- this means for a STORE instruction we cannot get operand forwarded but only from RD stage. So, we can assume data forwarding is possible for all other instructions.

T1	T2	T3	T4	T5	T6	T7	T8
IF	RD	EX	MA	WB
	IF	RD		EX	MA	WB
		IF		RD	EX	MA	WB
MA -> EX forwarding done between I1 and I2
EX -> EX forwarding done between I2 and I3

Hence, answer will be 8.

http://www.cs.iastate.edu/~prabhu/Tutorial/PIPELINE/forward.html

amarVashishth · Answer 2 · 2015-12-25T06:52:26+0000

4 Comments

Show 17 previous comments

by GateAspirant999

commented Jun 18, 2017

I feel Hamacher's book says something different about point 6 in the answer

6. Cant there be a forwarding from EX stage(T3) of I1 to EX stage(T4) of I2 ?
This is not possible . See what is happening in I1 . It is Memory Read .So data will be available in register after memory read only .So data cannot be forwarded from EX of I1.

In 6th edition of Hamacher's book, section 6.4. Data Dependencies, it gives following instructions:

Add R2, R3, #100
Subtract R9, R2, #30

Without operand forwarding, the solution is given as follows:

With operand forwarding, its given as follows:

Author says ALU's output can be given back as a feedback to its input to achieve above.

Based on this, I feel it should be possible for EX of I2 to execute in T4. What I am missing here :( ???

by blackcloud

commented Jun 23, 2019

I1: R0<= M[loc1]

lets break it down to all stages...

as that given program was executing in a pipelined processor at a certain time PC got this address of the above instruction while executing a previous instruction..say I0

now its turn for I1.

first in IF CYCLE the instruction was fetch from the memory location previously pointed by pc.

now hypothetically imagine :>>

say instruction was 0-101-11-10 (say 8 bit address)

now in RD phase its its decoded like this.:::>>>>

0 means direct address i.e. loc1=10

101 means load operation

11 means register R0

10 is the address of loc1(we still dont know whats the data in loc1)

no register to be read for this instruction as u can see

in EX phase if it was an indirect address or indexed or relative address the effective address would have been computed..

still don't know whats the data in loc1.

now in MA phase actual load happens.

the data from mem location loc1(i.e. 10) on R0

so only after MA phase we got correct value in R0

that's why operand forwarded from MA and not from EX.

by blackcloud

commented Jun 23, 2019

There are lots of things going on here...

without data forwarding and split phase..

its simply 13 clk cycles..(fully understandable)

without data forwarding but with split phase

its 11 clk cycles...

why?

see split phase meaning doing 2 diff things in each half of 1 cycle..

in first half we use it to write and other half to read.

we can do both in 1st half only as we do have diff h/w lines for reading and writing (suppose)

but that will led us into wrong or old value read.

now in split phase we use PIPO shift registers as buffers in between 2 successive stages.This makes the processor perform write and read in 1 clk cycle.

now coming to operand/data forwarding..(8 clks)

we have extra hardwares mostly comparators to check for RAW dependencies..

we say domain of an instruction is the registers in between operation is being done

range of instruction is register on which o/p is being written.

so we can perform comparation between

range(instruction i-1) & domain (instruction i)

so by default assume that when operand forwarding is not used theres no such hardware..else there is..

this is why even if with operand forwarding we are reading the regs in T3 cycle it does not matter as along with this extra circuitry is checking fot RAW and as RAW exists it will lead to 1 stall cycle.

we forward dummy signals (all 0) to next exec. phase .

thats why for I2 after RD ,EX phase does nothing in T4.

AkshayBatheja1996 · Answer 3 · 2019-12-03T14:36:16+0000

Everywhere the explanation for this question is wrong, The correct explanation is:

OPERAND FORWARDING:

(RAW):

1. In case of LOAD statements data forwarding fails and the operand is available in (MA) stage of instruction Here I1 (MA) and I2 (RD)

2. While In case of ALU type statements the operand is available in EX stage of instruction Here I2(EX) and I3(RD)

This is the correct way to do such questions.

Statement 1 mentioned above is the drawback of operand forwarding due to which it is not able to solve all such dependencies.

tags	tag:apple
author	user:martin
title	title:apple
content	content:apple
exclude	-tag:apple
force match	+apple
views	views:100
score	score:10
answers	answers:2
is accepted	isaccepted:true
is closed	isclosed:true

GATE CSE 2005 | Question: 68

4 Comments

Please log in or register to add a comment.

Please log in or register to answer this question.

4 Answers

4 Comments

Please log in or register to add a comment.

4 Comments

Please log in or register to add a comment.

4 Comments

Please log in or register to add a comment.

4 Comments

Please log in or register to add a comment.

Related questions

Subscribe to GATE CSE 2024 Test Series

Subscribe to GO Classes for GATE CSE 2024

Subjects

GATE CSE 2005 | Question: 68

4 Comments

Please log in or register to add a comment.

Please log in or register to answer this question.

4 Answers

4 Comments

Please log in or register to add a comment.

4 Comments

Please log in or register to add a comment.

4 Comments

Please log in or register to add a comment.

4 Comments

Please log in or register to add a comment.

Related questions

Subscribe to GATE CSE 2024 Test Series

Subscribe to GO Classes for GATE CSE 2024

Recent Posts

Subjects

Recent Blog Comments