There is a stall at MA of 1st instruction because the ADD operation is going to be done successfully only if we have the values of r2 and r0 .Here we have no issue with r2 since it is being used for the 1st time but for r0 it will be available only after the MA stage of load instruction.
So even though the operand forwarding mechanism is enabled , the operand buffer which is just after the MA stage is going to forward the value of r0 to the next cycle once the MA stage of load instruction is going to be completed.That is why we are not able to write EX of add instruction directly under the MA of load instruction since the operand forwarding takes place by the buffer after the MA stage only.
In short , execute stage of ADD is only possible only if we get the value of r0 which we can get after MA stage of load instruction.So 1 stall is necessary.
Had there been no operand forwarding , then there would be 2 stalls instead of 1 since in that case execution is possible , as we know if we get the values of both r0 and r2 and r0 value will be obtained only after the WB(writeback) stage in that case.