in Machine Learning
159 views
0 votes
0 votes
You have a single hidden-layer neural network for a binary classification task. The input is \(X \in \mathbb{R}^{n \times m}\), output \(\hat{y} \in \mathbb{R}^{1 \times m}\), and true label \(y \in \mathbb{R}^{1 \times m}\). The forward propagation equations are: \[ \begin{align*} z^{[1]} & = W^{[1]}X + b^{[1]} \\ a^{[1]} & = \sigma(z^{[1]}) \\ \hat{y} & = a^{[1]} \\ J & = -\frac{1}{m} \sum_{i=1}^{m} \left( y^{(i)} \log(\hat{y}[i]) + (1 - y^{(i)}) \log(1 - \hat{y}[i]) \right) \end{align*} \] Write the expression for \(\frac{\partial J}{\partial W^{[1]}}\) as a matrix product of two terms.

 

 

A) $\frac{\partial J}{\partial W^{[1]}} = X \cdot (\hat{y} - y)^T$

B) $\frac{\partial J}{\partial W^{[1]}} = (\hat{y} - y) \cdot X^T$

C) $\frac{\partial J}{\partial W^{[1]}} = X^T \cdot (\hat{y} - y)$

D) $\frac{\partial J}{\partial W^{[1]}} = (\hat{y} - y) \cdot \sigma'(z^{[1]}) \cdot X^T$
in Machine Learning
159 views

1 Answer

0 votes
0 votes
as per my calculation answer is C)  , let me know if it is wrong

1 comment

correct there is nothing is hard in this question if somebody has gone through the mathematical derivation of the network, but don’t think DA paper would have this type of question
0
0

Related questions