in Others recategorized by
677 views
0 votes
0 votes
Let $f: \mathbb{R} \rightarrow \mathbb{R}$ be the function $f(x)=\frac{1}{1+e^{-x}}$.

The value of the derivative of $f$ at $x$ where $f(x)=0.4$ is $\_\_\_\_\_\_\_$. (rounded off to two decimal places).

Note: $\mathbb{R}$ denotes the set of real numbers.
in Others recategorized by
by
677 views

1 Answer

2 votes
2 votes
Here, $f(z)=\frac{1}{1+e^{-z}}$

$f'(z) =\frac{e^{-z}}{(1+e^{-z})(1+e^{-z})}=\frac{1}{(1+e^{-z})}\frac{1+e^{-z}-1}{(1+e^{-z})}$

$f'(z)=\frac{1}{(1+e^{-z})} \left(1-\frac{1}{1+e^{-z}}\right)$

$f'(z)=f(z)(1-f(z))$

Say at $z_0,$ $f(z_0)=0.4$ then

$f'(z_0)=f(z_0)(1-f(z_0))=0.4 \times 0.6= 0.24$

$\textbf{Note:}$

$(1)$ This function $f(z)$ is present in most of textbooks because it is a very famous type of sigmoid activation function which is called as logistic function. This logistic function is highly non-linear and used in places like logistic regression and the artificial neural networks.

$(2)$ The way we have written the derivative $f'(z)=f(z)(1-f(z))$ has a special purpose. For large $z$, $f'(z) \rightarrow 0$ and we get very slow weight updates using the gradient descent and it creates a problem vanishing gradient while minimizing the loss function. ReLU (Rectified Linear Unit) is resistant to this problem.

$(3)$ To remove the $f'(z)$ from the picture, we use the chain rule and write the loss function ing a different way which is called as Binary Cross Entropy Loss in the context of logistic regression.
edited by

Related questions