Yeah. But I was think what about degrees of freedom (DOF) ? Let’s say for a feature D1 (taking V values, say 1 to V and K classes), and for k=1, we need to estimate P(v=1|k=1), P(v=2|k=1) … and so on up to P(v=V|k=1). Similarly for each K value, we get V and thus we get V*K for a single D1 feature. For D features, it’ll be K*V*D. For the prior probabilities of classes, say 1 to K, we need P(k=1), P(k=2) and so on up to P(k=K), (total K values), summing up the answer to K*V*D + K. But for the above D1 feature, P(v=1|k=1), P(v=2|k=1)… and so on up to P(v=V|k=1) sums to 1. So we need (V-1) values only (Cuz, Last value can be 1 – (all values), losing a single dof). For a feature we need, K*(V-1), and for all features we need K*D*(V-1) and similarly for prior probabilities, K-1 values only. So I thought the answer might be K*D*(V-1)+ (K-1) instead of K*V*D + K. But the options have K*V*D+K in the paper. Sorry if I’m wrong and hoping someone would help me before the exam. Best of luck to y’all.