1 min readDec 3, 2018
Hi Thierry,
Sorry about the extreme delay in the reply. So your question is that how do we measure KL divergence if our q(xi) = 0? One solution I can imagine is using “Laplace smoothing” on your Q and P values. To avoid the problem of division by zero. Here’s an interesting read about this problem I found: https://www.reddit.com/r/MachineLearning/comments/2wb8y0/how_to_calculate_kullbackleibner_divergence_when/
Hope this helps!