I have a reference distribution R from which I am sampling to create distributions of different sample sizes. These new distributions are of same dimensions but with different numbers of data points. When i'm calculating KL divergence of these sample distributions, P with R, they are different. When the sample size is large, KL is near 0 and when sample size is small KL is large. So, the difference in Kl is coming from limited sample sizes. How can I eliminate this error and get a corrected KL divergence? I am also fine with not getting a corrected KL, but getting an uncertainty due to small sample sizes.
I have tried Miller Madow method, however I am not sure how to implement it in KL.