GSoC’21 Blog 4

Rishabh Sanjay
3 min readAug 7, 2021

So for the last 2 weeks, I was working on the API design for the ECDF plots and also some nitpicks in the dot plot implementation.

Week 7

So this week was mainly fixing the bugs and some documentation changes in the dot plot PR so that it can finally be merged. So my mentors and I after going through the code multiple times and doing all the necessary changes finally my dot plot PR was merged. Also, I was discussing the API design for the ECDF plots and also some performance of some functions which would be used in the ECDFcomputation.

Week 8

This week after a lot of discussions I opened a PR for the initial prototype of ECDF plots.

Though I need to make a lot of changes and also add tests, documentation and also write the code for the Bokeh backend. Also if time permits will look into Numba for some computation acceleration.

Let’s see how this plot works.

import arviz as az
from scipy.stats import uniform, binom, norm

sample = norm(0,1).rvs(1000)
distribution = norm(0,1)

az.plot_ecdf(sample)

The above script would result in the following plot:

az.plot_ecdf(sample, distribution = distribution.cdf, confidence_bands = True)
az.plot_ecdf(sample, distribution = distribution.cdf, confidence_bands = True, difference = True)
az.plot_ecdf(sample, distribution = distribution.cdf, confidence_bands = True, difference = False, pit = True)
az.plot_ecdf(sample, distribution = distribution.cdf, confidence_bands = True, difference = True, pit = True)

Similarly, graphs when comparing to values2 instead of a distribution. Also, we can provide distribution or values2 from a different distribution then the plots depict that the sample does not belong to that distribution.

For eg:

sample = norm(0,1).rvs(1000)
distribution = norm(0,2)
az.plot_ecdf(sample, distribution = distribution.cdf, confidence_bands = True, difference = True)
az.plot_ecdf(sample, distribution = distribution.cdf, confidence_bands = True, difference = True, pit = True)

There are many more arguments which you can use to play with the plot. Also, I might add some more.

Let’s meet again in 2 weeks. Have a nice day! 😄

--

--

Rishabh Sanjay

Maths and Computing at IIT Kanpur, Deep Learning | NLP | Bayesian Statistics enthusiast