Association Demo: from variance to information

Sample size: Noise (dataset-dependent): Number of bins: Binning scheme

Key idea

If knowing one variable reduces our uncertainty about the other, the variables share information.

Surprise: $$s(x)=-\log_2 p(x)$$ Entropy: $$H(X)=-\sum_x p(x)\log_2 p(x)$$ KL: $$D_{KL}(P\|Q)=\sum_x p(x)\log_2\frac{p(x)}{q(x)}$$ Mutual information: $$I(X;Y)=D_{KL}(P_{XY}\|P_XP_Y)$$ Total correlation (two variables): $$TC(X,Y)=H(X)+H(Y)-H(X,Y)=I(X;Y)$$

Note: with quantile binning, the marginals are (approximately) uniform by construction, so $P(X)$ and $P(Y)$ will look flat even for strongly non-Gaussian data.

Scatter plot (axis locked to square)

Stacked: observed $P(X,Y)$ and independence model $Q(X,Y)=P(X)P(Y)$ (same colour scale). Values shown as probabilities with two decimals.

Marginal: X

Marginal: Y

Observed joint $P(X,Y)$ (binned)

Independence model $Q(X,Y)=P(X)P(Y)$

Measuring association: from variance to information