Information Theory

Kullback-Lieber Divergence

Measures the similarity between the joint probability density function and the product of the individual density functions
- If they’re the same, then both variables are independent
Also see Statistical Rethinking >> Chapter 7
Example: Measuring Segregation (link)
\[ L_u = \sum_{g=1}^G p_{g|u} \log \frac{p_{g|u}}{p_g} \]
- \(p_{g|u}\) is the proportion of a racial group, \(g\), in a neighborhood, \(u\)
- \(p_g\) is the overall proportion of that racial group in the metropolitan area
- This is a sum of scores across all racial groups of a neighborhood, \(u\)

Measures how dependent two random variables are on one another
- Accounts for linear and non-linear dependence
If the mutual information is 0, the variables are independent, otherwise there is some dependence.
Example: Measuring Segregation
\[ M = \sum_{u=1}^U p_uL_u \]
- \(L_u\) : See example in Kullback-Lieber Divergence section
- \(p_u\) : Described as the “size of the neighborhood”
  - Not sure if this is a count or a proportion of the population of the neighborhood to the population of the metropolitan area. Both may end up in the same place.
- This is a sum of scores across all neighborhoods in a metropolitan area
  - So the neighborhood scores are weighted by neighborhood population and summed for an overall metropolitan score
  - \(L_u\) is affected by the smallest racial proportion (see article) for that metropolitan area, so unless these are the same, you can’t compare metropolitan areas with this number. But you can use these numbers to see how a metro’s (or neighborhood’s) diversity has changed over time.