How to Read the News like a Bayesian

One of the, somewhat odd, morning rituals I've developed is what I would call Bayesian media analysis. More plainly: skeptically reading the news. I wake up in the morning and read the news, not to find out "what's happening" as much as what I'm being told is happening. I enjoy reading articles to see if there is something that stands out as inconsistent with the main story, and seeing how I might be able to glean important information even in a noisy media environment.

As I touched on in the last post about warming weather in NJ, one of the reasons that I advocate learning probability and statistics is that it is essential to understanding our everyday world. This is a major reason why I wrote Bayesian Statistics the Fun Way, to give these skills to a broader audience of people.

This morning I came across an article that is another great example of why this is so important.

The article was from NPR titled:

"The ‘Big Sort’ again: Americans are fleeing to places where political views match their own"

You can learn a lot more from the news if you read it like a Bayesian!

Let's take a look at how even the simplest bit of probably and a touch of Bayes' Theorem can help us understand what the news is really telling us.

Reading the News like a Bayesian

People often ask me about what sophisticated tools I might use for day-to-day statistics work, and I often tell them that the most important tool is not Stan or PyMC, or even JAX (which are all fantastic): it's thinking about the world in terms of Bayes' theorem.

Let's take a quick look at the form of this we'll be using to read this article:

$$P(\text{Hypothesis}|\text{Data}) \propto P(\text{Hypothesis}) \times P(\text{Data}|\text{Hypothesis})$$

Or more generally in Bayesian terms:

$$\text{Posterior Belief}\propto \text{Prior Beliefs} \times \text{Likelihood}$$

Or in plain English:

My beliefs given the evidence shown are proportional to how strongly I held those beliefs to begin with and how well the evidence presented is explained by those beliefs.

The Posterior - What the News is Trying to tell me

Let's start with the Posterior: \(P(\text{Hypothesis}|\text{Data})\). This can be understood as "the conclusion that article wants me to draw (hypothesis) after I have read points made in the article (data)".

The hypothesis is usually found in the headline or first paragraph. In this specific case it's "Americans are fleeing to places where political views match their own". This article is also part of a larger context discussing the political division in America, claiming here that it is so bad that people are moving solely to be around people that agree with them politically.

The data is going to be presented throughout the article, it will combine narrative arguments (stories about families moving), and most important: actual data. We'll discuss this in a bit as we read through the article.

At the end of reading the article the author/publisher hopes we will adjust our beliefs to include the hypothesis presented. But if we don't read as Bayesians we might not be reading skeptically enough. There's a certain irony here because there is a very good question about how big a role the media itself plays in the political division in America. That is, this article might be playing a part in creating the very problem it’s warning against.

The Prior - What I already believe about the News

Perhaps the single most import of information to consider when reading the news (or any media) is your own prior beliefs: \(P(\text{Hypothesis})\). In the case of news this is really a combination of your prior beliefs in the source as well as the hypothesis. For example some people will have their prior so aligned with a source such as FOX News, that they will throw out the rest of Bayes' theorem and immediately accept any hypothesis given.

In full disclosure, while I don't listen to NPR much anymore (today I'm a bigger fan of America's greatest Free Form Radio: WMFU) at one point in my life it was my default radio station. Ideally we shouldn't let the source influence our belief in the hypothesis, but it's important to account for all of our priors.

As far as the my prior in this particular hypothesis, if I'm honest with myself, I do have a weak belief that this may be happening.

$$P(\text{hypothesis}) = \text{weakly leaning towards true}$$

Anecdotally I've feel like I've heard stores about such things happening, and I certainly feel like the political climate in the US has become more intense. This means that without even reading the article, or seeing the data at all, I'm inclined to accept the hypothesis.

But I'm not absolutely certain either. It will still take a fair bit of convincing for me to strongly accept the conclusion of the article.

We can see here the problem of strong prior beliefs in both the source and the hypothesis when reading the news. Whether it's FOX News or NPR if you have a very strong prior in both the source and the hypothesis of the article, you aren't really gaining much information from reading it. A slight bit of flimsy evidence may still leave you with firm convictions.

Likelihood - Reading the article and assessing it's arguments

While the prior may be the most important thing to assess before reading an article, the likelihood is the most important thing to focus on while reading.

The likelihood, \(P(\text{data}|\text{hypothesis})\), asks us to invert our thinking a bit. This means we need to read the article and ask ourselves:

if what the article says is true, how well does it support what the data is showing us?

This is not how we usually read news articles. Typically we see them as a series of observations in support of a hypothesis, but we should really be checking how well these observations are explained by what the article is claiming is true.

This article focuses a lot on stories, which are nice, but doesn't do much to convince me of a hypothesis. This article focuses a lot specifically on Texas, telling the story of one family moving to Austin because it is more blue, but focuses more on:

"Trump followers are flocking to red Texas in search of the promised land."

In many ways this is the real hypothesis of the article since it provides little evidence in support of the more general claim. Even more specifically the article argues that people are moving from California, a “blue” voting state with a very large, conservative population, to Texas where they feel more supported and comfortable. Sounds reasonable, but then the article presents us with data that more than 1 in 10 new Texans arrives from California.

The actual evidence supporting the hypothesis of the article is fairly sparse.

Wow, more that 1 in ten people moving to Texas are from California! That sounds like a lot, especially since there are 50 states. Naively, our mental math might say that the probability of moving to Texas from California, just randomly selecting a non-Texas state, is \(\frac{1}{50}\) which means that Californians are moving to Texas at 5 times the rate we would expect. If that were the case, the argument that people are rushing out of California to Texas for political reasons does to explain this unexpected observation pretty well.

However this doesn't take into account populations of these states. A quick visit to Wikipedia tells us these three important facts we need to come up with what the expected rate people should be coming from California based on population alone:

  • US Population: 330,759,736

  • California Population: 39,538,223

  • Texas Population: 29,145,505

This means that the proportion of the US excluding Texas that lives in California is

$$\frac{\text{Pop}_\text{CA}}{\text{Pop}_\text{US} - \text{Pop}_\text{TX}} = \frac{39,538,223}{330,759,736 - 29,145,505} = 0.13$$

Which is "more than one of every 10". That is, if people just moved to Texas in proportion to the number of people in the originating state I would expect this fact presented in the article to be true. It doesn’t tell me anything particularly convincing about the hypothesis presented.

The next fact we get in this same paragraph is that most new Texans come from Florida.

I’m not even sure this fact supports the hypothesis!

Which is odd because I'm not entirely sure this supports their argument in the first place, since Florida was a “red” state in the last two elections, and I've also heard anecdotes of Trump supporters moving to Florida. So even if it was interesting, I'm not sure this helps the hypothesis, however it's also not interesting because the largest states by population in the US are:

  1. California - 39,538,223

  2. Texas - 29,145,505

  3. Florida - 21,538,187

  4. New York - 20,201,249

  5. Pennsylvania - 13,002,700

Ignoring Texas itself, Florida is the second most populous state, so it being the second highest source of new residents to Texas, behind the most populous state, seems to support the argument that perhaps people move to Texas in proportion to the number of people in the original state.

Alternate hypothesis and the likelihood ratio

What is emerging here as we read the article is the idea of an alternative hypothesis: that the distribution of states people migrate to Texas from is largely based on the proportion of people living in those states.

We'll call our initial hypothesis \(H\) and, for simplicity, consider this new hypothesis as the negation of \(H\), \(\bar{H}\). That is, nothing particularly interesting is happening here. This leads us to another, very useful formulation of Bayes' Theorem in terms of odds and likelihood ratio.

$$O(H|D) = O(H) \frac{P(D|H)}{P(D|\bar{H})}$$

Where \(O\) represents the odds in favor of our hypothesis.

The likelihood ratio can be considered how much better one hypothesis explains the data than another:

$$\frac{P(D|H)}{P(D|\bar{H})}$$

In this case, if this number were 2 it would mean "the argument for political migration explains the data twice as well as the argument that nothing interesting is happening", whereas this value being \(\frac{1}{2}\) means the opposite.

For our example it's hard to put an exact number to these, but that doesn’t mean we can’t still make use of Bayesian reasoning. It's clear that if the migration data mentioned is in line with exactly what you would expect given people move solely based on the distribution of populations in the US, that \(\bar{H}\) is a better explanation so this value should be less than 1. To be clear, if \(H\) were true, I would expect more people to be migrating from California than their population alone would explain.

Note: Interestingly enough, if you go to the source data for the article you will find that New York and Pennsylvania are not the third and fourth most common source of new Texans. So this does hint that our alternative hypothesis might not be a great explanation either.

Assuming the hypothesis - the rest of the article

If you're not convinced of the article's hypothesis by this evidence, which you probably shouldn't be, you can stop reading the article at this point. A common pattern of news articles like this is to continuing adding more hypothesis assuming the primary one is true.

Next section of the article assumes the hypothesis

This next section is only interesting is you assume the hypothesis is correct.

You can save a lot of reading and speculation (and anxiety) by understanding how your posterior probability of the article's primary hypothesis is influenced by your reading. There's plenty of citations throughout the rest of the article, but they are only useful if you assume the primary hypothesis is established.

Updating my prior and reading more news

The important part of the process is that this posterior becomes my new prior. Right now, I am less sure that politically motivated migrations are happening than I was before. However the net result is that I'm more unsure in general, not that I am sure the hypothesis is wrong.

When I see another article about this topic I will go in with a more neutral prior. If presented with strong evidence in favor of this hypothesis I might leave much more convinced, but if presented with strong evidence against this hypothesis I'm more likely to be swayed that way than I would have prior to reading this article.

An important part of Bayesian reasoning is that our beliefs are never fixed. By being conscious of how we process information, even in a mostly qualitative way, we can be more mindful of the media we consume and more accurately assess how our beliefs change with each new piece of information we receive. This is also essential for navigating a media landscape that is very much interested in manipulating your beliefs. While I have a tendency to prefer NPR to say FOX news, it is vital that I don't let this prior dominate my reasoning. By carefully following these guidelines it's possible to extract information from even incredibly noisy media sources and be less easily manipulated by media agendas.

If you enjoyed this post, please consider supporting this writing and much more on Patreon!