Close to Home: Too many predictions, not enough data

We sometimes try to do things with numbers that simply can’t be reliably predicted.|

The views and opinions expressed in this commentary are those of the author and don’t necessarily reflect The Press Democrat editorial board’s perspective. The opinion and news sections operate separately and independently of one another.

What if we lived in a time where we spent more time trying to predict outcomes than we did debating the merits of what we’re predicting?

Unfortunately, we’re already there. Our economic markets seem more influenced by whether a company’s earnings met analysts’ expectations than how good those earnings were.

Richard Hertz
Richard Hertz

Baseball has become so analytically addicted that the new normal lets dudes with laptops make on-field decisions best left to the manager and their coaches who’ve lived in the game and long looked for every edge, statistical or otherwise (Full disclosure: I worked with big league coaches and players for 13 seasons identifying predictive data for in-game use).

Perhaps the most impactful overuse of data lies in how our election analysis is overwhelmed with statistics that often mislead or mean nothing,

We sometimes try to do things with numbers that simply can’t be reliably predicted. Poll results and other political analytics are often projected with a level of certainty that overstates their likelihood and doesn’t adequately consider other possible outcomes.

In 2022, for the third consecutive election, many experts missed or underestimated key outcomes that didn’t fit the consensus political narrative.

The need for content to fill the 24/7 news cycle helps proliferate stories that many times overstate the impact of daily news events.

We constantly hear that this or that bit of news will be a game changer, but in reality, few actually do result in a big swing. Public opinion tends to move slowly for a mosaic of reasons rather than all at once for just one.

A good deal of our political analysis rests upon shaky statistical grounds. Let’s see why by taking a look at when data becomes stable and thus more reliable.

Most public opinion polls are based on samples of 500-2,000 interviews. Online polls sometimes have larger samples since their costs for conducting additional interviews are minimal.

Back when telephone surveys truly were the gold standard of polling, their results took longer to firmly take shape compared with online research where it usually takes just a few hours since most respondents complete them right away.

After conducting 100 or so phone interviews, you might think you see a trend, but most vanish after a few hundred interviews are completed when the data hardens. Think of data like Jell-O that’s very liquid coming off the stove but firms up as it cools.

Now let’s look at one of the metrics many analysts used to predict an outcome that largely failed to occur in the 2022 general election; the history of the president’s party often losing many seats in the first congressional elections after they assumed office.

But often is not always and every election is a unique event based on circumstances at that time. Historic analysis can be informative but only if it contains enough relevant data to work with.

As this is now the 118th Congress and off-year (non-presidential) congressional elections take place every four years, there have been not quite 60 of them since the nation’s founding. This is a really small sample to draw meaningful conclusions from.

That data is further marginalized as much of it is more than a century old. The nation’s issues and workings change over time. The pre- and post-internet worlds are particularly different when it comes to politics.

Pundits need to speak in a more sotto voice, even if their analysis covers a long period and displays certain tendencies. When there isn’t enough good data to reliably analyze, not much else matters.

Finally, it’s important to contextualize the use of poll data to the times we live in. It’s no secret that despite being way more connected today, it’s far more difficult to conduct accurate polls largely because many potential respondents refuse to take them.

How polls are conducted and by whom definitely matters. The proliferation of partisan polling is now impacting poll averages that used to help offset misleading data from a particular survey.

In 2022, there were significant cutbacks in the number of polls conducted by independent news organizations. However, these polls were largely quite accurate compared with the hooey produced by some partisan polls that seemed designed more to influence the political narrative than objectively measure public opinion.

The continuing obsession with horse race polling and which party benefits from every grain of news comes at the cost of less serious debate of important issues.

News readership and ratings are at least in part, a function of relevancy. The more election coverage focuses on issues people care about and impacts their lives, the more they’ll pay attention to it and the better the public will be served.

Richard Hertz, a longtime pollster, teaches American politics at Sonoma State University. He lives in Bodega Bay.

You can send letters to the editor to letters@pressdemocrat.com.

The views and opinions expressed in this commentary are those of the author and don’t necessarily reflect The Press Democrat editorial board’s perspective. The opinion and news sections operate separately and independently of one another.

UPDATED: Please read and follow our commenting policy:
  • This is a family newspaper, please use a kind and respectful tone.
  • No profanity, hate speech or personal attacks. No off-topic remarks.
  • No disinformation about current events.
  • We will remove any comments — or commenters — that do not follow this commenting policy.