Do 7 percent of Americans actually think that chocolate milk comes from brown cows?

It’s probably a bogus stat — but the Press coverage is striking all on its own.

Peter Licari, PhD
14 min readJun 19, 2017
A brown cow and her calf busily not producing chocolate milk (Wikimedia Commons)

Recently, the press went wild with a recent survey finding claiming that 7% of Americans believed that chocolate milk came from brown cows. This surprising fact paled in comparison to another finding from the same survey: that 48% of Americans couldn’t even hazard a guess. This appears to be a pretty shocking finding and a damning indictment of the country’s lack of general knowledge. But the press’ coverage of this finding has been oddly…similar.

That may seem like an odd charge to levy. “Of course the reporting on the story ‘similar.’ They got it from the same survey! That’s like saying it’s weird that a class all wrote similar points in their essays when they were assigned the same book!”

Fair point. Except, and this is important, we actually, honestly have very little idea about the truth behind that 7% figure. It’s more akin to the class claiming to have reported on George R. R. Martin’s A Dream of Spring. They definitely got their information from a shared source but since the actual book hasn’t come out yet the information contained is more than just a tad dubious. Actually, it’s more like the class reporting on a blog’s preemptive speculation of A Dream of Spring derived through nothing but Ouija boards and divination because, as we’ll soon see, it’s probably bogus.

The Source of the Claim:

The Innovation Center for US Dairy wanted to kick off their Milk Awareness Campaign. According to information from The Huffington Post, they approached Edelman intelligence to conduct a survey investigating Americans’ knowledge about food and farm practices. This survey found that 7% of respondents believed that chocolate milk came from a brown cow and that a full 48% simply didn’t know. The finding was originally reported on in Food & Wine and laid dormant for about two weeks before the Washington Post resurrected it and set the agenda. It was also the Post that established the preponderance of the facts and that came to dominate the coverage.

You can imagine my surprise when I came across these findings. It’s not that I was surprised at the idea that Americans are misinformed, under-informed, or just generally at sub-optimal levels of informed-ness. There’s a robust literature on political knowledge discussing how little we know as citizens and a complementary literature in science communication lamenting how little people seem to understand about basic concepts and phenomena. But this was a particularly gripping example; I thought that if I could get access to the original data that I might be able to find some interesting correlations and practice my visualization skills. So I looked.

And I looked.

And I looked.

There wasn’t even an official press release outlining the statistics or a cross-tab of the results from the Center’s official website. There was an ironically uninformative infographic, but the website didn’t offer any kind of explication of the results. It just referred people to the Food and Wine article. That article, by the way, sent people right on back to the website — so unless I channeled my inner Sisyphus, I wasn’t going to get any new meaning from following the endless recursion.

My digging didn’t uncover any reasons to believe that this statistic was true. But it did dredge up a whole host of reasons to doubt it.

Reasons to be Skeptical:

There are three reasons that I believe this statistic is, frankly, hogwash. (That’s not even counting the common rejoinder that people were simply joking and/or trolling. It’s a fair possibility, albeit an untestable one). First was the representativeness of the sampling frame (who received the questions); Second is the implementation of the survey instrument (how people received the questions); Third, and most damingly, is the wording of the question (what people saw when they answered).

The Who: Despite what The Washington Post reported, there is no evidence that the survey questioned a nationally representative sample of US adults. According to The Huffington Post, Innovation Center spokeswoman Lisa McComb responded to questions of representativeness by indicating that “responses came from all 50 states, and the regional response breakdown was fairly even.” To be clear, that is not sufficient to determine if a sample is representative. I could have easily gotten you 1,000 people with responses from every state and a solid regional breakdown that unanimously said they were voting for Gary Johnson. A representative survey means it mirrors the population on all relevant metrics. We’re missing income, proximity to a farm, education, age, and a whole host of other things. And until we account for those, we won’t be able to generalize to the rest of America. So, yeah, probably best that you ignore the back-of-the-envelope calculations that claimed that “millions” believed this. With that kind of sampling frame it’d literally be just as accurate for me to claim that no one’s actually read Pride and Prejudice after polling a class of kindergartners.

The How: The survey was administered to its 1,000 respondents over the Internet. That isn’t a problem in-and-of itself. There are a number of different firms that actually do generate representative samples from the internet. It’s just that Edelman Intelligence is not recognized as being one of those firms. To be fair, it’s not ubiquitously seen as being horrible either — but unless it outsources its online surveying, there’s gonna be some questions of validity. But the problems of “how” aren’t limited to generalizability. The survey could have also led people to answer the question as they did by structuring the survey so as to instill doubt. People will respond in surprising ways if they feel like they actually don’t know as much as they previously thought which could have been inadvertently exploited in this survey. Without knowing the question wording and the ordering of the survey, this could be at play.

The What: We ultimately don’t know if the survey was representative or if is was administered properly. There simply is not enough about the survey out there to say one way or the other. But we do know the question that people saw. And, frankly, it was pretty awful as far as surveys go. As per the President of the National Dairy Council via NPR, the question was literally something like “Where does chocolate milk come from” with the following answers:

1) Black-and-White Cows

2) Brown Cows

3) Don’t Know

It’s immediately and abundantly clear that this is the worst possible question to gauge people’s dairy literacy. Perhaps ever. Of all time. For the record, dairy cows in the US come in about seven varieties and come in a multitude of colors. That means that this question doesn’t measure how little someone knows. In fact, it’s the opposite. Since there’s no “The color doesn’t matter” option, the best guess is for people to select “Don’t know.” Hence why nearly half the sample gave that answer. The remaining 45% that, apparently, said black-and-white didn’t have any good options either. In fact, since the number one cow for dairy is the prototypical black and white Holstein, they were answering as best they could. I mean, what else are you supposed to say?! It’s not like there was an “other” option!

So how did the result from a (presumably) non-representative survey tapped by a horrendously worded question convince the press that it was legitimate? How did the coverage swell to this overwhelming consensus that people didn’t know where their milk came from despite not having access to an iota of useful data? That’s an interesting story in its own right and, as we’ll see, it comes down to who got to set the agenda.

Setting the Frames and Agenda

Remember earlier when I said that the Post acted as the trendsetter, establishing the “facts” that permeated the coverage? Well they actually set most of the frames too — either through its own coverage or by directing future sources to the Food and Wine article. The way that they articulated these “findings”, the facts and tone that they chose to surround it, were replicated throughout the coverage.

I performed a quick-and-dirty content analysis on 11 articles that discussed the findings: The article from The Washington Post and The Huffington Post, pieces by Vox, Brietbart, NPR, I Fucking Love Science, Today, CNN International, Business Insider, and NY Daily News, and the original article from Food and Wine. I tracked when the article was posted to be able to see who influence whom.

There were two predominant frames to these articles. The first was a frame of Intellectual Superiority. This often entails parading a hypothetical member of that purported 7% of Americans so that they can insult their apparent lack of intelligence in a way that reinforced their own. It was first employed by Food and Wine and was found in The Huffington Post, IFL Science, Today, NY Daily News, CNN International, and Today:

“First off, 48% of respondents said that they aren’t sure where chocolate milk comes from. Um, guys, it comes from cows — and not just the brown kind. Still, 7% of people — and remember, this survey talked to actual, grown-up adults — still think that chocolate milk only comes from brown cows.” — Food and Wine

Seriously, let’s humor the 7% for a second here — if milk color is directly dependent on the color of the cow it comes from, why wouldn’t regular milk have scattered black spots? But let’s move on. — CNN International

“If this is true across the nation generally, that would be an astounding 154,272,000 potential voters who aren’t confident enough to guess ‘cows?’.” — IFL Science

“It’s possible that some people in the survey of 1,000 were joking. But whatever the size of the group of lactose-uneducated, it’s yet another example of American idiocy.” — NY Daily News

The ultimate point of this frame is to identify and ostracize an other and establish them as being a lesser to the writer and their audience. “These people aren’t like you or I. They’re stupid on the most basic of levels. But we’re informed enough that their idiocy seems quaint and hysterical.”

The second, meanwhile, was a frame of rueful separation. This was established by the Washington Post’s coverage and consists of acknowledging the apparent gap in people’s knowledge but then taking this to mean something more profound. Specifically, the fact that people don’t know where their milk comes from because the conveniences of the modern post-industrial age has cleaved us from even this most rudimentary of understandings.

“For decades, observers in agriculture, nutrition and education have griped that many Americans are basically agriculturally illiterate. They don’t know where food is grown, how it gets to stores — or even, in the case of chocolate milk, what’s in it.” — The Washington Post

“Jokes aside, she says it shows how disconnected we are from where our food comes from…Less than 2 percent of people in America live on a farm now. So people out there have a very high level of interest but a very low level of understanding.” — NPR

“Snarky critics aside, agricultural experts say the survey’s results show a growing trend where Americans are “agriculturally illiterate,” meaning they do not know the process of how food goes from the farm to the kitchen table or how other food items are made.” — Brietbart

“The way most agricultural products today are marketed to shoppers doesn’t help. In the supermarket, meat, produce, and dairy products are often packaged in plastic wrap, cartons, and boxes. They don’t look much like the original plant or animal.” — Business Insider

“We are so ‘advanced’ that in many ways we’ve regressed,” these sources want to say. (Specifically, “these sources” being Vox, Brietbart, NPR, and Business Insider). “There are people out there who don’t know something this basic and the fault of that lies not, perhaps, with them but with where society itself is going.”

It’s not necessarily surprising that the press rallied behind similar frames in discussing this survey. It’s a story that seems to write itself after all. I was initially interested in how they were able to write these stories at all considering the lack of available information. But as I reviewed these articles again, with their numerous factual inaccuracies and logical leaps, I came to realize that it was because there was no original data available that they were able to write what they wrote. In lieu of something credible, something objective, they were able to take a core notion and construct a story around it.

See frames help us make sense of the tsunami of information on a topic. They are part shared cognitive structure and part story-telling device. Some take this to mean that a sufficiently strong media presence can contort a story so violently that it can justify just about anything. This isn’t true. Studies show that there is actually a limit to how things can be framed. Indeed, this limitation is part and parcel to being rooted in at least a sufficiently large shared reality if not an objective one. That’s not to say that things with a multiplicity of moral threads can’t be spun into a variety of different forms. Complex issues such as gun violence and abortion can be talked about through several frames. Nor is that to suggest that issues won’t be shaved down to a single, oversimplified frame because that happens too. It’s a 24 hour news day demanding easily digestible “thought-nuggets.” It doesn’t just happen, it happens all the time. But it is easier to deconstruct a prior frame and offer a new one when there’s enough objective evidence available to mount a metaphorical counter-assault. Without something new, though, the extant frames largely stand.

By virtue of being the first to broach a topic with a dearth of decent information The Washington Post and Food and Wine were able to set the frames that everyone else would come to use. And, subsequently, facilitate the appearance of a consensus in the absence of actual information.

But that consensus was not only engendered by the same framing. It is also, crucially, a product of the words by which they were erected and expressed.

“All around me are familiar faces”:

To be candid, if 11 students came to me with these news articles as assignments they’d all flunk for plagiarism. The frames were far from the only things they shared; they also employed incredibly similar syntax, often making the same jokes or emphasizing the same points using remarkably similar wording. And while no piece totally plagiarized any of the others, there was definitely a convergence in their coverage.

“Neither strawberry milk nor chocolate milk is made from cows with bloody udders…” — Vice

“Given those numbers, don’t be surprised if fans of strawberry milk are hunting down the elusive pink cow.” — NY Daily News

“Where do these people think strawberry milk comes from?” — The Huffington Post

“But it did make me wonder what they thought about strawberry milk. Did they think there were some pink cows out there?” — NPR

“We are hoping they will conduct a follow-up study to look into whether Americans think blue cheese comes from blue cow” — IFL Science

“[Chocolate milk is] just normal cow juice with some cocoa mixed in.” — Vice

“[C]hocolate milk is just regular milk that has been mixed with chocolate syrup or cocoa powder.” — The Huffington Post

“So again, for the record, chocolate milk is white milk mixed with chocolate.” — NPR

“Actually, chocolate milk gets its flavor and color from cocoa beans” — Food and Wine

“Chocolate milk — or any flavored milk for that matter — is white cow’s milk with added flavoring and sweeteners.” — Today and IFL Science, quoting the Center for US Dairy Website

To best demonstrate this point, I used R’s RNewsflow and TM packages to do some rudimentary machine text analysis. Exciting, right?! Qualitative and quantitative content analysis in the same piece by the same researcher?! That’s right. For I am nothing if not an epistemologically well-rounded egghead.

The TM package contains the ability to aggregate all of the text collected from the articles, combine them into a single corpus, and clean the individual documents by removing stop words and punctuation and lemmatizing the words (so that “running,” “runs,” and “run” are all treated as the same word). I also used it to create a Term-Document Matrix, which lists out how frequently each word was used in every article. I finally used the RNewsflow package to weight the words by the size of the individual articles and to measure their pairwise cosine similarity. The maths behind it are a tad extraneous for the purposes of this post but that measure allows me to determine how similar each article is to each other. The closer the number is to |1|, the stronger the relationship. (For the nerdier among us: The measure is roughly comparable to Pearson’s r. Roughly.) Finally, I was able to use the metadata from earlier and create a network that examined how these stories influenced each other. But I have to warn you — I am much better at mapping geographical data than I am at displaying network data. I’ll be learning how to use Gephi so that It doesn’t look as awful as what you are about to see. But I acknowledge that future promises are little comfort in the now. So, brace yourselves:

You might want to click “zoom” and cleanse your eyes with bleach.

Although that’s certainly a cluster (pun intended), you can see some really interesting relationships if you don’t mind ruining your eyes. (My favorite, for instance, is that the articles employing the intellectual superiority frame tend to be more similar to the Food and Wine article while those more similar to the Washington Post’s tend to deploy the disconnect frame). But there are two big findings that I want to highlight. First, the average similarity score is over .4 with several clocking in at over .5. That doesn’t sound like a lot, but that’s a big deal. It’s roughly analogous to saying they shared 40% of the same content. Second, they all shared amongst each other; the latter sources taking from those before, erecting a shared narrative out of virtually nothing. One that is, ultimately, divorced from reality.

The Take Away:

There’s another unspoken “frame” at play; effectively omnipresent but incredibly difficult for this kind of computational text analysis to pick up. Well, it’s less of a frame and more of a tone, but I needed a way to shoehorn in a segue to this last point. That tone is one of certainty. Whether it manifested in the haughty arrogance of the “look at how dumb some Americans are (and how superior you should feel in contrast)” frame or the stilted contrition of the “oh woe, oh woe — look at how far modern living has taken us away from our roots” frame, all of the articles convey a sense of absolute certitude. Here was a survey, these were the results, and we are unquestioningly presenting them as the facts of life. The Huffington Post was the only one of the 11 to suggest that the potential of error or obfuscation even existed. Not that it did any substantive good. Two of the sources appeared to use information gleaned from the post to bolster the appearance of generalizability in the survey’s sample. As in, they somehow spun skepticism into credulity. Even The Post didn’t seem too wed to the whole idea of questioning the legitimacy of the findings as the article quickly shrugged off the veneer of responsible skepticism to embrace snark in its stead.

This certainty, to be clear, is totally unearned. There was no access to the original source material; these stories merely represent a consensus constructed from networks of people ripping-off each other’s words. Considering that they’re all ultimately discussing the gaps in the average American’s knowledge, I find that fact to be more than just a tad ironic. One shouldn’t chuck pebbles at others’ milk if their own cups are made of glass.

Peter R. Licari is a Graduate Student in Political Science at the University of Florida specializing in American Politics, Political Behavior, and Political Methodology. The opinions expressed are his own. He can also be found on YouTube and on Twitter. What little spare time remains is dedicated to long-distance running, video games with his ever-patient fiancee, and to oddly productive one-sided conversations with his cat, Asia.

--

--

Peter Licari, PhD
Peter Licari, PhD

Written by Peter Licari, PhD

I’m a data scientist and social scientist specializing in political behavior. I’m also a runner, writer, gamer, YouTuber, and dinosaur enthusiast.

No responses yet