The below article is my all-time favorite piece of analytics-related writing. It’s entirely non-technical, but I assign it as required reading in my machine learning class. I recommend it to people interested in entering a data-related career. I have reanimated many lagging conversations by spouting pieces of it (yes, I am a real dinner party showstopper). One morning in 2012 it turned my routine selection of breakfast cereal into a mid-level existential crisis. Hell, the article even includes weight-loss tips. There’s something in there for everyone!
It’s How Companies Learn Your Secrets, by Charles Duhigg, from the New York Times Magazine.
Fundamentally, it’s an inspirational and cautionary tale about the power of data, told colorfully and accessibly.
Duhigg’s work, including this article and several best-selling books, focuses on behavioral science: Why do people do what they do, and how can that behavior be influenced? Here, he tells a story of the interaction between data analysis and behavioral science, using a specific example in which Target mined purchase history data to manipulate their marketing and influence the shopping behaviors of pregnant women.
The analytics part of the story goes like this: Target was (is) able to use purchase histories and sophisticated predictive algorithms to identify pregnant women early in their pregnancies, even when these women didn’t make purchases of obvious indicators like pregnancy tests or maternity clothes. This is both very cool and very creepy.
The behavioral science part goes like this: Target manipulated the advertising sent to these moms-to-be, using baby-related coupons to entice them to do more of their shopping at Target. The most fascinating part of the article is the evolution of this manipulation. Baby-centric advertising freaked out women who understandably felt violated by Target’s invasion of their privacy. But when sent the same advertising, camouflaged by the inclusion of non-baby items like lawnmowers or wine glasses, the women happily used the baby-related coupons because they didn’t even realize they were being targeted. Mask the creepiness, and the system churns right along to the benefit of both customer and corporation. It’s no longer creepy…it’s meta-creepy!
I love and share this article for three reasons:
The ability to predict a pregnancy from product purchases unrelated to maternity is just cool (taken independently from how that information was then used). It gives a flavor of the possible. Why stop at pregnancy? What about depression? Skin cancer? Erectile dysfunction? Divorce? Target may not be in a position to predict these—its key weapon in the pregnancy scenario was its baby registry which provided a list of known moms-to-be to train prediction algorithms on. But it’s entirely possible that, say, Amazon could do some of this.
I’m not suggesting that any retailer would or should identify individual customers in any of these groups, especially not for the sake of clumsy marketing. But what might data scientists find if they looked for patterns in the past purchases of specific groups of people? Who knows what kind of early symptoms they might uncover, expressed through purchases of lip balm, sleep aids, fuzzy socks, compression packs…who knows what else. In fact, the most promising part about work like this is that nobody has to have a specific hypothesis about what the “what else” might be. Under the right analytical lens, the correlations fall out of the data on their own.
Amazon in particular is in a unique position to do something along these lines, with its massive breadth of products and media, large and loyal customer base, and the sheer scale of the transactions they record daily. They’ve got teams upon teams of data scientists mining this data to improve their own sales and operations. Where are the data scientists who are mining this data for the common good, or at least for the common interest? Where’s the Amazon medicine group working with medical researchers to combine medical data with purchase data for real power?
Where’s the “Cocktail Conversations by Amazon” blog where their data scientists share interesting tidbits that they’ve discovered? I’m sure their scientists stumble all the time on facts that are of general interest but not competitively advantageous, even if they’re not necessarily looking for them. If they put someone directly on the task of hunting down buzzworthy discoveries and sharing them—a Nate Silver of Amazon—they’d have PR and recruiting gold. I’d read that blog!
From the article, quoting a Target executive:
"And we found out that as long as a pregnant woman thinks she hasn’t been spied on, she’ll use the coupons. She just assumes that everyone else on her block got the same mailer for diapers and cribs. As long as we don’t spook her, it works."
The wording here betrays a somewhat dismissive and belittling attitude toward the Target customers. As someone whose career centers on data, this part of the article always reminds me to think again about the consequences of what I’m doing. Data science reduces individuals down to lists of “features”: your purchases, or symptoms, or clicks on a website. Data is by nature impersonal, and that’s ok. The reminder here is that the data science is being applied, presumably, to affect the people behind the features in some concrete way: to sway their shopping behavior, to diagnose an affliction, or to optimize their web experience. It’s important for the data scientists doing the analytics to understand and be comfortable with the way their work touches real lives.
I read this article in February 2012 when it was first published. Back then I ate cereal for breakfast a few times a week, usually Cheerios. The morning I read this, I’d started my day with a bowl of Life cereal instead. I don’t know why the box of Life was even in the cupboard; it was not something I usually bought. Maybe it had been on sale, or maybe it was leftover from houseguests. It really didn’t matter, until I got to this part of the article:
The study found that when someone marries, he or she is more likely to start buying a new type of coffee. When a couple move into a new house, they’re more apt to purchase a different kind of cereal. When they divorce, there’s an increased chance they’ll start buying different brands of beer.
Here I experienced the sinking heart, then the momentary freeze, of a minor shock. My husband and I had purchased and moved into our first home in January 2012, just five weeks prior to my reading those sentences. The Life cereal in my cupboard was not on sale, and it was not left over from houseguests. More precisely, either or both of those things may have been true, but they didn’t really explain why my day started with Life instead of Cheerios.
The truth is, I’m not the free-willed, rational being that I’d like to believe I am. I’m part-robot, just like the rest of us. This article provided a powerful and timely lesson that what I consider to be my own choices, made from experience and educated reasoning, are to some degree just pre-programmed shortcuts. As such, they are open to the influence of habit, sub-conscious rewards, and marketing. What a disappointment!
Nevertheless, this disappointment of a human is still captivated by the power of data. I am so fond of this article because it conveys both the potential and the pitfalls of analytics in a way that’s colorful and accessible to both industry veterans and newbies. Perhaps that’s a predictable response for someone in my field. Before you judge me, though, read through the article for yourself and see what you discover about how your own behavior is predictable, to those with the right data.