Social Media: the Future of Diagnosis?

A study suggests that your Facebook posts can be used to determine what diseases or conditions you have with reasonable accuracy.

A new study appearing in PLOS One suggests that your Facebook posts can be used to diagnose a variety of diseases.

Researchers from the University of Pennsylvania found 999 brave souls willing to share their entire Facebook history – amounting to 950,000 Facebook status updates and 20 million words, which is equivalent to about 15 copies of Proust’s Remembrance of Things Past although with slightly less sardonic wit and slightly more emojis.

Stop trying to make “social mediome” happen. It’s not gonna happen.

Stop trying to make “social mediome” happen. It’s not gonna happen.

From this data they derived what they call a “social mediome” – we’ll see if that catches on – a set of 700 variables that reflected the 500 most-common word pairs seen, and 200 common word-cluster “topics”.

Each of the 999 individuals thus had a 700-variable fingerprint that represented all that they put out there into the ether of the social network.

Linking to the electronic medical record, the researchers asked whether they could use that fingerprint to predict the presence of 21 conditions like diabetes, psychosis, and pregnancy.

Ironically, the best social media site for diagnosis is Club Penguin.

And, for basically all of them, they could with varying degrees of accuracy. Pregnancy, in fact, was the easiest to predict, while the presence of coagulopathy was the hardest. This is probably for the best.

The authors compared Facebook’s ability to predict with the predictive ability of a combination of three demographic factors – a individual’s age, race, and sex, finding that Facebook-based prediction was significantly superior to demographic-based prediction for 10 of the 21 conditions.

Depression word cloud.

Depression word cloud.

Most of the word-clusters had good face validity.  People who were depressed, for example were more likely to have posts with words like “hurt” “feelings” and “care”. 

Not all the clusters made so much sense. One strong predictor of the presence of diabetes was a word cluster with words like “god”, “pray” and “lord” suggesting these postings are capturing some data that simple demographics do not.

Diabetes word cloud.

Diabetes word cloud.

Where does this all go? Well, the implication is that someday, by sharing your online data with your doctor, the doctor may be able to identify you as at risk for a condition that you didn’t even know about. Of course, this study doesn’t really go there. There is no information as to the timing of posts versus the diagnosis – it’s one thing to post about diabetes when you know you have diabetes, another thing altogether to predict FUTURE diabetes from current Facebook posts.

And of course, comparing only to demographic information is a bit of a straw man. Docs have a lot more info about patients than just their age, sex, and race.  Still, the data in your social media history may reveal aspects of health that we don’t capture well otherwise. But of course, whether you are willing to share that side of you with your health professional may say more about you than all those 20 million words ever could.