How India’s Favorite TV Show Uses Big Data
Every Sunday morning, millions of people in India tune in to watch Bollywood star Aamir Khanhost one of the country’s highest-rated television shows,Satyamev Jayate. Only unlike so many popular programs, Satyamev Jayate doesn’t involve a singing competition or a collection of volatile strangers living under the same roof. It’s a documentary program tackling some of the country’s most-sensitive topics, and it has the whole country — indeed, the whole world — talking. In order to funnel millions of messages a week into something valuable, the shows producers have turned to big data.
Aside from Khan’s star power, the show is so popular because of the types of issues it tackles —female feticide, caste discrimination, dowry deaths, child abuse and medical practice among them. According to one of the show’s producers, the amount of engagement and the number of responses from viewers is “completely unprecedented.” Here’s a sample of what we’re talking about, just 13 episodes into the show’s existence:
- 400 million viewers on Indian television and across the world on YouTube.
- More than 1.2 billion people have connected with Satyamev Jayate across its website, Facebook, Twitter, YouTube and mobile devices.
- More than 8 million people have contributed a total of more than 14 million responses to the show’s content via Facebook, web comments, text-message votes and a telephone hotline. More than 100,000 new people respond each week.
The responses take all sorts of forms, from votes on a weekly poll question to long, heartfelt letters explaining a viewer’s experience with an issue or how the show has changed their thinking on an issue. And although 95 percent of responses come from India, the show has received them from 5,000 locations in 165 countries, including as far away as northern Canada and Alaska. The show’s topics regularly rank among the top trends on Twitter shortly after each episode airs.
The messages are parsed through an automated analysis system developed by Persistent Systems, an Indian IT consultancy.
About a day-and-a-half before each show, Satyamev Jayate’s production company tells Persistent what the issue will be and the two groups come up with a taxonomy that will help the system sort through messages based on what topics will be brought up during Sunday’s show. But it’s not by any means the definitive list. As activity ramps up on Twitter while the show airs (tweet rates are highest during commercials and immediately after it ends, by the way), the team gets a sense of what topics are resonating with viewers and what themes they can expect in the nearly million responses that will follow.
When the responses actually do start pouring in after lunch, they hit a system designed by Persistent to automatically tag them and score them based on interest level and sentiment. So, as Mukund Deshpande, head of business intelligence and analytics at Persistent, told me, a long message with an interesting story will be marked as higher quality, while a short, congratulatory note will be scored lower. Because so many viewers write in “Hinglish,” a combination of Hindi and English, an off-the-shelf system wouldn’t have been as accurate for processing these messages.
Image: Satyamev Data via Gigaom.