it was such an unsecured database that the Comparitech researchers, led by Bob Diachenko, discovered on August 1, leaving the personal profile data of nearly 235 million Instagram, TikTok and YouTube users up for grabs.
The data was spread across several datasets; the most significant being two coming in at just under 100 million each and containing profile records apparently scraped from Instagram. The third-largest was a dataset of some 42 million TikTok users, followed by just under 4 million YouTube user profiles.
Comparitech says that, based on the samples it collected, one in five records contained either a telephone number or email address. Every record also included at least some, sometimes all, the following information:
- Profile name
- Full real name
- Profile photo
- Account description
Statistics about follower engagement, including:
- Number of followers
- Engagement rate
- Follower growth rate
- Audience gender
- Audience age
- Audience location
- Last post timestamp
“The information would probably be most valuable to spammers and cybercriminals running phishing campaigns,” Paul Bischoff, Comparitech editor, says. “Even though the data is publicly accessible, the fact that it was leaked in aggregate as a well-structured database makes it much more valuable than each profile would be in isolation,” Bischoff adds. Indeed, Bischoff told me that it would be easy for a bot to use the database to post targeted spam comments on any Instagram profile matching criteria such as gender, age or number of followers.
Tracing the source of the leaked data
So, where did all this data originate? The researchers suggest that the evidence, including dataset names, pointed to a company called Deep Social. However, Deep Social was banned by both Facebook and Instagram in 2018 after scraping user profile data. The company was wound down sometime after this.
A Facebook company spokesperson told me that “scraping people’s information from Instagram is a clear violation of our policies. We revoked Deep Social’s access to our platform in June 2018 and sent a legal notice prohibiting any further data collection.”
Once the researchers found the database and the clues to its origin, “we sent an alert to Deep Social, assuming the data belonged to them,” Bischoff says. The administrators of Deep Social then forwarded the disclosure to a Hong Kong-registered social media influencer data-marketing company called Social Data. “Social Data shut down the database about three hours after our initial email,” Bischoff says.