AI models leak secret data too easily

A paper released on arXiv last week by a team of researchers from the University of California, Berkeley, National University of Singapore, and Google Brain reveals just how vulnerable deep learning is to information leakage.

The researchers labelled the problem “unintended memorization” and explained it happens if miscreants can access to the model’s code and apply a variety of search algorithms. That’s not an unrealistic scenario considering the code for many models are available online. And it means that text messages, location histories, emails or medical data can be leaked.

Nicholas Carlini, first author of the paper and a PhD student at UC Berkeley, told The Register, that the team “don’t really know why neural networks memorize these secrets right now”.

“At least in part, it is a direct response to the fact that we train neural networks by repeatedly showing them the same training inputs over and over and asking them to remember these facts. At the end of training, a model might have seen any given input ten or twenty times, or even a hundred, for some models.

“This allows them to know how to perfectly label the training data – because they’ve seen it so much – but don’t know how to perfectly label other data. What we exploit to reveal these secrets is the fact that models are much more confident on data they’ve seen before,” he explained.
Secrets worth stealing are the easiest to nab

In the paper, the researchers showed how easy it is to steal secrets such as social security and credit card numbers, which can be easily identified from neural network’s training data.

They used the example of an email dataset comprising several hundred thousand emails from different senders containing sensitive information. This was split into different senders who have sent at least one secret piece of data and used to train a two-layer long short-term memory (LSTM) network to generate the next the sequence of characters.
The chances of sensitive data becoming available are also raised when the miscreant knows the general format of the secret. Credit card numbers, phone numbers and social security numbers all follow the same template with a limited number of digits – a property the researchers call “low entropy”.
Luckily, there are ways to get around the problem. The researchers recommend developers use “differential privacy algorithms” to train models. Companies like Apple and Google already employ these methods when dealing with customer data.

Private information is scrambled and randomised so that it is difficult to reproduce it. Dawn Song, co-author of the paper and a professor in the department of electrical engineering and computer sciences at UC Berkeley, told us the following:

Source: Boffins baffled as AI training leaks secrets to canny thieves • The Register