Yeah, that Zoom app you’re trusting with work chatter? It lives with ‘vampires feeding on the blood of human data’

As the global coronavirus pandemic pushes the popularity of videoconferencing app Zoom to new heights, one web veteran has sounded the alarm over its “creepily chummy” relationship with tracking-based advertisers.

Doc Searls, co-author of the influential internet marketing book The Cluetrain Manifesto last century, today warned [cached] Zoom not only has the right to extract data from its users and their meetings, it can work with Google and other ad networks to turn this personal information into targeted ads that follow them across the web.

This personal info includes, and is not limited to, names, addresses and any other identifying data, job titles and employers, Facebook profiles, and device specifications. Crucially, it also includes “the content contained in cloud recordings, and instant messages, files, whiteboards … shared while using the service.”

Searls said reports outlining how Zoom was collecting and sharing user data with advertisers, marketers, and other companies, prompted him to pore over the software maker’s privacy policy to see how it processes calls, messages, and transcripts.

And he concluded: “Zoom is in the advertising business, and in the worst end of it: the one that lives off harvested personal data.

“What makes this extra creepy is that Zoom is in a position to gather plenty of personal data, some of it very intimate (for example with a shrink talking to a patient) without anyone in the conversation knowing about it. (Unless, of course, they see an ad somewhere that looks like it was informed by a private conversation on Zoom.)”

The privacy policy, as of March 18, lumps together a lot of different types of personal information, from contact details to meeting contents, and says this info may be used, one way or another, to personalize web ads to suit your interests.

“Zoom does use certain standard advertising tools which require personal data,” the fine-print states. “We use these tools to help us improve your advertising experience (such as serving advertisements on our behalf across the internet, serving personalized ads on our website, and providing analytics services) … For example, Google may use this data to improve its advertising services for all companies who use their services.”

Searls, a former Harvard Berkman Fellow, said netizens are likely unaware their information could be harvested from their Zoom accounts and video conferences for advertising and tracking across the internet: “A person whose personal data is being shed on Zoom doesn’t know that’s happening because Zoom doesn’t tell them. There’s no red light, like the one you see when a session is being recorded.

“Nobody goes to Zoom for an ‘advertising experience,’ personalized or not. And nobody wants ads aimed at their eyeballs elsewhere on the ‘net by third parties using personal information leaked out through Zoom.”

Speaking of Zoom…

Zoom’s iOS app sent analytics data to Facebook even if you didn’t use Facebook, due to the application’s use of the social network’s Graph API, Vice discovered. The privacy policy stated the software collects profile information when a Facebook account is used to sign into Zoom, though it didn’t say anything about what happens if you don’t use Facebook. Zoom has since corrected its code to not send analytics in these circumstances.

It should go without saying but don’t share your Zoom meeting ID and password in public, such as on social media, as miscreants will spot it, hijack it, and bomb it with garbage. And don’t forget to set a strong password, too. Zoom had to beef up its meeting security after Check Point found a bunch of weaknesses, such as the fact it was easy to guess or brute-force meeting IDs.

Source: Yeah, that Zoom app you’re trusting with work chatter? It lives with ‘vampires feeding on the blood of human data’ • The Register

Android Apps Are Transmitting what other apps you have ever installed to marketing peole

At this point we’re all familiar with apps of all sorts tracking our every move and sharing that info with pretty much every third party imaginable. But it actually may not be as simple as tracking where you go and what you do in an app: It turns out that these apps might be dropping details about the other programs you’ve installed on your phone, too.

This news comes courtesy of a new paper out from a team of European researchers who found that some of the most popular apps in the Google Play store were bundled with certain bits of software that pull details of any apps that were ever downloaded onto a person’s phone.

Before you immediately chuck your Android device out the window in some combination of fear and disgust, we need to clarify a few things. First, these bits of software—called IAMs, or “installed application methods”—have some decent uses. A photography app might need to check the surrounding environment to make sure you have a camera installed somewhere on your phone. If another app immediately glitches out in the presence of an on-phone camera, knowing the environment—and the reason for that glitch—can help a developer know which part of his app to tinker with to keep that from happening in the future.

Because these IAM-specific calls are technically for debugging purposes, they generally don’t need to secure permissions the same way an app usually would when, say, asking for your location. Android devices have actually gotten better about clamping down on that form of invasive tracking after struggling with it for years, recently announcing that the Android 11 formally requiring that devs apply for location permissions access before Google grants it.

But at the same time, surveying the apps on a given phone can go the invasive route very easily: The apps we download can tip developers off about our incomes, our sexualities, and some of our deepest fears.

The research team found that, of the roughly 4,200 commercial apps it surveyed making these IAM calls, almost half were strictly grabbing details on the surrounding apps. For context, most other calls—which were for monitoring details about the app like available updates, or the current app version—together made up less than one percent of all calls they observed.

There are a few reasons for the prevalence of this errant app-sniffing behavior, but for the most part it boils down to one thing: money. A lot of these IAMs come from apps that are on-boarding software from adtech companies offering developers an easy way to make quick cash off their free product. That’s probably why the lion’s share—more than 83%—of these calls were being made on behalf of third-party code that the dev onboarded for their commercially available app, rather than code that was baked into that app by design.

And for the most part, these third parties are—as you might have suspected—companies that specialize in targeted advertising. Looking over the top 20 libraries that pull some kind of data via IAMs, some of the top contenders, like ironSource or AppNext, are in the business of getting the right ads in front of the right player at the right time, offering the developer the right price for their effort.

And because app developers—like most people in the publishing space—are often hard-up for cash, they’ll onboard these money-making tools without asking how they make that money in the first place. This kind of daisy-chaining is the same reason we see trackers of every shape and size running across every site in the modern ecosystem, at times without the people actually behind the site having any idea.

Source: Android Apps May Be Snooping on You More Than You Realize

Ring corporate surveillance doorbells Continues To Insist Its Cameras Reduce Crime, But Crime Data Doesn’t Back Those Claims Up

Despite evidence to the contrary, Amazon’s Ring is still insisting its the best thing people can put on their front doors — an IoT camera with PD hookups that will magically reduce crime in their neighborhoods simply by being a mute witness of criminal acts.

Boasting over 1,000 law enforcement partnerships, Ring talks a good game about crime reduction, but its products haven’t proven to be any better than those offered by competitors — cameras that don’t come with law enforcement strings attached.

Last month, Cyrus Farivar undid a bit of Ring’s PR song-and-dance by using public records requests and conversations with law enforcement agencies to show any claim Ring makes about crime reduction probably (and in some cases definitely) can’t be linked to the presence of Ring’s doorbell cameras.

CNET has done the same thing and come to the same conclusion: the deployment of Ring cameras rarely results in any notable change in property crime rates. That runs contrary to the talking points deployed by Dave Limp — Amazon’s hardware chief — who “believes” adding Rings to neighborhoods makes neighborhoods safer. Limp needs to keep hedging.

CNET obtained property-crime statistics from three of Ring’s earliest police partners, examining the monthly theft rates from the 12 months before those partners signed up to work with the company, and the 12 months after the relationships began, and found minimal impact from the technology.

The data shows that crime continued to fluctuate, and analysts said that while many factors affect crime rates, such as demographics, median income and weather, Ring’s technology likely wasn’t one of them.

Worse for Ring — which has used its partnerships with law enforcement agencies to corner the market for doorbell cameras — law enforcement agencies are saying the same thing: Ring isn’t having any measurable impact on crime.

“In 2019, we saw a 6% decrease in property crime,” said Kevin Warych, police patrol commander in Green Bay, Wisconsin, but he noted, “there’s no causation with the Ring partnership.”

[…]

“I can’t put numbers on it specifically, if it works or if it doesn’t reduce crime,” [Aurora PD public information officer Paris] Lewbel said.

But maybe it doesn’t really matter to Ring if law enforcement agencies believe the crime reduction sales pitch. What ultimately matters is that end users might. After all, these cameras are installed on homes, not police departments. As long as potential customers believe crime in their area (or at least their front doorstep) will be reduced by the presence of camera, Ring can continue to increase market share.

But the spin is, at best, inaccurate. Crime rates in cities where Ring has partnered with law enforcement agencies continue to fluctuate. Meanwhile, Ring has fortuitously begun its mass deployment during a time of historically-low crime rates which have dropped steadily for more than 20 years. Hitting the market when things are good and keep getting better makes for pretty good PR, especially when company reps are willing to convert correlation to causation to sell devices.

Source: Ring Continues To Insist Its Cameras Reduce Crime, But Crime Data Doesn’t Back Those Claims Up | Techdirt

HP printers try to send loads of data back to HP about your devices and what you print

NB you can disable outgoing communication in the public network using windows defender by using the instructions here (HP).

They come down to opening windows defender firewall, allowing an app or feature through windows defender firewall, searching for HP and then deselecting the public zone.

At first the setup process was so simple that even a computer programmer could do it. But then, after I had finished removing pieces of cardboard and blue tape from the various drawers of the machine, I noticed that the final step required the downloading of an app of some sort onto a phone or computer. This set off my crapware detector.

It’s possible that I was being too cynical. I suppose that it was theoretically possible that the app could have been a thoughtfully-constructed wizard, which did nothing more than gently guide non-technical users through the sometimes-harrowing process of installing and testing printer drivers. It was at least conceivable that it could then quietly uninstall itself, satisfied with a simple job well done.

Of course, in reality it was a way to try and get people to sign up for expensive ink subscriptions and/or hand over their email addresses, plus something even more nefarious that we’ll talk about shortly (there were also some instructions for how to download a printer driver tacked onto the end). This was a shame, but not unexpected. I’m sure that the HP ink department is saddled with aggressive sales quotas, and no doubt the only way to hit them is to ruthlessly exploit people who don’t know that third-party cartridges are just as good as HP’s and are much cheaper. Fortunately, the careful user can still emerge unscathed from this phase of the setup process by gingerly navigating the UI patterns that presumably do fool some people who aren’t paying attention.

But it is only then, once the user has found the combination of “Next” and “Cancel” buttons that lead out of the swamp of hard sells and bad deals, that they are confronted with their biggest test: the “Data Collection Notice & Settings”.

In summary, HP wants its printer to collect all kinds of data that a reasonable person would never expect it to. This includes metadata about your devices, as well as information about all the documents that you print, including timestamps, number of pages, and the application doing the printing (HP state that they do stop short of looking at the contents of your documents). From the HP privacy policy, linked to from the setup program:

Product Usage Data – We collect product usage data such as pages printed, print mode, media used, ink or toner brand, file type printed (.pdf, .jpg, etc.), application used for printing (Word, Excel, Adobe Photoshop, etc.), file size, time stamp, and usage and status of other printer supplies. We do not scan or collect the content of any file or information that might be displayed by an application.

Device Data – We collect information about your computer, printer and/or device such as operating system, firmware, amount of memory, region, language, time zone, model number, first start date, age of device, device manufacture date, browser version, device manufacturer, connection port, warranty status, unique device identifiers, advertising identifiers and additional technical information that varies by product.

HP wants to use the data they collect for a wide range of purposes, the most eyebrow-raising of which is for serving advertising. Note the last column in this “Privacy Matrix”, which states that “Product Usage Data” and “Device Data” (amongst many other types of data) are collected and shared with “service providers” for purposes of advertising.

HP delicately balances short-term profits with reasonable-man-ethics by only half-obscuring the checkboxes and language in this part of the setup.

At this point everything has become clear – the job of this setup app is not only to sell expensive ink subscriptions; it’s also to collect what apparently passes for informed consent in a court of law. I clicked the boxes to indicate “Jesus Christ no, obviously not, why would anyone ever knowingly consent to that”, and then spent 5 minutes Googling how to make sure that this setting was disabled. My research suggests that it’s controlled by an item in the settings menu of the printer itself labelled “Store anonymous usage information”. However, I don’t think any reasonable person would think that the meaning of “Store anonymous usage information” includes “send analytics data back to HP’s servers so that it can be used for targeted advertising”, so either HP is being deliberately coy or there’s another option that disables sending your data that I haven’t found yet.

I bet there’s also a vigorous debate to be had over whether HP’s definition of “anonymous” is the same as mine.


I imagine that a user’s data is exfiltrated back to HP by the printer itself, rather than any client-side software. Once HP has a user’s data then I don’t know what they do with it. Maybe if they can see that you are printing documents from Photoshop then they can send you spam for photo paper? I also don’t know anything about how much a user’s data is worth. My guess is that it’s depressingly little. I’d almost prefer it if HP was snatching highly valuable information that was worth making a high-risk, high-reward play for. But I can’t help but feel like they’re just grabbing whatever data is lying around because they might as well, it might be worth a few cents, and they (correctly) don’t anticipate any real risk to their reputation and bottom line from doing so.

Recommended for who?

Source: HP printers try to send data back to HP about your devices and what you print | Robert Heaton

Private By Design: Free and Private Voice Assistants

Science fiction has whetted our imagination for helpful voice assistants. Whether it’s JARVIS from Iron Man, KITT from Knight Rider, or Computer from Star Trek, many of us harbor a desire for a voice assistant to manage the minutiae of our daily lives. Speech recognition and voice technologies have advanced rapidly in recent years, particularly with the adoption of Siri, Alexa, and Google Home.

However, many in the maker community are concerned — rightly — about the privacy implications of using commercial solutions. Just how much data do you give away every time you speak with a proprietary voice assistant? Just what are they storing in the cloud? What free, private, and open source options are available? Is it possible to have a voice stack that doesn’t share data across the internet?

Yes, it is. In this article, I’ll walk you through the options.

WHAT’S IN A VOICE STACK?

Some voice assistants offer a whole stack of software, but you may prefer to pick and choose which layers to use.

» WAKE WORD SPOTTER — This layer is constantly listening until it hears the wake word or hot word, at which point it will activate the speech-to-text layer. “Alexa,” “Jarvis,” and “OK Google” are wake words you may know.

» SPEECH TO TEXT (STT) — Also called automatic speech recognition (ASR). Once activated by the wake word, the job of the STT layer is just that: to recognize what you’re saying and turn it into written form. Your spoken phrase is called an utterance.

» INTENT PARSER — Also called natural language processing (NLP) or natural language understanding (NLU). The job of this layer is to take the text from STT and determine what action you would like to take. It often does this by recognizing entities — such as a time, date, or object — in the utterance.

» SKILL — Once the intent parser has determined what you’d like to do, an application or handler is triggered. This is usually called a skill or application. The computer may also create a reply in human-readable language, using natural language generation (NLG).

» TEXT TO SPEECH — Once the skill has completed its task, the voice assistant may acknowledge or respond using a synthesized voice.

Some layers work on device, meaning they don’t need an internet connection. These are a good option for those concerned about privacy, because they don’t share your data across the internet. Others do require an internet connection because they offload processing to cloud servers; these can be more of a privacy risk.

Before you pick a voice stack for your project you’ll need to ask key questions such as:

• What’s the interface of the software like — how easy is it to install and configure, and what support is available?

• What sort of assurances do you have around the software? How accurate is it? Does it recognize your accent well? Is it well tested? Does it make the right decisions about your intended actions?

• What sort of context, or use case, do you have? Do you want your data going across the internet or being stored on cloud servers? Is your hardware constrained in terms of memory or CPU? Do you need to support languages other than English?

ALL-IN-ONE VOICE SOLUTIONS

If you’re looking for an easy option to start with, you might want to try an all-in-one voice solution. These products often package other software together in a way that’s easy to install. They’ll get your DIY voice project up and running the fastest.

Jasper  is designed from the ground up for makers, and is intended to run on a Raspberry Pi. It’s a great first step for integrating voice into your projects. With Jasper, you choose which software components you want to use, and write your own skills, and it’s possible to configure it so that it doesn’t need an internet connection to function.

Rhasspy also uses a modular framework and can be run without an internet connection. It’s designed to run under Docker and has integrations for NodeRED and for Home Assistant, a popular open source home automation software.

Mycroft is modular too, but by default it requires an internet connection. Skills in Mycroft are easy to develop and are written in Python 3; existing skills include integrations with Home Assistant and Mozilla WebThings. Mycroft also builds open-source hardware voice assistants similar to Amazon Echo and Google Home. And it has a distribution called Picroft specifically for the Raspberry Pi 3B and above.

Almond is a privacy-preserving voice assistant from Stanford that’s available as a web app, for Android, or for the GNOME Linux desktop. Almond is very new on the scene, but already has an integration with Home Assistant. It also has options that allow it to run on the command line, so it could be installed on a Raspberry Pi (with some effort).

The languages supported by all-in-one voice solutions are dependent on what software options are selected, but by default they use English. Other languages require additional configuration.

WAKE WORD SPOTTERS

PocketSphinx is a great option for wake word spotting. It’s available for Linux, Mac, Windows platforms, as well as Android and iOS; however, installation can be involved. PocketSphinx works on-device, by recognizing phonemes, which are the smallest units of sound that make up a word.

For example, hello and world each have four phonemes:

hello H EH L OW

world W ER L D

The downside of PocketSphinx is that its core developers appear to have moved on to a for-profit company, so it’s not clear how long PocketSphinx or its parent CMU Sphinx will be around.

Precise by Mycroft.AI uses a recurrent neural network to learn what are and are not wake words. You can train your own wake words with Precise, but it does take a lot of training to get accurate results.

Snowboy is free for makers to train your own wake word, using Kitt.AI’s (proprietary) training, but also comes with several pre-trained models, and wrappers for several programming languages including Python and Go. Once you’ve got your trained wake word, you no longer need an internet connection. It’s an easier option for beginners than Precise or PocketSphinx, and has a very small CPU footprint, which makes it ideal for embedded electronics. Kitt.AI was acquired by Chinese giant Baidu in 2017, although to date it appears to remain as its own entity.

Porcupine from Picovoice is designed specifically for embedded applications. It comes in two variants: a complete model with higher accuracy, and a compressed model with slightly lower accuracy but a much smaller CPU and memory footprint. It provides examples for integration with several common programming languages. Ada, the voice assistant recently released by Home Assistant, uses Porcupine under the hood.

SPEECH TO TEXT

Kaldi has for years been the go-to open source speech-to-text engine. Models are available for several languages, including Mandarin. It works on-device but is notoriously difficult to set up, not recommended for beginners. You can use Kaldi to train your own speech-to-text model, if you have spoken phrases and recordings, for example in another language. Researchers in the Australian Centre for the Dynamics of Language have recently developed Elpis , a wrapper for Kaldi that makes transcription to text a lot easier. It’s aimed at linguists who need to transcribe lots of recordings.

CMU Sphinx , like its child PocketSphinx, is based on phoneme recognition, works on-device, and is complex for beginners.

DeepSpeech, part of Mozilla’s Common Voice project , is another major player in the open source space that’s been gaining momentum. DeepSpeech comes with a pre-trained English model but can be trained on other data sets — this requires a compatible GPU. Trained models can be exported using TensorFlow Lite for inference, and it’s been tested on an RasPi 4, where it comfortably performs real-time transcriptions. Again, it’s complex for beginners.

INTENT PARSING AND ENTITY RECOGNITION

There are two general approaches to intent parsing and entity recognition: neural networks and slot matching. The neural network is trained on a set of phrases, and can usually match an utterance that “sounds like” an intent that should trigger an action. In the slot matching approach, your utterance needs to closely match a set of predefined “slots,” such as “play the song [songname] using [streaming service].” If you say “play Blur,” the utterance won’t match the intent.

Padatious is Mycroft’s new intent parser, which uses a neural network. They also developed Adapt which uses the slot matching approach.

For those who use Python and want to dig a little deeper into the structure of language, the Natural Language Toolkit is a powerful tool, and can do “parts of speech” tagging — for example recognizing the names of places.

Rasa  is a set of tools for conversational applications, such as chatbots, and includes a robust intent parser. Rasa makes predictions about intent based on the entire context of a conversation. Rasa also has a training tool called Rasa X, which helps you train the conversational agent to your particular context. Rasa X comes in both an open source community edition and a licensed enterprise edition.

Picovoice also has Rhino, which comes with pre-trained intent parsing models for free. However, customization of models — for specific contexts like medical or industrial applications — requires a commercial license.

TEXT TO SPEECH

Just like speech-to-text models need to be “trained” for a particular language or dialect, so too do text-to-speech models. However, text to speech is usually trained on a single voice, such as “British Male” or “American Female.”

eSpeak  is perhaps the best-known open source text-to-speech engine. It supports over 100 languages and accents, although the quality of the voice varies between languages. eSpeak supports the Speech Synthesis Markup Language format, which can be used to add inflection and emphasis to spoken language. It is available for Linux, Windows, Mac, and Android systems, and it works on-device, so it can be used without an internet connection, making it ideal for maker projects.

Festival is now quite dated, and needs to be compiled from source for Linux, but does have around 15 American English voices available. It works on-device. It’s mentioned here out of respect; for over a decade it was considered the premier open source text-to-speech engine.

Mimic2 is a Tacotron fork from Mycroft AI, who have also released the to allow you to build your own text-to-speech voices. To get a high-quality voice requires up to 100 hours of “clean” speech, and Mimic2 is too large to work on-device, so you need to host it on your own server or connect your device to the Mycroft Mimic2 server. Currently it only has a pre-trained voice for American English.

Mycroft’s earlier Mimic TTS can work on-device, even on a Raspberry Pi, and is another good candidate for maker projects. It’s a fork of CMU Flite.

Mary Text to Speech supports several, mainly European languages, and has tools for synthesizing new voices. It runs on Java, so can be complex to install.

So, that’s a map of the current landscape in open source voice assistants and software layers. You can compare all these layers in the chart at the end of this article. Whatever your voice project, you’re likely to find something here that will do the job well — and will keep your voice and your data private from Big Tech.

WHAT’S NEXT FOR OPEN SOURCE VOICE?

As machine learning and natural language processing continue to advance rapidly, we’ve seen the decline of the major open source voice tools. CMU Sphinx, Festival, and eSpeak have become outdated as their supporters have adopted other tools, or maintainers have gone into private industry and startups.

We’re going to see more software that’s free for personal use but requires a commercial license for enterprise, as Rasa and Picovoice do today. And it’s understandable; dealing with voice in an era of machine learning is data intensive, a poor fit for the open source model of volunteer development. Instead, companies are driven to commercialize by monetizing a centralized “platform as a service.”

Another trajectory this might take is some form of value exchange. Training all those neural networks and machine learning models — for STT, intent parsing, and TTS — takes vast volumes of data. More companies may provide software on an open source basis and in return ask users to donate voice samples to improve the data sets.Mozilla’s Common Voice follows this model.

Another trend is voice moving on-device. The newer, machine-learning-driven speech tools originally were too computationally intensive to run on low-end hardware like the Raspberry Pi. But with DeepSpeech now running on a RasPi 4, it’s only a matter of time before the newer TTS tools can too.

We’re also seeing a stronger focus on personalization, with the ability to customize both speech-to-text and text-to-speech software.

WHAT WE STILL NEED

What’s lacking across all these open source tools are user-friendly interfaces to capture recordings and train models. Open source products must continue to improve their UIs to attract both developer and user communities; failure to do so will see more widespread adoption of proprietary and “freemium” tools.

As always in emerging technologies, standards remain elusive. For example, skills have to be rewritten for different voice assistants. Device manufacturers, particularly for smart home appliances, won’t want to develop and maintain integrations for multiple assistants; much of this will fall to an already-stretched open source community until mechanisms for interoperability are found. Mozilla’s WebThings ecosystem (see page 50) may plug the interoperability gap if it can garner enough developer support.

Regardless, the burden rests with the open source community to find ways to connect to proprietary systems (see page 46 for a fun example) because there’s no incentive for manufacturers to do the converse.

The future of open source rests in your hands! Experiment and provide feedback, issues, pull requests, data, ideas, and bugs. With your help, open source can continue to have a strong voice.

click the image to view full size. Alternatively, you can download this data as a spreadsheet by clicking here.

Source: Private By Design: Free and Private Voice Assistants

Pervasive digital locational surveillance of citizens deployed in COVID-19 fight

Pervasive surveillance through digital technologies is the business model of Facebook and Google. And now governments are considering the web giants’ tools to track COVID-19 carriers for the public good.

Among democracies, Israel appears to have gone first: prime minister Benjamin Netanyahu has announced “emergency regulations that will enable the use of digital means in the war on Corona. These means will greatly assist us in locating patients and thereby stop the spread of the virus.”

Speaking elsewhere, Netanyhau said the digital tools are those used by Israeli security agency Shin Bet to observe terrorists. Netanyahu said the tools mean the government “will be able to see who they [people infected with the virus] were with, what happened before and after [they became infected].”

Strict oversight and a thirty-day limit on the use of the tools is promised. But the tools’ use was announced as a fait accompli before Israel’s Parliament or the relevant committee could properly authorise their use. And that during a time of caretaker government!

The idea of using tech to spy on COVID-carriers may now be catching.

The Washington Post has reported that the White House has held talks with Google and Facebook about how the data they hold could contribute to analysis of the virus’ spread. Both companies already share some anonymised location with researchers. The Post suggested anonymised location data be used by government agencies to understand how people are behaving.

Thailand recently added a COVID-19-screening form to the Airports of Thailand app. While the feature is a digital replica of a paper registration form offered to incoming travellers, the app asks for location permission and tries to turn on Bluetooth every time it is activated. The Register has asked the app’s developers to explain the permissions it seeks, but has not received a reply in 48 hours.

Computer Emergency Response Team in Farsi chief incident response officer Nariman Gharib has claimed that the Iranian government’s COVID-diagnosis app tracks its users.

China has admitted it’s using whatever it wants to track its people – the genie has been out of the bottle there for years.

If other nations follow suit, will it be possible to put the genie back in?

Probably not: plenty of us give away our location data to exercise-tracking apps for the sheer fun of it and government agencies gleefully hoover up what they call “open source intelligence

Source: Pervasive digital surveillance of citizens deployed in COVID-19 fight, with rules that send genie back to bottle • The Register

Brave Browser Delivers on Promise, Files GDPR Complaint Against Google

Earlier today, March 16, Brave filed a formal complaint against Google with the lead General Data Protection Regulation (GDPR) enforcer in Europe.

In a February Cointelegraph interview, Dr. Johnny Ryan, Brave’s chief policy and industry relations officer, explained that Google is abusing its power by sharing user data collected by dozens of its distinct services, creating a “free for all” data warehouse. According to Ryan, this was a clear violation of the GDPR.

Aggravated with the situation and the lack of enforcement against the giant, Ryan promised to take Google to court if things don’t change for the better.

Complaint against Google

Now, the complaint is with the Irish Data Protection Commission. It accuses Google of violating Article 5(1)b of the GDPR. Dublin is Google’s European headquarters and, as Dr. Ryan explained to Cointelegraph, the Commission “is responsible for regulating Google’s data protection across the European Economic Area”.

Article 5(1)b of the GDPR requires that data be “collected for specified, explicit and legitimate purposes and not further processed in a manner that is incompatible with those purposes”. According to Dr. Ryan:

“Enforcement of Brave’s GDPR ‘purpose limitation’ complaint against Google would be tantamount to a functional separation, giving everyone the power to decide what parts of Google they chose to reward with their data.”

Google is a “black box”

Dr. Ryan has spent six months trying to elicit a response from Google to a basic question: “What do you do with my data?” to no avail.

Alongside the complaint, Brave released a study called “Inside the Black Box”, that:

“Examines a diverse set of documents written for Google’s business clients, technology partners, developers, lawmakers, and users. It reveals that Google collects personal data from integrations with websites, apps, and operating systems, for hundreds ill-defined processing purposes.”

Brave does not need regulators to compete with Google

Cointelegraph asked Dr. Ryan how Google’s treatment of user data frustrates Brave as a competitor, to which  Dr. Ryan replied:

“The question is not relevant. Brave does not —  as far as I am aware — have direct frustrations with Google. Brave is growing nicely by being a particularly fast, excellent, and private browser. (It doesn’t need regulators to help it grow.)”

A recent privacy study indicated that Brave protects user privacy much better than Google Chrome or any other major browser.

In addition to filing a formal complaint with the Irish Data Protection Commission, Brave has reportedly written to the European Commission, German Bundeskartellamt, UK Competition & Markets Authority, and French Autorité de la concurrence.

If none of these regulatory bodies take action against Google, Brave has suggested that it may take the tech giant to court itself.

Source: Brave Browser Delivers on Promise, Files GDPR Complaint Against Google

Data of millions of eBay and Amazon shoppers exposed by VAT analysing 3rd party

Researchers have discovered another big database containing millions of European customer records left unsecured on Amazon Web Services (AWS) for anyone to find using a search engine.

A total of eight million records were involved, collected via marketplace and payment system APIs belonging to companies including Amazon, eBay, Shopify, PayPal, and Stripe.

Discovered by Comparitech’s noted breach hunter Bob Diachenko, the AWS instance containing the MongoDB database became visible on 3 February, where it remained indexable by search engines for five days.

Data in the records included names, shipping addresses, email addresses, phone numbers, items purchased, payments, order IDs, links to Stripe and Shopify invoices, and partially redacted credit cards.

Also included were thousands of Amazon Marketplace Web Services (MWS) queries, an MWS authentication token, and an AWS access key ID.

Because a single customer might generate multiple records, Comparitech wasn’t able to estimate how many customers might be affected.

About half of the customers whose records were leaked are from the UK; as far as we can tell, most if not all of the rest are from elsewhere in Europe.

How did this happen?

According to Comparitech, the unnamed company involved was a third party conducting cross-border value-added tax (VAT) analysis.

That is, a company none of the affected customers would have heard of or have any relationship with:

This exposure exemplifies how, when handing over personal and payment details to a company online, that info often passes through the hands of various third parties contracted to process, organize, and analyze it. Rarely are such tasks handled solely in house.

Amazon queries could be used to query the MWS API, Comparitech said, potentially allowing an attacker to request records from sales databases. For that reason, it recommended that the companies involved should immediately change their passwords and keys.

Banjo, the company that will use an AI to spy on all of Utah through all their cams Used a Secret Company and Fake Apps to Scrape Social Media

Banjo, an artificial intelligence firm that works with police used a shadow company to create an array of Android and iOS apps that looked innocuous but were specifically designed to secretly scrape social media, Motherboard has learned.

The news signifies an abuse of data by a government contractor, with Banjo going far beyond what companies which scrape social networks usually do. Banjo created a secret company named Pink Unicorn Labs, according to three former Banjo employees, with two of them adding that the company developed the apps. This was done to avoid detection by social networks, two of the former employees said.

Three of the apps created by Pink Unicorn Labs were called “One Direction Fan App,” “EDM Fan App,” and “Formula Racing App.” Motherboard found these three apps on archive sites and downloaded and analyzed them, as did an independent expert. The apps—which appear to have been originally compiled in 2015 and were on the Play Store until 2016 according to Google—outwardly had no connection to Banjo, but an analysis of its code indicates connections to the company. This aspect of Banjo’s operation has some similarities with the Cambridge Analytica scandal, with multiple sources comparing the two incidents.

“Banjo was doing exactly the same thing but more nefariously, arguably,” a former Banjo employee said, referring to how seemingly unrelated apps were helping to feed the activities of the company’s main business.

[…]

Last year Banjo signed a $20.7 million contract with Utah that granted the company access to the state’s traffic, CCTV, and public safety cameras. Banjo promises to combine that input with a range of other data such as satellites and social media posts to create a system that it claims alerts law enforcement of crimes or events in real-time.

“We essentially do most of what Palantir does, we just do it live,” Banjo’s top lobbyist Bryan Smith previously told police chiefs and 911 dispatch officials when pitching the company’s services.

[…]

Motherboard found the apps developed by Pink Unicorn Labs included code mentioning signing into Facebook, Twitter, Instagram, Russian social media app VK, FourSquare, Google Plus, and Chinese social network Sina Weibo.

[…]

One of the former employees said they saw one of the apps when it was still working and it had a high number of logins.

“It was all major social media platforms,” they added. The particular versions of the apps Motherboard obtained, when opened, asked a user to sign-in with Instagram.

Business records for Pink Unicorn Labs show the company was originally incorporated by Banjo CEO Damien Patton. Banjo employees worked directly on Pink Unicorn Labs projects from Banjo’s offices, several of the former employees said, though they added that Patton made it clear in recent years that Banjo needed to wind down Pink Unicorn Labs’ work and not be linked to the firm.

“There was something about Pink Unicorn that was important for Damien to distance himself from,” another former employee told Motherboard.

[…]

ome similar companies, like Dataminr, have permission from social media sites to use large amounts of data; Twitter, which owns a stake in Dataminr, gives the firm exclusive access to its so-called “fire hose” of public posts.

Banjo did not have that sort of data access. So it created Pink Unicorn Labs, which one former employee described as a “shadow company,” that developed apps to harvest social media data.

“They were shitty little apps that took advantage of some of the data that we had but the catch was that they had a ton of OAuth providers,” one of the former employees said. OAuth providers are methods for signing into apps or websites via another service, such as Facebook’s “Facebook Connect,” Twitter’s “Sign In With Twitter,” or Google’s “Google Sign-In.” These providers mean a user doesn’t have to create a new account for each site or app they want to use, and can instead log in via their already established social media identity.

But once users logged into the innocent looking apps via a social network OAuth provider, Banjo saved the login credentials, according to two former employees and an expert analysis of the apps performed by Kasra Rahjerdi, who has been an Android developer since the original Android project was launched. Banjo then scraped social media content, those two former employees added. The app also contained nonstandard code written by Pink Unicorn Labs: “The biggest red flag for me is that all the code related to grabbing Facebook friends, photos, location history, etc. is directly from their own codebase,” Rahjerdi said.

[…]

“Banjo was secretly farming peoples’ user tokens via these shadow apps,” one of the former employees said. “That was the entire point and plan,” they added when asked if the apps were specifically designed to steal users’ login tokens.

[…]

The apps request a wide range of permissions, such as access to location data, the ability to create accounts and set passwords, and find accounts on the device.

Multiple sources said Banjo tried to keep Pink Unicorn Labs a secret, but Motherboard found several links between the two. An analysis of the Android apps revealed all three had code that contained web links to Banjo’s website; each app contained a set of identical data that appeared to be pulled from social network sites, including repeatedly the Twitter profile of Jennifer Peck, who works for Banjo and is also married to Banjo’s Patton. In registration records for the two companies, both Banjo and Pink Unicorn Labs shared the same address in Redwood, California; and Patton is listed as the creator of Pink Unicorn Labs in that firm’s own public records.

Source: Surveillance Firm Banjo Used a Secret Company and Fake Apps to Scrape Social Media – VICE

Whisper App Exposes Entire History of Chat Logs, personal details and location

Whisper, the anonymous messaging app beloved by teens and tweens the world over, has a problem: it’s not as anonymous as we’d thought. The platform is only the latest that brands itself as private by design while leaking sensitive user data into the open, according to a damning Washington Post report out earlier today. According to the sleuths that uncovered the leak, “anonymous” posts on the platform—which tackle everything from closeted homosexuality, to domestic abuse, to unwanted pregnancies—could easily be tied to the original poster.

As is often the case, the culprit was a leaky bucket, that housed the platform’s entire posting history since it first came onto the scene in 2012. And because this app has historically courted a ton of teens, a lot of this data can get really unsavory, really fast. The Post describes being able to pull a search for users that listed their age as fifteen and getting more than a million results in return, which included not only their posts, but any identifying information they gave the platform, like age, ethnicity, gender, and the groups they were a part of—including groups that are centered around delicate topics like sexual assault.

Whisper told the Post that they’d shut down the leak once being contacted—a point that Gizmodo independently confirmed. Still, the company has yet to come around to cracking down on its less-than-satisfying policies surrounding location data. In 2014, Whisper was caught sharing this data with federal researchers as part of research on personnel stationed at military bases. In the years since then, it looks like a lot of this data is still up for grabs. While some law enforcement officials might need to get their hands on it, Gizmodo’s own analysis found multiple targeted advertising partners that are scooping up user location data as recently as this afternoon.

Source: Whisper App Exposes Entire History of Chat Logs: Report

Utah has given all its camera feeds to an AI, turning it Into a Surveillance Panopticon

The state of Utah has given an artificial intelligence company real-time access to state traffic cameras, CCTV and “public safety” cameras, 911 emergency systems, location data for state-owned vehicles, and other sensitive data.

The company, called Banjo, says that it’s combining this data with information collected from social media, satellites, and other apps, and claims its algorithms “detect anomalies” in the real world.

The lofty goal of Banjo’s system is to alert law enforcement of crimes as they happen. It claims it does this while somehow stripping all personal data from the system, allowing it to help cops without putting anyone’s privacy at risk. As with other algorithmic crime systems, there is little public oversight or information about how, exactly, the system determines what is worth alerting cops to.

Source: This Small Company Is Turning Utah Into a Surveillance Panopticon – VICE

Clearview AI: We Are ‘Working to Acquire All U.S. Mugshots’ From Past 15 Years

Clearview AI worked to build a national database of every mug shot taken in the United States during the past 15 years, according to an email obtained by OneZero through a public records request.

The email, sent by a representative for Clearview AI in August 2019, was in response to an inquiry from the Green Bay Police Department in Wisconsin, which had asked if there was a way to upload its own mug shots to Clearview AI’s app.

“We are… working to acquire all U.S. mugshots nationally from the last 15 years, so once we have that integrated in a few months’ time it might just be superfluous anyway,” wrote the Clearview AI employee, whose name was redacted.

Clearview AI is best known for scraping the public internet, including social media, for billions of images to power its facial recognition app, which was first reported on by the New York Times. Some of those images are pulled from online repositories of mug shots, like Rapsheets.org and Arrests.org, according to other emails obtained by OneZero. Acquiring a national mug shot database would make Clearview AI an even more powerful tool for police departments, which would be able to easily match a photograph of an individual against their criminal history.

Clearview AI did not immediately respond to a request for comment from OneZero. It is unclear whether the company ultimately succeeded in acquiring such a database.

Source: Clearview AI: We Are ‘Working to Acquire All U.S. Mugshots’ From Past 15 Years

Clearview AI Let Celebs, Investors Use Facial Recognition App for fun

Creepy facial recognition firm Clearview AI—which claims to have built an extensive database from billions of photos scraped from the public web—allowed the rich and powerful to use its app as a personal plaything and spy tool, according to reporting from the New York Times on Thursday.

Clearview and its founder, Hoan Ton-That, claim that the database is only supposed to be used by law enforcement and “select security professionals” in the course of investigations. Prior reports from the Times revealed that hundreds of law enforcement agencies, including the Department of Justice and Immigration and Customs Enforcement, had used Clearview’s biometric tools, which is alarming enough, given the total lack of any U.S. laws regulating how face recognition can be used and its proven potential in mass surveillance of anyone from minorities to political targets. Clearview also pitched itself and its tools to white supremacist Paul Nehlen, then a candidate for Congress, saying it could provide “unconventional databases” for “extreme opposition research.”

But the Times has now found that Clearview’s app was “freely used in the wild by the company’s investors, clients and friends” in situations ranging from showing off at parties to, in the case of billionaire Gristedes founder John Catsimatidis, correctly identifying a man his daughter was on a date with. More alarmingly, Catsimatidis launched a trial run of Clearview’s potential as a surveillance tool at his chain of grocery stores.

Catsimatidis told the Times that a Gristedes in Manhattan had used Clearview to screen for “shoplifters or people who had held up other stores,” adding, “People were stealing our Häagen-Dazs. It was a big problem.” That dovetails with other reporting by BuzzFeed that found Clearview is developing security cameras designed to work with its face recognition tools and that corporations including Kohl’s, Macy’s, and the NBA had tested it.

Source: Clearview AI Let Celebs, Investors Use Facial Recognition App

DuckDuckGo Made a List of Jerks Tracking You Online

DuckDuckGo, a privacy-focused tech company, today launched something called Tracker Radar—an open-source, automatically generated and continually updated list that currently contains more than 5,000 domains that more than 1,700 companies use to track people online.

The idea behind Tracker Radar, first reported by CNET, is to share the data DuckDuckGo has collected to create a better set of tracker blockers. DuckDuckGo says that the majority of existing tracker data falls into two types: block lists and in-browser tracker identification. The issue is the former relies on crowd-sourcing and manual maintenance. The latter is difficult to scale and also can be potentially abused due to the fact it’s generating a list based on your actual browsing habits. Tracker Radar supposedly gets around some of these issues by looking at the most common cross-site trackers and including a host of information about their behavior, things like prevalence, fingerprinting, cookies, and privacy policies, among other considerations.

This can be weedsy, especially if the particulars of adtech make your eyeballs roll out of their sockets. The gist is, that creepy feeling you get when you see ads on social media for that product you googled the other day? All that is powered by the types of hidden trackers DuckDuckGo is trying to block. On top of shopping data, these trackers can also glean your search history, location data, along with a number of other metrics. That can then be used to infer data like age, ethnicity, and gender to create a profile that then gets shared with other companies looking to profit off you without your explicit consent.

As for how people can actually take advantage of it, it’s a little more roundabout. The average joe mostly benefits by using… DuckDuckGo’s browser mobile apps for iOS and Android, or desktop browser extensions for Chrome, Firefox, and Safari.

As for developers, DuckDuckGo is encouraging them to create their own tracker block lists. The company is also suggesting researchers use Tracker Radar to help them study online tracking. You can find the data set here.

Source: DuckDuckGo Made a List of Jerks Tracking You Online

After blowing $100m to snoop on Americans’ phone call logs for four years, what did the NSA get? Just one lead

The controversial surveillance program that gave the NSA access to the phone call records of millions of Americans has cost US taxpayers $100m – and resulted in just one useful lead over four years.

That’s the upshot of a report [PDF] from the US government’s freshly revived Privacy and Civil Liberties Oversight Board (PCLOB). The panel dug into the super-snoops’ so-called Section 215 program, which is due to be renewed next month.

Those findings reflect concerns expressed by lawmakers back in November when at a Congressional hearing, the NSA was unable to give a single example of how the spy program had been useful in the fight against terrorism. At the time, Senator Dianne Feinstein (D-CA) stated bluntly: “If you can’t give us any indication of specific value, there is no reason for us to reauthorize it.”

That value appears to have been, in total, 15 intelligence reports at an overall cost of $100m between 2015 and 2019. Of the 15 reports that mentioned what the PCLOB now calls the “call detail records (CDR) program,” just two of them provided “unique information.” In other words, for the other 13 reports, use of the program reinforced what Uncle Sam’s g-men already knew. In 2018 alone, the government collected more than 434 million records covering 19 million different phone numbers.

What of those two reports? According to the PCLOB overview: “Based on one report, FBI vetted an individual, but, after vetting, determined that no further action was warranted. The second report provided unique information about a telephone number, previously known to US authorities, which led to the opening of a foreign intelligence investigation.”

Source: After blowing $100m to snoop on Americans’ phone call logs for four years, what did the NSA get? Just one lead • The Register

Facebook’s privacy tools are riddled with missing data

Facebook wants you to think it’s consistently increasing transparency about how the company stores and uses your data. But the company still isn’t revealing everything to its users, according to an investigation by Privacy International.

The obvious holes in Facebook’s privacy data exports paint a picture of a company that aims to placate users’ concerns without actually doing anything to change its practices.

Data lists are incomplete — The most pressing issue with Facebook’s downloadable privacy data is that it’s incomplete. Privacy International’s investigation tested the “Ads and Business” section on Facebook’s “Download Your Information” page, which purports to tell users which advertisers have been targeting them with ads.

The investigation found that the list of advertisers actually changes over time, seemingly at random. This essentially makes it impossible for users to develop a full understanding of which advertisers are using their data. In this sense, Facebook’s claims of transparency are inaccurate and misleading.

‘Off-Facebook’ data is misleading — Facebook’s most recent act of “transparency” is its “Off-Facebook Activity” tool, which allows users to “see and control the data that other apps and websites share with Facebook.” But the reports generated by this tool offer extremely limited detail. Some data is marked with a cryptic “CUSTOM” label, while even the best-labeled data gives no context surrounding the reason it’s included in the list.

Nothing to see here — Facebook’s supposed attempts at increased transparency do very little to actually help users understand what the company is doing with their personal data. These tools come off as nothing more than a ploy to take pressure off the company. Meanwhile, the company continues to quietly pay off massive lawsuits over actual user privacy issues.

Facebook doesn’t care about your privacy — it cares about making money. Users would do well to remember that.

Source: Report: Facebook’s privacy tools are riddled with missing data

US Gov wants to spy on all drones all the time: they must be constantly connected to the internet to give Feds real-time location data

Drone enthusiasts are up in arms over rules proposed by the US Federal Aviation Administration (FAA) that would require their flying gizmos to provide real-time location data to the government via an internet connection.

The requirement, for drones weighing 0.55lb (0.25kg) or more, would ground an estimated 80 per cent of gadgets in the United States, and many would never be able to fly again because they couldn’t be retrofitted with the necessary equipment, say drone owners. Those that did buy new drones would need to buy a monthly data plan for their flying machines: something that would likely cost $35 or more a month, given extortionate US mobile rates.

There are also additional costs of running what would need to be new location databases of drones, which the FAA expects will be run by private companies but doesn’t exist yet, which drones owners would have to pay for through subscriptions. The cost of all this is prohibitive, for little real benefit, they argue.

If a device loses internet connectivity while flying, and can’t send its real-time info, it must land. It may be possible to pair a drone control unit with, say, a smartphone or a gateway with fixed-lined internet connectivity, so that the drone can relay its data to the Feds via these nodes. However, that’s not much use if you’re out in the middle of nowhere, or if you wander into a wireless not-spot.

Nearly 35,000 public comments have been received by the FAA, with the comment period closing later today. The vast majority of the comments are critical and most make the same broad point: that the rules are too strict, too costly and are unnecessary.

The world’s largest drone maker, DJI, is among those fighting the rule change, unsurprisingly enough. The manufacturer argues that while it agrees that every drone should have its own unique ID, the FAA proposal is “complex, expensive and intrusive.”

It would also undermine the industry own remote ID solution that doesn’t require a real-time data connection but utilizes the same radio signals used to control drones to broadcast ID information. It also flags that the proposed solution has privacy implications: people would be able to track months of someone’s previous drone usage.

Source: Drones must be constantly connected to the internet to give Feds real-time location data – new US govt proposal • The Register

Project Svalbard, Have I Been Pwned will not be sold after all

This is going to be a lengthy blog post so let me use this opening paragraph as a summary of where Project Svalbard is at: Have I Been Pwned is no longer being sold and I will continue running it independently. After 11 months of a very intensive process culminating in many months of exclusivity with a party I believed would ultimately be the purchaser of the service, unexpected changes to their business model made the deal infeasible. It wasn’t something I could have seen coming nor was it anything to do with HIBP itself, but it introduced a range of new and insurmountable barriers. So that’s the tl;dr, let me now share as much as I can about what’s been happening since April 2019 and how the service will operate in the future.

Source: Troy Hunt: Project Svalbard, Have I Been Pwned and its Ongoing Independence

Ring doorbells to change privacy settings after study showed it shared personal information with Facebook and Google

Ring, the Amazon-owned maker of smart-home doorbells and web-enabled security cameras, is changing its privacy settings two weeks after a study showed the company shares customers’ personal information with Facebook, Google and other parties without users’ consent.

The change will let Ring users block the company from sharing most, but not all, of their data. A company spokesperson said people will be able to opt out of those sharing agreements “where applicable.” The spokesperson declined to clarify what “where applicable” might mean.

Ring will announce and start rolling out the opt-out feature soon, the spokesperson told CBS MoneyWatch.

Source: Ring to change privacy settings after study showed it shared personal information with Facebook and Google – CBS News

Facebook Cuts Off Some Mobile tracking Ad Data With Advertising Partners, should have done this long long ago

Facebook is tightening its rules around the use of raw, device-level data used for measuring ad campaigns that Facebook shares with an elite group of advertising technology partners.

As first spotted by AdAge, the company recently tweaked the terms of service that apply to its “advanced mobile measurement partner” program, which advertisers tap into to track the performance of their ads on Facebook. Those mobile measurement partners (MMPs) were, until now, free to share the raw data they accessed from Facebook with advertisers. These metrics drilled down to the individual device level, which advertisers could then reportedly connect to any device IDs they might already have on tap.

Facebook reportedly began notifying affected partners on February 5 and all advertising partners must agree to the updated terms of the program before April 22, according to Tencent.

While Facebook didn’t deliver the device IDs themselves, passing granular insights like the way a given consumer shops or browses the web—and then giving an advertiser free rein to link that data to, well, just about anyone—smacks hard of something that could easily turn Cambridge Analytica-y if the wrong actors got their hands on the data. As AdAge put it:

The program had safeguards that bound advertisers to act responsibly, but there were always concerns that advertisers could misuse the data, according to people familiar with the program. Facebook says that it did not uncover any wrongdoing on the part of advertisers when it decided to update the measurement program. However, the program under its older configuration came with clear risks, according to marketing partners.

Source: Facebook Cuts Off Some Ad Data With Advertising Partners

Apple has blocked Clearview AI’s iPhone app for violating its rules

An iPhone app built by controversial facial recognition startup Clearview AI has been blocked by Apple, effectively banning the app from use.

Apple confirmed to TechCrunch that the startup “violated” the terms of its enterprise developer program.

The app allows its users — which the company claims it serves only law enforcement officers — to use their phone camera or upload a photo to search its database of 3 billion photos. But BuzzFeed News revealed that the company — which claims to only cater to law enforcement users — also includes many private-sector users, including Macy’s, Walmart and Wells Fargo.

Clearview AI has been at the middle of a media — and legal — storm since its public debut in The New York Times last month. The company scrapes public photos from social media sites, drawing ire from the big tech giants that claim Clearview AI misused their services. But it’s also gained attention from hackers. On Wednesday, Clearview AI confirmed a data breach in which its client list was stolen.

Source: Apple has blocked Clearview AI’s iPhone app for violating its rules | TechCrunch

Clearview AI, Creepy Facial Recognition Company That Stole Your Pictures from Social Media, Says Entire Client List Was Stolen by Hackers

A facial-recognition company that contracts with powerful law-enforcement agencies just reported that an intruder stole its entire client list, according to a notification the company sent to its customers.

In the notification, which The Daily Beast reviewed, the startup Clearview AI disclosed to its customers that an intruder “gained unauthorized access” to its list of customers, to the number of user accounts those customers had set up, and to the number of searches its customers have conducted. The notification said the company’s servers were not breached and that there was “no compromise of Clearview’s systems or network.” The company also said it fixed the vulnerability and that the intruder did not obtain any law-enforcement agencies’ search histories.

Source: Clearview AI, Facial Recognition Company That Works With Law Enforcement, Says Entire Client List Was Stolen

Your car records a lot of things you don’t know about – including you.

Tesla chief executive Elon Musk calls this function Sentry Mode. I also call it Chaperone Mode and Snitch Mode. I’ve been writing recently about how we don’t drive cars, we drive computers. But this experience opened my eyes.

I love that my car recorded a hit-and-run on my behalf. Yet I’m scared we’re not ready for the ways cameras pointed inside and outside vehicles will change the open road — just like the cameras we’re adding to doorbells are changing our neighborhoods.

It’s not just crashes that will be different. Once governments, companies and parents get their hands on car video, it could become evidence, an insurance liability and even a form of control. Just imagine how it will change teenage romance. It could be the end of the idea that cars are private spaces to peace out and get away — an American symbol of independence.

“You are not alone in your car anymore,” says Andrew Guthrie Ferguson, a visiting professor at the American University Washington College of Law and the author of “The Rise of Big Data Policing.”

The moment my car was struck, it sent an alert to my phone and the car speakers began blaring ghoulish classical music, a touch of Musk’s famous bravado. The car saved four videos of the incident, each from a different angle, to a memory stick I installed near the cup holder. (Sentry Mode is an opt-in feature.) You can watch my car lurch when the bus strikes it, spot the ID number on the bus and see its driver’s face passing by moments before.

This isn’t just a Tesla phenomenon. Since 2016, some Cadillacs have let you store recordings from four outward-facing cameras, both as the car is moving and when it’s parked. Chevrolet offers a so-called Valet Mode to record potentially naughty parking attendants. Sold with Corvettes, they call this camera feature a “baby monitor for your baby.”

Now there are even face-monitoring cameras in certain General Motors, BMW and Volvo vehicles to make sure you’re not drowsy, drunk or distracted. Most keep a running log of where you’re looking.

Your older car’s camera may not be saving hours of footage, but chances are it keeps at least a few seconds of camera, speed, steering and other data on a hidden “black box” that activates in a crash. And I’m pretty sure your next car would make even 007 jealous; I’ve already seen automakers brag about adding 16 cameras and sensors to 2020 models.

The benefits of this technology are clear. The video clips from my car made a pretty compelling case for the city to pay for my repairs without even getting my insurance involved. Lots of Tesla owners proudly share crazy footage on YouTube. It’s been successfully used to put criminals behind bars.

But it’s not just the bad guys my car records. I’ve got clips of countless people’s behinds scooching by in tight parking lots, because Sentry Mode activates any time something gets close. It’s also recording my family: With another function called Dash Cam that records the road, Tesla has saved hours and hours of my travels — the good driving and the not-so-good alike.

We’ve been down this road before with connected cameras. Amazon’s Ring doorbells and Nest cams also seemed like a good idea, until hackers, stalkers and police tried to get their hands on the video feed. (Amazon founder and chief executive Jeff Bezos owns The Washington Post.) Applied to a car, the questions multiply: Can you just peer in on your teen driver — or spouse? Do I have to share my footage with the authorities? Should my car be allowed to kick me off the road if it thinks I’m sleepy? How long until insurance companies offer “discounts” for direct video access? And is any of this actually making cars safer or less expensive to own?

Your data can and will be used against you. Can we do anything to make our cars remain private spaces?

[…]

design choices may well determine our future privacy. It’s important to remember: Automakers can change how their cameras work with as little as a software update. Sentry mode arrived out of thin air last year on cars made as early as 2017.

We can learn from smart doorbells and home security devices where surveillance goes wrong.

The problems start with what gets recorded. Home security cameras have so normalized surveillance that they let people watch and listen in on family and neighbors. Today, Tesla’s Sentry Mode and Dash Cam only record video, not audio. The cars have microphones inside, but right now they seem to just be used for voice commands and other car functions — avoiding eavesdropping on potentially intimate car conversations.

Tesla also hasn’t activated a potentially invasive source of video: a camera pointed inside the car, right next to the rear view mirror. But, again, it’s not entirely clear why. CEO Musk tweeted it’s there to be used as part of a future ride-sharing program, implying it’s not used in other ways. Already some Tesla owners are champing at the bit to have it activated for Sentry Mode to see, for example, what a burglar is stealing. I could imagine others demanding live access for a “teen driving” mode.

(Tesla has shied away from perhaps the most sensible use for that inner camera: activating it to monitor whether drivers are paying attention while using its Autopilot driver assistance system, something GM does with its so-called SuperCruise system.)

In other ways, Tesla is already recording gobs. Living in a dense city, my Sentry Mode starts recording between five and seven times per day — capturing lots of people, the vast majority of whom are not committing any crime. (This actually drains the car’s precious battery; some owners estimate it sips about a mile’s worth of the car’s 322 mile potential range for every hour it runs.) Same with the Dash Cam that runs while I’m on the road: it’s recording not just my driving but all the other cars and people on the road, too.

The recordings stick around on a memory card until you delete them or the card fills up, and it writes over the old footage.

[…]

Chevrolet potentially ran afoul of eavesdropping laws when it debuted Valet Mode in 2015, because it was recording audio inside the cabin of the car without disclosure. (Now they’ve cut the audio and added a warning message to the infotainment system.) When it’s on, Tesla’s Sentry Mode activates a warning sign on its large dashboard screen with a pulsing version of the red circle some might remember from the evil HAL-9000 computer in “2001: A Space Odyssey.”

My biggest concern is who can access all that video footage. Connected security cameras let anybody with your password peer in from afar, through an app or the Web.

[…]

Teslas, like most new cars, come with their own independent cellular connections. And Tesla, by default, uploads clips from its customer cars’ external cameras. A privacy control in the car menus says Tesla uses the footage “to learn how to recognize things like lane lines, street signs and traffic light positions.”

[…]

Video from security cameras is already routine in criminal prosecutions. In the case of Ring cameras, the police can make a request of a homeowner, who is free to say no. But courts have also issued warrants to Amazon to hand over the video data it stores on its computers, and it had to comply.

It’s an open question whether police could just seize the video recordings saved on a drive in your car, says Ferguson, the law professor.

“They could probably go through a judge and get a probable cause warrant, if they believe there was a crime,” he says. “It’s a barrier, but is not that high of a barrier. Your car is going to snitch on you.”

Source: My car was in a hit-and-run. Then I learned it recorded the whole thing.

Google users in UK to lose EU data protection, get US non-protection

The shift, prompted by Britain’s exit from the EU, will leave the sensitive personal information of tens of millions with less protection and within easier reach of British law enforcement.

The change was described to Reuters by three people familiar with its plans. Google intends to require its British users to acknowledge new terms of service including the new jurisdiction.

Ireland, where Google and other U.S. tech companies have their European headquarters, is staying in the EU, which has one of the world’s most aggressive data protection rules, the General Data Protection Regulation.

Google has decided to move its British users out of Irish jurisdiction because it is unclear whether Britain will follow GDPR or adopt other rules that could affect the handling of user data, the people said.

If British Google users have their data kept in Ireland, it would be more difficult for British authorities to recover it in criminal investigations.

The recent Cloud Act in the United States, however, is expected to make it easier for British authorities to obtain data from U.S. companies. Britain and the United States are also on track to negotiate a broader trade agreement.

Beyond that, the United States has among the weakest privacy protections of any major economy, with no broad law despite years of advocacy by consumer protection groups.

A Google spokesman declined to comment ahead of a public announcement.

Source: Exclusive: Google users in UK to lose EU data protection – sources – Reuters

Firm Tracking Purchase, Transaction Histories of Millions Not Really Anonymizing Them

The nation’s largest financial data broker, Yodlee, holds extensive and supposedly anonymized banking and credit card transaction histories on millions of Americans. Internal documents obtained by Motherboard, however, appear to indicate that Yodlee clients could potentially de-anonymize those records by simply downloading a giant text file and poking around in it for a while.

According to Motherboard, the 2019 document explains how Yodlee obtains transaction data from partners like banks and credit card companies and what data is collected. That includes a unique identifier associated with the bank or credit card holder, amounts of transactions, dates of sale, which business the transaction was processed at, and bits of metadata, Motherboard wrote; it also includes data relating to purchases involving multiple retailers, such as a restaurant order through a delivery app. The document states that Yodlee is giving clients access to this data in the form of a large text file rather than a Yodlee-run interface.

The document also shows how Yodlee performs “data cleaning” on that text file, which means obfuscating patterns like “account numbers, phone numbers, and SSNs by redacting them with the letters “XXX,” Motherboard wrote. It also scrubs some payroll and financial transfer data, as well as the names of the banking and credit card companies involved.

But this process leaves the unique identifiers, which are shared across each entry associated with a particular account, intact. Research has repeatedly shown that taking supposedly anonymized data and reverse-engineering it to identify individuals within can be a trivial undertaking, even when no information is shared across records.

Experts told Motherboard that anyone with malicious intent would just need to verify a purchase was made by a specific individual and they might gain access to all other transactions using the same identifier.

With location and time data on just three to four purchases, an “attacker can unmask the person with a very high probability,” Rutgers University associate professor Vivek Singh told the site. “With this unmasking, the attacker would have access to all the other transactions made by that individual.”

Imperial College of London assistant professor Yves-Alexandre de Montjoye, who worked with Singh on a 2015 study that identified shoppers from metadata, wrote to Motherboard this process appeared to leave the data only “pseudonymized” and that “someone with access to the dataset and some information about you, e.g. shops you’ve been buying from and when, might be able to identify you.”

Yodlee and its owner, Envestnet, is facing serious heat from Congress. Democratic Senators Ron Wyden and Sherrod Brown, as well as Representative Anna Eshoo, recently sent a letter to the Federal Trade Commission asking for it to investigate whether the sale of this kind of financial data violates federal law.

“Envestnet claims that consumers’ privacy is protected because it anonymizes their personal financial data,” the congresspeople wrote. “But for years researchers have been able to re-identify the individuals to whom the purportedly anonymized data belongs with just three or four pieces of information.”

Source: Report: Firm Tracking Purchase, Transaction Histories of Millions Maybe Not Really Anonymizing Them

It’s very hard to get anonymity right.