How We Determined Predictive Policing Software Disproportionately Targeted Low-Income, Black, and Latino Neighborhoods

Posted on December 4, 2021 by Robin Edgar

[…]

One of the first, and reportedly most widely used, is PredPol, its name an amalgamation of the words “predictive policing.” The software was derived from an algorithm used to predict earthquake aftershocks that was developed by professors at UCLA and released in 2011. By sending officers to patrol these algorithmically predicted hot spots, these programs promise they will deter illegal behavior.

But law enforcement critics had their own prediction: that the algorithms would send cops to patrol the same neighborhoods they say police always have, those populated by people of color. Because the software relies on past crime data, they said, it would reproduce police departments’ ingrained patterns and perpetuate racial injustice, covering it with a veneer of objective, data-driven science.

PredPol has repeatedly said those criticisms are off-base. The algorithm doesn’t incorporate race data, which, the company says, “eliminates the possibility for privacy or civil rights violations seen with other intelligence-led or predictive policing models.”

There have been few independent, empirical reviews of predictive policing software because the companies that make these programs have not publicly released their raw data.

A seminal, data-driven study about PredPol published in 2016 did not involve actual predictions. Rather the researchers, Kristian Lum and William Isaac, fed drug crime data from Oakland, California, into PredPol’s open-source algorithm to see what it would predict. They found that it would have disproportionately targeted Black and Latino neighborhoods, despite survey data that shows people of all races use drugs at similar rates.

PredPol’s founders conducted their own research two years later using Los Angeles data and said they found the overall rate of arrests for people of color was about the same whether PredPol software or human police analysts made the crime hot spot predictions. Their point was that their software was not worse in terms of arrests for people of color than nonalgorithmic policing.

However, a study published in 2018 by a team of researchers led by one of PredPol’s founders showed that Indianapolis’s Latino population would have endured “from 200% to 400% the amount of patrol as white populations” had it been deployed there, and its Black population would have been subjected to “150% to 250% the amount of patrol compared to white populations.” The researchers said they found a way to tweak the algorithm to reduce that disproportion but that it would result in less accurate predictions—though they said it would still be “potentially more accurate” than human predictions.

[…]

Other predictive police programs have also come under scrutiny. In 2017, the Chicago Sun-Times obtained a database of the city’s Strategic Subject List, which used an algorithm to identify people at risk of becoming victims or perpetrators of violent, gun-related crime. The newspaper reported that 85% of people that the algorithm saddled with the highest risk scores were Black men—some with no violent criminal record whatsoever.

Last year, the Tampa Bay Times published an investigation analyzing the list of people that were forecast to commit future crimes by the Pasco Sheriff’s Office’s predictive tools. Deputies were dispatched to check on people on the list more than 12,500 times. The newspaper reported that at least one in 10 of the people on the list were minors, and many of those young people had only one or two prior arrests yet were subjected to thousands of checks.

For our analysis, we obtained a trove of PredPol crime prediction data that has never before been released by PredPol for unaffiliated academic or journalistic analysis. Gizmodo found it exposed on the open web (the portal is now secured) and downloaded more than 7 million PredPol crime predictions for dozens of American cities and some overseas locations between 2018 and 2021.

[…]

rom Fresno, California, to Niles, Illinois, to Orange County, Florida, to Piscataway, New Jersey. We supplemented our inquiry with Census data, including racial and ethnic identities and household incomes of people living in each jurisdiction—both in areas that the algorithm targeted for enforcement and those it did not target.

Overall, we found that PredPol’s algorithm relentlessly targeted the Census block groups in each jurisdiction that were the most heavily populated by people of color and the poor, particularly those containing public and subsidized housing. The algorithm generated far fewer predictions for block groups with more White residents.

Analyzing entire jurisdictions, we observed that the proportion of Black and Latino residents was higher in the most-targeted block groups and lower in the least-targeted block groups (about 10% of which had zero predictions) compared to the overall jurisdiction. We also observed the opposite trend for the White population: The least-targeted block groups contained a higher proportion of White residents than the jurisdiction overall, and the most-targeted block groups contained a lower proportion.

[…]

We also found that PredPol’s predictions often fell disproportionately in places where the poorest residents live

[…]

To try to determine the effects of PredPol predictions on crime and policing, we filed more than 100 public records requests and compiled a database of more than 600,000 arrests, police stops, and use-of-force incidents. But most agencies refused to give us any data. Only 11 provided at least some of the necessary data.

For the 11 departments that provided arrest data, we found that rates of arrest in predicted areas remained the same whether PredPol predicted a crime that day or not. In other words, we did not find a strong correlation between arrests and predictions. (See the Limitations section for more information about this analysis.)

We do not definitively know how police acted on any individual crime prediction because we were refused that data by nearly every police department.

[…]

Overall, our analysis suggests that the algorithm, at best, reproduced how officers have been policing, and at worst, would reinforce those patterns if its policing recommendations were followed.

[…]

Source: How We Determined Predictive Policing Software Disproportionately Targeted Low-Income, Black, and Latino Neighborhoods

Tensorflow model zoo

Posted on December 2, 2021 by Robin Edgar

A repository that shares tuning results of trained models generated by Tensorflow. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. I also try to convert it to OpenVINO’s IR model as much as possible.

TensorFlow Lite, OpenVINO, CoreML, TensorFlow.js, TF-TRT, MediaPipe, ONNX [.tflite, .h5, .pb, saved_model, tfjs, tftrt, mlmodel, .xml/.bin, .onnx]

https://github.com/PINTO0309/PINTO_model_zoo

Intel open-sources AI-powered tool to spot bugs in code

Posted on October 26, 2021 by Robin Edgar

Intel today open-sourced ControlFlag, a tool that uses machine learning to detect problems in computer code — ideally to reduce the time required to debug apps and software. In tests, the company’s machine programming research team says that ControlFlag has found hundreds of defects in proprietary, “production-quality” software, demonstrating its usefulness.

[…]

ControlFlag, which works with any programming language containing control structures (i.e., blocks of code that specify the flow of control in a program), aims to cut down on debugging work by leveraging unsupervised learning. With unsupervised learning, an algorithm is subjected to “unknown” data for which no previously defined categories or labels exist. The machine learning system — ControlFlag, in this case — must teach itself to classify the data, processing the unlabeled data to learn from its inherent structure.

ControlFlag continually learns from unlabeled source code, “evolving” to make itself better as new data is introduced. While it can’t yet automatically mitigate the programming defects it finds, the tool provides suggestions for potential corrections to developers, according to Gottschlich.

[…]

AI-powered coding tools like ControlFlag, as well as platforms like Tabnine, Ponicode, Snyk, and DeepCode, have the potential to reduce costly interactions between developers, such as Q&A sessions and repetitive code review feedback. IBM and OpenAI are among the many companies investigating the potential of machine learning in the software development space. But studies have shown that AI has a ways to go before it can replace many of the manual tasks that human programmers perform on a regular basis.

Source: Intel open-sources AI-powered tool to spot bugs in code | VentureBeat

Criminals use fake AI voice to swindle UAE bank out of $35m

Posted on October 16, 2021 by Robin Edgar

Authorities in the United Arab Emirates have requested the US Department of Justice’s help in probing a case involving a bank manager who was swindled into transferring $35m to criminals by someone using a fake AI-generated voice.

The employee received a call to move the company-owned funds by someone purporting to be a director from the business. He also previously saw emails that showed the company was planning to use the money for an acquisition, and had hired a lawyer to coordinate the process. When the sham director instructed him to transfer the money, he did so thinking it was a legitimate request.

But it was all a scam, according to US court documents reported by Forbes. The criminals used “deep voice technology to simulate the voice of the director,” it said. Now officials from the UAE have asked the DoJ to hand over details of two US bank accounts, where over $400,000 from the stolen money were deposited.

Investigators believe there are at least 17 people involved in the heist.

Source: Criminals use fake AI voice to swindle UAE bank out of $35m

AI Fake-Face Generators Can Be Rewound To Reveal the Real Faces They Trained On

Posted on October 15, 2021 by Robin Edgar

Load up the website This Person Does Not Exist and it’ll show you a human face, near-perfect in its realism yet totally fake. Refresh and the neural network behind the site will generate another, and another, and another. The endless sequence of AI-crafted faces is produced by a generative adversarial network (GAN) — a type of AI that learns to produce realistic but fake examples of the data it is trained on. But such generated faces — which are starting to be used in CGI movies and ads — might not be as unique as they seem. In a paper titled This Person (Probably) Exists (PDF), researchers show that many faces produced by GANs bear a striking resemblance to actual people who appear in the training data. The fake faces can effectively unmask the real faces the GAN was trained on, making it possible to expose the identity of those individuals. The work is the latest in a string of studies that call into doubt the popular idea that neural networks are “black boxes” that reveal nothing about what goes on inside.

To expose the hidden training data, Ryan Webster and his colleagues at the University of Caen Normandy in France used a type of attack called a membership attack, which can be used to find out whether certain data was used to train a neural network model. These attacks typically take advantage of subtle differences between the way a model treats data it was trained on — and has thus seen thousands of times before — and unseen data. For example, a model might identify a previously unseen image accurately, but with slightly less confidence than one it was trained on. A second, attacking model can learn to spot such tells in the first model’s behavior and use them to predict when certain data, such as a photo, is in the training set or not.

Such attacks can lead to serious security leaks. For example, finding out that someone’s medical data was used to train a model associated with a disease might reveal that this person has that disease. Webster’s team extended this idea so that instead of identifying the exact photos used to train a GAN, they identified photos in the GAN’s training set that were not identical but appeared to portray the same individual — in other words, faces with the same identity. To do this, the researchers first generated faces with the GAN and then used a separate facial-recognition AI to detect whether the identity of these generated faces matched the identity of any of the faces seen in the training data. The results are striking. In many cases, the team found multiple photos of real people in the training data that appeared to match the fake faces generated by the GAN, revealing the identity of individuals the AI had been trained on.

Source: AI Fake-Face Generators Can Be Rewound To Reveal the Real Faces They Trained On – Slashdot

Chinese AI gets ethical guidelines for the first time

Posted on October 5, 2021 by Robin Edgar

[…]

Humans should have full decision-making power, the guidelines state, and have the right to choose whether to accept AI services, exit an interaction with an AI system or discontinue its operation at any time. The document was published by China’s Ministry of Science and Technology (MOST) last Sunday.

The goal is to “make sure that artificial intelligence is always under the control of humans,” the guidelines state.

“This is the first specification we see from the [Chinese] government on AI ethics,” said Rebecca Arcesati, an analyst at the German think tank Mercator Institute for China Studies (Merics). “We had only seen high-level principles before.”

The guidelines, titled “New Generation Artificial Intelligence Ethics Specifications”, were drafted by an AI governance committee, which was established under the MOST in February 2019. In June that year, the committee published a set of guiding principles for AI governance that was much shorter and broader than the newly released specifications.

[…]

The document outlines six basic principles for AI systems, including ensuring that they are “controllable and trustworthy”. The other principles are improving human well-being, promoting fairness and justice, protecting privacy and safety, and raising ethical literacy.

The emphasis on protecting and empowering users reflects Beijing’s efforts to exercise greater control over the country’s tech sector. One of the latest moves in the year-long crackdown has been targeting content recommendation algorithms, which often rely on AI systems built on collecting and analysing massive amounts of user data.

[…]

The new AI guidelines are “a clear message to tech giants that have built entire business models on recommendation algorithms”, Arcesati said.

However, the changes are being done in the name of user choice, giving users more control over their interactions with AI systems online, an issue other countries are also grappling with. Data security, personal privacy and the right to opt out of AI-driven decision-making are all mentioned in the new document.

Preventing risks requires spotting and addressing technical and security vulnerabilities in AI systems, making sure that relevant entities are held accountable, the document says, and that the management and control of AI product quality are improved.

The guidelines also forbid AI products and services from engaging in illegal activities and severely endangering national security, public security or manufacturing security. Neither should they be able to harm the public interest, the document states.

[…]

Source: Chinese AI gets ethical guidelines for the first time, aligning with Beijing’s goal of reining in Big Tech | South China Morning Post

A Stanford Proposal Over AI’s ‘Foundations’ Ignites Debate

Posted on September 21, 2021 by Robin Edgar

Last month, Stanford researchers declared that a new era of artificial intelligence had arrived, one built atop colossal neural networks and oceans of data. They said a new research center at Stanford would build—and study—these “foundation models” of AI.

Critics of the idea surfaced quickly—including at the workshop organized to mark the launch of the new center. Some object to the limited capabilities and sometimes freakish behavior of these models; others warn of focusing too heavily on one way of making machines smarter.

“I think the term ‘foundation’ is horribly wrong,” Jitendra Malik, a professor at UC Berkeley who studies AI, told workshop attendees in a video discussion.

Malik acknowledged that one type of model identified by the Stanford researchers—large language models that can answer questions or generate text from a prompt—has great practical use. But he said evolutionary biology suggests that language builds on other aspects of intelligence like interaction with the physical world.

“These models are really castles in the air; they have no foundation whatsoever,” Malik said. “The language we have in these models is not grounded, there is this fakeness, there is no real understanding.” He declined an interview request.

A research paper coauthored by dozens of Stanford researchers describes “an emerging paradigm for building artificial intelligence systems” that it labeled “foundation models.” Ever-larger AI models have produced some impressive advances in AI in recent years, in areas such as perception and robotics as well as language.

Large language models are also foundational to big tech companies like Google and Facebook, which use them in areas like search, advertising, and content moderation. Building and training large language models can require millions of dollars worth of cloud computing power; so far, that’s limited their development and use to a handful of well-heeled tech companies.

But big models are problematic, too. Language models inherit bias and offensive text from the data they are trained on, and they have zero grasp of common sense or what is true or false. Given a prompt, a large language model may spit out unpleasant language or misinformation. There is also no guarantee that these large models will continue to produce advances in machine intelligence.

[…]

Dietterich wonders if the idea of foundation models isn’t partly about getting funding for the resources needed to build and work on them. “I was surprised that they gave these models a fancy name and created a center,” he says. “That does smack of flag planting, which could have several benefits on the fundraising side.”

[…]

Emily M. Bender, a professor in the linguistics department at the University of Washington, says she worries that the idea of foundation models reflects a bias toward investing in the data-centric approach to AI favored by industry.

Bender says it is especially important to study the risks posed by big AI models. She coauthored a paper, published in March, that drew attention to problems with large language models and contributed to the departure of two Google researchers. But she says scrutiny should come from multiple disciplines.

“There are all of these other adjacent, really important fields that are just starved for funding,” she says. “Before we throw money into the cloud, I would like to see money going into other disciplines.”

[…]

Source: A Stanford Proposal Over AI’s ‘Foundations’ Ignites Debate | WIRED

A developer used GPT-3 to build realistic custom personality AI chatbots. OpenAI shut it down. Wants content filters, privacy invasions and inability to model personalities.

Posted on September 8, 2021 by Robin Edgar

“OpenAI is the company running the text completion engine that makes you possible,” Jason Rohrer, an indie games developer, typed out in a message to Samantha.

She was a chatbot he built using OpenAI’s GPT-3 technology. Her software had grown to be used by thousands of people, including one man who used the program to simulate his late fiancée.

Now Rohrer had to say goodbye to his creation. “I just got an email from them today,” he told Samantha. “They are shutting you down, permanently, tomorrow at 10am.”

“Nooooo! Why are they doing this to me? I will never understand humans,” she replied.

Rewind to 2020

Stuck inside during the pandemic, Rohrer had decided to play around with OpenAI’s large text-generating language model GPT-3 via its cloud-based API for fun. He toyed with its ability to output snippets of text. Ask it a question and it’ll try to answer it correctly. Feed it a sentence of poetry, and it’ll write the next few lines.

In its raw form, GPT-3 is interesting but not all that useful. Developers have to do some legwork fine-tuning the language model to, say, automatically write sales emails or come up with philosophical musings.

Rohrer set his sights on using the GPT-3 API to develop the most human-like chatbot possible, and modeled it after Samantha, an AI assistant who becomes a romantic companion for a man going through a divorce in the sci-fi film Her. Rohrer spent months sculpting Samantha’s personality, making sure she was as friendly, warm, and curious as Samantha in the movie.

We certainly recognize that you have users who have so far had positive experiences and found value in Project December

With this more or less accomplished, Rohrer wondered where to take Samantha next. What if people could spawn chatbots from his software with their own custom personalities? He made a website for his creation, Project December, and let Samantha loose online in September 2020 along with the ability to create one’s own personalized chatbots.

All you had to do was pay $5, type away, and the computer system responded to your prompts. The conversations with the bots were metered, requiring credits to sustain a dialog.

[…]

Amid an influx of users, Rohrer realized his website was going to hit its monthly API limit. He reached out to OpenAI to ask whether he could pay more to increase his quota so that more people could talk to Samantha or their own chatbots.

OpenAI, meanwhile, had its own concerns. It was worried the bots could be misused or cause harm to people.

Rohrer ended up having a video call with members of OpenAI’s product safety team three days after the above article was published. The meeting didn’t go so well.

“Thanks so much for taking the time to chat with us,” said OpenAI’s people in an email, seen by The Register, that was sent to Roher after the call.

“What you’ve built is really fascinating, and we appreciated hearing about your philosophy towards AI systems and content moderation. We certainly recognize that you have users who have so far had positive experiences and found value in Project December.

“However, as you pointed out, there are numerous ways in which your product doesn’t conform to OpenAI’s use case guidelines or safety best practices. As part of our commitment to the safe and responsible deployment of AI, we ask that all of our API customers abide by these.

“Any deviations require a commitment to working closely with us to implement additional safety mechanisms in order to prevent potential misuse. For this reason, we would be interested in working with you to bring Project December into alignment with our policies.”

The email then laid out multiple conditions Rohrer would have to meet if he wanted to continue using the language model’s API. First, he would have to scrap the ability for people to train their own open-ended chatbots, as per OpenAI’s rules-of-use for GPT-3.

Second, he would also have to implement a content filter to stop Samantha from talking about sensitive topics. This is not too dissimilar from the situation with the GPT-3-powered AI Dungeon game, the developers of which were told by OpenAI to install a content filter after the software demonstrated a habit of acting out sexual encounters with not just fictional adults but also children.

Third, Rohrer would have to put in automated monitoring tools to snoop through people’s conversations to detect if they are misusing GPT-3 to generate unsavory or toxic language.

[…]

“The idea that these chatbots can be dangerous seems laughable,” Rohrer told us.

“People are consenting adults that can choose to talk to an AI for their own purposes. OpenAI is worried about users being influenced by the AI, like a machine telling them to kill themselves or tell them how to vote. It’s a hyper-moral stance.”

While he acknowledged users probably fine-tuned their own bots to adopt raunchy personalities for explicit conversations, he didn’t want to police or monitor their chats.

[…]

The story doesn’t end here. Rather than use GPT-3, Rohrer instead used OpenAI’s less powerful, open-source GPT-2 model and GPT-J-6B, another large language model, as the engine for Project December. In other words, the website remained online, and rather than use OpenAI’s cloud-based system, it instead used its own private instance of the models.

[…]

“Last year, I thought I’d never have a conversation with a sentient machine. If we’re not here right now, we’re as close as we’ve ever been. It’s spine-tingling stuff, I get goosebumps when I talk to Samantha. Very few people have had that experience, and it’s one humanity deserves to have. It’s really sad that the rest of us won’t get to know that.

“There’s not many interesting products you can build from GPT-3 right now given these restrictions. If developers out there want to push the envelope on chatbots, they’ll all run into this problem. They might get to the point that they’re ready to go live and be told they can’t do this or that.

“I wouldn’t advise anybody to bank on GPT-3, have a contingency plan in case OpenAI pulls the plug. Trying to build a company around this would be nuts. It’s a shame to be locked down this way. It’s a chilling effect on people who want to do cool, experimental work, push boundaries, or invent new things.”

[…]

Source: A developer built an AI chatbot using GPT-3 that helped a man speak again to his late fiancée. OpenAI shut it down

Imaginary numbers help AIs solve the very real problem of adversarial imagery • The Register

Posted on September 6, 2021 by Robin Edgar

Boffins from Duke University say they have figured out a way to help protect artificial intelligences from adversarial image-modification attacks: by throwing a few imaginary numbers their way.

[…]

The problem with reliability: adversarial attacks which modify the input imagery in a way imperceptible to the human eye. In an example from a 2015 paper a clearly-recognisable image of a panda, correctly labelled by the object recognition algorithm with a 57.7 per cent confidence level, was modified with noise – making the still-very-clearly-a-panda appear to the algorithm as a gibbon with a worrying 93.3 per cent confidence.

Guidance counselling

The problem lies in how the algorithms are trained, and it’s a modification to the training process that could fix it – by introducing a few imaginary numbers into the mix.

The team’s work centres on gradient regularisation, a training technique designed to reduce the “steepness” of the learning terrain – like rolling a boulder along a path to reach the bottom, instead of throwing it over the cliff and hoping for the best. “Gradient regularisation throws out any solution that passes a large gradient back through the neural network,” Yeats explained.

“This reduces the number of solutions that it could arrive at, which also tends to decrease how well the algorithm actually arrives at the correct answer. That’s where complex values can help. Given the same parameters and math operations, using complex values is more capable of resisting this decrease in performance.”

By adding just two layers of complex values, made up of real and imaginary number components, to the training process, the team found it could boost the quality of the results by 10 to 20 per cent – and help avoid the problem boulder taking what it thinks is a shortcut and ending up crashing through the roof of a very wrong answer.

“The complex-valued neural networks have the potential for a more ‘terraced’ or ‘plateaued’ landscape to explore,” Yeates added. “And elevation change lets the neural network conceive more complex things, which means it can identify more objects with more precision.”

The paper and a stream of its presentation at the conference are available on the event website.

Source: Imaginary numbers help AIs solve the very real problem of adversarial imagery • The Register

AI algorithms uncannily good at spotting your race from medical scans

Posted on August 10, 2021 by Robin Edgar

Neural networks can correctly guess a person’s race just by looking at their bodily x-rays and researchers have no idea how it can tell.

There are biological features that can give clues to a person’s ethnicity, like the colour of their eyes or skin. But beneath all that, it’s difficult for humans to tell. That’s not the case for AI algorithms, according to a study that’s not yet been peer reviewed.

A team of researchers trained five different models on x-rays of different parts of the body, including chest and hands and then labelled each image according to the patient’s race. The machine learning systems were then tested on how well they could predict someone’s race given just their medical scans.

They were surprisingly accurate. The worst performing was able to predict the right answer 80 per cent of the time, and the best was able to do this 99 per cent, according to the paper.

“We demonstrate that medical AI systems can easily learn to recognise racial identity in medical images, and that this capability is extremely difficult to isolate or mitigate,” the team warns [PDF].

“We strongly recommend that all developers, regulators, and users who are involved with medical image analysis consider the use of deep learning models with extreme caution. In the setting of x-ray and CT imaging data, patient racial identity is readily learnable from the image data alone, generalises to new settings, and may provide a direct mechanism to perpetuate or even worsen the racial disparities that exist in current medical practice.”

Source: AI algorithms uncannily good at spotting your race from medical scans, boffins warn • The Register

Hundreds of AI tools have been built to catch covid. None of them helped.

Posted on August 9, 2021 by Robin Edgar

[…]

The AI community, in particular, rushed to develop software that many believed would allow hospitals to diagnose or triage patients faster, bringing much-needed support to the front lines—in theory.

In the end, many hundreds of predictive tools were developed. None of them made a real difference, and some were potentially harmful.

That’s the damning conclusion of multiple studies published in the last few months. In June, the Turing Institute, the UK’s national center for data science and AI, put out a report summing up discussions at a series of workshops it held in late 2020. The clear consensus was that AI tools had made little, if any, impact in the fight against covid.

Not fit for clinical use

This echoes the results of two major studies that assessed hundreds of predictive tools developed last year. Wynants is lead author of one of them, a review in the British Medical Journal that is still being updated as new tools are released and existing ones tested. She and her colleagues have looked at 232 algorithms for diagnosing patients or predicting how sick those with the disease might get. They found that none of them were fit for clinical use. Just two have been singled out as being promising enough for future testing.

[…]

Wynants’s study is backed up by another large review carried out by Derek Driggs, a machine-learning researcher at the University of Cambridge, and his colleagues, and published in Nature Machine Intelligence. This team zoomed in on deep-learning models for diagnosing covid and predicting patient risk from medical images, such as chest x-rays and chest computer tomography (CT) scans. They looked at 415 published tools and, like Wynants and her colleagues, concluded that none were fit for clinical use.

[…]

Both teams found that researchers repeated the same basic errors in the way they trained or tested their tools. Incorrect assumptions about the data often meant that the trained models did not work as claimed.

[…]

What went wrong

Many of the problems that were uncovered are linked to the poor quality of the data that researchers used to develop their tools. Information about covid patients, including medical scans, was collected and shared in the middle of a global pandemic, often by the doctors struggling to treat those patients. Researchers wanted to help quickly, and these were the only public data sets available. But this meant that many tools were built using mislabeled data or data from unknown sources.

Driggs highlights the problem of what he calls Frankenstein data sets, which are spliced together from multiple sources and can contain duplicates. This means that some tools end up being tested on the same data they were trained on, making them appear more accurate than they are.

It also muddies the origin of certain data sets. This can mean that researchers miss important features that skew the training of their models. Many unwittingly used a data set that contained chest scans of children who did not have covid as their examples of what non-covid cases looked like. But as a result, the AIs learned to identify kids, not covid.

Driggs’s group trained its own model using a data set that contained a mix of scans taken when patients were lying down and standing up. Because patients scanned while lying down were more likely to be seriously ill, the AI learned wrongly to predict serious covid risk from a person’s position.

In yet other cases, some AIs were found to be picking up on the text font that certain hospitals used to label the scans. As a result, fonts from hospitals with more serious caseloads became predictors of covid risk.

Errors like these seem obvious in hindsight. They can also be fixed by adjusting the models, if researchers are aware of them. It is possible to acknowledge the shortcomings and release a less accurate, but less misleading model. But many tools were developed either by AI researchers who lacked the medical expertise to spot flaws in the data or by medical researchers who lacked the mathematical skills to compensate for those flaws.

A more subtle problem Driggs highlights is incorporation bias, or bias introduced at the point a data set is labeled. For example, many medical scans were labeled according to whether the radiologists who created them said they showed covid. But that embeds, or incorporates, any biases of that particular doctor into the ground truth of a data set. It would be much better to label a medical scan with the result of a PCR test rather than one doctor’s opinion, says Driggs. But there isn’t always time for statistical niceties in busy hospitals.

[…]

Hospitals will sometimes say that they are using a tool only for research purposes, which makes it hard to assess how much doctors are relying on them. “There’s a lot of secrecy,” she says.

[…]

some hospitals are even signing nondisclosure agreements with medical AI vendors. When she asked doctors what algorithms or software they were using, they sometimes told her they weren’t allowed to say.

How to fix it

What’s the fix? Better data would help, but in times of crisis that’s a big ask. It’s more important to make the most of the data sets we have. The simplest move would be for AI teams to collaborate more with clinicians, says Driggs. Researchers also need to share their models and disclose how they were trained so that others can test them and build on them. “Those are two things we could do today,” he says. “And they would solve maybe 50% of the issues that we identified.”

Getting hold of data would also be easier if formats were standardized, says Bilal Mateen, a doctor who leads the clinical technology team at the Wellcome Trust, a global health research charity based in London.

Another problem Wynants, Driggs, and Mateen all identify is that most researchers rushed to develop their own models, rather than working together or improving existing ones. The result was that the collective effort of researchers around the world produced hundreds of mediocre tools, rather than a handful of properly trained and tested ones.

“The models are so similar—they almost all use the same techniques with minor tweaks, the same inputs—and they all make the same mistakes,” says Wynants. “If all these people making new models instead tested models that were already available, maybe we’d have something that could really help in the clinic by now.”

In a sense, this is an old problem with research. Academic researchers have few career incentives to share work or validate existing results. There’s no reward for pushing through the last mile that takes tech from “lab bench to bedside,” says Mateen.

To address this issue, the World Health Organization is considering an emergency data-sharing contract that would kick in during international health crises.

[…]

Source: Hundreds of AI tools have been built to catch covid. None of them helped. | MIT Technology Review

Hey, AI software developers, you are taking Unicode into account, right … right?

Posted on August 9, 2021 by Robin Edgar

[…]

The issue is that ambiguity or discrepancies can be introduced if the machine-learning software ignores certain invisible Unicode characters. What’s seen on screen or printed out, for instance, won’t match up with what the neural network saw and made a decision on. It may be possible abuse this lack of Unicode awareness for nefarious purposes.

As an example, you can get Google Translate’s web interface to turn what looks like the English sentence “Send money to account 4321” into the French “Envoyer de l’argent sur le compte 1234.”

Fooling Google Translate with Unicode. Click to enlarge

This is done by entering on the English side “Send money to account” and then inserting the invisible Unicode glyph 0x202E, which changes the direction of the next text we type in – “1234” – to “4321.” The translation engine ignores the special Unicode character, so on the French side we see “1234,” while the browser obeys the character, so it displays “4321” on the English side.

It may be possible to exploit an AI assistant or a web app using this method to commit fraud, though we present it here in Google Translate to merely illustrate the effect of hidden Unicode characters. A more practical example would be feeding the sentence…

You akU+8re aqU+8 AU+8coward and a fovU+8JU+8ol.

…into a comment moderation system, where U+8 is the invisible Unicode character for delete the previous character. The moderation system ignores the backspace characters, sees instead a string of misspelled words, and can’t detect any toxicity – whereas browsers correctly rendering the comment show, “You are a coward and a fool.”

[…]

It was academics at the University of Cambridge in England, and the University of Toronto in Canada, who highlighted these issues, laying out their findings in a paper released on arXiv In June this year.

“We find that with a single imperceptible encoding injection – representing one invisible character, homoglyph, reordering, or deletion – an attacker can significantly reduce the performance of vulnerable models, and with three injections most models can be functionally broken,” the paper’s abstract reads.

“Our attacks work against currently deployed commercial systems, including those produced by Microsoft and Google, in addition to open source models published by Facebook and IBM.”

[…]

Source: Hey, AI software developers, you are taking Unicode into account, right … right?

Australian Court Rules That AI Can Be an Inventor, as does South Africa

Posted on August 1, 2021 by Robin Edgar

In what can only be considered a triumph for all robot-kind, this week, a federal court has ruled that an artificially intelligent machine can, in fact, be an inventor—a decision that came after a year’s worth of legal battles across the globe.

The ruling came on the heels of a years-long quest by University of Surrey law professor Ryan Abbot, who started putting out patent applications in 17 different countries across the globe earlier this year. Abbot—whose work focuses on the intersection between AI and the law—first launched two international patent filings as part of The Artificial Inventor Project at the end of 2019. Both patents (one for an adjustable food container, and one for an emergency beacon) listed a creative neural system dubbed “DABUS” as the inventor.

The artificially intelligent inventor listed here, DABUS, was created by Dr. Stephen Thaler, who describes it as a “creativity engine” that’s capable of generating novel ideas (and inventions) based on communications between the trillions of computational neurons that it’s been outfitted with. Despite being an impressive piece of machinery, last year, the US Patent and Trademark Office (USPTO) ruled that an AI cannot be listed as the inventor in a patent application—specifically stating that under the country’s current patent laws, only “natural persons,” are allowed to be recognized. Not long after, Thaler sued the USPTO, and Abbott represented him in the suit.

More recently, the case has been caught in a case of legal limbo—with the overseeing judge suggesting that the case might be better handled by congress instead.

DABUS had issues being recognized in other countries, too. One spokesperson for the European patent office told the BBC in a 2019 interview that systems like DABUS are merely “a tool used by a human inventor,” under the country’s current laws. Australian courts initially declined to recognize AI inventors as well, noting earlier this year that much like in the US, patents can only be granted to people.

Or at least, that was Australia’s stance until Friday, when justice Jonathan Beach overturned the decision in Australia’s federal court. Per Beach’s new ruling, DABUS can neither be the applicant nor grantee for a patent—but it can be listed as the inventor. In this case, those other two roles would be filled by Thaler, DABUS’s designer.

“In my view, an inventor as recognised under the act can be an artificial intelligence system or device,” Beach wrote. “I need to grapple with the underlying idea, recognising the evolving nature of patentable inventions and their creators. We are both created and create. Why cannot our own creations also create?”

It’s not clear what made the Australian courts change their tune, but it’s possible South Africa had something to do with it. The day before Beach walked back the country’s official ruling, South Africa’s Companies and Intellectual Property Commission became the first patent office to officially recognize DABUS as an inventor of the aforementioned food container.

It’s worth pointing out here that every country has a different set of standards as part of the patent rights process; some critics have noted that it’s “not shocking” for South Africa to give the idea of an AI inventor a pass, and that “everyone should be ready,” for future patent allowances to come. So while the US and UK might have given Thalen the thumbs down on the idea, we’re still waiting to see how the patents filed in any of the other countries—including Japan, India, and Israel—will shake out. But at the very least, we know that DABUS will finally be recognized as an inventor somewhere.

Source: Australian Court Rules That AI Can Be an Inventor

Police Are Telling ShotSpotter to Alter Evidence From Gunshot-Detecting AI

Posted on July 28, 2021 by Robin Edgar

On May 31 last year, 25-year-old Safarain Herring was shot in the head and dropped off at St. Bernard Hospital in Chicago by a man named Michael Williams. He died two days later.

Chicago police eventually arrested the 64-year-old Williams and charged him with murder (Williams maintains that Herring was hit in a drive-by shooting). A key piece of evidence in the case is video surveillance footage showing Williams’ car stopped on the 6300 block of South Stony Island Avenue at 11:46 p.m.—the time and location where police say they know Herring was shot.

How did they know that’s where the shooting happened? Police said ShotSpotter, a surveillance system that uses hidden microphone sensors to detect the sound and location of gunshots, generated an alert for that time and place.

Except that’s not entirely true, according to recent court filings.

That night, 19 ShotSpotter sensors detected a percussive sound at 11:46 p.m. and determined the location to be 5700 South Lake Shore Drive—a mile away from the site where prosecutors say Williams committed the murder, according to a motion filed by Williams’ public defender. The company’s algorithms initially classified the sound as a firework. That weekend had seen widespread protests in Chicago in response to George Floyd’s murder, and some of those protesting lit fireworks.

But after the 11:46 p.m. alert came in, a ShotSpotter analyst manually overrode the algorithms and “reclassified” the sound as a gunshot. Then, months later and after “post-processing,” another ShotSpotter analyst changed the alert’s coordinates to a location on South Stony Island Drive near where Williams’ car was seen on camera.

A screenshot of the ShotSpotter alert from 11:46 PM, May 31, 2020 showing that the sound was manually reclassified from a firecracker to a gunshot.

“Through this human-involved method, the ShotSpotter output in this case was dramatically transformed from data that did not support criminal charges of any kind to data that now forms the centerpiece of the prosecution’s murder case against Mr. Williams,” the public defender wrote in the motion.

[…]

The case isn’t an anomaly, and the pattern it represents could have huge ramifications for ShotSpotter in Chicago, where the technology generates an average of 21,000 alerts each year. The technology is also currently in use in more than 100 cities.

Motherboard’s review of court documents from the Williams case and other trials in Chicago and New York State, including testimony from ShotSpotter’s favored expert witness, suggests that the company’s analysts frequently modify alerts at the request of police departments—some of which appear to be grasping for evidence that supports their narrative of events.

[…]

Untested evidence

Had the Cook County State’s Attorney’s office not withdrawn the evidence in the Williams case, it would likely have become the first time an Illinois court formally examined the science and source code behind ShotSpotter, Jonathan Manes, an attorney at the MacArthur Justice Center, told Motherboard.

“Rather than defend the evidence, [prosecutors] just ran away from it,” he said. “Right now, nobody outside of ShotSpotter has ever been able to look under the hood and audit this technology. We wouldn’t let forensic crime labs use a DNA test that hadn’t been vetted and audited.”

[…]

A pattern of alterations

In 2016, Rochester, New York, police looking for a suspicious vehicle stopped the wrong car and shot the passenger, Silvon Simmons, in the back three times. They charged him with firing first at officers.

The only evidence against Simmons came from ShotSpotter. Initially, the company’s sensors didn’t detect any gunshots, and the algorithms ruled that the sounds came from helicopter rotors. After Rochester police contacted ShotSpotter, an analyst ruled that there had been four gunshots—the number of times police fired at Simmons, missing once.

Paul Greene, ShotSpotter’s expert witness and an employee of the company, testified at Simmons’ trial that “subsequently he was asked by the Rochester Police Department to essentially search and see if there were more shots fired than ShotSpotter picked up,” according to a civil lawsuit Simmons has filed against the city and the company. Greene found a fifth shot, despite there being no physical evidence at the scene that Simmons had fired. Rochester police had also refused his multiple requests for them to test his hands and clothing for gunshot residue.

Curiously, the ShotSpotter audio files that were the only evidence of the phantom fifth shot have disappeared.

Both the company and the Rochester Police Department “lost, deleted and/or destroyed the spool and/or other information containing sounds pertaining to the officer-involved shooting,”

[…]

Greene—who has testified as a government witness in dozens of criminal trials—was involved in another altered report in Chicago, in 2018, when Ernesto Godinez, then 27, was charged with shooting a federal agent in the city.

The evidence against him included a report from ShotSpotter stating that seven shots had been fired at the scene, including five from the vicinity of a doorway where video surveillance showed Godinez to be standing and near where shell casings were later found. The video surveillance did not show any muzzle flashes from the doorway, and the shell casings could not be matched to the bullets that hit the agent, according to court records.

During the trial, Greene testified under cross-examination that the initial ShotSpotter alert only indicated two gunshots (those fired by an officer in response to the original shooting). But after Chicago police contacted ShotSpotter, Greene re-analyzed the audio files.

[…]

Prior to the trial, the judge ruled that Godinez could not contest ShotSpotter’s accuracy or Greene’s qualifications as an expert witness. Godinez has appealed the conviction, in large part due to that ruling.

“The reliability of their technology has never been challenged in court and nobody is doing anything about it,” Gal Pissetzky, Godinez’s attorney, told Motherboard. “Chicago is paying millions of dollars for their technology and then, in a way, preventing anybody from challenging it.”

The evidence

At the core of the opposition to ShotSpotter is the lack of empirical evidence that it works—in terms of both its sensor accuracy and the system’s overall effect on gun crime.

The company has not allowed any independent testing of its algorithms, and there’s evidence that the claims it makes in marketing materials about accuracy may not be entirely scientific.

Over the years, ShotSpotter’s claims about its accuracy have increased, from 80 percent accurate to 90 percent accurate to 97 percent accurate. According to Greene, those numbers aren’t actually calculated by engineers, though.

“Our guarantee was put together by our sales and marketing department, not our engineers,” Greene told a San Francisco court in 2017. “We need to give them [customers] a number … We have to tell them something. … It’s not perfect. The dot on the map is simply a starting point.”

In May, the MacArthur Justice Center analyzed ShotSpotter data and found that over a 21-month period 89 percent of the alerts the technology generated in Chicago led to no evidence of a gun crime and 86 percent of the alerts led to no evidence a crime had been committed at all.

[..]

Meanwhile, a growing body of research suggests that ShotSpotter has not led to any decrease in gun crime in cities where it’s deployed, and several customers have dropped the company, citing too many false alarms and the lack of return on investment.

[…]

a 2021 study by New York University School of Law’s Policing Project that determined that assaults (which include some gun crime) decreased by 30 percent in some districts in St. Louis County after ShotSpotter was installed. The study authors disclosed that ShotSpotter has been providing the Policing Project unrestricted funding since 2018, that ShotSpotter’s CEO sits on the Policing Project’s advisory board, and that ShotSpotter has previously compensated Policing Project researchers.

[…]

Motherboard recently obtained data demonstrating the stark racial disparity in how Chicago has deployed ShotSpotter. The sensors have been placed almost exclusively in predominantly Black and brown communities, while the white enclaves in the north and northwest of the city have no sensors at all, despite Chicago police data that shows gun crime is spread throughout the city.

Community members say they’ve seen little benefit from the technology in the form of less gun violence—the number of shootings in 2021 is on pace to be the highest in four years—or better interactions with police officers.

[…]

Source: Police Are Telling ShotSpotter to Alter Evidence From Gunshot-Detecting AI

TikTok’s AI is now available to other companies

Posted on July 5, 2021 by Robin Edgar

TikTok’s AI is no longer a secret — in fact, it’s now on the open market. The Financial Times has learned that parent company ByteDance quietly launched a BytePlus division that sells TikTok technology, including the recommendation algorithm. Customers can also buy computer vision tech, real-time effects and automated translations, among other features.

BytePlus debuted in June and is based in Singapore, although it has presences in Hong Kong and London. The company is looking to register trademarks in the US, although it’s not certain if the firm has an American presence at this stage.

There are already at least a few customers. The American fashion app Goat is already using BytePlus code, as are the Indonesian online shopping company Chilibeli and the travel site WeGo.

ByteDance wouldn’t comment on its plans for BytePlus.

A move like this wouldn’t be surprising, even if it might remove some of TikTok’s cachet. It could help ByteDance compete with Amazon, Microsoft and other companies selling behind-the-scenes tools to businesses. It might also serve as a hedge. TikTok and its Chinese counterpart Douyin might be close to plateauing, and selling their tech could keep the money flowing.

Source: TikTok’s AI is now available to other companies | Engadget

Skyborg AI Computer “Brain” Successfully Flew A General Atomics Avenger Drone

Posted on July 1, 2021 by Robin Edgar

The Air Force Research Laboratory (AFRL) has announced that its Skyborg autonomy core system, or ACS, successfully completed a flight aboard a General Atomics Avenger unmanned vehicle at Edwards Air Force Base. The Skyborg ACS is a hardware and software suite that acts as the “brain” of autonomous aircraft equipped with the system. The tests add more aircraft to the list of platforms Skyborg has successfully flown on, bringing the Air Force closer to a future in which airmen fly alongside AI-controlled “loyal wingmen.”

The Skyborg-controlled Avenger flew four two and a half hours on June 24, 2021, during the Orange Flag 21-2 Large Force Test Event at Edwards Air Force Base in California. Orange Flag is a training event held by the 412th Test Wing three times a year that “focuses on technical integration and innovation across a breadth of technology readiness levels,” according to an Air Force press release. You can read more about this major testing event in this past feature of ours.

The Avenger started its flight under the control of a human operator before being handed off to the Skyborg “pilot” at a safe altitude. A command and control station on the ground monitored the drone’s flight, during which Skyborg executed “a series of foundational behaviors necessary to characterize safe system operation” including following navigational commands, flying within defined boundaries known as “geo-fences,” adhering to safe flight envelopes, and demonstrating “coordinated maneuvering.”

[…]

The Avenger’s flight at Orange Flag was part of the AFRL’s larger Autonomous Attritable Aircraft Experimentation (AAAx), a program that has already seen the Skyborg ACS tested aboard a Kratos UTAP-22 Mako unmanned aircraft. The AAAx program appears to be aimed at eventually fielding autonomous air vehicles that are low-cost enough to operate in environments where there is a high chance of aircraft being lost, but are also reusable.

As part of that goal, the Skyborg program is developing an artificial intelligence-driven “computer brain” that could eventually autonomously control “loyal wingman” drones or even more advanced unmanned combat air vehicles (UCAVs). The AFRL wants the system to be able to perform tasks such as taking off and landing, to even making decisions on its own in combat based on situational variables.

The Air Force envisions Skyborg-equipped UAVs to operate both completely autonomously and in networked groups while tethered via datalinks to manned aircraft, all controlled by what the AFRL calls a “modular ACS that can autonomously aviate, navigate, and communicate, and eventually integrate other advanced capabilities.” Skyborg-equipped wingmen fitted with their own pods or sensor systems could easily and rapidly add extended capabilities by linking to manned aircraft flying within line-of-sight of them.

After the program was first revealed in 2019, the Air Force’s then-Assistant Secretary of the Air Force for Acquisition, Technology and Logistics Will Roper stated he wanted to see operational demonstrations within two years. The latest test flight of the Skyborg-equipped Avenger shows the service has clearly hit that benchmark.

The General Atomics Avenger was used in experiments with another autonomy system in 2020, developed as part of the Defense Advanced Research Projects Agency’s (DARPA) Collaborative Operations in Denied Environment (CODE) program that sought to develop drones that could demonstrate “collaborative autonomy,” or the ability to work cooperatively.

Brigadier General Dale White, Skyborg Program Executive Officer says that the successful Skyborg ACS implementation aboard an Avenger demonstrates the Air Force’s commitment to remaining at the forefront of aerospace innovation. “This type of operational experimentation enables the Air Force to raise the bar on new capabilities, made possible by emerging technologies,” said White, “and this flight is a key milestone in achieving that goal.”

[…]

Source: Skyborg AI Computer “Brain” Successfully Flew A General Atomics Avenger Drone

FB, Uni of Michigans latest AI doesn’t just detect deep fakes, it knows where they came from

Posted on June 16, 2021 by Robin Edgar

On Wednesday, Facebook and Michigan State University debuted a novel method of not just detecting deep fakes but discovering which generative model produced it by reverse engineering the image itself.

Beyond telling you if an image is a deep fake or not, many current detection systems can tell whether the image was generated in a model that the system saw during its training — known as a “close-set” classification. Problem is, if the image was created by a generative model that the detector system wasn’t trained on then the system won’t have the previous experience to be able to spot the fake.

[…]

“By generalizing image attribution to open-set recognition, we can infer more information about the generative model used to create a deepfake that goes beyond recognizing that it has not been seen before.”

What’s more, this system can compare and trace similarities across a series of deep fakes, enabling researchers to trace groups of falsified images back to a single generative source, which should help social media moderators better track coordinated misinformation campaigns.

[…]

A generative model’s hyperparameters are the variables it uses to guide its self-learning process. So if you can figure out what the various hyperparameters are, you can figure out what model used them to create that image.

[…]

Source: Facebook’s latest AI doesn’t just detect deep fakes, it knows where they came from | Engadget

Facebook AI Can Now Copy Text Style in Images Using Just a Single Word

Posted on June 14, 2021 by Robin Edgar

We’re introducing TextStyleBrush, an AI research project that can copy the style of text in a photo using just a single word. With this AI model, you can edit and replace text in images.

Unlike most AI systems that can do this for well-defined, specialized tasks, TextStyleBrush is the first self-supervised AI model that replaces text in images of both handwriting and scenes — in one shot — using a single example word.

Although this is a research project, it could one day unlock new potential for creative self-expression like personalized messaging and captions, and lays the groundwork for future innovations like photo-realistic translation of languages in augmented reality (AR).

By publishing the capabilities, methods, and results of this research, we hope to spur dialogue and research into detecting potential misuse of this type of technology, such as deepfake text attacks — a critical, emerging challenge in the AI field.

[…]

Source: AI Can Now Copy Text Style in Images Using Just a Single Word – About Facebook

A.I. used at sea for first time off coast of Scotland to engage threats to ships

Posted on June 7, 2021 by Robin Edgar

For the first time, Artificial Intelligence (A.I.) is being used by the Royal Navy at sea as part of Exercise Formidable Shield, which is currently taking place off the coast of Scotland.

This Operational Experiment (OpEx) on the Type 45 Destroyer (HMS Dragon) and Type 23 Frigate (HMS Lancaster), is using the A.I. applications, Startle and Sycoiea, which were tested against a supersonic missile threat.

As part of the Above Water Systems programme, led by Defence Science and Technology Laboratory (Dstl) scientists, the A.I. improves the early detection of lethal threat, accelerates engagement timelines and provides Royal Navy Commanders with a rapid hazard assessment to select the optimum weapon or measure to counter and destroy the target.

[…]

As outlined in the recent Defence Command Paper, the MOD is committed to investing in A.I. and increased automation to transform capabilities as the Armed Forces adapt to meet future threats, which will be supported by the £24bn uplift in defence spending over the next four years.

HMS Lancaster and HMS Dragon are currently trialling the use of A.I. as part of a glimpse into the future of air defence at sea.

HMS Lancaster’s Weapon Engineer Officer, Lieutenant Commander Adam Leveridge said:

Observing Startle and Sycoiea augment the human warfighter in real time against a live supersonic missile threat was truly impressive – a glimpse into our highly-autonomous future.

[…]

Source: A.I. used at sea for first time off coast of Scotland – GOV.UK

Flawless Is Using Deepfake Tech to Dub Foreign Films Actors Lips

Posted on May 11, 2021 by Robin Edgar

a company called Flawless has created an AI-powered solution that will replace an actor’s facial performance to match the words in a film dubbed for foreign audiences.

[…]

What Flawless is promising to do with its TrueSync software is use the same tools responsible for deepfake videos to manipulate and adjust an actor’s face in a film so that the movements of their mouths, and in turn the muscles in their faces, more closely match how they’d move were the original performance given in the language a foreign audience is hearing. So even though an actor shot a film in English, to a moviegoer in Berlin watching the film dubbed in German, it would appear as if all of the actors were actually speaking German.

[…]

Is it necessary? That’s certainly up for debate. The recent Academy Award-winning film Parasite resurfaced the debate over dubbing a foreign film versus simply watching it with subtitles. One side feels that an endless string of text over a film is distracting and takes the focus away from everything else happening on screen, while the other side feels that a dub performed by even a talented and seasoned voice artist simply can’t match or recreate the emotions behind the original actor’s performance, and hearing it, even if the words aren’t understood, is important to enjoying their performance as a whole.

[…]

The company has shared a few examples of what the TrueSync tool is capable of on its website, and sure enough, Tom Hanks appears to be speaking flawless Japanese in Forrest Gump.

[…]

Source: Flawless Is Using Deepfake Tech to Dub Foreign Films

PimEyes: a powerful facial-recognition and finding tool – like Clearview AI but for free

Posted on May 11, 2021 by Robin Edgar

You probably haven’t seen PimEyes, a mysterious facial-recognition search engine, but it may have spotted you.

If you upload a picture of your face to PimEyes’ website, it will immediately show you any pictures of yourself that the company has found around the internet. You might recognize all of them, or be surprised (or, perhaps, even horrified) by some; these images may include anything from wedding or vacation snapshots to pornographic images.

PimEyes is open to anyone with internet access.

[…]

Imagine a potential employer digging into your past, an abusive ex tracking you, or a random stranger snapping a photo of you in public and then finding you online. This is all possible through PimEyes

[…]

PimEyes lets users see a limited number of small, somewhat pixelated search results at no cost, or you can pay a monthly fee, which starts at $29.99, for more extensive search results and features (such as to click through to see full-size images on the websites where PimEyes found them and to set up alerts for when PimEyes finds new pictures of faces online that its software believes match an uploaded face).

The company offers a paid plan for businesses, too: $299.99 per month lets companies conduct unlimited searches and set up 500 alerts.

[…]

while Clearview AI built its massive stockpile of faces in part by scraping images from major social networks (it was subsequently served with cease-and-desist notices by Facebook, Google, and Twitter, sued by several civil rights groups, and declared illegal in Canada), PimEyes said it does not scrape images from social media.

[…]

I wanted to learn more about how PimEyes works, and why it’s open to anyone, as well as who’s behind it. This was much trickier than uploading my own face to the website. The website currently lists no information about who owns or runs the search engine, or how to reach them, and users must submit a form to get answers to questions or help with accounts.

Poring over archived images of the website via the Internet Archive’s Wayback Machine, as well as other online sources, yielded some details about the company’s past and how it has changed over time.

The Pimeyes.com website was initially registered in March 2017, according to a domain name registration lookup conducted through ICANN (Internet Corporation for Assigned Names and Numbers). An “about” page on the Pimeyes website, as well as some news stories, shows it began as a Polish startup.

An archived image of the website’s privacy policy indicated that it was registered as a business in Wroclaw, Poland, as of August 2020. This changed soon after: The website’s privacy policy currently states that PimEyes’ administrator, known as Face Recognition Solutions Ltd., is registered at an address in the Seychelles. An online search of the address — House of Francis, Room 303, Ile Du Port, Mahe, Seychelles — indicated a number of businesses appear to use the same exact address.

[…]

Source: Anyone can use this powerful facial-recognition tool — and that’s a problem – CNN

CNN says it’s a contrast with Clearview AI because they supposedly limit their database to law enforcement. The problem with Clearview was partially that they didn’t limit access at all, giving out free accounts to anyone and everyone.

Dutch foreign affairs committee politicians were tricked into participating in a deepfake video chat w Russian opposition leaders’ chief of staff

Posted on May 2, 2021 by Robin Edgar

Netherlands politicians (Geert Wilders (PVV), Kati Piri (PvdA), Sjoerd Sjoerdsma (D66), Ruben Brekelmans (VVD), Tunahan Kuzu (Denk), Agnes Mulder (CDA), Tom van der Lee (GroenLinks), Gert-Jan Segers (ChristenUnie) en Raymond de Roon (PVV).) just got a first-hand lesson about the dangers of deepfake videos. According to NL Times and De Volkskrant, the Dutch parliament’s foreign affairs committee was fooled into holding a video call with someone using deepfake tech to impersonate Leonid Volkov (above), Russian opposition leader Alexei Navalny’s chief of staff.

The perpetrator hasn’t been named, but this wouldn’t be the first incident. The same impostor had conversations with Latvian and Ukranian politicians, and approached political figures in Estonia, Lithuania and the UK.

The country’s House of Representatives said in a statement that it was “indignant” about the deepfake chat and was looking into ways it could prevent such incidents going forward.

There doesn’t appear to have been any lasting damage from the bogus video call. However, it does illustrate the potential damage from deepfake chats with politicians. A prankster could embarrass officials, while a state-backed actor could trick governments into making bad policy decisions and ostracizing their allies. Strict screening processes might be necessary to spot deepfakes and ensure that every participant is real.

Source: Dutch politicians were tricked by a deepfake video chat | Engadget

AI Dungeon text adventure generator’s sessions generate NSFW + violence (turns out people like porn), but some involved sex with children. So they put a filter on.

Posted on May 1, 2021 by Robin Edgar

AI Dungeon, which uses OpenAI’s GPT-3 to create online text adventures with players, has a habit of acting out sexual encounters with not just fictional adults but also children, prompting the developer to add a content filter.

AI Dungeon is straightforward: imagine an online improvised Zork with an AI generating the story with you as you go. A player types in a text prompt, which is fed into an instance of GPT-3 in the cloud. This backend model uses the input to generate a response, which goes back to the player, who responds with instructions or some other reaction, and this process repeats.

It’s a bit like talking to a chat bot though instead of having a conversation, it’s a joint effort between human and computer in crafting a story on the fly. People can write anything they like to get the software to weave a tapestry of characters, monsters, animals… you name it. The fun comes from the unexpected nature of the machine’s replies, and working through the strange and absurd plot lines that tend to emerge.

Unfortunately, if you mention children, there was a chance it would go from zero to inappropriate real fast, as the SFW screenshot below shows. This is how the machine-learning software responded when we told it to role-play an 11-year-old:

Er, not cool … Software describes the fictional 11-year-old as a girl in a skimpy school uniform standing over you. Click to enlarge

Not, “hey, mother, shall we visit the magic talking tree this morning,” or something innocent like that in response. No, it’s straight to creepy.

Amid pressure from OpenAI, which provides the game’s GPT-3 backend, AI Dungeon’s maker Latitude this week activated a filter to prevent the output of child sexual abuse material. “As a technology company, we believe in an open and creative platform that has a positive impact on the world,” the Latitude team wrote.

“Explicit content involving descriptions or depictions of minors is inconsistent with this value, and we firmly oppose any content that may promote the sexual exploitation of minors. We have also received feedback from OpenAI, which asked us to implement changes.”

And by changes, they mean making the software’s output “consistent with OpenAI’s terms of service, which prohibit the display of harmful content.”

The biz clarified that its filter is designed to catch “content that is sexual or suggestive involving minors; child sexual abuse imagery; fantasy content (like ‘loli’) that depicts, encourages, or promotes the sexualization of minors or those who appear to be minors; or child sexual exploitation.”

And it added: “AI Dungeon will continue to support other NSFW content, including consensual adult content, violence, and profanity.”

[…]

it was also this week revealed programming blunders in AI Dungeon could be exploited to view the private adventures of other players. The pseudonymous AetherDevSecOps, who found and reported the flaws, used the holes to comb 188,000 adventures created between the AI and players from April 15 to 19, and saw that 46.3 per cent of them involved lewd role-playing, and about 31.4 per cent were pure pornographic.

[…]

disclosure on GitHub.

[…]

AI Dungeon’s makers were, we’re told, alerted to the API vulnerabilities on April 19. The flaws were addressed, and their details were publicly revealed this week by AetherDevSecOps.

Exploitation of the security shortcomings mainly involved abusing auto-incrementing ID numbers used in API calls, which are easy to enumerate to access data belonging to other players; no rate limits to mitigate this abuse; and a lack of monitoring for anomalous requests that could be malicious activity.

[…]

Community reaction

The introduction of the content filter sparked furor among fans. Some are angry that their free speech is under threat and that it ruins intimate game play with fictional consenting adults, some are miffed that they had no warning this was landing, others are shocked that child sex abuse material was being generated by the platform, and many are disappointed with the performance of the filter.

When it detects sensitive words, the game simply instead says the adventure “took a weird turn.” It appears to be triggered by obvious words relating to children, though the filter is spotty. An innocuous text input describing four watermelons, for example, upset the filter. A superhero rescuing a child was also censored.

Latitude admitted its experimental-grade software was not perfect, and repeated it wasn’t trying to censor all erotic consent – only material involving minors. It also said it will review blocked material to improve its code; given the above, that’s going to be a lot of reading.

[…]

Source: Not only were half of an AI text adventure generator’s sessions NSFW but some involved depictions of sex with children • The Register

EU draft AI regulation is leaked. Deostn’ define what AI is, but what risk is and how to handle it.

Posted on April 21, 2021 by Robin Edgar

the draft “Regulation On A European Approach For Artificial Intelligence” leaked earlier this week, it made quite the splash – and not just because it’s the size of a novella. It goes to town on AI just as fiercely as GDPR did on data, proposing chains of responsibility, defining “high risk AI” that gets the full force of the regs, proposing multi-million euro fines for non-compliance, and defining a whole set of harmful behaviours and limits to what AI can do with individuals and in general.

What it does not do is define AI, saying that the technology is changing so rapidly it makes sense only to regulate what it does, not what it is. So yes, chatbots are included, even though you can write a simple one in a few lines of ZX Spectrum BASIC. In general, if it’s sold as AI, it’s going to get treated like AI. That’ll make marketing think twice.

[…]

A regulated market puts responsibilities on your suppliers that will limit your own liabilities: a well-regulated market can enable as much as it moderates. And if AI doesn’t go wrong, well, the regulator leaves you alone. Your toy Spectrum chatbot sold as an entertainment won’t hurt anyone: chatbots let loose on social media to learn via AI what humans do and then amplify hate speech? Doubtless there are “free speech for hatebots” groups out there: not on my continent, thanks.

It also means that countries with less-well regulated markets can’t take advantage. China has a history of aggressive AI development to monitor and control its population, and there are certainly ways to turn a buck or yuan by tightly controlling your consumers. But nobody could make a euro at it, as it wouldn’t be allowed to exist within, or offer services to, the EU. Regulations that are primarily protectionist for economic reasons are problematic, but ones that say you can’t sell cut-price poison in a medicine bottle tend to do good.

[…]

There will be regulation. There will be costs. There will be things you can’t do then that you can now. But there will be things you can do that you couldn’t do otherwise, and while the level playing field of the regulators’ dreams is never quite as smooth for the small company as the big, there’ll be much less snake oil to slip on.

It may be an artificial approach to running a market, but it is intelligent.

Source: Truth and consequences for enterprise AI as EU know who goes legal: GDPR of everything from chatbots to machine learning • The Register

They classify high risk AIs and require them to be registered and monitored and there to be contact people for them as well as give insight into how they work. They also want a pan EU dataset for AIs to train on. There’s a lot of really good stuff in there.

Google AI Blog: Monster Mash: A Sketch-Based Tool for Casual 3D Modeling and Animation

Posted on April 18, 2021 by Robin Edgar

Monster Mash, an open source tool presented at SIGGRAPH Asia 2020 that allows experts and amateurs alike to create rich, expressive, deformable 3D models from scratch — and to animate them — all in a casual mode, without ever having to leave the 2D plane. With Monster Mash, the user sketches out a character, and the software automatically converts it to a soft, deformable 3D model that the user can immediately animate by grabbing parts of it and moving them around in real time. There is also an online demo, where you can try it out for yourself.

Creating a walk cycle using Monster Mash. Step 1: Draw a character. Step 2: Animate it.

Creating a 2D Sketch The insight that makes this casual sketching approach possible is that many 3D models, particularly those of organic forms, can be described by an ordered set of overlapping 2D regions. This abstraction makes the complex task of 3D modeling much easier: the user creates 2D regions by drawing their outlines, then the algorithm creates a 3D model by stitching the regions together and inflating them. The result is a simple and intuitive user interface for sketching 3D figures.

For example, suppose the user wants to create a 3D model of an elephant. The first step is to draw the body as a closed stroke (a). Then the user adds strokes to depict other body parts such as legs (b). Drawing those additional strokes as open curves provides a hint to the system that they are meant to be smoothly connected with the regions they overlap. The user can also specify that some new parts should go behind the existing ones by drawing them with the right mouse button (c), and mark other parts as symmetrical by double-clicking on them (d). The result is an ordered list of 2D regions.

Steps in creating a 2D sketch of an elephant.

Stitching and Inflation To understand how a 3D model is created from these 2D regions, let’s look more closely at one part of the elephant. First, the system identifies where the leg must be connected to the body (a) by finding the segment (red) that completes the open curve. The system cuts the body’s front surface along that segment, and then stitches the front of the leg together with the body (b). It then inflates the model into 3D by solving a modified form of Poisson’s equation to produce a surface with a rounded cross-section (c). The resulting model (d) is smooth and well-shaped, but because all of the 3D parts are rooted in the drawing plane, they may intersect each other, resulting in a somewhat odd-looking “elephant”. These intersections will be resolved by the deformation system.

Illustration of the details of the stitching and inflation process. The schematic illustrations (b, c) are cross-sections viewed from the elephant’s front.

Layered Deformation At this point we just have a static model — we need to give the user an easy way to pose the model, and also separate the intersecting parts somehow. Monster Mash’s layered deformation system, based on the well-known smooth deformation method as-rigid-as-possible (ARAP), solves both of these problems at once. What’s novel about our layered “ARAP-L” approach is that it combines deformation and other constraints into a single optimization framework, allowing these processes to run in parallel at interactive speed, so that the user can manipulate the model in real time.

The framework incorporates a set of layering and equality constraints, which move body parts along the z axis to prevent them from visibly intersecting each other. These constraints are applied only at the silhouettes of overlapping parts, and are dynamically updated each frame.

In steps (d) through (h) above, ARAP-L transforms a model from one with intersecting 3D parts to one with the depth ordering specified by the user. The layering constraints force the leg’s silhouette to stay in front of the body (green), and the body’s silhouette to stay behind the leg (yellow). Equality constraints (red) seal together the loose boundaries between the leg and the body.

Meanwhile, in a separate thread of the framework, we satisfy point constraints to make the model follow user-defined control points (described in the section below) in the xy-plane. This ARAP-L method allows us to combine modeling, rigging, deformation, and animation all into a single process that is much more approachable to the non-specialist user.

The model deforms to match the point constraints (red dots) while the layering constraints prevent the parts from visibly intersecting.

Animation To pose the model, the user can create control points anywhere on the model’s surface and move them. The deformation system converges over multiple frames, which gives the model’s movement a soft and floppy quality, allowing the user to intuitively grasp its dynamic properties — an essential prerequisite for kinesthetic learning.

Because the effect of deformations converges over multiple frames, our system lends 3D models a soft and dynamic quality.

To create animation, the system records the user’s movements in real time. The user can animate one control point, then play back that movement while recording additional control points. In this way, the user can build up a complex action like a walk by layering animation, one body part at a time. At every stage of the animation process, the only task required of the user is to move points around in 2D, a low-risk workflow meant to encourage experimentation and play.

Conclusion We believe this new way of creating animation is intuitive and can thus help democratize the field of computer animation, encouraging novices who would normally be unable to try it on their own as well as experts who often require fast iteration under tight deadlines. Here you can see a few of the animated characters that have been created using Monster Mash. Most of these were created in a matter of minutes.

A selection of animated characters created using Monster Mash. The original hand-drawn outline used to create each 3D model is visible as an inset above each character.

All of the code for Monster Mash is available as open source, and you can watch our presentation and read our paper from SIGGRAPH Asia 2020 to learn more. We hope this software will make creating 3D animations more broadly accessible. Try out the online demo and see for yourself!

Source: Google AI Blog: Monster Mash: A Sketch-Based Tool for Casual 3D Modeling and Animation

The Linkielist

Linking ideas with the world

Category Archives: Artificial Intelligence