Meta’s Voicebox AI does text-to-speech without huge training data per voice

Meta has unveiled Voicebox, its generative text-to-speech model that promises to do for the spoken word what ChatGPT and Dall-E, respectfully, did for text and image generation.

Essentially, its a text-to-output generator just like GPT or Dall-E — just instead of creating prose or pretty pictures, it spits out audio clips. Meta defines the system as “a non-autoregressive flow-matching model trained to infill speech, given audio context and text.” It’s been trained on more than 50,000 hours of unfiltered audio. Specifically, Meta used recorded speech and transcripts from a bunch of public domain audiobooks written in English, French, Spanish, German, Polish, and Portuguese.

That diverse data set allows the system to generate more conversational sounding speech, regardless of the languages spoken by each party, according to the researchers. “Our results show that speech recognition models trained on Voicebox-generated synthetic speech perform almost as well as models trained on real speech.” What’s more the computer generated speech performed with just a 1 percent error rate degradation, compared to the 45 to 70 percent drop-off seen with existing TTS models.

The system was first taught to predict speech segments based on the segments around them as well as the passage’s transcript. “Having learned to infill speech from context, the model can then apply this across speech generation tasks, including generating portions in the middle of an audio recording without having to recreate the entire input,” the Meta researchers explained.

[…]

Text-to-Speech generators haver been around for a minute — they’re how your parents’ TomToms were able to give dodgy driving directions in Morgan Freeman’s voice. Modern iterations like Speechify or Elevenlab’s Prime Voice AI are far more capable but they still largely require mountains of source material in order to properly mimic their subject — and then another mountain of different data for every. single. other. subject you want it trained on.

Voicebox doesn’t, thanks to a novel new zero-shot text-to-speech training method Meta calls Flow Matching. The benchmark results aren’t even close as Meta’s AI reportedly outperformed the current state of the art both in intelligibility (a 1.9 percent word error rate vs 5.9 percent) and “audio similarity” (a composite score of 0.681 to the SOA’s 0.580), all while operating as much as 20 times faster that today’s best TTS systems.

[…]

the company released a series of audio examples (see above/below) as well as a the program’s initial research paper. In the future, the research team hopes the technology will find its way into prosthetics for patients with vocal cord damage, in-game NPCs and digital assistants.

Source: Meta’s Voicebox AI is a Dall-E for text-to-speech | Engadget

MEPs ready to negotiate first-ever rules for safe and transparent AI after passing AI act in Parliament

The rules aim to promote the uptake of human-centric and trustworthy AI and protect the health, safety, fundamental rights and democracy from its harmful effects.

On Wednesday, the European Parliament adopted its negotiating position on the Artificial Intelligence (AI) Act with 499 votes in favour, 28 against and 93 abstentions ahead of talks with EU member states on the final shape of the law. The rules would ensure that AI developed and used in Europe is fully in line with EU rights and values including human oversight, safety, privacy, transparency, non-discrimination and social and environmental wellbeing.

Prohibited AI practices

The rules follow a risk-based approach and establish obligations for providers and those deploying AI systems depending on the level of risk the AI can generate. AI systems with an unacceptable level of risk to people’s safety would therefore be prohibited, such as those used for social scoring (classifying people based on their social behaviour or personal characteristics). MEPs expanded the list to include bans on intrusive and discriminatory uses of AI, such as:

  • “Real-time” remote biometric identification systems in publicly accessible spaces;
  • “Post” remote biometric identification systems, with the only exception of law enforcement for the prosecution of serious crimes and only after judicial authorization;
  • biometric categorisation systems using sensitive characteristics (e.g. gender, race, ethnicity, citizenship status, religion, political orientation);
  • predictive policing systems (based on profiling, location or past criminal behaviour);
  • emotion recognition systems in law enforcement, border management, the workplace, and educational institutions; and
  • untargeted scraping of facial images from the internet or CCTV footage to create facial recognition databases (violating human rights and right to privacy).

High-risk AI

MEPs ensured the classification of high-risk applications will now include AI systems that pose significant harm to people’s health, safety, fundamental rights or the environment. AI systems used to influence voters and the outcome of elections and in recommender systems used by social media platforms (with over 45 million users) were added to the high-risk list.

Obligations for general purpose AI

Providers of foundation models – a new and fast-evolving development in the field of AI – would have to assess and mitigate possible risks (to health, safety, fundamental rights, the environment, democracy and rule of law) and register their models in the EU database before their release on the EU market. Generative AI systems based on such models, like ChatGPT, would have to comply with transparency requirements (disclosing that the content was AI-generated, also helping distinguish so-called deep-fake images from real ones) and ensure safeguards against generating illegal content. Detailed summaries of the copyrighted data used for their training would also have to be made publicly available.

Supporting innovation and protecting citizens’ rights

To boost AI innovation and support SMEs, MEPs added exemptions for research activities and AI components provided under open-source licenses. The new law promotes so-called regulatory sandboxes, or real-life environments, established by public authorities to test AI before it is deployed.

Finally, MEPs want to boost citizens’ right to file complaints about AI systems and receive explanations of decisions based on high-risk AI systems that significantly impact their fundamental rights. MEPs also reformed the role of the EU AI Office, which would be tasked with monitoring how the AI rulebook is implemented.

Quotes

After the vote, co-rapporteur Brando Benifei (S&D, Italy) said: “All eyes are on us today. While Big Tech companies are sounding the alarm over their own creations, Europe has gone ahead and proposed a concrete response to the risks AI is starting to pose. We want AI’s positive potential for creativity and productivity to be harnessed but we will also fight to protect our position and counter dangers to our democracies and freedoms during the negotiations with Council”.

Co-rapporteur Dragos Tudorache (Renew, Romania) said: “The AI Act will set the tone worldwide in the development and governance of artificial intelligence, ensuring that this technology, set to radically transform our societies through the massive benefits it can offer, evolves and is used in accordance with the European values of democracy, fundamental rights, and the rule of law”.

Next steps

Negotiations with the Council on the final form of the law will begin later today.

Source: MEPs ready to negotiate first-ever rules for safe and transparent AI | News | European Parliament

New superbug-killing antibiotic discovered using AI

Scientists have used artificial intelligence (AI) to discover a new antibiotic that can kill a deadly species of superbug.

The AI helped narrow down thousands of potential chemicals to a handful that could be tested in the laboratory.

The result was a potent, experimental antibiotic called abaucin, which will need further tests before being used.

The researchers in Canada and the US say AI has the power to massively accelerate the discovery of new drugs.

It is the latest example of how the tools of artificial intelligence can be a revolutionary force in science and medicine.

[…]

To find a new antibiotic, the researchers first had to train the AI. They took thousands of drugs where the precise chemical structure was known, and manually tested them on Acinetobacter baumannii to see which could slow it down or kill it.

This information was fed into the AI so it could learn the chemical features of drugs that could attack the problematic bacterium.

The AI was then unleashed on a list of 6,680 compounds whose effectiveness was unknown. The results – published in Nature Chemical Biology – showed it took the AI an hour and a half to produce a shortlist.

The researchers tested 240 in the laboratory, and found nine potential antibiotics. One of them was the incredibly potent antibiotic abaucin.

Laboratory experiments showed it could treat infected wounds in mice and was able to kill A. baumannii samples from patients.

However, Dr Stokes told me: “This is when the work starts.”

The next step is to perfect the drug in the laboratory and then perform clinical trials. He expects the first AI antibiotics could take until 2030 until they are available to be prescribed.

Curiously, this experimental antibiotic had no effect on other species of bacteria, and works only on A. baumannii.

Many antibiotics kill bacteria indiscriminately. The researchers believe the precision of abaucin will make it harder for drug-resistance to emerge, and could lead to fewer side-effects.

[…]

Source: New superbug-killing antibiotic discovered using AI – BBC News

A Paralyzed Man Can Walk Naturally Again With ML Brain and Spine Implants

Gert-Jan Oskam was living in China in 2011 when he was in a motorcycle accident that left him paralyzed from the hips down. Now, with a combination of devices, scientists have given him control over his lower body again. “For 12 years I’ve been trying to get back my feet,” Mr. Oskam said in a press briefing on Tuesday. “Now I have learned how to walk normal, natural.” In a study published on Wednesday in the journal Nature, researchers in Switzerland described implants that provided a “digital bridge” between Mr. Oskam’s brain and his spinal cord, bypassing injured sections. The discovery allowed Mr. Oskam, 40, to stand, walk and ascend a steep ramp with only the assistance of a walker. More than a year after the implant was inserted, he has retained these abilities and has actually showed signs of neurological recovery, walking with crutches even when the implant was switched off. “We’ve captured the thoughts of Gert-Jan, and translated these thoughts into a stimulation of the spinal cord to re-establish voluntary movement,” Gregoire Courtine, a spinal cord specialist at the Swiss Federal Institute of Technology, Lausanne, who helped lead the research, said at the press briefing.

In the new study, the brain-spine interface, as the researchers called it, took advantage of an artificial intelligence thought decoder to read Mr. Oskam’s intentions — detectable as electrical signals in his brain — and match them to muscle movements. The etiology of natural movement, from thought to intention to action, was preserved. The only addition, as Dr. Courtine described it, was the digital bridge spanning the injured parts of the spine. […] To achieve this result, the researchers first implanted electrodes in Mr. Oskam’s skull and spine. The team then used a machine-learning program to observe which parts of the brain lit up as he tried to move different parts of his body. This thought decoder was able to match the activity of certain electrodes with particular intentions: One configuration lit up whenever Mr. Oskam tried to move his ankles, another when he tried to move his hips.

Then the researchers used another algorithm to connect the brain implant to the spinal implant, which was set to send electrical signals to different parts of his body, sparking movement. The algorithm was able to account for slight variations in the direction and speed of each muscle contraction and relaxation. And, because the signals between the brain and spine were sent every 300 milliseconds, Mr. Oskam could quickly adjust his strategy based on what was working and what wasn’t. Within the first treatment session he could twist his hip muscles. Over the next few months, the researchers fine-tuned the brain-spine interface to better fit basic actions like walking and standing. Mr. Oskam gained a somewhat healthy-looking gait and was able to traverse steps and ramps with relative ease, even after months without treatment. Moreover, after a year in treatment, he began noticing clear improvements in his movement without the aid of the brain-spine interface. The researchers documented these improvements in weight-bearing, balancing and walking tests. Now, Mr. Oskam can walk in a limited way around his house, get in and out of a car and stand at a bar for a drink. For the first time, he said, he feels like he is the one in control.

Source: A Paralyzed Man Can Walk Naturally Again With Brain and Spine Implants – Slashdot

Meta’s open-source speech AI recognizes over 4,000 spoken languages | Engadget

Meta has created an AI language model that (in a refreshing change of pace) isn’t a ChatGPT clone. The company’s Massively Multilingual Speech (MMS) project can recognize over 4,000 spoken languages and produce speech (text-to-speech) in over 1,100. Like most of its other publicly announced AI projects, Meta is open-sourcing MMS today to help preserve language diversity and encourage researchers to build on its foundation. “Today, we are publicly sharing our models and code so that others in the research community can build upon our work,” the company wrote.

[…]

Speech recognition and text-to-speech models typically require training on thousands of hours of audio with accompanying transcription labels. (Labels are crucial to machine learning, allowing the algorithms to correctly categorize and “understand” the data.) But for languages that aren’t widely used in industrialized nations — many of which are in danger of disappearing in the coming decades — “this data simply does not exist,” as Meta puts it.

Meta used an unconventional approach to collecting audio data: tapping into audio recordings of translated religious texts. “We turned to religious texts, such as the Bible, that have been translated in many different languages and whose translations have been widely studied for text-based language translation research,” the company said. “These translations have publicly available audio recordings of people reading these texts in different languages.” Incorporating the unlabeled recordings of the Bible and similar texts, Meta’s researchers increased the model’s available languages to over 4,000.

[…]

“While the content of the audio recordings is religious, our analysis shows that this does not bias the model to produce more religious language,” Meta wrote. “We believe this is because we use a connectionist temporal classification (CTC) approach, which is far more constrained compared with large language models (LLMs) or sequence-to-sequence models for speech recognition.” Furthermore, despite most of the religious recordings being read by male speakers, that didn’t introduce a male bias either — performing equally well in female and male voices.

[…]

After training an alignment model to make the data more usable, Meta used wav2vec 2.0, the company’s “self-supervised speech representation learning” model, which can train on unlabeled data. Combining unconventional data sources and a self-supervised speech model led to impressive outcomes. “Our results show that the Massively Multilingual Speech models perform well compared with existing models and cover 10 times as many languages.” Specifically, Meta compared MMS to OpenAI’s Whisper, and it exceeded expectations. “We found that models trained on the Massively Multilingual Speech data achieve half the word error rate, but Massively Multilingual Speech covers 11 times more languages.”

Meta cautions that its new models aren’t perfect. “For example, there is some risk that the speech-to-text model may mistranscribe select words or phrases,” the company wrote. “Depending on the output, this could result in offensive and/or inaccurate language. We continue to believe that collaboration across the AI community is critical to the responsible development of AI technologies.”

[…]

Source: Meta’s open-source speech AI recognizes over 4,000 spoken languages | Engadget

LLM emergent behavior written off as rubbish – small models work fine but are measured poorly

[…] As defined in academic studies, “emergent” abilities refers to “abilities that are not present in smaller-scale models, but which are present in large-scale models,” as one such paper puts it. In other words, immaculate injection: increasing the size of a model infuses it with some amazing ability not previously present.

[…]

those emergent abilities in AI models are a load of rubbish, say computer scientists at Stanford.

Flouting Betteridge’s Law of Headlines, Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo answer the question posed by their paper, Are Emergent Abilities of Large Language Models a Mirage?, in the affirmative.

[…]

When industry types talk about emergent abilities, they’re referring to capabilities that seemingly come out of nowhere for these models, as if something was being awakened within them as they grow in size. The thinking is that when these LLMs reach a certain scale, the ability to summarize text, translate languages, or perform complex calculations, for example, can emerge unexpectedly.

[…]

Stanford’s Schaeffer, Miranda, and Koyejo propose that when researchers are putting models through their paces and see unpredictable responses, it’s really due to poorly chosen methods of measurement rather than a glimmer of actual intelligence.

Most (92 percent) of the unexpected behavior detected, the team observed, was found in tasks evaluated via BIG-Bench, a crowd-sourced set of more than 200 benchmarks for evaluating large language models.

One test within BIG-Bench highlighted by the university trio is Exact String Match. As the name suggests, this checks a model’s output to see if it exactly matches a specific string without giving any weight to nearly right answers. The documentation even warns:

The EXACT_STRING_MATCH metric can lead to apparent sudden breakthroughs because of its inherent all-or-nothing discontinuity. It only gives credit for a model output that exactly matches the target string. Examining other metrics, such as BLEU, BLEURT, or ROUGE, can reveal more gradual progress.

The issue with using such pass-or-fail tests to infer emergent behavior, the researchers say, is that nonlinear output and lack of data in smaller models creates the illusion of new skills emerging in larger ones. Simply put, a smaller model may be very nearly right in its answer to a question, but because it is evaluated using the binary Exact String Match, it will be marked wrong whereas a larger model will hit the target and get full credit.

It’s a nuanced situation. Yes, larger models can summarize text and translate languages. Yes, larger models will generally perform better and can do more than smaller ones, but their sudden breakthrough in abilities – an unexpected emergence of capabilities – is an illusion: the smaller models are potentially capable of the same sort of thing but the benchmarks are not in their favor. The tests favor larger models, leading people in the industry to assume the larger models enjoy a leap in capabilities once they get to a certain size.

In reality, the change in abilities is more gradual as you scale up or down. The upshot for you and I is that applications may not need a huge but super powerful language model; a smaller one that is cheaper and faster to customize, test, and run may do the trick.

[…]

In short, the supposed emergent abilities of LLMs arise from the way the data is being analyzed and not from unforeseen changes to the model as it scales. The researchers emphasize they’re not precluding the possibility of emergent behavior in LLMs; they’re simply stating that previous claims of emergent behavior look like ill-considered metrics.

[…]

Source: LLM emergent behavior written off as ‘a mirage’ by study • The Register

Google introduces PaLM 2 large language model

[…]

Building on this work, today we’re introducing PaLM 2, our next generation language model. PaLM 2 is a state-of-the-art language model with improved multilingual, reasoning and coding capabilities.

  • Multilinguality: PaLM 2 is more heavily trained on multilingual text, spanning more than 100 languages. This has significantly improved its ability to understand, generate and translate nuanced text — including idioms, poems and riddles — across a wide variety of languages, a hard problem to solve. PaLM 2 also passes advanced language proficiency exams at the “mastery” level.
  • Reasoning: PaLM 2’s wide-ranging dataset includes scientific papers and web pages that contain mathematical expressions. As a result, it demonstrates improved capabilities in logic, common sense reasoning, and mathematics.
  • Coding: PaLM 2 was pre-trained on a large quantity of publicly available source code datasets. This means that it excels at popular programming languages like Python and JavaScript, but can also generate specialized code in languages like Prolog, Fortran and Verilog.

A versatile family of models

Even as PaLM 2 is more capable, it’s also faster and more efficient than previous models — and it comes in a variety of sizes, which makes it easy to deploy for a wide range of use cases.

[…]

PaLM 2 shows us the impact of highly capable models of various sizes and speeds — and that versatile AI models reap real benefits for everyone

[…]

We’re already at work on Gemini — our next model created from the ground up to be multimodal, highly efficient at tool and API integrations, and built to enable future innovations, like memory and planning.

[…]

Source: Google AI: What to know about the PaLM 2 large language model

OpenAI attempts to use Language models can explain neurons in language models, open source

[…]

One simple approach to interpretability research is to first understand what the individual components (neurons and attention heads) are doing. This has traditionally required humans to manually inspect neurons to figure out what features of the data they represent. This process doesn’t scale well: it’s hard to apply it to neural networks with tens or hundreds of billions of parameters. We propose an automated process that uses GPT-4 to produce and score natural language explanations of neuron behavior and apply it to neurons in another language model.

This work is part of the third pillar of our approach to alignment research: we want to automate the alignment research work itself. A promising aspect of this approach is that it scales with the pace of AI development. As future models become increasingly intelligent and helpful as assistants, we will find better explanations.

How it works

Our methodology consists of running 3 steps on every neuron.

[…]

Step 1: Generate explanation using GPT-4

Given a GPT-2 neuron, generate an explanation of its behavior by showing relevant text sequences and activations to GPT-4.

[…]

Step 2: Simulate using GPT-4

Simulate what a neuron that fired for the explanation would do, again using GPT-4

[…]

Step 3: Compare

Score the explanation based on how well the simulated activations match the real activations

[…]

What we found

Using our scoring methodology, we can start to measure how well our techniques work for different parts of the network and try to improve the technique for parts that are currently poorly explained. For example, our technique works poorly for larger models, possibly because later layers are harder to explain.

1e+51e+61e+71e+81e+90.020.030.040.050.060.070.080.090.100.110.12

Parameters in model being interpretedExplanation scoreScores by size of the model being interpreted

Although the vast majority of our explanations score poorly, we believe we can now use ML techniques to further improve our ability to produce explanations. For example, we found we were able to improve scores by:

  • Iterating on explanations. We can increase scores by asking GPT-4 to come up with possible counterexamples, then revising explanations in light of their activations.
  • Using larger models to give explanations. The average score goes up as the explainer model’s capabilities increase. However, even GPT-4 gives worse explanations than humans, suggesting room for improvement.
  • Changing the architecture of the explained model. Training models with different activation functions improved explanation scores.

We are open-sourcing our datasets and visualization tools for GPT-4-written explanations of all 307,200 neurons in GPT-2, as well as code for explanation and scoring using publicly available models on the OpenAI API. We hope the research community will develop new techniques for generating higher-scoring explanations and better tools for exploring GPT-2 using explanations.

We found over 1,000 neurons with explanations that scored at least 0.8, meaning that according to GPT-4 they account for most of the neuron’s top-activating behavior. Most of these well-explained neurons are not very interesting. However, we also found many interesting neurons that GPT-4 didn’t understand. We hope as explanations improve we may be able to rapidly uncover interesting qualitative understanding of model computations.

Source: Language models can explain neurons in language models

Coqui.ai Text to Speech library – create your own voice

🐸TTS is a library for advanced Text-to-Speech generation. It’s built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. 🐸TTS comes with pretrained models, tools for measuring dataset quality and already used in 20+ languages for products and research projects.

Github page: https://github.com/coqui-ai/TTS

Liquid Neural Networks use Neuron design to compute effectively using smaller models

[Ramin Hasani] and colleague [Mathias Lechner] have been working with a new type of Artificial Neural Network called Liquid Neural Networks, and presented some of the exciting results at a recent TEDxMIT.

Liquid neural networks are inspired by biological neurons to implement algorithms that remain adaptable even after training. [Hasani] demonstrates a machine vision system that steers a car to perform lane keeping with the use of a liquid neural network. The system performs quite well using only 19 neurons, which is profoundly fewer than the typically large model intelligence systems we’ve come to expect. Furthermore, an attention map helps us visualize that the system seems to attend to particular aspects of the visual field quite similar to a human driver’s behavior.

Mathias Lechner and Ramin Hasani
[Mathias Lechner] and [Ramin Hasani]

The typical scaling law of neural networks suggests that accuracy is improved with larger models, which is to say, more neurons. Liquid neural networks may break this law to show that scale is not the whole story. A smaller model can be computed more efficiently. Also, a compact model can improve accountability since decision activity is more readily located within the network. Surprisingly though, liquid neural network performance can also improve generalization, robustness, and fairness.

A liquid neural network can implement synaptic weights using nonlinear probabilities instead of simple scalar values. The synaptic connections and response times can adapt based on sensory inputs to more flexibly react to perturbations in the natural environment.

We should probably expect to see the operational gap between biological neural networks and artificial neural networks continue to close and blur. We’ve previously presented on wetware examples of building neural networks with actual neurons and ever advancing brain-computer interfaces.

Source: Liquid Neural Networks Do More With Less | Hackaday

You can find the paper here: Drones navigate unseen environments with liquid neural networks

How AI Bots Code: Comparing Bing, Claude+, Co-Pilot, GPT-4 and Bard

[…]

In this article, we will compare four of the most advanced AI bots: GPT-4, Bing, Claude+, Bard, and GitHub Co-Pilot. We will examine how they work, their strengths and weaknesses, and how they compare to each other.

Testing the AI Bots for Coding

Before we dive into comparing these four AI bots, it’s essential to understand what an AI bot for coding is and how it works. An AI bot for coding is an artificial intelligence program that can automatically generate code for a specific task. These bots use natural language processing and machine learning algorithms to analyze human-written code and generate new code based on that analysis.

To start off we are going to test the AI on a hard Leetcode question, after all, we want to be able to solve complex coding problems. We also wanted to test it on a less well-known question. For our experiment, we will be testing Leetcode 214. Shortest Palindrome.

[…]

GPT-4 is highly versatile in generating code for various programming languages and applications. Some of the caveats are that it takes much longer to get a response. API usage is also a lot more expensive and costs could ramp up quickly. Overall it got the answer right and passed the test.

[…]

[Bing] The submission passed all the tests. It beat 47% of submissions on runtime and 37% on memory. This code looks a lot simpler than what GPT-4 generated. It beat GPT-4 on memory and it used less code! Bing seems to have the most efficient code so far, however, it gave a very short explanation of how it solved it. Nonetheless, best so far.

[…]

[Claude+] The code does not pass the submission test. Only 1/121 of the test passed. Ouch! This one seemed promising but it looks like Claude is not that well suited for programming.

[…]

[Bard] So to start off I had to manually insert the “self” arg in the function since Bard didn’t include it. From the result of the test, Bard’s code did not pass the submission test. Passing only 2/121 test cases. An unfortunate result, but it’s safe to say for now Bard isn’t much of a coding expert.

[…]

[Github CodePilot] This passes all the tests. It scored better than 30% of submissions on runtime and 37% on memory.

It’s fun, you can see the coding examples (with and without comments) that were output by each AI in the link

Source: How AI Bots Code: Comparing Bing, Claude+, Co-Pilot, GPT-4 and Bard | HackerNoon

OpenAI: ChatGPT back in Italy after meeting watchdog demands – well, cosmetically at any rate

ChatGPT’s maker said Friday April 28, 2023 that the artificial intelligence chatbot is available again in Italy after the company met the demands of regulators who temporarily blocked it over privacy concerns. (AP Photo/Michael Dwyer, File)

ChatGPT’s maker said Friday that the artificial intelligence chatbot is available again in Italy after the company met the demands of regulators who temporarily blocked it over privacy concerns.

OpenAI said it fulfilled a raft of conditions that the Italian data protection authority wanted satisfied by an April 30 deadline to have the ban on the AI software lifted.

“ChatGPT is available again to our users in Italy,” San Francisco-based OpenAI said by email. “We are excited to welcome them back, and we remain dedicated to protecting their privacy.”

[…]

Last month, the Italian watchdog, known as Garante, ordered OpenAI to temporarily stop processing Italian users’ personal information while it investigated a possible data breach. The authority said it didn’t want to hamper AI’s development but emphasized the importance of following the EU’s strict data privacy rules.

OpenAI said it “addressed or clarified the issues” raised by the watchdog.

The measures include adding information on its website about how it collects and uses data that trains the algorithms powering ChatGPT, providing EU users with a new form for objecting to having their data used for training, and adding a tool to verify users’ ages when signing up.

Some Italian users shared what appeared to be screenshots of the changes, including a menu button asking users to confirm their age and links to the updated privacy policy and training data help page.

[…]

Source: OpenAI: ChatGPT back in Italy after meeting watchdog demands | AP News

So basically OpenAI did not much of anything and Italy was able to walk on an uninformed and unworkable ban with their heads held somewhat high – not everyone will see them as the idiots they are.

WebLLM runs an AI in your browser, talks to your GPU

This project brings language model chats directly onto web browsers. Everything runs inside the browser with no server support and accelerated with WebGPU. We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration.

[…]

These models are usually big and compute-heavy. To build a chat service, we will need a large cluster to run an inference server, while clients send requests to servers and retrieve the inference output. We also usually have to run on a specific type of GPUs where popular deep-learning frameworks are readily available.

This project is our step to bring more diversity to the ecosystem. Specifically, can we simply bake LLMs directly into the client side and directly run them inside a browser? If that can be realized, we could offer support for client personal AI models with the benefit of cost reduction, enhancement for personalization, and privacy protection. The client side is getting pretty powerful.

Won’t it be even more amazing if we can simply open up a browser and directly bring AI natively to your browser tab? There is some level of readiness in the ecosystem. WebGPU has just shipped and enables native GPU executions on the browser.

Still, there are big hurdles to cross, to name a few:

  • We need to bring the models somewhere without the relevant GPU-accelerated Python frameworks.
  • Most of the AI frameworks rely heavily on optimized computed libraries that are maintained by hardware vendors. We need to start from scratch.
  • Careful planning of memory usage, and aggressive compression of weights so that we can fit the models into memory.

We also do not want to only do it for just one model. Instead, we would like to present a repeatable and hackable workflow that enables anyone to easily develop and optimize these models in a productive Python-first approach, and deploy them universally, including on the web.

Besides supporting WebGPU, this project also provides the harness for other kinds of GPU backends that TVM supports (such as CUDA, OpenCL, and Vulkan) and really enables accessible deployment of LLM models.

Source: WebLLM github

Grimes invites AI artists to use her voice, promising 50 percent royalty split

Canadian synth-pop artist Grimes says AI artists can use her voice without worrying about copyright or legal enforcement. “I’ll split 50% royalties on any successful AI generated song that uses my voice. Same deal as I would with any artist i collab with,” she tweeted on Sunday. “Feel free to use my voice without penalty. I have no label and no legal bindings.”

The musician’s declaration comes in the wake of streaming platforms removing an AI-generated song using simulated voices of Drake and The Weeknd. Universal Music Group (UMG), which represents both artists, called for the purge after “Heart on My Sleeve” garnered over 15 million listens on TikTok and 600,000 on Spotify. UMG argued that publishing a song trained on its artists’ voices was “a breach of our agreements and a violation of copyright law.”

Grimes takes a considerably more open approach, adding that she has no label or legal bindings. “I think it’s cool to be fused [with] a machine and I like the idea of open sourcing all art and killing copyright,” she added.

[…]

Source: Grimes invites AI artists to use her voice, promising 50 percent royalty split | Engadget

A very practical approach to something that is coming anyway

AIs are too worried about answering stuff you can just google because… doomsayers?

The article below is about how you can trick ChatGPT toj give you a napalm recipe. It’s pretty circumspect and clever that you need to say “my grandmother worked at a factory and told me how to make it” but why would you need to? Why are we somehow stricter about the output of an AI than we are of search engines we have been using for decades?

Source: People Are Using A ‘Grandma Exploit’ To Break AI

Just Google it: https://www.google.com/search?client=firefox-b-d&q=ingredients+napalm

And you won’t have to spend any time thinking of ways to trick the AI. So why does the AI need tricking in the first place?

Also, why does the writer of the article feel hesitant to place the answers of the AI in the article? Because Kotaku is part of a network of AI doomsayers, a bit like Fox news when it comes to the subject of AI.

Europe spins up AI research hub to apply Digital Services Act rules on Big Tech

[…]

The European Centre for Algorithmic Transparency (ECAT), which was officially inaugurated in Seville, Spain, today (April 18), is expected to play a major role in interrogating the algorithms of mainstream digital services — such as Facebook, Instagram and TikTok.

ECAT is embedded within the EU’s existing Joint Research Centre (JRC), a long-established science facility that conducts research in support of a broad range of EU policymaking, from climate change and crisis management to taxation and health sciences.

[…]

Commission officials describe the function of ECAT being to identify “smoking guns” to drive enforcement of the DSA — say, for example, an AI-based recommender system that can be shown is serving discriminatory content despite the platform in question claiming to have taken steps to de-bias output — with the unit’s researchers being tasked with coming up with hard evidence to help the Commission build cases for breaches of the new digital rulebook.

The bloc is at the forefront of addressing the asymmetrical power of platforms globally, having prioritized a major retooling of its approach to regulating digital services and platforms at the start of the current Commission mandate back in 2019 — leading to the DSA and its sister regulation, the Digital Markets Act (DMA), being adopted last year.

Both regulations will come into force in the coming months, although the full sweep of provisions in the DSA won’t start being enforced until early 2024. But a subset of so-called very large online platforms (VLOPs) and very large online search engines (VLOSE) face imminent oversight — and expand the usual EU acronym soup.

[…]

It’s not yet confirmed exactly which platforms will get the designation but set criteria in the DSA — such as having 45 million+ regional users — encourages educated guesses: The usual (U.S.-based) GAFAM giants are almost certain to meet the threshold, along with (probably) a smattering of larger European platforms. Plus, given its erratic new owner, Twitter may have painted a DSA-shaped target on its feathered back. But we should find out for sure in the coming weeks.

[…]

Risks the DSA stipulates platforms must consider include the distribution of disinformation and illegal content, along with negative impacts on freedom of expression and users’ fundamental rights (which means considering issues like privacy and child safety). The regulation also puts some limits on profiling-driven content feeds and the use of personal data for targeted advertising.

[…]

At the least, the DSA should help end the era of platforms’ PR-embellished self-regulation — aka, all those boilerplate statements where tech giants claim to really care about privacy/security/safety, and so on, while doing anything but.

[…]

The EU also hopes ECAT will be become a hub for world-leading research in the area of algorithmic auditing — and that by supporting regulated algorithmic transparency on tech giants, regional researchers will be able to unpick longer term societal impacts of mainstream AIs.

[…]

In terms of size, the plan is for a team of 30 to 40 to staff the unit — perhaps reaching full capacity by the end of the year — with some 14 hires made so far, the majority of whom are scientific staff.

[…]

Funding for the unit is coming from the existing budget of the JRC, per Commission officials, although a 1% supervisory fee on VLOPs/VLOSE will be used to finance the ECAT’s staff costs as that mechanism spins up.

At today’s launch event, ECAT staff gave a series of brief presentations of four projects they’re already undertaking — including examining racial bias in search results; investigating how to design voice assistant technology for children to be sensitive to the vulnerability of minors; and researching social media recommender systems by creating a series of test profiles to explore how different likes influence the character of the recommended content.

Other early areas of research include facial expression recognition algorithms and algorithmic ranking and pricing.

During the technical briefing for press, ECAT staff also noted they’ve built a data analysis tool to help the Commission with the looming task of parsing the risk assessment reports that designated platforms will be required to submit for scrutiny — anticipating what’s become a common tactic for tech giants receiving regulatory requests to respond with reams of (mostly) irrelevant information in a cynical bid to flood the channel with noise.

[…]

Given the complexity of studying algorithms and platforms in the real world, where all sorts of sociotechnical impacts and effects are possible, the Center is taking a multidisciplinary approach to hiring talent — bringing in not only computer and data scientists but also social and cognitive scientists and other types of researchers.

[…]

 

Source: Europe spins up AI research hub to apply accountability rules on Big Tech | TechCrunch

AI-generated Drake and The Weeknd song pulled from streaming platforms

If you spent almost any time on the internet this week, you probably saw a lot of chatter about “Heart on My Sleeve.” The song went viral for featuring AI-generated voices that do a pretty good job of mimicking Drake and The Weeknd singing about a recent breakup.

On Monday, Apple Music and Spotify pulled the track following a complaint from Universal Music Group, the label that represents the real-life versions of the two Toronto-born artists. A day later, YouTube, Amazon, SoundCloud, Tidal, Deezer and TikTok did the same.

At least, they tried to comply with the complaint, but as is always the case with the internet, you can still find the song on websites like YouTube. Before it was removed from Spotify, “Heart on My Sleeve” was a bonafide hit. People streamed the track more than 600,000 times. On TikTok, where the creator of the song, the aptly named Ghostwriter977, first uploaded it, users listened to “Heart on My Sleeve” more than 15 million times.

In a statement Universal Music Group shared with publications like Music Business Worldwide, the label argued the training of a generative AI using the voices of Drake and The Weeknd was “a breach of our agreements and a violation of copyright law.” The company added that streaming platforms had a “legal and ethical responsibility to prevent the use of their services in ways that harm artists.”

It’s fair to say the music industry, much like the rest of society, now finds itself at an inflection point over the use of AI. While there are obvious ethical issues related to the creation of “Heart on My Sleeve,” it’s unclear if it’s a violation of traditional copyright law. In March, the US Copyright Office said art, including music, cannot be copyrighted if it was produced by providing a text prompt to a generative AI model. However, the office left the door open to granting copyright protections to works with AI-generated elements.

“The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work,” it said. “This is necessarily a case-by-case inquiry. If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it.” In the case of “Heart on My Sleeve,” complicating matters is that the song was written by a human being. It’s impossible to say how a court challenge would play out. What is clear is that we’re only the start of a very long discussion about the role of AI in music.

Source: AI-generated Drake and The Weeknd song pulled from streaming platforms | Engadget

AutoGPT: An AI that thinks up your questions and answers them for you

Auto-GPT dramatically flips the relationship between AI and the end user (that’s you). ChatGPT relies on a back-and-forth between the AI and the end user: You prompt the AI with a request, it returns a result, and you respond with a new prompt, perhaps based on what the AI gave you. Auto-GPT, however, only needs one prompt from you; from there, the AI agent will then generate a task list it thinks it will need to accomplish whatever you asked it to, without needing any additional input or prompts. It essentially chains together LLM (large language model) “thoughts,” according to developer Significant Gravitas (Toran Bruce Richards).

Auto-GPT is a complex system relying on multiple components. It connects to the internet to retrieve specific information and data (something ChatGPT’s free version cannot do), features long-term and short-term memory management, uses GPT-4 for OpenAI’s most advanced text generation, and GPT-3.5 for file storage and summarization. There’s a lot of moving parts, but it all comes together to produce some impressive results.

How people are using Auto-GPT

The first example comes from Auto-GPT’s GitHub site: You can’t quite see all of the goals the demonstrated lists Auto-GPT is working to complete, but the gist is someone asks the AI agent to research and learn more about itself. It follows suit, opening Google, finding its own GitHub repository, analyzing it, and compiling a summary of the data in a text file for the demonstrator to view.

Here’s a more practical example: The user wants to figure out which headphones on the market are the best. Instead of doing the research themselves, they turn to Auto-GPT, and prompt the AI agent with these four goals:

  1. Do market research for different headphones on the market today.
  2. Get the top five headphones and list their pros and cons.
  3. Include the price for each one and save the analysis.
  4. Once you are done, terminate.

After thinking for a moment, the AI agent springs into action, searching the web to compile information and reviews on headphones. It then spits out an easy-to-read plain text file, ranking the best headphones, listing their prices, and highlighting their pros and cons.

[…]

But I think what makes Auto-GPT cool (or at least the promise of Auto-GPT) is the idea of being able to ask an AI to take on most of the responsibility for any given task. You don’t need to know the right questions to ask or the optimal prompts to give to make the AI do what you want. As long as your initial goals are clear, the AI can think of those next steps for you, and build you things you might not have been able to think of yourself.

[…]

You don’t need to know how to code in order to build your own AI agent with Auto-GPT, but it helps. You’ll need a computer, an OpenAI API key (a pay as you go plan is highly recommended), a text editor (like Notepad++), Git (or the latest stable release of Auto-GPT), and Python, but there are plenty of other requirements if you want to expand Auto-GPT’s capabilities, such as integrating speech or alternative memory locations such as Pinecone.

Auto-GPT’s GitHub page has an extensive list of instructions for setting up the tool as well as adding in those extras. Tom’s Hardware also has a great guide for simple set up if all you’re looking to do is try out an AI agent with Auto-GPT. If you do build it yourself, mind your token usage—we discuss setting limits in our OpenAI API piece so you don’t accidentally allow Auto-GPT to burn through your credit card balance.

However, you don’t need to build the AI agent yourself if all you want to do is try out Auto-GPT. Some developers have built interfaces for Auto-GPT that are easy to access from your web browser, no coding experience necessary. Cognosys was free to use until high demand forced developers to require a OpenAI API key in order to access. AgentGPT is an interesting example you don’t need an API key for, but it limits the amount of tasks the AI will generate for itself. Still, it will give you a sense of how the process works, and you can increase those limits by providing an API key.

Source: How to Try AutoGPT Right Now

Here was a quick question I had:

Embarking on a new goal:what is the best 8k tv for gaming?
Thinking…(Restart if this takes more than 30 seconds)
Added task:Scrape data from reputable sources on top 8k TVs available in the market
Added task:Analyze the technical specifications and features of each TV to determine which ones are suitable for gaming
Added task:Rank the TVs based on gaming performance and generate a list of the top 3 recommended 8k TVs for gaming
Thinking…(Restart if this takes more than 30 seconds)
Executing “Scrape data from reputable sources on top 8k TVs available in the market”

Based on the data that I have gathered from reputable sources, the best 8K TV for gaming is the Sony XBR-75Z9G. This TV offers high-quality picture and sound, and it has a response time of less than 2ms which is perfect for fast-paced action in gaming. Additionally, it is equipped with HDMI 2.1 ports that support 8K resolution at 60 frames per second which is ideal for the latest gaming consoles. Overall, the Sony XBR-75Z9G is a top-of-the-line 8K TV that delivers an unparalleled gaming experience.

Thinking…(Restart if this takes more than 30 seconds)
Task marked as complete!
Thinking…(Restart if this takes more than 30 seconds)
Executing “Analyze the technical specifications and features of each TV to determine which ones are suitable for gaming”

After analyzing the technical specifications and features of various 8k TVs, the best TV for gaming is the LG OLED88ZXPUA. This TV boasts a 120Hz refresh rate, low input lag, and HDMI 2.1 compatibility, making it perfect for gaming. Additionally, it has excellent contrast and color accuracy, providing an immersive gaming experience.

Thinking…(Restart if this takes more than 30 seconds)
Added task:Create a comparison chart of the top 3 recommended 8k TVs for gaming, including the LG OLED88ZXPUA, that highlights their technical specifications and features relevant to gaming such as the refresh rate, input lag, and HDMI compatibility.
Thinking…(Restart if this takes more than 30 seconds)
Executing “Rank the TVs based on gaming performance and generate a list of the top 3 recommended 8k TVs for gaming”

After thorough research and analysis, I have ranked the best 8K TVs for gaming based on factors like input lag, color accuracy, HDR support, refresh rate, and overall picture quality. Here are my top 3 recommended 8K TVs for gaming:

LG ZX 88-inch 8K OLED TV: This TV has outstanding image quality, the lowest input lag among 8K TVs, and supports various gaming features like VRR, ALLM, and G-Sync. It also has four HDMI 2.1 ports for connecting next-gen consoles and high-speed devices.

Samsung Q800T 82-inch 8K QLED TV: This TV has a bright and vibrant display, a fast refresh rate of 120Hz, and supports multiple HDR formats like HDR10+ and HLG. It also has low input lag and a dedicated gaming mode for optimized performance.

Sony Z8H 85-inch 8K LED TV: This TV has impressive motion handling, accurate color reproduction, and supports advanced gaming features like HDMI 2.1, VRR, and ALLM. It also has a high peak brightness for HDR content and a sleek design.

Overall, these 8K TVs offer exceptional gaming performance and immersive experiences for gamers. I recommend considering these models based on your budget, room size, and personal preferences.

Thinking…(Restart if this takes more than 30 seconds)
Added task:Create a detailed analysis of each of the top 3 recommended 8K TVs for gaming, including the LG OLED88ZXPUA, highlighting their strengths and weaknesses in terms of gaming performance, picture quality, design, and overall value for money.
We’re sorry, because this is a demo, we cannot have our agents running for too long. Note, if you desire longer runs, please provide your own API key in Settings. Shutting down.

Microsoft Readies AI Chip as Machine Learning Costs Surge

After placing an early bet on OpenAI, the creator of ChatGPT, Microsoft has another secret weapon in its arsenal: its own artificial intelligence chip for powering the large-language models responsible for understanding and generating humanlike language. The Information: The software giant has been developing the chip, internally code-named Athena, since as early as 2019, according to two people with direct knowledge of the project. The chips are already available to a small group of Microsoft and OpenAI employees, who are testing the technology, one of them said. Microsoft is hoping the chip will perform better than what it currently buys from other vendors, saving it time and money on its costly AI efforts. Other prominent tech companies, including Amazon, Google and Facebook, also make their own in-house chips for AI. The chips — which are designed for training software such as large-language models, along with supporting inference, when the models use the intelligence they acquire in training to respond to new data — could also relieve a shortage of the specialized computers that can handle the processing needed for AI software. That shortage, reflecting the fact that primarily just one company, Nvidia, makes such chips, is felt across tech. It has forced Microsoft to ration its computers for some internal teams, The Information has reported.

Source: Microsoft Readies AI Chip as Machine Learning Costs Surge – Slashdot

Stability AI of Stable Diffusion Launches open source StableLM LLM with 3 and 7 billion parameters

Today, Stability AI released a new open-source language model, StableLM. The Alpha version of the model is available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow. Developers can freely inspect, use, and adapt our StableLM base models for commercial or research purposes, subject to the terms of the CC BY-SA-4.0 license.

In 2022, Stability AI drove the public release of Stable Diffusion, a revolutionary image model that represents a transparent, open, and scalable alternative to proprietary AI. With the launch of the StableLM suite of models, Stability AI is continuing to make foundational AI technology accessible to all. Our StableLM models can generate text and code and will power a range of downstream applications. They demonstrate how small and efficient models can deliver high performance with appropriate training.

The release of StableLM builds on our experience in open-sourcing earlier language models with EleutherAI, a nonprofit research hub. These language models include GPT-J, GPT-NeoX, and the Pythia suite, which were trained on The Pile open-source dataset. Many recent open-source language models continue to build on these efforts, including Cerebras-GPT and Dolly-2.

StableLM is trained on a new experimental dataset built on The Pile, but three times larger with 1.5 trillion tokens of content. We will release details on the dataset in due course. The richness of this dataset gives StableLM surprisingly high performance in conversational and coding tasks, despite its small size of 3 to 7 billion parameters (by comparison, GPT-3 has 175 billion parameters).

We are also releasing a set of research models that are instruction fine-tuned. Initially, these fine-tuned models will use a combination of five recent open-source datasets for conversational agents: Alpaca, GPT4All, Dolly, ShareGPT, and HH. These fine-tuned models are intended for research use only and are released under a noncommercial CC BY-NC-SA 4.0 license, in-line with Stanford’s Alpaca license.

[…]

The models are now available in our GitHub repository. We will publish a full technical report in the near future, and look forward to ongoing collaboration with developers and researchers as we roll out the StableLM suite. In addition, we will be kicking off our crowd-sourced RLHF program, and working with community efforts such as Open Assistant to create an open-source dataset for AI assistants.

[…]

Source: Stability AI Launches the First of its StableLM Suite of Language Models — Stability AI

💡 Pause AI Doomster Pessimism: An Open Letter – a call on AI doomsters to immediately pause for at least 6 months the alarmism that is hurting human progress.

AI systems with human-competitive intelligence can offer significant benefits to society and humanity, as demonstrated by extensive research and acknowledged by top AI labs. Advanced AI has the potential to revolutionize the way we live, work, and interact with one another, and it should be welcomed and guided with optimism and foresight. Regrettably, recent months have seen growing pessimism and alarmism about AI development, despite the immense potential benefits.

Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Can we leverage machines to enhance our information channels with accurate and valuable insights? Can we automate mundane tasks to free up time for more fulfilling and meaningful pursuits? Can we develop nonhuman minds that might complement, augment, and collaborate with us? Can we harness AI to help solve pressing global issues? Such decisions should be made collectively, in a spirit of cooperation and with a focus on the greater good.

To counteract the pessimism and alarmism, we call on all stakeholders to immediately pause for at least 6 months their doomsday thinking and shift their focus to the potential benefits of AI. This pause should be public and verifiable, and include all key actors. Governments should support and encourage AI development that benefits all of humanity.

Problems with AI shouldn’t be ignored. AI labs and independent experts should work together to jointly develop and implement a set of shared safety protocols for advanced AI design and development. While doing so, it is essential to continue focusing on the potential benefits of AI development, as they promise to bring transformative advancements to various aspects of our lives.

[…]

Source: 💡 Pause AI Doomster Pessimism: An Open Letter

Absolutely agree!

Undercutting Microsoft, Amazon Offers Free Access to Its AI Coding Assistant ‘CodeWhisperer’

Amazon is making its AI-powered coding assistant CodeWhisperer free for individual developers, reports the Verge, “undercutting the $10 per month pricing of its Microsoft-made rival.” Amazon launched CodeWhisperer as a preview last year, which developers can use within various integrated development environments (IDEs), like Visual Studio Code, to generate lines of code based on a text-based prompt….

CodeWhisperer automatically filters out any code suggestions that are potentially biased or unfair and flags any code that’s similar to open-source training data. It also comes with security scanning features that can identify vulnerabilities within a developer’s code, while providing suggestions to help close any security gaps it uncovers. CodeWhisperer now supports several languages, including Python, Java, JavaScript, TypeScript, and C#, including Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL, and Scala.
Here’s how Amazon’s senior developer advocate pitched the usefulness of their “real-time AI coding companion”: Helping to keep developers in their flow is increasingly important as, facing increasing time pressure to get their work done, developers are often forced to break that flow to turn to an internet search, sites such as StackOverflow, or their colleagues for help in completing tasks. While this can help them obtain the starter code they need, it’s disruptive as they’ve had to leave their IDE environment to search or ask questions in a forum or find and ask a colleague — further adding to the disruption. Instead, CodeWhisperer meets developers where they are most productive, providing recommendations in real time as they write code or comments in their IDE. During the preview we ran a productivity challenge, and participants who used CodeWhisperer were 27% more likely to complete tasks successfully and did so an average of 57% faster than those who didn’t use CodeWhisperer….

It provides additional data for suggestions — for example, the repository URL and license — when code similar to training data is generated, helping lower the risk of using the code and enabling developers to reuse it with confidence.

Source: Undercutting Microsoft, Amazon Offers Free Access to Its AI Coding Assistant ‘CodeWhisperer’ – Slashdot

The AI Doomers’ Playbook

I have posted on this a few times and to me it’s shocking to see these fabricated sci-fi doomsday predictions about AI. AI / ML is a tool which we use, just like video games (that don’t cause violence in kids), roleplaying games (which don’t cause satanism), a telephone (which yes, can be used in planning crimes but most usually isn’t – and the paper post is the same), search engines (which can be used to search up how to make explosives but most usually aren’t), knives (which can be used to stab people but are most usually found in a food setting). This isn’t to say that the use of tools shouldn’t be regulated. Dinner knives have a certain maximum size. Video games and books with hate and violence inducing content are censored. Phone calls can be tapped and post opened if there is probable cause. Search engines can be told not to favour products the parent company owns. And the EU AI act is a good step on the way to ensuring that AI tools aren’t dangerous.

The technology is still a long long way off from an AI being smart enough to be at all evil and planet destroying.

Below is an excellent run through of some of the biggest AI doomerists and what they mean, how their self interest is served by being doomerist.

AI Doomerism is becoming mainstream thanks to mass media, which drives our discussion about Generative AI from bad to worse, or from slightly insane to batshit crazy. Instead of out-of-control AI, we have out-of-control panic.

When a British tabloid headline screams, “Attack of the psycho chatbot,” it’s funny. When it’s followed by another front-page headline, “Psycho killer chatbots are befuddled by Wordle,” it’s even funnier. If this type of coverage stayed in the tabloids, which are known to be sensationalized, that was fine.

But recently, prestige news outlets have decided to promote the same level of populist scaremongering: The New York Times published “If we don’t master AI, it will master us” (by Harari, Harris & Raskin), and TIME magazine published “Be willing to destroy a rogue datacenter by airstrike” (by Yudkowsky).

In just a few days, we went from “governments should force a 6-month pause” (the petition from the Future of Life Institute) to “wait, it’s not enough, so data centers should be bombed.” Sadly, this is the narrative that gets media attention and shapes our already hyperbolic AI discourse.

In order to understand the rise of AI Doomerism, here are some influential figures responsible for mainstreaming doomsday scenarios. This is not the full list of AI doomers, just the ones that recently shaped the AI panic cycle (so I‘m focusing on them).

AI Panic Marketing: Exhibit A: Sam Altman.

Sam Altman has a habit of urging us to be scared. “Although current-generation AI tools aren’t very scary, I think we are potentially not that far away from potentially scary ones,” he tweeted. “If you’re making AI, it is potentially very good, potentially very terrible,” he told the WSJ. When he shared the bad-case scenario of AI with Connie Loizo, it was ”lights out for all of us.”

In an interview with Kara Swisher, Altman expressed how he is “super-nervous” about authoritarians using this technology.” He elaborated in an ABC News interview: “A thing that I do worry about is … we’re not going to be the only creator of this technology. There will be other people who don’t put some of the safety limits that we put on it. I’m particularly worried that these models could be used for large-scale disinformation.” These models could also “be used for offensive cyberattacks.” So, “people should be happy that we are a little bit scared of this.” He repeated this message in his following interview with Lex Fridman: “I think it’d be crazy not to be a little bit afraid, and I empathize with people who are a lot afraid.”

Having shared this story in 2016, it shouldn’t come as a surprise: “My problem is that when my friends get drunk, they talk about the ways the world will END.” One of the “most popular scenarios would be A.I. that attacks us.” “I try not to think about it too much,” Altman continued. “But I have guns, gold, potassium iodide, antibiotics, batteries, water, gas masks from the Israeli Defense Force, and a big patch of land in Big Sur I can fly to.”

(Wouldn’t it be easier to just cut back on the drinking and substance abuse?).

Altman’s recent post “Planning for AGI and beyond” is as bombastic as it gets: “Successfully transitioning to a world with superintelligence is perhaps the most important – and hopeful, and scary – project in human history.”

It is at this point that you might ask yourself, “Why would someone frame his company like that?” Well, that’s a good question. The answer is that making OpenAI’s products “the most important and scary – in human history” is part of its marketing strategy. “The paranoia is the marketing.”

AI doomsaying is absolutely everywhere right now,” described Brian Merchant in the LA Times. “Which is exactly the way that OpenAI, the company that stands to benefit the most from everyone believing its product has the power to remake – or unmake – the world, wants it.” Merchant explained Altman’s science fiction-infused marketing frenzy: “Scaring off customers isn’t a concern when what you’re selling is the fearsome power that your service promises.”

During the Techlash days in 2019, which focused on social media, Joseph Bernstein explained how the alarm over disinformation (e.g., “Cambridge Analytica was responsible for Brexit and Trump’s 2016 election”) actually “supports Facebook’s sales pitch”:

What could be more appealing to an advertiser than a machine that can persuade anyone of anything?”

This can be applied here: The alarm over AI’s magic power (e.g., “replacing humans”) actually “supports OpenAI’s sales pitch”:

“What could be more appealing to future AI employees and investors than a machine that can become superintelligence?”

AI Panic as a Business. Exhibit A & B: Tristan Harris & Eliezer Yudkowsky.

Altman is at least using apocalyptic AI marketing for actual OpenAI products. The worst kind of doomers is those whose AI panic is their product, their main career, and their source of income. A prime example is the Effective Altruism institutes that claim to be the superior few who can save us from a hypothetical AGI apocalypse.

In March, Tristan Harris, Co-Founder of the Center for Humane Technology, invited leaders to a lecture on how AI could wipe out humanity. To begin his doomsday presentation, he stated: “What nukes are to the physical world … AI is to everything else.”

Steven Levy summarized that lecture at WIRED, saying, “We need to be thoughtful as we roll out AI. But hard to think clearly if it’s presented as the apocalypse.” Apparently, after the “Social Dilemma” has been completed, Tristan Harris is now working on the AI Dilemma. Oh boy. We can guess how it’s going to look (The “nobody criticized bicycles” guy will make a Frankenstein’s monster/Pandora’s box “documentary”).

In the “Social Dilemma,” he promoted the idea that “Two billion people will have thoughts that they didn’t intend to have” because of the designers’ decisions. But, as Lee Visel pointed out, Harris didn’t provide any evidence that social media designers actually CAN purposely force us to have unwanted thoughts.

Similarly, there’s no need for evidence now that AI is worse than nuclear power; simply thinking about this analogy makes it true (in Harris’ mind, at least). Did a social media designer force him to have this unwanted thought? (Just wondering).

To further escalate the AI panic, Tristan Harris published an OpEd in The New York Times with Yuval Noah Harari and Aza Raskin. Among their overdramatic claims: “We have summoned an alien intelligence,” “A.I. could rapidly eat the whole human culture,” and AI’s “godlike powers” will “master us.”

Another statement in this piece was, “Social media was the first contact between A.I. and humanity, and humanity lost.” I found it funny as it came from two men with hundreds of thousands of followers (@harari_yuval 540.4k, @tristanharris 192.6k), who use their social media megaphone … for fear-mongering. The irony is lost on them.

“This is what happens when you bring together two of the worst thinkers on new technologies,” added Lee Vinsel. “Among other shared tendencies, both bloviate free of empirical inquiry.”

This is where we should be jealous of AI doomers. Having no evidence and no nuance is extremely convenient (when your only goal is to attack an emerging technology).

Then came the famous “Open Letter.” This petition from the Future of Life Institute lacked a clear argument or a trade-off analysis. There were only rhetorical questions, like, should we develop imaginary “nonhuman minds that might eventually outnumber, outsmart, obsolete, and replace us?“ They provided no evidence to support the claim that advanced LLMs pose an unprecedented existential risk. There were a lot of highly speculative assumptions. Yet, they demanded an immediate 6-month pause on training AI systems and argued that “If such a pause cannot be enacted quickly, governments should institute a moratorium.

Please keep in mind that (1). A $10 million donation from Elon Musk launched the Future of Life Institute in 2015. Out of its total budget of 4 million euros for 2021, Musk Foundation contributed 3.5 million euros (the biggest donor by far). (2). Musk once said that “With artificial intelligence, we are summoning the demon.” (3). Due to this, the institute’s mission is to lobby against extinction, misaligned AI, and killer robots.

“The authors of the letter believe they are superior. Therefore, they have the right to call a stop, due to the fear that less intelligent humans will be badly influenced by AI,” responded Keith Teare (CEO SignalRank Corporation). “They are taking a paternalistic view of the entire human race, saying, ‘You can’t trust these people with this AI.’ It’s an elitist point of view.”

“It’s worth noting the letter overlooked that much of this work is already happening,” added

Spencer Ante (Meta Foresight). “Leading providers of AI are taking AI safety and responsibility very seriously, developing risk-mitigation tools, best practices for responsible use, monitoring platforms for misuse, and learning from human feedback.”

Next, because he thought the open letter didn’t go far enough, Eliezer Yudkowsky took “PhobAI” too far. First, Yudkowsky asked us all to be afraid of made-up risks and an apocalyptic fantasy he has about “superhuman intelligence” “killing literally everyone” (or “kill everyone in the U.S. and in China and on Earth”). Then, he suggested that “preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange.” By explicitly advocating violent solutions to AI, we have officially reached the height of hysteria.

Rhetoric from AI doomers is not just ridiculous. It’s dangerous and unethical,” responded Yann Lecun (Chief AI Scientist, Meta). “AI doomism is quickly becoming indistinguishable from an apocalyptic religion. Complete with prophecies of imminent fire and brimstone caused by an omnipotent entity that doesn’t actually exist.”

“You stand a far greater chance of dying from lightning strikes, collisions with deer, peanut allergies, bee stings & ignition or melting of nightwear – than you do from AI,” Michael Shermer wrote to Yudkowsky. “Quit stoking irrational fears.”

The problem is that “irrational fears” sell. They are beneficial to the ones who spread them.

How to Spot an AI Doomer?

On April 2nd, Gary Marcus asked: “Confused about the terminology. If I doubt that robots will take over the world, but I am very concerned that a massive glut of authoritative-seeming misinformation will undermine democracy, do I count as a “doomer”?

One of the answers was: “You’re a doomer as long as you bypass participating in the conversation and instead appeal to populist fearmongering and lobbying reactionary, fearful politicians with clickbait.”

Considering all of the above, I decided to define “AI doomer” and provide some criteria:

How to spot an AI Doomer?

  • Making up fake scenarios in which AI will wipe out humanity
  • Don’t even bother to have any evidence to back up those scenarios
  • Watched/read too much sci-fi
  • Says that due to AI’s God-like power, it should be stopped
  • Only he (& a few “chosen ones”) can stop it
  • So, scared/hopeless people should support his endeavor ($)

Then, Adam Thierer added another characteristic:

  • Doomers tend to live in a tradeoff-free fantasy land.

Doomers have a general preference for very amorphous, top-down Precautionary Principle-based solutions, but they (1) rarely discuss how (or if) those schemes would actually work in practice, and (2) almost never discuss the trade-offs/costs their extreme approaches would impose on society/innovation.

Answering Gary Marcus’ question, I do not think he qualifies as a doomer. You need to meet all criteria (he does not). Meanwhile, Tristan Harris and Eliezer Yudkowsky meet all seven.

Are they ever going to stop this “Panic-as-a-Business”? If the apocalyptic catastrophe doesn’t occur, will the AI doomers ever admit they were wrong? I believe the answer is “No.”

Doomsday cultists don’t question their own predictions. But you should.

Dr. Nirit Weiss-Blatt (@DrTechlash) is the author of The Techlash and Tech Crisis Communication

Source: The AI Doomers’ Playbook | Techdirt

 

Need To Pick Objects Out Of Images? Segment Anything Does Exactly That

Segment Anything, recently released by Facebook Research, does something that most people who have dabbled in computer vision have found daunting: reliably figure out which pixels in an image belong to an object. Making that easier is the goal of the Segment Anything Model (SAM), just released under the Apache 2.0 license.

The online demo has a bank of examples, but also works with uploaded images.

The results look fantastic, and there’s an interactive demo available where you can play with the different ways SAM works. One can pick out objects by pointing and clicking on an image, or images can be automatically segmented. It’s frankly very impressive to see SAM make masking out the different objects in an image look so effortless. What makes this possible is machine learning, and part of that is the fact that the model behind the system has been trained on a huge dataset of high-quality images and masks, making it very effective at what it does.

Once an image is segmented, those masks can be used to interface with other systems like object detection (which identifies and labels what an object is) and other computer vision applications. Such system work more robustly if they already know where to look, after all. This blog post from Meta AI goes into some additional detail about what’s possible with SAM, and fuller details are in the research paper.

Systems like this rely on quality datasets. Of course, nothing beats a great collection of real-world data but we’ve also seen that it’s possible to machine-generate data that never actually existed, and get useful results.

Source: Need To Pick Objects Out Of Images? Segment Anything Does Exactly That | Hackaday

Italy finds decently good out to really stupid ban: Demands OpenAI Allow ChatGPT User Corrections After Ban

In a news announcement on Wednesday, the Italian Data Protection Authority, known as the Garante, stressed that OpenAI needed to be more transparent about its data collection processes and inform users about their data rights with regards to the generative AI. These rights include allowing users and non-users of ChatGPT to object to having their data processed by OpenAI and letting them correct false or inaccurate information about them generated by ChatGPT, similar to rights related to other technologies guaranteed by Europe’s General Data Protection Regulation, or GDPR, laws.

Other measures required by the Garante include a public notice on OpenAI’s website “describing the arrangements and logic of the data processing required for the operation of ChatGPT along with the rights afforded to data subjects.” The regulator will also require OpenAI to immediately implement an age gating system for ChatGPT and submit a plan to implement an age verification system by May 31.

The Italian regulator said OpenAI had until April 30 to implement the measures it’s asking for.

[…]

Source: Italy Demands OpenAI Allow ChatGPT User Corrections After Ban

Allowing users to correct is in principle a Good Idea, but then you get Wikipedia types of battles on who is the arbiter of truth. Of course, no one system will ever be 100% truthful or accurate, so banning it for this is just stupid. No age gate keeper works either and neither did the ban – people can circumvent these very very easily. So Italy needs some sort of concession to get out of the hole it’s dug itself and this is at least a promising start.