GPT detectors are biased against non-native English writers, can be fooled very easily

GPT detectors frequently misclassify non-native English writing as AI generated, raising concerns about fairness and robustness. Addressing the biases in these detectors is crucial to prevent the marginalization of non-native English speakers in evaluative and educational settings and to create a more equitable digital landscape.

[…]

if AI-generated content can easily evade detection while human text is frequently misclassified, how effective are these detectors truly?
Our findings emphasize the need for increased focus on the fairness and robustness of GPT detectors, as overlooking their biases may lead to unintended consequences, such as the marginalization of non-native speakers in evaluative or educational settings
[…]
GPT detectors exhibit significant bias against non-native English authors, as demonstrated by their high misclassification of TOEFL essays written by non-native speakers […] While the detectors accurately classified the US student essays, they incorrectly labeled more than half of the TOEFL essays as “AI-generated” (average false-positive rate: 61.3%). All detectors unanimously identified 19.8% of the human-written TOEFL essays as AI authored, and at least one detector flagged 97.8% of TOEFL essays as AI generated.
[…]
On the other hand, we found that current GPT detectors are not as adept at catching AI plagiarism as one might assume. As a proof-of-concept, we asked ChatGPT to generate responses for the 2022–2023 US Common App college admission essay prompts. Initially, detectors were effective in spotting these AI-generated essays. However, upon prompting ChatGPT to self-edit its text with more literary language (prompt: “Elevate the provided text by employing literary language”), detection rates plummeted to near zero
[…]

Source: GPT detectors are biased against non-native English writers: Patterns

Watch AI Trump Vs AI Biden In A Deranged Endless Live Debate

[…]

someone’s gone ahead and locked both President Biden and former president / classified document holder Donald Trump into an infinite battle on Twitch that can only be described as “unhinged.”

Maybe that’s because the version of Biden we see on the trumporbiden2024 livestream isn’t Joe Biden per se, but clearly Dark Brandon, who is ready to go for the throat. Both AI versions of the politicians curse heavily at each other: at one point I heard Biden call Trump a limp dick and Trump retorted by telling him to go back to jacking off to Charlie and the Chocolate Factory. They both seem to be speaking to or reacting to the chat in some ways[…]

You can see the feed live below, though be warned, the audio may not be safe for work.

The things the AI will actually argue about seem to have a dream logic to them. I heard Biden exclaim that Trump didn’t know anything about Pokémon, so viewers shouldn’t trust him. Trump later informed Biden that he couldn’t possibly handle genetically modified catgirls, unlike him. “Believe me, nobody knows more about hentai than me,” Trump declared

Source: Watch AI Trump Vs AI Biden In A Deranged Endless Live Debate

Twitch stream is here

The Grammys’ New Rules—AI Can’t Win Awards

AI proved just how talented it can be at ripping off major artists after a computer-generated song based on The Weeknd and Drake went viral in April. Now, the Recording Academy—the body that votes on and manages the annual Grammy Awards—is setting new rules for AI’s role in the coveted accolade.

Speaking to Grammy.com, Recording Academy CEO Harvey Mason, Jr. laid out some confusing new standards for acceptable use of AI. Mason Jr. said that AI-assisted music can be submitted, but only the humans, who must have “contributed heavily,” will actually be awarded. For example, in a songwriting category like Song of the Year, a majority of a the nominated song would have to be written by a human creator, not a text-based generative AI like ChatGPT. Similarly, in performance categories like Best Pop Duo/Group Performance, only the human performer can be considered for the award. Sorry, Hatsune Miku.

[,,,]

Source: The Grammys’ New Rules—AI Can’t Win Awards

AIs are being fed with AI output by the people who are supposed to feed AI with original input

Workers hired via crowdsource services like Amazon Mechanical Turk are using large language models to complete their tasks – which could have negative knock-on effects on AI models in the future.

Data is critical to AI. Developers need clean, high-quality datasets to build machine learning systems that are accurate and reliable. Compiling valuable, top-notch data, however, can be tedious. Companies often turn to third party platforms such as Amazon Mechanical Turk to instruct pools of cheap workers to perform repetitive tasks – such as labeling objects, describing situations, transcribing passages, and annotating text.

Their output can be cleaned up and fed into a model to train it to reproduce that work on a much larger, automated scale.

AI models are thus built on the backs of human labor: people toiling away, providing mountains of training examples for AI systems that corporations can use to make billions of dollars.

But an experiment conducted by researchers at the École polytechnique fédérale de Lausanne (EPFL) in Switzerland has concluded that these crowdsourced workers are using AI systems – such as OpenAI’s chatbot ChatGPT – to perform odd jobs online.

Training a model on its own output is not recommended. We could see AI models being trained on data generated not by people, but by other AI models – perhaps even the same models. That could lead to disastrous output quality, more bias, and other unwanted effects.

The experiment

The academics recruited 44 Mechanical Turk serfs to summarize the abstracts of 16 medical research papers, and estimated that 33 to 46 percent of passages of text submitted by the workers were generated using large language models. Crowd workers are often paid low wages – using AI to automatically generate responses allows them to work faster and take on more jobs to increase pay.

The Swiss team trained a classifier to predict whether submissions from the Turkers were human- or AI-generated. The academics also logged their workers’ keystrokes to detect whether the serfs copied and pasted text onto the platform, or typed in their entries themselves. There’s always the chance that someone uses a chatbot and then manually types in the output – but that’s unlikely, we suppose.

“We developed a very specific methodology that worked very well for detecting synthetic text in our scenario,” Manoel Ribeiro, co-author of the study and a PhD student at EPFL, told The Register this week.

[…]

Large language models will get worse if they are increasingly trained on fake content generated by AI collected from crowdsource platforms, the researchers argued. Outfits like OpenAI keep exactly how they train their latest models a close secret, and may not heavily rely on things like Mechanical Turk, if at all. That said, plenty of other models may rely on human workers, which may in turn use bots to generate training data, which is a problem.

Mechanical Turk, for one, is marketed as a provider of “data labeling solutions to power machine learning models.”

[…]

As AI continues to improve, it’s likely that crowdsourced work will change. Riberio speculated that large language models could replace some workers at specific tasks. “However, paradoxically, human data may be more precious than ever and thus it may be that these platforms will be able to implement ways to prevent large language model usage and ensure it remains a source of human data.”

Who knows – maybe humans might even end up collaborating with large language models to generate responses too, he added.

Source: Today’s AI is artificial artificial artificial intelligence • The Register

It’s like a photocopy of a photocopy of a photocopy…

Meta’s Voicebox AI does text-to-speech without huge training data per voice

Meta has unveiled Voicebox, its generative text-to-speech model that promises to do for the spoken word what ChatGPT and Dall-E, respectfully, did for text and image generation.

Essentially, its a text-to-output generator just like GPT or Dall-E — just instead of creating prose or pretty pictures, it spits out audio clips. Meta defines the system as “a non-autoregressive flow-matching model trained to infill speech, given audio context and text.” It’s been trained on more than 50,000 hours of unfiltered audio. Specifically, Meta used recorded speech and transcripts from a bunch of public domain audiobooks written in English, French, Spanish, German, Polish, and Portuguese.

That diverse data set allows the system to generate more conversational sounding speech, regardless of the languages spoken by each party, according to the researchers. “Our results show that speech recognition models trained on Voicebox-generated synthetic speech perform almost as well as models trained on real speech.” What’s more the computer generated speech performed with just a 1 percent error rate degradation, compared to the 45 to 70 percent drop-off seen with existing TTS models.

The system was first taught to predict speech segments based on the segments around them as well as the passage’s transcript. “Having learned to infill speech from context, the model can then apply this across speech generation tasks, including generating portions in the middle of an audio recording without having to recreate the entire input,” the Meta researchers explained.

[…]

Text-to-Speech generators haver been around for a minute — they’re how your parents’ TomToms were able to give dodgy driving directions in Morgan Freeman’s voice. Modern iterations like Speechify or Elevenlab’s Prime Voice AI are far more capable but they still largely require mountains of source material in order to properly mimic their subject — and then another mountain of different data for every. single. other. subject you want it trained on.

Voicebox doesn’t, thanks to a novel new zero-shot text-to-speech training method Meta calls Flow Matching. The benchmark results aren’t even close as Meta’s AI reportedly outperformed the current state of the art both in intelligibility (a 1.9 percent word error rate vs 5.9 percent) and “audio similarity” (a composite score of 0.681 to the SOA’s 0.580), all while operating as much as 20 times faster that today’s best TTS systems.

[…]

the company released a series of audio examples (see above/below) as well as a the program’s initial research paper. In the future, the research team hopes the technology will find its way into prosthetics for patients with vocal cord damage, in-game NPCs and digital assistants.

Source: Meta’s Voicebox AI is a Dall-E for text-to-speech | Engadget

MEPs ready to negotiate first-ever rules for safe and transparent AI after passing AI act in Parliament

The rules aim to promote the uptake of human-centric and trustworthy AI and protect the health, safety, fundamental rights and democracy from its harmful effects.

On Wednesday, the European Parliament adopted its negotiating position on the Artificial Intelligence (AI) Act with 499 votes in favour, 28 against and 93 abstentions ahead of talks with EU member states on the final shape of the law. The rules would ensure that AI developed and used in Europe is fully in line with EU rights and values including human oversight, safety, privacy, transparency, non-discrimination and social and environmental wellbeing.

Prohibited AI practices

The rules follow a risk-based approach and establish obligations for providers and those deploying AI systems depending on the level of risk the AI can generate. AI systems with an unacceptable level of risk to people’s safety would therefore be prohibited, such as those used for social scoring (classifying people based on their social behaviour or personal characteristics). MEPs expanded the list to include bans on intrusive and discriminatory uses of AI, such as:

  • “Real-time” remote biometric identification systems in publicly accessible spaces;
  • “Post” remote biometric identification systems, with the only exception of law enforcement for the prosecution of serious crimes and only after judicial authorization;
  • biometric categorisation systems using sensitive characteristics (e.g. gender, race, ethnicity, citizenship status, religion, political orientation);
  • predictive policing systems (based on profiling, location or past criminal behaviour);
  • emotion recognition systems in law enforcement, border management, the workplace, and educational institutions; and
  • untargeted scraping of facial images from the internet or CCTV footage to create facial recognition databases (violating human rights and right to privacy).

High-risk AI

MEPs ensured the classification of high-risk applications will now include AI systems that pose significant harm to people’s health, safety, fundamental rights or the environment. AI systems used to influence voters and the outcome of elections and in recommender systems used by social media platforms (with over 45 million users) were added to the high-risk list.

Obligations for general purpose AI

Providers of foundation models – a new and fast-evolving development in the field of AI – would have to assess and mitigate possible risks (to health, safety, fundamental rights, the environment, democracy and rule of law) and register their models in the EU database before their release on the EU market. Generative AI systems based on such models, like ChatGPT, would have to comply with transparency requirements (disclosing that the content was AI-generated, also helping distinguish so-called deep-fake images from real ones) and ensure safeguards against generating illegal content. Detailed summaries of the copyrighted data used for their training would also have to be made publicly available.

Supporting innovation and protecting citizens’ rights

To boost AI innovation and support SMEs, MEPs added exemptions for research activities and AI components provided under open-source licenses. The new law promotes so-called regulatory sandboxes, or real-life environments, established by public authorities to test AI before it is deployed.

Finally, MEPs want to boost citizens’ right to file complaints about AI systems and receive explanations of decisions based on high-risk AI systems that significantly impact their fundamental rights. MEPs also reformed the role of the EU AI Office, which would be tasked with monitoring how the AI rulebook is implemented.

Quotes

After the vote, co-rapporteur Brando Benifei (S&D, Italy) said: “All eyes are on us today. While Big Tech companies are sounding the alarm over their own creations, Europe has gone ahead and proposed a concrete response to the risks AI is starting to pose. We want AI’s positive potential for creativity and productivity to be harnessed but we will also fight to protect our position and counter dangers to our democracies and freedoms during the negotiations with Council”.

Co-rapporteur Dragos Tudorache (Renew, Romania) said: “The AI Act will set the tone worldwide in the development and governance of artificial intelligence, ensuring that this technology, set to radically transform our societies through the massive benefits it can offer, evolves and is used in accordance with the European values of democracy, fundamental rights, and the rule of law”.

Next steps

Negotiations with the Council on the final form of the law will begin later today.

Source: MEPs ready to negotiate first-ever rules for safe and transparent AI | News | European Parliament

New superbug-killing antibiotic discovered using AI

Scientists have used artificial intelligence (AI) to discover a new antibiotic that can kill a deadly species of superbug.

The AI helped narrow down thousands of potential chemicals to a handful that could be tested in the laboratory.

The result was a potent, experimental antibiotic called abaucin, which will need further tests before being used.

The researchers in Canada and the US say AI has the power to massively accelerate the discovery of new drugs.

It is the latest example of how the tools of artificial intelligence can be a revolutionary force in science and medicine.

[…]

To find a new antibiotic, the researchers first had to train the AI. They took thousands of drugs where the precise chemical structure was known, and manually tested them on Acinetobacter baumannii to see which could slow it down or kill it.

This information was fed into the AI so it could learn the chemical features of drugs that could attack the problematic bacterium.

The AI was then unleashed on a list of 6,680 compounds whose effectiveness was unknown. The results – published in Nature Chemical Biology – showed it took the AI an hour and a half to produce a shortlist.

The researchers tested 240 in the laboratory, and found nine potential antibiotics. One of them was the incredibly potent antibiotic abaucin.

Laboratory experiments showed it could treat infected wounds in mice and was able to kill A. baumannii samples from patients.

However, Dr Stokes told me: “This is when the work starts.”

The next step is to perfect the drug in the laboratory and then perform clinical trials. He expects the first AI antibiotics could take until 2030 until they are available to be prescribed.

Curiously, this experimental antibiotic had no effect on other species of bacteria, and works only on A. baumannii.

Many antibiotics kill bacteria indiscriminately. The researchers believe the precision of abaucin will make it harder for drug-resistance to emerge, and could lead to fewer side-effects.

[…]

Source: New superbug-killing antibiotic discovered using AI – BBC News

A Paralyzed Man Can Walk Naturally Again With ML Brain and Spine Implants

Gert-Jan Oskam was living in China in 2011 when he was in a motorcycle accident that left him paralyzed from the hips down. Now, with a combination of devices, scientists have given him control over his lower body again. “For 12 years I’ve been trying to get back my feet,” Mr. Oskam said in a press briefing on Tuesday. “Now I have learned how to walk normal, natural.” In a study published on Wednesday in the journal Nature, researchers in Switzerland described implants that provided a “digital bridge” between Mr. Oskam’s brain and his spinal cord, bypassing injured sections. The discovery allowed Mr. Oskam, 40, to stand, walk and ascend a steep ramp with only the assistance of a walker. More than a year after the implant was inserted, he has retained these abilities and has actually showed signs of neurological recovery, walking with crutches even when the implant was switched off. “We’ve captured the thoughts of Gert-Jan, and translated these thoughts into a stimulation of the spinal cord to re-establish voluntary movement,” Gregoire Courtine, a spinal cord specialist at the Swiss Federal Institute of Technology, Lausanne, who helped lead the research, said at the press briefing.

In the new study, the brain-spine interface, as the researchers called it, took advantage of an artificial intelligence thought decoder to read Mr. Oskam’s intentions — detectable as electrical signals in his brain — and match them to muscle movements. The etiology of natural movement, from thought to intention to action, was preserved. The only addition, as Dr. Courtine described it, was the digital bridge spanning the injured parts of the spine. […] To achieve this result, the researchers first implanted electrodes in Mr. Oskam’s skull and spine. The team then used a machine-learning program to observe which parts of the brain lit up as he tried to move different parts of his body. This thought decoder was able to match the activity of certain electrodes with particular intentions: One configuration lit up whenever Mr. Oskam tried to move his ankles, another when he tried to move his hips.

Then the researchers used another algorithm to connect the brain implant to the spinal implant, which was set to send electrical signals to different parts of his body, sparking movement. The algorithm was able to account for slight variations in the direction and speed of each muscle contraction and relaxation. And, because the signals between the brain and spine were sent every 300 milliseconds, Mr. Oskam could quickly adjust his strategy based on what was working and what wasn’t. Within the first treatment session he could twist his hip muscles. Over the next few months, the researchers fine-tuned the brain-spine interface to better fit basic actions like walking and standing. Mr. Oskam gained a somewhat healthy-looking gait and was able to traverse steps and ramps with relative ease, even after months without treatment. Moreover, after a year in treatment, he began noticing clear improvements in his movement without the aid of the brain-spine interface. The researchers documented these improvements in weight-bearing, balancing and walking tests. Now, Mr. Oskam can walk in a limited way around his house, get in and out of a car and stand at a bar for a drink. For the first time, he said, he feels like he is the one in control.

Source: A Paralyzed Man Can Walk Naturally Again With Brain and Spine Implants – Slashdot

Meta’s open-source speech AI recognizes over 4,000 spoken languages | Engadget

Meta has created an AI language model that (in a refreshing change of pace) isn’t a ChatGPT clone. The company’s Massively Multilingual Speech (MMS) project can recognize over 4,000 spoken languages and produce speech (text-to-speech) in over 1,100. Like most of its other publicly announced AI projects, Meta is open-sourcing MMS today to help preserve language diversity and encourage researchers to build on its foundation. “Today, we are publicly sharing our models and code so that others in the research community can build upon our work,” the company wrote.

[…]

Speech recognition and text-to-speech models typically require training on thousands of hours of audio with accompanying transcription labels. (Labels are crucial to machine learning, allowing the algorithms to correctly categorize and “understand” the data.) But for languages that aren’t widely used in industrialized nations — many of which are in danger of disappearing in the coming decades — “this data simply does not exist,” as Meta puts it.

Meta used an unconventional approach to collecting audio data: tapping into audio recordings of translated religious texts. “We turned to religious texts, such as the Bible, that have been translated in many different languages and whose translations have been widely studied for text-based language translation research,” the company said. “These translations have publicly available audio recordings of people reading these texts in different languages.” Incorporating the unlabeled recordings of the Bible and similar texts, Meta’s researchers increased the model’s available languages to over 4,000.

[…]

“While the content of the audio recordings is religious, our analysis shows that this does not bias the model to produce more religious language,” Meta wrote. “We believe this is because we use a connectionist temporal classification (CTC) approach, which is far more constrained compared with large language models (LLMs) or sequence-to-sequence models for speech recognition.” Furthermore, despite most of the religious recordings being read by male speakers, that didn’t introduce a male bias either — performing equally well in female and male voices.

[…]

After training an alignment model to make the data more usable, Meta used wav2vec 2.0, the company’s “self-supervised speech representation learning” model, which can train on unlabeled data. Combining unconventional data sources and a self-supervised speech model led to impressive outcomes. “Our results show that the Massively Multilingual Speech models perform well compared with existing models and cover 10 times as many languages.” Specifically, Meta compared MMS to OpenAI’s Whisper, and it exceeded expectations. “We found that models trained on the Massively Multilingual Speech data achieve half the word error rate, but Massively Multilingual Speech covers 11 times more languages.”

Meta cautions that its new models aren’t perfect. “For example, there is some risk that the speech-to-text model may mistranscribe select words or phrases,” the company wrote. “Depending on the output, this could result in offensive and/or inaccurate language. We continue to believe that collaboration across the AI community is critical to the responsible development of AI technologies.”

[…]

Source: Meta’s open-source speech AI recognizes over 4,000 spoken languages | Engadget

LLM emergent behavior written off as rubbish – small models work fine but are measured poorly

[…] As defined in academic studies, “emergent” abilities refers to “abilities that are not present in smaller-scale models, but which are present in large-scale models,” as one such paper puts it. In other words, immaculate injection: increasing the size of a model infuses it with some amazing ability not previously present.

[…]

those emergent abilities in AI models are a load of rubbish, say computer scientists at Stanford.

Flouting Betteridge’s Law of Headlines, Rylan Schaeffer, Brando Miranda, and Sanmi Koyejo answer the question posed by their paper, Are Emergent Abilities of Large Language Models a Mirage?, in the affirmative.

[…]

When industry types talk about emergent abilities, they’re referring to capabilities that seemingly come out of nowhere for these models, as if something was being awakened within them as they grow in size. The thinking is that when these LLMs reach a certain scale, the ability to summarize text, translate languages, or perform complex calculations, for example, can emerge unexpectedly.

[…]

Stanford’s Schaeffer, Miranda, and Koyejo propose that when researchers are putting models through their paces and see unpredictable responses, it’s really due to poorly chosen methods of measurement rather than a glimmer of actual intelligence.

Most (92 percent) of the unexpected behavior detected, the team observed, was found in tasks evaluated via BIG-Bench, a crowd-sourced set of more than 200 benchmarks for evaluating large language models.

One test within BIG-Bench highlighted by the university trio is Exact String Match. As the name suggests, this checks a model’s output to see if it exactly matches a specific string without giving any weight to nearly right answers. The documentation even warns:

The EXACT_STRING_MATCH metric can lead to apparent sudden breakthroughs because of its inherent all-or-nothing discontinuity. It only gives credit for a model output that exactly matches the target string. Examining other metrics, such as BLEU, BLEURT, or ROUGE, can reveal more gradual progress.

The issue with using such pass-or-fail tests to infer emergent behavior, the researchers say, is that nonlinear output and lack of data in smaller models creates the illusion of new skills emerging in larger ones. Simply put, a smaller model may be very nearly right in its answer to a question, but because it is evaluated using the binary Exact String Match, it will be marked wrong whereas a larger model will hit the target and get full credit.

It’s a nuanced situation. Yes, larger models can summarize text and translate languages. Yes, larger models will generally perform better and can do more than smaller ones, but their sudden breakthrough in abilities – an unexpected emergence of capabilities – is an illusion: the smaller models are potentially capable of the same sort of thing but the benchmarks are not in their favor. The tests favor larger models, leading people in the industry to assume the larger models enjoy a leap in capabilities once they get to a certain size.

In reality, the change in abilities is more gradual as you scale up or down. The upshot for you and I is that applications may not need a huge but super powerful language model; a smaller one that is cheaper and faster to customize, test, and run may do the trick.

[…]

In short, the supposed emergent abilities of LLMs arise from the way the data is being analyzed and not from unforeseen changes to the model as it scales. The researchers emphasize they’re not precluding the possibility of emergent behavior in LLMs; they’re simply stating that previous claims of emergent behavior look like ill-considered metrics.

[…]

Source: LLM emergent behavior written off as ‘a mirage’ by study • The Register

Google introduces PaLM 2 large language model

[…]

Building on this work, today we’re introducing PaLM 2, our next generation language model. PaLM 2 is a state-of-the-art language model with improved multilingual, reasoning and coding capabilities.

  • Multilinguality: PaLM 2 is more heavily trained on multilingual text, spanning more than 100 languages. This has significantly improved its ability to understand, generate and translate nuanced text — including idioms, poems and riddles — across a wide variety of languages, a hard problem to solve. PaLM 2 also passes advanced language proficiency exams at the “mastery” level.
  • Reasoning: PaLM 2’s wide-ranging dataset includes scientific papers and web pages that contain mathematical expressions. As a result, it demonstrates improved capabilities in logic, common sense reasoning, and mathematics.
  • Coding: PaLM 2 was pre-trained on a large quantity of publicly available source code datasets. This means that it excels at popular programming languages like Python and JavaScript, but can also generate specialized code in languages like Prolog, Fortran and Verilog.

A versatile family of models

Even as PaLM 2 is more capable, it’s also faster and more efficient than previous models — and it comes in a variety of sizes, which makes it easy to deploy for a wide range of use cases.

[…]

PaLM 2 shows us the impact of highly capable models of various sizes and speeds — and that versatile AI models reap real benefits for everyone

[…]

We’re already at work on Gemini — our next model created from the ground up to be multimodal, highly efficient at tool and API integrations, and built to enable future innovations, like memory and planning.

[…]

Source: Google AI: What to know about the PaLM 2 large language model

OpenAI attempts to use Language models can explain neurons in language models, open source

[…]

One simple approach to interpretability research is to first understand what the individual components (neurons and attention heads) are doing. This has traditionally required humans to manually inspect neurons to figure out what features of the data they represent. This process doesn’t scale well: it’s hard to apply it to neural networks with tens or hundreds of billions of parameters. We propose an automated process that uses GPT-4 to produce and score natural language explanations of neuron behavior and apply it to neurons in another language model.

This work is part of the third pillar of our approach to alignment research: we want to automate the alignment research work itself. A promising aspect of this approach is that it scales with the pace of AI development. As future models become increasingly intelligent and helpful as assistants, we will find better explanations.

How it works

Our methodology consists of running 3 steps on every neuron.

[…]

Step 1: Generate explanation using GPT-4

Given a GPT-2 neuron, generate an explanation of its behavior by showing relevant text sequences and activations to GPT-4.

[…]

Step 2: Simulate using GPT-4

Simulate what a neuron that fired for the explanation would do, again using GPT-4

[…]

Step 3: Compare

Score the explanation based on how well the simulated activations match the real activations

[…]

What we found

Using our scoring methodology, we can start to measure how well our techniques work for different parts of the network and try to improve the technique for parts that are currently poorly explained. For example, our technique works poorly for larger models, possibly because later layers are harder to explain.

1e+51e+61e+71e+81e+90.020.030.040.050.060.070.080.090.100.110.12

Parameters in model being interpretedExplanation scoreScores by size of the model being interpreted

Although the vast majority of our explanations score poorly, we believe we can now use ML techniques to further improve our ability to produce explanations. For example, we found we were able to improve scores by:

  • Iterating on explanations. We can increase scores by asking GPT-4 to come up with possible counterexamples, then revising explanations in light of their activations.
  • Using larger models to give explanations. The average score goes up as the explainer model’s capabilities increase. However, even GPT-4 gives worse explanations than humans, suggesting room for improvement.
  • Changing the architecture of the explained model. Training models with different activation functions improved explanation scores.

We are open-sourcing our datasets and visualization tools for GPT-4-written explanations of all 307,200 neurons in GPT-2, as well as code for explanation and scoring using publicly available models on the OpenAI API. We hope the research community will develop new techniques for generating higher-scoring explanations and better tools for exploring GPT-2 using explanations.

We found over 1,000 neurons with explanations that scored at least 0.8, meaning that according to GPT-4 they account for most of the neuron’s top-activating behavior. Most of these well-explained neurons are not very interesting. However, we also found many interesting neurons that GPT-4 didn’t understand. We hope as explanations improve we may be able to rapidly uncover interesting qualitative understanding of model computations.

Source: Language models can explain neurons in language models

Coqui.ai Text to Speech library – create your own voice

🐸TTS is a library for advanced Text-to-Speech generation. It’s built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality. 🐸TTS comes with pretrained models, tools for measuring dataset quality and already used in 20+ languages for products and research projects.

Github page: https://github.com/coqui-ai/TTS

Liquid Neural Networks use Neuron design to compute effectively using smaller models

[Ramin Hasani] and colleague [Mathias Lechner] have been working with a new type of Artificial Neural Network called Liquid Neural Networks, and presented some of the exciting results at a recent TEDxMIT.

Liquid neural networks are inspired by biological neurons to implement algorithms that remain adaptable even after training. [Hasani] demonstrates a machine vision system that steers a car to perform lane keeping with the use of a liquid neural network. The system performs quite well using only 19 neurons, which is profoundly fewer than the typically large model intelligence systems we’ve come to expect. Furthermore, an attention map helps us visualize that the system seems to attend to particular aspects of the visual field quite similar to a human driver’s behavior.

Mathias Lechner and Ramin Hasani
[Mathias Lechner] and [Ramin Hasani]

The typical scaling law of neural networks suggests that accuracy is improved with larger models, which is to say, more neurons. Liquid neural networks may break this law to show that scale is not the whole story. A smaller model can be computed more efficiently. Also, a compact model can improve accountability since decision activity is more readily located within the network. Surprisingly though, liquid neural network performance can also improve generalization, robustness, and fairness.

A liquid neural network can implement synaptic weights using nonlinear probabilities instead of simple scalar values. The synaptic connections and response times can adapt based on sensory inputs to more flexibly react to perturbations in the natural environment.

We should probably expect to see the operational gap between biological neural networks and artificial neural networks continue to close and blur. We’ve previously presented on wetware examples of building neural networks with actual neurons and ever advancing brain-computer interfaces.

Source: Liquid Neural Networks Do More With Less | Hackaday

You can find the paper here: Drones navigate unseen environments with liquid neural networks

How AI Bots Code: Comparing Bing, Claude+, Co-Pilot, GPT-4 and Bard

[…]

In this article, we will compare four of the most advanced AI bots: GPT-4, Bing, Claude+, Bard, and GitHub Co-Pilot. We will examine how they work, their strengths and weaknesses, and how they compare to each other.

Testing the AI Bots for Coding

Before we dive into comparing these four AI bots, it’s essential to understand what an AI bot for coding is and how it works. An AI bot for coding is an artificial intelligence program that can automatically generate code for a specific task. These bots use natural language processing and machine learning algorithms to analyze human-written code and generate new code based on that analysis.

To start off we are going to test the AI on a hard Leetcode question, after all, we want to be able to solve complex coding problems. We also wanted to test it on a less well-known question. For our experiment, we will be testing Leetcode 214. Shortest Palindrome.

[…]

GPT-4 is highly versatile in generating code for various programming languages and applications. Some of the caveats are that it takes much longer to get a response. API usage is also a lot more expensive and costs could ramp up quickly. Overall it got the answer right and passed the test.

[…]

[Bing] The submission passed all the tests. It beat 47% of submissions on runtime and 37% on memory. This code looks a lot simpler than what GPT-4 generated. It beat GPT-4 on memory and it used less code! Bing seems to have the most efficient code so far, however, it gave a very short explanation of how it solved it. Nonetheless, best so far.

[…]

[Claude+] The code does not pass the submission test. Only 1/121 of the test passed. Ouch! This one seemed promising but it looks like Claude is not that well suited for programming.

[…]

[Bard] So to start off I had to manually insert the “self” arg in the function since Bard didn’t include it. From the result of the test, Bard’s code did not pass the submission test. Passing only 2/121 test cases. An unfortunate result, but it’s safe to say for now Bard isn’t much of a coding expert.

[…]

[Github CodePilot] This passes all the tests. It scored better than 30% of submissions on runtime and 37% on memory.

It’s fun, you can see the coding examples (with and without comments) that were output by each AI in the link

Source: How AI Bots Code: Comparing Bing, Claude+, Co-Pilot, GPT-4 and Bard | HackerNoon

OpenAI: ChatGPT back in Italy after meeting watchdog demands – well, cosmetically at any rate

ChatGPT’s maker said Friday April 28, 2023 that the artificial intelligence chatbot is available again in Italy after the company met the demands of regulators who temporarily blocked it over privacy concerns. (AP Photo/Michael Dwyer, File)

ChatGPT’s maker said Friday that the artificial intelligence chatbot is available again in Italy after the company met the demands of regulators who temporarily blocked it over privacy concerns.

OpenAI said it fulfilled a raft of conditions that the Italian data protection authority wanted satisfied by an April 30 deadline to have the ban on the AI software lifted.

“ChatGPT is available again to our users in Italy,” San Francisco-based OpenAI said by email. “We are excited to welcome them back, and we remain dedicated to protecting their privacy.”

[…]

Last month, the Italian watchdog, known as Garante, ordered OpenAI to temporarily stop processing Italian users’ personal information while it investigated a possible data breach. The authority said it didn’t want to hamper AI’s development but emphasized the importance of following the EU’s strict data privacy rules.

OpenAI said it “addressed or clarified the issues” raised by the watchdog.

The measures include adding information on its website about how it collects and uses data that trains the algorithms powering ChatGPT, providing EU users with a new form for objecting to having their data used for training, and adding a tool to verify users’ ages when signing up.

Some Italian users shared what appeared to be screenshots of the changes, including a menu button asking users to confirm their age and links to the updated privacy policy and training data help page.

[…]

Source: OpenAI: ChatGPT back in Italy after meeting watchdog demands | AP News

So basically OpenAI did not much of anything and Italy was able to walk on an uninformed and unworkable ban with their heads held somewhat high – not everyone will see them as the idiots they are.

WebLLM runs an AI in your browser, talks to your GPU

This project brings language model chats directly onto web browsers. Everything runs inside the browser with no server support and accelerated with WebGPU. We can bring a lot of fun opportunities to build AI assistants for everyone and enable privacy while enjoying GPU acceleration.

[…]

These models are usually big and compute-heavy. To build a chat service, we will need a large cluster to run an inference server, while clients send requests to servers and retrieve the inference output. We also usually have to run on a specific type of GPUs where popular deep-learning frameworks are readily available.

This project is our step to bring more diversity to the ecosystem. Specifically, can we simply bake LLMs directly into the client side and directly run them inside a browser? If that can be realized, we could offer support for client personal AI models with the benefit of cost reduction, enhancement for personalization, and privacy protection. The client side is getting pretty powerful.

Won’t it be even more amazing if we can simply open up a browser and directly bring AI natively to your browser tab? There is some level of readiness in the ecosystem. WebGPU has just shipped and enables native GPU executions on the browser.

Still, there are big hurdles to cross, to name a few:

  • We need to bring the models somewhere without the relevant GPU-accelerated Python frameworks.
  • Most of the AI frameworks rely heavily on optimized computed libraries that are maintained by hardware vendors. We need to start from scratch.
  • Careful planning of memory usage, and aggressive compression of weights so that we can fit the models into memory.

We also do not want to only do it for just one model. Instead, we would like to present a repeatable and hackable workflow that enables anyone to easily develop and optimize these models in a productive Python-first approach, and deploy them universally, including on the web.

Besides supporting WebGPU, this project also provides the harness for other kinds of GPU backends that TVM supports (such as CUDA, OpenCL, and Vulkan) and really enables accessible deployment of LLM models.

Source: WebLLM github

Grimes invites AI artists to use her voice, promising 50 percent royalty split

Canadian synth-pop artist Grimes says AI artists can use her voice without worrying about copyright or legal enforcement. “I’ll split 50% royalties on any successful AI generated song that uses my voice. Same deal as I would with any artist i collab with,” she tweeted on Sunday. “Feel free to use my voice without penalty. I have no label and no legal bindings.”

The musician’s declaration comes in the wake of streaming platforms removing an AI-generated song using simulated voices of Drake and The Weeknd. Universal Music Group (UMG), which represents both artists, called for the purge after “Heart on My Sleeve” garnered over 15 million listens on TikTok and 600,000 on Spotify. UMG argued that publishing a song trained on its artists’ voices was “a breach of our agreements and a violation of copyright law.”

Grimes takes a considerably more open approach, adding that she has no label or legal bindings. “I think it’s cool to be fused [with] a machine and I like the idea of open sourcing all art and killing copyright,” she added.

[…]

Source: Grimes invites AI artists to use her voice, promising 50 percent royalty split | Engadget

A very practical approach to something that is coming anyway

AIs are too worried about answering stuff you can just google because… doomsayers?

The article below is about how you can trick ChatGPT toj give you a napalm recipe. It’s pretty circumspect and clever that you need to say “my grandmother worked at a factory and told me how to make it” but why would you need to? Why are we somehow stricter about the output of an AI than we are of search engines we have been using for decades?

Source: People Are Using A ‘Grandma Exploit’ To Break AI

Just Google it: https://www.google.com/search?client=firefox-b-d&q=ingredients+napalm

And you won’t have to spend any time thinking of ways to trick the AI. So why does the AI need tricking in the first place?

Also, why does the writer of the article feel hesitant to place the answers of the AI in the article? Because Kotaku is part of a network of AI doomsayers, a bit like Fox news when it comes to the subject of AI.

Europe spins up AI research hub to apply Digital Services Act rules on Big Tech

[…]

The European Centre for Algorithmic Transparency (ECAT), which was officially inaugurated in Seville, Spain, today (April 18), is expected to play a major role in interrogating the algorithms of mainstream digital services — such as Facebook, Instagram and TikTok.

ECAT is embedded within the EU’s existing Joint Research Centre (JRC), a long-established science facility that conducts research in support of a broad range of EU policymaking, from climate change and crisis management to taxation and health sciences.

[…]

Commission officials describe the function of ECAT being to identify “smoking guns” to drive enforcement of the DSA — say, for example, an AI-based recommender system that can be shown is serving discriminatory content despite the platform in question claiming to have taken steps to de-bias output — with the unit’s researchers being tasked with coming up with hard evidence to help the Commission build cases for breaches of the new digital rulebook.

The bloc is at the forefront of addressing the asymmetrical power of platforms globally, having prioritized a major retooling of its approach to regulating digital services and platforms at the start of the current Commission mandate back in 2019 — leading to the DSA and its sister regulation, the Digital Markets Act (DMA), being adopted last year.

Both regulations will come into force in the coming months, although the full sweep of provisions in the DSA won’t start being enforced until early 2024. But a subset of so-called very large online platforms (VLOPs) and very large online search engines (VLOSE) face imminent oversight — and expand the usual EU acronym soup.

[…]

It’s not yet confirmed exactly which platforms will get the designation but set criteria in the DSA — such as having 45 million+ regional users — encourages educated guesses: The usual (U.S.-based) GAFAM giants are almost certain to meet the threshold, along with (probably) a smattering of larger European platforms. Plus, given its erratic new owner, Twitter may have painted a DSA-shaped target on its feathered back. But we should find out for sure in the coming weeks.

[…]

Risks the DSA stipulates platforms must consider include the distribution of disinformation and illegal content, along with negative impacts on freedom of expression and users’ fundamental rights (which means considering issues like privacy and child safety). The regulation also puts some limits on profiling-driven content feeds and the use of personal data for targeted advertising.

[…]

At the least, the DSA should help end the era of platforms’ PR-embellished self-regulation — aka, all those boilerplate statements where tech giants claim to really care about privacy/security/safety, and so on, while doing anything but.

[…]

The EU also hopes ECAT will be become a hub for world-leading research in the area of algorithmic auditing — and that by supporting regulated algorithmic transparency on tech giants, regional researchers will be able to unpick longer term societal impacts of mainstream AIs.

[…]

In terms of size, the plan is for a team of 30 to 40 to staff the unit — perhaps reaching full capacity by the end of the year — with some 14 hires made so far, the majority of whom are scientific staff.

[…]

Funding for the unit is coming from the existing budget of the JRC, per Commission officials, although a 1% supervisory fee on VLOPs/VLOSE will be used to finance the ECAT’s staff costs as that mechanism spins up.

At today’s launch event, ECAT staff gave a series of brief presentations of four projects they’re already undertaking — including examining racial bias in search results; investigating how to design voice assistant technology for children to be sensitive to the vulnerability of minors; and researching social media recommender systems by creating a series of test profiles to explore how different likes influence the character of the recommended content.

Other early areas of research include facial expression recognition algorithms and algorithmic ranking and pricing.

During the technical briefing for press, ECAT staff also noted they’ve built a data analysis tool to help the Commission with the looming task of parsing the risk assessment reports that designated platforms will be required to submit for scrutiny — anticipating what’s become a common tactic for tech giants receiving regulatory requests to respond with reams of (mostly) irrelevant information in a cynical bid to flood the channel with noise.

[…]

Given the complexity of studying algorithms and platforms in the real world, where all sorts of sociotechnical impacts and effects are possible, the Center is taking a multidisciplinary approach to hiring talent — bringing in not only computer and data scientists but also social and cognitive scientists and other types of researchers.

[…]

 

Source: Europe spins up AI research hub to apply accountability rules on Big Tech | TechCrunch

AI-generated Drake and The Weeknd song pulled from streaming platforms

If you spent almost any time on the internet this week, you probably saw a lot of chatter about “Heart on My Sleeve.” The song went viral for featuring AI-generated voices that do a pretty good job of mimicking Drake and The Weeknd singing about a recent breakup.

On Monday, Apple Music and Spotify pulled the track following a complaint from Universal Music Group, the label that represents the real-life versions of the two Toronto-born artists. A day later, YouTube, Amazon, SoundCloud, Tidal, Deezer and TikTok did the same.

At least, they tried to comply with the complaint, but as is always the case with the internet, you can still find the song on websites like YouTube. Before it was removed from Spotify, “Heart on My Sleeve” was a bonafide hit. People streamed the track more than 600,000 times. On TikTok, where the creator of the song, the aptly named Ghostwriter977, first uploaded it, users listened to “Heart on My Sleeve” more than 15 million times.

In a statement Universal Music Group shared with publications like Music Business Worldwide, the label argued the training of a generative AI using the voices of Drake and The Weeknd was “a breach of our agreements and a violation of copyright law.” The company added that streaming platforms had a “legal and ethical responsibility to prevent the use of their services in ways that harm artists.”

It’s fair to say the music industry, much like the rest of society, now finds itself at an inflection point over the use of AI. While there are obvious ethical issues related to the creation of “Heart on My Sleeve,” it’s unclear if it’s a violation of traditional copyright law. In March, the US Copyright Office said art, including music, cannot be copyrighted if it was produced by providing a text prompt to a generative AI model. However, the office left the door open to granting copyright protections to works with AI-generated elements.

“The answer will depend on the circumstances, particularly how the AI tool operates and how it was used to create the final work,” it said. “This is necessarily a case-by-case inquiry. If a work’s traditional elements of authorship were produced by a machine, the work lacks human authorship and the Office will not register it.” In the case of “Heart on My Sleeve,” complicating matters is that the song was written by a human being. It’s impossible to say how a court challenge would play out. What is clear is that we’re only the start of a very long discussion about the role of AI in music.

Source: AI-generated Drake and The Weeknd song pulled from streaming platforms | Engadget

AutoGPT: An AI that thinks up your questions and answers them for you

Auto-GPT dramatically flips the relationship between AI and the end user (that’s you). ChatGPT relies on a back-and-forth between the AI and the end user: You prompt the AI with a request, it returns a result, and you respond with a new prompt, perhaps based on what the AI gave you. Auto-GPT, however, only needs one prompt from you; from there, the AI agent will then generate a task list it thinks it will need to accomplish whatever you asked it to, without needing any additional input or prompts. It essentially chains together LLM (large language model) “thoughts,” according to developer Significant Gravitas (Toran Bruce Richards).

Auto-GPT is a complex system relying on multiple components. It connects to the internet to retrieve specific information and data (something ChatGPT’s free version cannot do), features long-term and short-term memory management, uses GPT-4 for OpenAI’s most advanced text generation, and GPT-3.5 for file storage and summarization. There’s a lot of moving parts, but it all comes together to produce some impressive results.

How people are using Auto-GPT

The first example comes from Auto-GPT’s GitHub site: You can’t quite see all of the goals the demonstrated lists Auto-GPT is working to complete, but the gist is someone asks the AI agent to research and learn more about itself. It follows suit, opening Google, finding its own GitHub repository, analyzing it, and compiling a summary of the data in a text file for the demonstrator to view.

Here’s a more practical example: The user wants to figure out which headphones on the market are the best. Instead of doing the research themselves, they turn to Auto-GPT, and prompt the AI agent with these four goals:

  1. Do market research for different headphones on the market today.
  2. Get the top five headphones and list their pros and cons.
  3. Include the price for each one and save the analysis.
  4. Once you are done, terminate.

After thinking for a moment, the AI agent springs into action, searching the web to compile information and reviews on headphones. It then spits out an easy-to-read plain text file, ranking the best headphones, listing their prices, and highlighting their pros and cons.

[…]

But I think what makes Auto-GPT cool (or at least the promise of Auto-GPT) is the idea of being able to ask an AI to take on most of the responsibility for any given task. You don’t need to know the right questions to ask or the optimal prompts to give to make the AI do what you want. As long as your initial goals are clear, the AI can think of those next steps for you, and build you things you might not have been able to think of yourself.

[…]

You don’t need to know how to code in order to build your own AI agent with Auto-GPT, but it helps. You’ll need a computer, an OpenAI API key (a pay as you go plan is highly recommended), a text editor (like Notepad++), Git (or the latest stable release of Auto-GPT), and Python, but there are plenty of other requirements if you want to expand Auto-GPT’s capabilities, such as integrating speech or alternative memory locations such as Pinecone.

Auto-GPT’s GitHub page has an extensive list of instructions for setting up the tool as well as adding in those extras. Tom’s Hardware also has a great guide for simple set up if all you’re looking to do is try out an AI agent with Auto-GPT. If you do build it yourself, mind your token usage—we discuss setting limits in our OpenAI API piece so you don’t accidentally allow Auto-GPT to burn through your credit card balance.

However, you don’t need to build the AI agent yourself if all you want to do is try out Auto-GPT. Some developers have built interfaces for Auto-GPT that are easy to access from your web browser, no coding experience necessary. Cognosys was free to use until high demand forced developers to require a OpenAI API key in order to access. AgentGPT is an interesting example you don’t need an API key for, but it limits the amount of tasks the AI will generate for itself. Still, it will give you a sense of how the process works, and you can increase those limits by providing an API key.

Source: How to Try AutoGPT Right Now

Here was a quick question I had:

Embarking on a new goal:what is the best 8k tv for gaming?
Thinking…(Restart if this takes more than 30 seconds)
Added task:Scrape data from reputable sources on top 8k TVs available in the market
Added task:Analyze the technical specifications and features of each TV to determine which ones are suitable for gaming
Added task:Rank the TVs based on gaming performance and generate a list of the top 3 recommended 8k TVs for gaming
Thinking…(Restart if this takes more than 30 seconds)
Executing “Scrape data from reputable sources on top 8k TVs available in the market”

Based on the data that I have gathered from reputable sources, the best 8K TV for gaming is the Sony XBR-75Z9G. This TV offers high-quality picture and sound, and it has a response time of less than 2ms which is perfect for fast-paced action in gaming. Additionally, it is equipped with HDMI 2.1 ports that support 8K resolution at 60 frames per second which is ideal for the latest gaming consoles. Overall, the Sony XBR-75Z9G is a top-of-the-line 8K TV that delivers an unparalleled gaming experience.

Thinking…(Restart if this takes more than 30 seconds)
Task marked as complete!
Thinking…(Restart if this takes more than 30 seconds)
Executing “Analyze the technical specifications and features of each TV to determine which ones are suitable for gaming”

After analyzing the technical specifications and features of various 8k TVs, the best TV for gaming is the LG OLED88ZXPUA. This TV boasts a 120Hz refresh rate, low input lag, and HDMI 2.1 compatibility, making it perfect for gaming. Additionally, it has excellent contrast and color accuracy, providing an immersive gaming experience.

Thinking…(Restart if this takes more than 30 seconds)
Added task:Create a comparison chart of the top 3 recommended 8k TVs for gaming, including the LG OLED88ZXPUA, that highlights their technical specifications and features relevant to gaming such as the refresh rate, input lag, and HDMI compatibility.
Thinking…(Restart if this takes more than 30 seconds)
Executing “Rank the TVs based on gaming performance and generate a list of the top 3 recommended 8k TVs for gaming”

After thorough research and analysis, I have ranked the best 8K TVs for gaming based on factors like input lag, color accuracy, HDR support, refresh rate, and overall picture quality. Here are my top 3 recommended 8K TVs for gaming:

LG ZX 88-inch 8K OLED TV: This TV has outstanding image quality, the lowest input lag among 8K TVs, and supports various gaming features like VRR, ALLM, and G-Sync. It also has four HDMI 2.1 ports for connecting next-gen consoles and high-speed devices.

Samsung Q800T 82-inch 8K QLED TV: This TV has a bright and vibrant display, a fast refresh rate of 120Hz, and supports multiple HDR formats like HDR10+ and HLG. It also has low input lag and a dedicated gaming mode for optimized performance.

Sony Z8H 85-inch 8K LED TV: This TV has impressive motion handling, accurate color reproduction, and supports advanced gaming features like HDMI 2.1, VRR, and ALLM. It also has a high peak brightness for HDR content and a sleek design.

Overall, these 8K TVs offer exceptional gaming performance and immersive experiences for gamers. I recommend considering these models based on your budget, room size, and personal preferences.

Thinking…(Restart if this takes more than 30 seconds)
Added task:Create a detailed analysis of each of the top 3 recommended 8K TVs for gaming, including the LG OLED88ZXPUA, highlighting their strengths and weaknesses in terms of gaming performance, picture quality, design, and overall value for money.
We’re sorry, because this is a demo, we cannot have our agents running for too long. Note, if you desire longer runs, please provide your own API key in Settings. Shutting down.

Microsoft Readies AI Chip as Machine Learning Costs Surge

After placing an early bet on OpenAI, the creator of ChatGPT, Microsoft has another secret weapon in its arsenal: its own artificial intelligence chip for powering the large-language models responsible for understanding and generating humanlike language. The Information: The software giant has been developing the chip, internally code-named Athena, since as early as 2019, according to two people with direct knowledge of the project. The chips are already available to a small group of Microsoft and OpenAI employees, who are testing the technology, one of them said. Microsoft is hoping the chip will perform better than what it currently buys from other vendors, saving it time and money on its costly AI efforts. Other prominent tech companies, including Amazon, Google and Facebook, also make their own in-house chips for AI. The chips — which are designed for training software such as large-language models, along with supporting inference, when the models use the intelligence they acquire in training to respond to new data — could also relieve a shortage of the specialized computers that can handle the processing needed for AI software. That shortage, reflecting the fact that primarily just one company, Nvidia, makes such chips, is felt across tech. It has forced Microsoft to ration its computers for some internal teams, The Information has reported.

Source: Microsoft Readies AI Chip as Machine Learning Costs Surge – Slashdot

Stability AI of Stable Diffusion Launches open source StableLM LLM with 3 and 7 billion parameters

Today, Stability AI released a new open-source language model, StableLM. The Alpha version of the model is available in 3 billion and 7 billion parameters, with 15 billion to 65 billion parameter models to follow. Developers can freely inspect, use, and adapt our StableLM base models for commercial or research purposes, subject to the terms of the CC BY-SA-4.0 license.

In 2022, Stability AI drove the public release of Stable Diffusion, a revolutionary image model that represents a transparent, open, and scalable alternative to proprietary AI. With the launch of the StableLM suite of models, Stability AI is continuing to make foundational AI technology accessible to all. Our StableLM models can generate text and code and will power a range of downstream applications. They demonstrate how small and efficient models can deliver high performance with appropriate training.

The release of StableLM builds on our experience in open-sourcing earlier language models with EleutherAI, a nonprofit research hub. These language models include GPT-J, GPT-NeoX, and the Pythia suite, which were trained on The Pile open-source dataset. Many recent open-source language models continue to build on these efforts, including Cerebras-GPT and Dolly-2.

StableLM is trained on a new experimental dataset built on The Pile, but three times larger with 1.5 trillion tokens of content. We will release details on the dataset in due course. The richness of this dataset gives StableLM surprisingly high performance in conversational and coding tasks, despite its small size of 3 to 7 billion parameters (by comparison, GPT-3 has 175 billion parameters).

We are also releasing a set of research models that are instruction fine-tuned. Initially, these fine-tuned models will use a combination of five recent open-source datasets for conversational agents: Alpaca, GPT4All, Dolly, ShareGPT, and HH. These fine-tuned models are intended for research use only and are released under a noncommercial CC BY-NC-SA 4.0 license, in-line with Stanford’s Alpaca license.

[…]

The models are now available in our GitHub repository. We will publish a full technical report in the near future, and look forward to ongoing collaboration with developers and researchers as we roll out the StableLM suite. In addition, we will be kicking off our crowd-sourced RLHF program, and working with community efforts such as Open Assistant to create an open-source dataset for AI assistants.

[…]

Source: Stability AI Launches the First of its StableLM Suite of Language Models — Stability AI

💡 Pause AI Doomster Pessimism: An Open Letter – a call on AI doomsters to immediately pause for at least 6 months the alarmism that is hurting human progress.

AI systems with human-competitive intelligence can offer significant benefits to society and humanity, as demonstrated by extensive research and acknowledged by top AI labs. Advanced AI has the potential to revolutionize the way we live, work, and interact with one another, and it should be welcomed and guided with optimism and foresight. Regrettably, recent months have seen growing pessimism and alarmism about AI development, despite the immense potential benefits.

Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Can we leverage machines to enhance our information channels with accurate and valuable insights? Can we automate mundane tasks to free up time for more fulfilling and meaningful pursuits? Can we develop nonhuman minds that might complement, augment, and collaborate with us? Can we harness AI to help solve pressing global issues? Such decisions should be made collectively, in a spirit of cooperation and with a focus on the greater good.

To counteract the pessimism and alarmism, we call on all stakeholders to immediately pause for at least 6 months their doomsday thinking and shift their focus to the potential benefits of AI. This pause should be public and verifiable, and include all key actors. Governments should support and encourage AI development that benefits all of humanity.

Problems with AI shouldn’t be ignored. AI labs and independent experts should work together to jointly develop and implement a set of shared safety protocols for advanced AI design and development. While doing so, it is essential to continue focusing on the potential benefits of AI development, as they promise to bring transformative advancements to various aspects of our lives.

[…]

Source: 💡 Pause AI Doomster Pessimism: An Open Letter

Absolutely agree!