CodeWhisperer automatically filters out any code suggestions that are potentially biased or unfair and flags any code that’s similar to open-source training data. It also comes with security scanning features that can identify vulnerabilities within a developer’s code, while providing suggestions to help close any security gaps it uncovers. CodeWhisperer now supports several languages, including Python, Java, JavaScript, TypeScript, and C#, including Go, Rust, PHP, Ruby, Kotlin, C, C++, Shell scripting, SQL, and Scala.
Here’s how Amazon’s senior developer advocate pitched the usefulness of their “real-time AI coding companion”: Helping to keep developers in their flow is increasingly important as, facing increasing time pressure to get their work done, developers are often forced to break that flow to turn to an internet search, sites such as StackOverflow, or their colleagues for help in completing tasks. While this can help them obtain the starter code they need, it’s disruptive as they’ve had to leave their IDE environment to search or ask questions in a forum or find and ask a colleague — further adding to the disruption. Instead, CodeWhisperer meets developers where they are most productive, providing recommendations in real time as they write code or comments in their IDE. During the preview we ran a productivity challenge, and participants who used CodeWhisperer were 27% more likely to complete tasks successfully and did so an average of 57% faster than those who didn’t use CodeWhisperer….
It provides additional data for suggestions — for example, the repository URL and license — when code similar to training data is generated, helping lower the risk of using the code and enabling developers to reuse it with confidence.
I have posted on this a few times and to me it’s shocking to see these fabricated sci-fi doomsday predictions about AI. AI / ML is a tool which we use, just like video games (that don’t cause violence in kids), roleplaying games (which don’t cause satanism), a telephone (which yes, can be used in planning crimes but most usually isn’t – and the paper post is the same), search engines (which can be used to search up how to make explosives but most usually aren’t), knives (which can be used to stab people but are most usually found in a food setting). This isn’t to say that the use of tools shouldn’t be regulated. Dinner knives have a certain maximum size. Video games and books with hate and violence inducing content are censored. Phone calls can be tapped and post opened if there is probable cause. Search engines can be told not to favour products the parent company owns. And the EU AI act is a good step on the way to ensuring that AI tools aren’t dangerous.
The technology is still a long long way off from an AI being smart enough to be at all evil and planet destroying.
Below is an excellent run through of some of the biggest AI doomerists and what they mean, how their self interest is served by being doomerist.
AI Doomerism is becoming mainstream thanks to mass media, which drives our discussion about Generative AI from bad to worse, or from slightly insane to batshit crazy. Instead of out-of-control AI, we have out-of-control panic.
When a British tabloid headline screams, “Attack of the psycho chatbot,” it’s funny. When it’s followed by another front-page headline, “Psycho killer chatbots are befuddled by Wordle,” it’s even funnier. If this type of coverage stayed in the tabloids, which are known to be sensationalized, that was fine.
In just a few days, we went from “governments should force a 6-month pause” (the petition from the Future of Life Institute) to “wait, it’s not enough, so data centers should be bombed.” Sadly, this is the narrative that gets media attention and shapes our already hyperbolic AI discourse.
In order to understand the rise of AI Doomerism, here are some influential figures responsible for mainstreaming doomsday scenarios. This is not the full list of AI doomers, just the ones that recently shaped the AI panic cycle (so I‘m focusing on them).
AI Panic Marketing: Exhibit A: Sam Altman.
Sam Altman has a habit of urging us to be scared. “Although current-generation AI tools aren’t very scary, I think we are potentially not that far away from potentially scary ones,” he tweeted. “If you’re making AI, it is potentially very good, potentially very terrible,” he told the WSJ. When he shared the bad-case scenario of AI with Connie Loizo, it was ”lights out for all of us.”
In an interview with Kara Swisher, Altman expressed how he is “super-nervous” about authoritarians using this technology.” He elaborated in an ABC News interview: “A thing that I do worry about is … we’re not going to be the only creator of this technology. There will be other people who don’t put some of the safety limits that we put on it. I’m particularly worried that these models could be used for large-scale disinformation.” These models could also “be used for offensive cyberattacks.” So, “people should be happy that we are a little bit scared of this.” He repeated this message in his following interview with Lex Fridman: “I think it’d be crazy not to be a little bit afraid, and I empathize with people who are a lot afraid.”
Having shared this story in 2016, it shouldn’t come as a surprise: “My problem is that when my friends get drunk, they talk about the ways the world will END.” One of the “most popular scenarios would be A.I. that attacks us.” “I try not to think about it too much,” Altman continued. “But I have guns, gold, potassium iodide, antibiotics, batteries, water, gas masks from the Israeli Defense Force, and a big patch of land in Big Sur I can fly to.”
(Wouldn’t it be easier to just cut back on the drinking and substance abuse?).
Altman’s recent post “Planning for AGI and beyond” is as bombastic as it gets: “Successfully transitioning to a world with superintelligence is perhaps the most important – and hopeful, and scary – project in human history.”
It is at this point that you might ask yourself, “Why would someone frame his company like that?” Well, that’s a good question. The answer is that making OpenAI’s products “the most important and scary – in human history” is part of its marketing strategy. “The paranoia is the marketing.”
“AI doomsaying is absolutely everywhere right now,” described Brian Merchant in the LA Times. “Which is exactly the way that OpenAI, the company that stands to benefit the most from everyone believing its product has the power to remake – or unmake – the world, wants it.” Merchant explained Altman’s science fiction-infused marketing frenzy: “Scaring off customers isn’t a concern when what you’re selling is the fearsome power that your service promises.”
During the Techlash days in 2019, which focused on social media, Joseph Bernstein explained how the alarm over disinformation (e.g., “Cambridge Analytica was responsible for Brexit and Trump’s 2016 election”) actually “supports Facebook’s sales pitch”:
This can be applied here: The alarm over AI’s magic power (e.g., “replacing humans”) actually “supports OpenAI’s sales pitch”:
“What could be more appealing to future AI employees and investors than a machine that can become superintelligence?”
AI Panic as a Business. Exhibit A & B: Tristan Harris & Eliezer Yudkowsky.
Altman is at least using apocalyptic AI marketing for actual OpenAI products. The worst kind of doomers is those whose AI panic is their product, their main career, and their source of income. A prime example is the Effective Altruism institutes that claim to be the superior few who can save us from a hypothetical AGI apocalypse.
In March, Tristan Harris, Co-Founder of the Center for Humane Technology, invited leaders to a lecture on how AI could wipe out humanity. To begin his doomsday presentation, he stated: “What nukes are to the physical world … AI is to everything else.”
In the “Social Dilemma,” he promoted the idea that “Two billion people will have thoughts that they didn’t intend to have” because of the designers’ decisions. But, as Lee Visel pointed out, Harris didn’t provide any evidence that social media designers actually CAN purposely force us to have unwanted thoughts.
Similarly, there’s no need for evidence now that AI is worse than nuclear power; simply thinking about this analogy makes it true (in Harris’ mind, at least). Did a social media designer force him to have this unwanted thought? (Just wondering).
To further escalate the AI panic, Tristan Harris published an OpEd in The New York Times with Yuval Noah Harari and Aza Raskin. Among their overdramatic claims: “We have summoned an alien intelligence,” “A.I. could rapidly eat the whole human culture,” and AI’s “godlike powers” will “master us.”
Another statement in this piece was, “Social media was the first contact between A.I. and humanity, and humanity lost.” I found it funny as it came from two men with hundreds of thousands of followers (@harari_yuval 540.4k, @tristanharris 192.6k), who use their social media megaphone … for fear-mongering. The irony is lost on them.
“This is what happens when you bring together two of the worst thinkers on new technologies,” added Lee Vinsel. “Among other shared tendencies, both bloviate free of empirical inquiry.”
This is where we should be jealous of AI doomers. Having no evidence and no nuance is extremely convenient (when your only goal is to attack an emerging technology).
Then came the famous “Open Letter.” This petition from the Future of Life Institute lacked a clear argument or a trade-off analysis. There were only rhetorical questions, like, should we develop imaginary “nonhuman minds that might eventually outnumber, outsmart, obsolete, and replace us?“ They provided no evidence to support the claim that advanced LLMs pose an unprecedented existential risk. There were a lot of highly speculative assumptions. Yet, they demanded an immediate 6-month pause on training AI systems and argued that “If such a pause cannot be enacted quickly, governments should institute a moratorium.”
Please keep in mind that (1). A $10 million donation from Elon Musk launched the Future of Life Institute in 2015. Out of its total budget of 4 million euros for 2021, Musk Foundation contributed 3.5 million euros (the biggest donor by far). (2). Musk once said that “With artificial intelligence, we are summoning the demon.” (3). Due to this, the institute’s mission is to lobby against extinction, misaligned AI, and killer robots.
“The authors of the letter believe they are superior. Therefore, they have the right to call a stop, due to the fear that less intelligent humans will be badly influenced by AI,” responded Keith Teare (CEO SignalRank Corporation). “They are taking a paternalistic view of the entire human race, saying, ‘You can’t trust these people with this AI.’ It’s an elitist point of view.”
Spencer Ante (Meta Foresight). “Leading providers of AI are taking AI safety and responsibility very seriously, developing risk-mitigation tools, best practices for responsible use, monitoring platforms for misuse, and learning from human feedback.”
Next, because he thought the open letter didn’t go far enough, Eliezer Yudkowsky took “PhobAI” too far. First, Yudkowsky asked us all to be afraid of made-up risks and an apocalyptic fantasy he has about “superhuman intelligence” “killing literally everyone” (or “kill everyone in the U.S. and in China and on Earth”). Then, he suggested that “preventing AI extinction scenarios is considered a priority above preventing a full nuclear exchange.” By explicitly advocating violent solutions to AI, we have officially reached the height of hysteria.
“Rhetoric from AI doomers is not just ridiculous. It’s dangerous and unethical,” responded Yann Lecun (Chief AI Scientist, Meta). “AI doomism is quickly becoming indistinguishable from an apocalyptic religion. Complete with prophecies of imminent fire and brimstone caused by an omnipotent entity that doesn’t actually exist.”
“You stand a far greater chance of dying from lightning strikes, collisions with deer, peanut allergies, bee stings & ignition or melting of nightwear – than you do from AI,” Michael Shermer wrote to Yudkowsky. “Quit stoking irrational fears.”
The problem is that “irrational fears” sell. They are beneficial to the ones who spread them.
How to Spot an AI Doomer?
On April 2nd, Gary Marcus asked: “Confused about the terminology. If I doubt that robots will take over the world, but I am very concerned that a massive glut of authoritative-seeming misinformation will undermine democracy, do I count as a “doomer”?
One of the answers was: “You’re a doomer as long as you bypass participating in the conversation and instead appeal to populist fearmongering and lobbying reactionary, fearful politicians with clickbait.”
Considering all of the above, I decided to define “AI doomer” and provide some criteria:
Doomers tend to live in a tradeoff-free fantasy land.
Doomers have a general preference for very amorphous, top-down Precautionary Principle-based solutions, but they (1) rarely discuss how (or if) those schemes would actually work in practice, and (2) almost never discuss the trade-offs/costs their extreme approaches would impose on society/innovation.
Answering Gary Marcus’ question, I do not think he qualifies as a doomer. You need to meet all criteria (he does not). Meanwhile, Tristan Harris and Eliezer Yudkowsky meet all seven.
Are they ever going to stop this “Panic-as-a-Business”? If the apocalyptic catastrophe doesn’t occur, will the AI doomers ever admit they were wrong? I believe the answer is “No.”
Segment Anything, recently released by Facebook Research, does something that most people who have dabbled in computer vision have found daunting: reliably figure out which pixels in an image belong to an object. Making that easier is the goal of the Segment Anything Model (SAM), just released under the Apache 2.0 license.
The online demo has a bank of examples, but also works with uploaded images.
The results look fantastic, and there’s an interactive demo available where you can play with the different ways SAM works. One can pick out objects by pointing and clicking on an image, or images can be automatically segmented. It’s frankly very impressive to see SAM make masking out the different objects in an image look so effortless. What makes this possible is machine learning, and part of that is the fact that the model behind the system has been trained on a huge dataset of high-quality images and masks, making it very effective at what it does.
Once an image is segmented, those masks can be used to interface with other systems like object detection (which identifies and labels what an object is) and other computer vision applications. Such system work more robustly if they already know where to look, after all. This blog post from Meta AI goes into some additional detail about what’s possible with SAM, and fuller details are in the research paper.
Systems like this rely on quality datasets. Of course, nothing beats a great collection of real-world data but we’ve also seen that it’s possible to machine-generate data that never actually existed, and get useful results.
In a news announcement on Wednesday, the Italian Data Protection Authority, known as the Garante, stressed that OpenAI needed to be more transparent about its data collection processes and inform users about their data rights with regards to the generative AI. These rights include allowing users and non-users of ChatGPT to object to having their data processed by OpenAI and letting them correct false or inaccurate information about them generated by ChatGPT, similar to rights related to other technologies guaranteed by Europe’s General Data Protection Regulation, or GDPR, laws.
Other measures required by the Garante include a public notice on OpenAI’s website “describing the arrangements and logic of the data processing required for the operation of ChatGPT along with the rights afforded to data subjects.” The regulator will also require OpenAI to immediately implement an age gating system for ChatGPT and submit a plan to implement an age verification system by May 31.
The Italian regulator said OpenAI had until April 30 to implement the measures it’s asking for.
Allowing users to correct is in principle a Good Idea, but then you get Wikipedia types of battles on who is the arbiter of truth. Of course, no one system will ever be 100% truthful or accurate, so banning it for this is just stupid. No age gate keeper works either and neither did the ban – people can circumvent these very very easily. So Italy needs some sort of concession to get out of the hole it’s dug itself and this is at least a promising start.
The 2019 release of the first image of a black hole was hailed as a significant scientific achievement. But truth be told, it was a bit blurry – or, as one astrophysicist involved in the effort called it, a “fuzzy orange donut.”
Scientists on Thursday unveiled a new and improved image of this black hole – a behemoth at the center of a nearby galaxy – mining the same data used for the earlier one but improving its resolution by employing image reconstruction algorithms to fill in gaps in the original telescope observations.
[…]
The ring of light – that is, the material being sucked into the voracious object – seen in the new image is about half the width of how it looked in the previous picture. There is also a larger “brightness depression” at the center – basically the donut hole – caused by light and other matter disappearing into the black hole.
The image remains somewhat blurry due to the limitations of the data underpinning it – not quite ready for a Hollywood sci-fi blockbuster, but an advance from the 2019 version.
This supermassive black hole resides in a galaxy called Messier 87, or M87, about 54 million light-years from Earth. A light year is the distance light travels in a year, 5.9 trillion miles (9.5 trillion km). This galaxy, with a mass 6.5 billion times that of our sun, is larger and more luminous than our Milky Way.
[…]
Lia Medeiros of the Institute for Advanced Study in Princeton, New Jersey, lead author of the research published in the Astrophysical Journal Letters.
The study’s four authors are members of the Event Horizon Telescope (EHT) project, the international collaboration begun in 2012 with the goal of directly observing a black hole’s immediate environment. A black hole’s event horizon is the point beyond which anything – stars, planets, gas, dust and all forms of electromagnetic radiation – gets swallowed into oblivion.
Medeiros said she and her colleagues plan to use the same technique to improve upon the image of the only other black hole ever pictured – released last year showing the one inhabiting the Milky Way’s center, called Sagittarius A*, or Sgr A*.
The M87 black hole image stems from data collected by seven radio telescopes at five locations on Earth that essentially create a planet-sized observational dish.
“The EHT is a very sparse array of telescopes. This is something we cannot do anything about because we need to put our telescopes on the tops of mountains and these mountains are few and far apart from each other. Most of the Earth is covered by oceans,” said Georgia Tech astrophysicist and study co-author Dimitrios Psaltis.
“As a result, our telescope array has a lot of ‘holes’ and we need to rely on algorithms that allow us to fill in the missing data,” Psaltis added. “The image we report in the new paper is the most accurate representation of the black hole image that we can obtain with our globe-wide telescope.”
The machine-learning technique they used is called PRIMO, short for “principal-component interferometric modeling.”
“This is the first time we have used machine learning to fill in the gaps where we don’t have data,” Medeiros said. “We use a large data set of high-fidelity simulations as a training set, and find an image that is consistent with the data and also is broadly consistent with our theoretical expectations. The fact that the previous EHT results robustly demonstrated that the image is a ring allows us to assume so in our analysis.”
Universal Music Group has told streaming platforms, including Spotify and Apple, to block artificial intelligence services from scraping melodies and lyrics from their copyrighted songs, according to emails viewed by the Financial Times. From the report: UMG, which controls about a third of the global music market, has become increasingly concerned about AI bots using their songs to train themselves to churn out music that sounds like popular artists. AI-generated songs have been popping up on streaming services and UMG has been sending takedown requests “left and right,” said a person familiar with the matter. The company is asking streaming companies to cut off access to their music catalogue for developers using it to train AI technology. “We will not hesitate to take steps to protect our rights and those of our artists,” UMG wrote to online platforms in March, in emails viewed by the FT. “This next generation of technology poses significant issues,” said a person close to the situation. “Much of [generative AI] is trained on popular music. You could say: compose a song that has the lyrics to be like Taylor Swift, but the vocals to be in the style of Bruno Mars, but I want the theme to be more Harry Styles. The output you get is due to the fact the AI has been trained on those artists’ intellectual property.”
Basically they don’t want AI’s listening to their music as an inspiration for them to make music. Which is exactly what humans do. So I’m very curious what legal basis would accept their takedowns.
Today, the Department of Commerce’s National Telecommunications and Information Administration (NTIA) launched a request for comment (RFC) to advance its efforts to ensure artificial intelligence (AI) systems work as claimed – and without causing harm. The insights gathered through this RFC will inform the Biden Administration’s ongoing work to ensure a cohesive and comprehensive federal government approach to AI-related risks and opportunities.
[…]
NTIA’s “AI Accountability Policy Request for Comment” seeks feedback on what policies can support the development of AI audits, assessments, certifications and other mechanisms to create earned trust in AI systems that they work as claimed. Much as financial audits create trust in the accuracy of a business’ financial statements, so for AI, such mechanisms can help provide assurance that an AI system is trustworthy in that it does what it is intended to do without adverse consequences.
[…]
President Biden has been clear that when it comes to AI, we must both support responsible innovation and ensure appropriate guardrails to protect Americans’ rights and safety. The White House Office of Science and Technology Policy’s Blueprint for an AI Bill of Rights provides an important framework to guide the design, development, and deployment of AI and other automated systems. The National Institute of Standards and Technology’s (NIST) AI Risk Management Framework serves as a voluntary tool that organizations can use to manage risks posed by AI systems.
Comments will be due 60 days from publication of the RFC in the Federal Register.
combined Python and a hefty dose of of AI for a fascinating proof of concept: self-healing Python scripts. He shows things working in a video, embedded below the break, but we’ll also describe what happens right here.
The demo Python script is a simple calculator that works from the command line, and [BioBootloader] introduces a few bugs to it. He misspells a variable used as a return value, and deletes the subtract_numbers(a, b) function entirely. Running this script by itself simply crashes, but using Wolverine on it has a very different outcome.In a short time, error messages are analyzed, changes proposed, those same changes applied, and the script re-run.
Wolverine is a wrapper that runs the buggy script, captures any error messages, then sends those errors to GPT-4 to ask it what it thinks went wrong with the code. In the demo, GPT-4 correctly identifies the two bugs (even though only one of them directly led to the crash) but that’s not all! Wolverine actually applies the proposed changes to the buggy script, and re-runs it. This time around there is still an error… because GPT-4’s previous changes included an out of scope return statement. No problem, because Wolverine once again consults with GPT-4, creates and formats a change, applies it, and re-runs the modified script. This time the script runs successfully and Wolverine’s work is done.
LLMs (Large Language Models) like GPT-4 are “programmed” in natural language, and these instructions are referred to as prompts. A large chunk of what Wolverine does is thanks to a carefully-written prompt, and you can read it here to gain some insight into the process. Don’t forget to watch the video demonstration just below if you want to see it all in action.
a novel approach to the problem of scraping web content in a structured way without needing to write the kind of page-specific code web scrapers usually have to deal with. How? Just enlist the help of a natural language AI. Scrapeghost relies on OpenAI’s GPT API to parse a web page’s content, pull out and classify any salient bits, and format it in a useful way.
What makes Scrapeghost different is how data gets organized. For example, when instantiating scrapeghost one defines the data one wishes to extract. For example:
The kicker is that this format is entirely up to you! The GPT models are very, very good at processing natural language, and scrapeghost uses GPT to process the scraped data and find (using the example above) whatever looks like a name, district, party, photo, and office address and format it exactly as requested.
It’s an experimental tool and you’ll need an API key from OpenAI to use it, but it has useful features and is certainly a novel approach. There’s a tutorial and even a command-line interface, so check it out.
Italy’s privacy watchdog said Friday it had blocked the controversial robot ChatGPT, saying the artificial intelligence app did not respect user data and could not verify users’ age.
The decision “with immediate effect” will result in “the temporary limitation of the processing of Italian user data vis-a-vis OpenAI”, the Italian Data Protection Authority said.
The agency has launched an investigation.
[…]
The watchdog said that on March 20, the app experienced a data breach involving user conversations and payment information.
It said there was no legal basis to justify “the mass collection and storage of personal data for the purpose of ‘training’ the algorithms underlying the operation of the platform”.
It also said that since there was no way to verify the age of users, the app “exposes minors to absolutely unsuitable answers compared to their degree of development and awareness.”
It said the company had 20 days to respond how it would address the watchdog’s concerns, under penalty of a 20-million-euro ($21.7-million) fine, or up to 4 percent of annual revenues.
I am pretty sure none of the search engines verify age and store user data (ok duckduckgo is an exception) or give answers that may “expose” the little snowflake “minors to absolutely unsuitable answers compared to their degree of development and awareness.”
There is a race on to catch up to OpenAI and people are obviously losing, so crushing OpenAI is the way to go.
A public challenge could put a temporary stop to the deployment of ChatGPT and similar AI systems. The nonprofit research organization Center for AI and Digital Policy (CAIDP) has filed a complaint with the Federal Trade Commission (FTC) alleging that OpenAI is violating the FTC Act through its releases of large language AI models like GPT-4. That model is “biased, deceptive” and threatens both privacy and public safety, CAIDP claims. Likewise, it supposedly fails to meet Commission guidelines calling for AI to be transparent, fair and easy to explain.
The Center wants the FTC to investigate OpenAI and suspend future releases of large language models until they meet the agency’s guidelines. The researchers want OpenAI to require independent reviews of GPT products and services before they launch. CAIDP also hopes the FTC will create an incident reporting system and formal standards for AI generators.
We’ve asked OpenAI for comment. The FTC has declined to comment. CAIDP president Marc Rotenberg was among those who signed an open letter demanding that OpenAI and other AI researchers pause work for six months to give time for ethics discussions. OpenAI founder Elon Musk also signed the letter.
Critics of ChatGPT, Google Bard and similar models have warned of problematic output, including inaccurate statements, hate speech and bias. Users also can’t repeat results, CAIDP says. The Center points out that OpenAI itself warns AI can “reinforce” ideas whether or not they’re true. While upgrades like GPT-4 are more reliable, there’s a concern people may rely on the AI without double-checking its content.
There’s no guarantee the FTC will act on the complaint. If it does set requirements, though, the move would affect development across the AI industry. Companies would have to wait for assessments, and might face more repercussions if their models fail to meet the Commission’s standards. While this might improve accountability, it could also slow the currently rapid pace of AI development.
Every LLN AI being released right now is pretty clear that it’s not a single source of truth, that mistakes will be made and that you need to check the output yourself. The signing of the letter to stop AI development smacks of people who are so far behind in the race wanting to quietly catch up until the moratorium is lifted and this action sounds a lot like this organisation being in someone’s pocket.
Several months ago, Socket, which makes a freemium security scanner for JavaScript and Python projects, connected OpenAI’s ChatGPT model (and more recently its GPT-4 model) to its internal threat feed.
The results, according to CEO Feross Aboukhadijeh, were surprisingly good. “It worked way better than expected,” he told The Register in an email. “Now I’m sitting on a couple hundred vulnerabilities and malware packages and we’re rushing to report them as quick as we can.”
Socket’s scanner was designed to detect supply chain attacks. Available as a GitHub app or a command line tool, it scans JavaScript and Python projects in an effort to determine whether any of the many packages that may have been imported from the npm or PyPI registries contain malicious code.
Aboukhadijeh said Socket has confirmed 227 vulnerabilities, all using ChatGPT. The vulnerabilities fall into different categories and don’t share common characteristics.
The Register was provided with numerous examples of published packages that exhibited malicious behavior or unsafe practices, including: information exfiltration, SQL injection, hardcoded credentials, potential privilege escalation, and backdoors.
We were asked not to share several examples as they have yet to be removed, but here’s one that has already been dealt with.
mathjs-min“Socket reported this to npm and it has been removed,” said Aboukhadijeh. “This was a pretty nasty one.”
AI analysis: “The script contains a discord token grabber function which is a serious security risk. It steals user tokens and sends them to an external server. This is malicious behavior.”
“There are some interesting effects as well, such as things that a human might be persuaded of but the AI is marking as a risk,” Aboukhadijeh added.
“These decisions are somewhat subjective, but the AI is not dissuaded by comments claiming that a dangerous piece of code is not malicious in nature. The AI even includes a humorous comment indicating that it doesn’t trust the inline comment.”
Example trello-enterprise
AI analysis: “The script collects information like hostname, username, home directory, and current working directory and sends it to a remote server. While the author claims it is for bug bounty purposes, this behavior can still pose a privacy risk. The script also contains a blocking operation that can cause performance issues or unresponsiveness.”
Aboukhadijeh explained that the software packages at these registries are vast and it’s difficult to craft rules that thoroughly plumb the nuances of every file, script, and bit of configuration data. Rules tend to be fragile and often produce too much detail or miss things a savvy human reviewer would catch.
Applying human analysis to the entire corpus of a package registry (~1.3 million for npm and ~450,000 for PyPI) just isn’t feasible, but machine learning models can pick up some of the slack by helping human reviewers focus on the more dubious code modules.
“Socket is analyzing every npm and PyPI package with AI-based source code analysis using ChatGPT,” said Aboukhadijeh.
“When it finds something problematic in a package, we flag it for review and ask ChatGPT to briefly explain its findings. Like all AI-based tooling, this may produce some false positives, and we are not enabling this as a blocking issue until we gather more feedback on the feature.”
Aboukhadijeh provided The Register with a sample report from its ChatGPT helper that identifies risky, though not conclusively malicious behavior. In this instance, the machine learning model offered this assessment, “This script collects sensitive information about the user’s system, including username, hostname, DNS servers, and package information, and sends it to an external server.”
Screenshot of ChatGPT report for Socket security scanner – Click to enlarge
What a ChatGPT-based Socket advisory looks like … Click to enlarge
According to Aboukhadijeh, Socket was designed to help developers make informed decisions about risk in a way that doesn’t interfere with their work. So raising the alarm about every install script – a common attack vector – can create too much noise. Analysis of these scripts using a large language model dials the alarm bell down and helps developers recognize real problems. And these models are becoming more capable.
Earlier today, more than 1,100 artificial intelligence experts, industry leaders and researchers signed a petition calling on AI developers to stop training models more powerful than OpenAI’s ChatGPT-4 for at least six months. Among those who refrained from signing it was Eliezer Yudkowsky, a decision theorist from the U.S. and lead researcher at the Machine Intelligence Research Institute. He’s been working on aligning Artificial General Intelligence since 2001 and is widely regarded as a founder of the field.
“This 6-month moratorium would be better than no moratorium,” writes Yudkowsky in an opinion piece for Time Magazine. “I refrained from signing because I think the letter is understating the seriousness of the situation and asking for too little to solve it.” Yudkowsky cranks up the rhetoric to 100, writing: “If somebody builds a too-powerful AI, under present conditions, I expect that every single member of the human species and all biological life on Earth dies shortly thereafter.” Here’s an excerpt from his piece: The key issue is not “human-competitive” intelligence (as the open letter puts it); it’s what happens after AI gets to smarter-than-human intelligence. Key thresholds there may not be obvious, we definitely can’t calculate in advance what happens when, and it currently seems imaginable that a research lab would cross critical lines without noticing. […] It’s not that you can’t, in principle, survive creating something much smarter than you; it’s that it would require precision and preparation and new scientific insights, and probably not having AI systems composed of giant inscrutable arrays of fractional numbers. […]
It took more than 60 years between when the notion of Artificial Intelligence was first proposed and studied, and for us to reach today’s capabilities. Solving safety of superhuman intelligence — not perfect safety, safety in the sense of “not killing literally everyone” — could very reasonably take at least half that long. And the thing about trying this with superhuman intelligence is that if you get that wrong on the first try, you do not get to learn from your mistakes, because you are dead. Humanity does not learn from the mistake and dust itself off and try again, as in other challenges we’ve overcome in our history, because we are all gone.
Trying to get anything right on the first really critical try is an extraordinary ask, in science and in engineering. We are not coming in with anything like the approach that would be required to do it successfully. If we held anything in the nascent field of Artificial General Intelligence to the lesser standards of engineering rigor that apply to a bridge meant to carry a couple of thousand cars, the entire field would be shut down tomorrow. We are not prepared. We are not on course to be prepared in any reasonable time window. There is no plan. Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what the hell is going on inside those systems. If we actually do this, we are all going to die. You can read the full letter signed by AI leaders here.
with Microsoft’s unveiling of the new Security Copilot AI at its inaugural Microsoft Secure event. The automated enterprise-grade security system is powered by OpenAI’s GPT-4, runs on the Azure infrastructure and promises admins the ability “to move at the speed and scale of AI.”
Security Copilot is similar to the large language model (LLM) that drives the Bing Copilot feature, but with a training geared heavily towards network security rather than general conversational knowledge and web search optimization. […]
“Just since the pandemic, we’ve seen an incredible proliferation [in corporate hacking incidents],”Jakkal told Bloomberg. For example, “it takes one hour and 12 minutes on average for an attacker to get full access to your inbox once a user has clicked on a phishing link. It used to be months or weeks for someone to get access.”
[…]
Jakkal anticipates these new capabilities enabling Copilot-assisted admins to respond within minutes to emerging security threats, rather than days or weeks after the exploit is discovered. Being a brand new, untested AI system, Security Copilot is not meant to operate fully autonomously, a human admin needs to remain in the loop. “This is going to be a learning system,” she said. “It’s also a paradigm shift: Now humans become the verifiers, and AI is giving us the data.”
To more fully protect the sensitive trade secrets and internal business documents Security Copilot is designed to protect, Microsoft has also committed to never use its customers data to train future Copilot iterations. Users will also be able to dictate their privacy settings and decide how much of their data (or the insights gleaned from it) will be shared. The company has not revealed if, or when, such security features will become available for individual users as well.
This is a plugin for ChatGPT that enables semantic search and retrieval of personal or organizational documents. It allows users to obtain the most relevant document snippets from their data sources, such as files, notes, or emails, by asking questions or expressing needs in natural language. Enterprises can make their internal documents available to their employees through ChatGPT using this plugin.
[…]
Users can refine their search results by using metadata filters by source, date, author, or other criteria. The plugin can be hosted on any cloud platform that supports Docker containers, such as Fly.io, Heroku or Azure Container Apps. To keep the vector database updated with the latest documents, the plugin can process and store documents from various data sources continuously
[…]
A notable feature of the Retrieval Plugin is its capacity to provide ChatGPT with memory. By utilizing the plugin’s upsert endpoint, ChatGPT can save snippets from the conversation to the vector database for later reference (only when prompted to do so by the user). This functionality contributes to a more context-aware chat experience by allowing ChatGPT to remember and retrieve information from previous conversations. Learn how to configure the Retrieval Plugin with memory here.
Apple has quietly acquired a Mountain View-based startup, WaveOne, that was developing AI algorithms for compressing video.
Apple wouldn’t confirm the sale when asked for comment. But WaveOne’s website was shut down around January, and severalformeremployees, including one of WaveOne’s co-founders, now work within Apple’s various machine learning groups.
WaveOne’s former head of sales and business development, Bob Stankosh, announced the sale in a LinkedIn post published a month ago.
“After almost two years at WaveOne, last week we finalized the sale of the company to Apple,” Stankosh wrote. “We started our journey at WaveOne, realizing that machine learning and deep learning video technology could potentially change the world. Apple saw this potential and took the opportunity to add it to their technology portfolio.”
[…]
WaveOne’s main innovation was a “content-aware” video compression and decompression algorithm that could run on the AI accelerators built into many phones and an increasing number of PCs. Leveraging AI-powered scene and object detection, the startup’s technology could essentially “understand” a video frame, allowing it to, for example, prioritize faces at the expense of other elements within a scene to save bandwidth.
WaveOne also claimed that its video compression tech was robust to sudden disruptions in connectivity. That is to say, it could make a “best guess” based on whatever bits it had available, so when bandwidth was suddenly restricted, the video wouldn’t freeze; it’d just show less detail for the duration.
WaveOne claimed its approach, which was hardware-agnostic, could reduce the size of video files by as much as half, with better gains in more complex scenes.
[…]
Even minor improvements in video compression could save on bandwidth costs, or enable services like Apple TV+ to deliver higher resolutions and framerates depending on the type of content being streamed.
YouTube’s already doing this. Last year, Alphabet’s DeepMind adapted a machine learning algorithm originally developed to play board games to the problem of compressing YouTube videos, leading to a 4% reduction in the amount of data the video-sharing service needs to stream to users.
No lights. No camera. All action.Realistically and consistently synthesize new videos. Either by applying the composition and style of an image or text prompt to the structure of a source video (Video to Video). Or, using nothing but words (Text to Video). It’s like filming something new, without filming anything at all.
[…] Introduced last summer after a year-long technical trial, Copilot offers coding suggestions, though not always good ones, to developers using GitHub with supported text editors and IDEs, like Visual Studio Code.
As of last month, according to GitHub, Copilot had a hand in 46 percent of the code being created on Microsoft’s cloud repo depot and had helped developers program up to 55 percent faster.
On Wednesday, Copilot – an AI “pair programmer”, as GitHub puts it – will be ready to converse with developers ChatGPT-style in either Visual Studio Code or Visual Studio. Prompt-and-response conversations take place in an IDE sidebar chat window, as opposed to the autocompletion responses that get generated from comment-based queries in a source file.
“Copilot chat is not just a chat window,” said Dohmke. “It recognizes what code a developer has typed, what error messages are shown, and it’s deeply embedded into the IDE.”
A developer thus can highlight, say, a regex in a source file and invite Copilot to explain what the obtuse pattern matching expression does. Copilot can also be asked to generate tests, to analyze and debug, to propose a fix, or to attempt a custom task. The model can even add comments that explain source code and can clean files up like a linter.
More interesting still, Copilot can be addressed by voice. Using spoken prompts, the assistive software can produce (or reproduce) code and run it on demand. It’s a worthy accessibility option at least.
[…]
When making a pull request under the watchful eye of AI, developers can expect to find GitHub’s model will fill out tags that serve to provide additional information about what’s going on. It then falls to developers to accept or revise the suggestions.
[…]
What’s more, Copilot’s ambit has been extended to documentation. Starting with documentation for React, Azure Docs, and MDN, developers can pose questions and get AI-generated answers through a chat interface. In time, according to Dohmke, the ability to interact with documentation via a chat interface will be extended to any organization’s repositories and internal documentation.
[…]
GitHub has even helped Copilot colonize the command line, with GitHub Copilot CLI. If you’ve ever forgotten an obscure command line incantation or command flag, Copilot has you covered
We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key,” the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees.•First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given query access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Moreover, even if the distinguisher can request backdoored inputs of its choice, they cannot backdoor a new input—a property we call non-replicability.•Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm (Rahimi, Recht; NeurIPS 2007). In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor.
[…]
Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, by constructing undetectable backdoor for an “adversarially-robust” learning algorithm, we can produce a classifier that is indistinguishable from a robust classifier, but where every input has an adversarial example! In this way, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.
Soon, an Amazon corporate lawyer chimed in. She warned employees not to provide ChatGPT with “any Amazon confidential information (including Amazon code you are working on),” according to a screenshot of the message seen by Insider.
The attorney, a senior corporate counsel at Amazon, suggested employees follow the company’s existing conflict of interest and confidentiality policies because there have been “instances” of ChatGPT responses looking similar to internal Amazon data.
“This is important because your inputs may be used as training data for a further iteration of ChatGPT, and we wouldn’t want its output to include or resemble our confidential information (and I’ve already seen instances where its output closely matches existing material),” the lawyer wrote.
[…]
“OpenAI is far from transparent about how they use the data, but if it’s being folded into training data, I would expect corporations to wonder: After a few months of widespread use of ChatGPT, will it become possible to extract private corporate information with cleverly crafted prompts?” said Emily Bender, who teaches computational linguistics at University of Washington.
[…]
some Amazonians are already using the AI tool as a software “coding assistant” by asking it to improve internal lines of code, according to Slack messages seen by Insider.
[…]
For Amazon employees, data privacy seems to be the least of their concerns. They said using the chatbot at work has led to “10x in productivity,” and many expressed a desire to join internal teams developing similar services.
[…] As games grow bigger in scope, writers are facing the ratcheting challenge of keeping NPCs individually interesting and realistic. How do you keep each interaction with them – especially if there are hundreds of them – distinct? This is where Ghostwriter, an in-house AI tool created by Ubisoft’s R&D department, La Forge, comes in.
Ghostwriter isn’t replacing the video game writer, but instead, alleviating one of the video game writer’s most laborious tasks: writing barks. Ghostwriter effectively generates first drafts of barks – phrases or sounds made by NPCs during a triggered event – which gives scriptwriters more time to polish the narrative elsewhere. Ben Swanson, R&D Scientist at La Forge Montreal, is the creator of Ghostwriter, and remembers the early seeds of it ahead of his presentation of the tech at GDC this year.
[…]
Ghostwriter is the result of conversations with narrative designers who revealed a challenge, one that Ben identified could be solved with an AI tool. Crowd chatter and barks are central features of player immersion in games – NPCs speaking to each other, enemy dialogue during combat, or an exchange triggered when entering an area all provide a more realistic world experience and make the player feel like the game around them exists outside of their actions. However, both require time and creative effort from scriptwriters that could be spent on other core plot items. Ghostwriter frees up that time, but still allows the scriptwriters a degree of creative control.
“Rather than writing first draft versions themselves, Ghostwriter lets scriptwriters select and polish the samples generated,” Ben explains. This way, the tech is a tool used by the teams to support them in their creative journey, with every interaction and feedback originating from the members who use it.
As a summary of its process, scriptwriters first create a character and a type of interaction or utterance they would like to generate. Ghostwriter then proposes a select number of variations which the scriptwriter can then choose and edit freely to fit their needs. This process uses pairwise comparison as a method of evaluation and improvement. This means that, for each variation generated, Ghostwriter provides two choices which will be compared and chosen by the scriptwriter. Once one is selected, the tool learns from the preferred choice and, after thousands of selections made by humans, it becomes more effective and accurate.
[…]
The team’s ambition is to give this AI power to narrative designers, who will be able to eventually create their own AI system themselves, tailored to their own design needs. To do this, they created a user-friendly back-end tool website called Ernestine, which allows anyone to create their own machine learning models used in Ghostwriter. Their hope is that teams consider Ghostwriter before they start their narrative process and create their models with a vision in mind, effectively making the tech an integral part of the production pipeline.
Last month, Roblox outlined its vision for AI-assisted content creation, imagining a future where Generative AI could help users create code, 3D models and more with little more than text prompts. Now, it’s taking its first steps toward allowing “every user on Roblox to be a creator” by launching its first AI tools: Code Assist and Material Generator, both in beta.
Although neither tool is anywhere close to generating a playable Roblox experience from a text description, Head of Roblox Studio Stef Corazzatold an audience at GDC 2023 that they can “help automate basic coding tasks so you can focus on creative work.” For now, that means being able to generate useful code snippets and object textures based on short prompts. Roblox’s announcement for the tools offers a few examples, generating realistic textures for a “bright red rock canyon” and “stained glass,” or producing several lines of functional code that will that make certain objects change color and self-destruct after a player interacts with them.
The technology, developed by Pittsburgh, Pennsylvania startup Abridge, aims to reduce workloads for clinicians and improve care for patients. Shivdev Rao, the company’s CEO and a cardiologist, told The Register doctors can spend hours writing up notes from their previous patient sessions outside their usual work schedules.
“That really adds up over time, and I think it has contributed in large part to this public health crisis that we have right now around doctors and nurses burning out and leaving the profession.” Clinicians will often have to transcribe audio recordings or recall conversations from memory when writing their notes, she added.
[…]
Abridge’s software automatically generates summaries of medical conversations using AI and natural language processing algorithms. In a short demo, The Register pretended to be a mock patient talking to Rao about suffering from shortness of breath, diabetes, and drinking three bottles of wine every week. Abridge’s software was able to note down things like symptoms, medicines recommended by the doctor, and actions the clinician should follow up on in future appointments.
The code works by listening out for keywords and classifying important information. “If I said take Metoprolol twice, an entity would be Metoprolol, and then twice a day would be an attribute. And if I said by mouth, that’s another attribute. And we could do the same thing with the wine example. Wine would be an entity, and an attribute would be three bottles, and other attribute every night.”
“We’re creating a structured data dataset; [the software is] classifying everything that I said and you said into different categories of the conversation. But then once it’s classified all the information, the last piece is generative.”
At this point, Rao explained Abridge uses a transformer-based model to generate a document piecing together the classified information into short sentences under various subsections describing a patient’s previous history of illness, future plans or actions to take.
[…]
Physicians can edit the notes further, whilst patients can access them in an app. Rao likened Abridge’s technology to a copilot, and was keen to emphasize that doctors remain in charge, and should check and edit the generated notes if necessary. Both patients and doctors also have access to recordings of their meetings, and can click on specific keywords to have the software play back parts of the audio when the specific word was uttered during their conversation.
“We’re going all the way from the summary we put in front of users and we’re tracing it back to the ground truth of the conversation. And so if I have a conversation, and I couldn’t recall something happening, I can always double-check that this wasn’t a hallucination. There are models in between that are making sure to not expose something that was not discussed.”
Microsoft on Tuesday announced that it is using an advanced version of Open AI’s DALL-E image generator to power its own Bing and Edge browser. Like DALL-E before it, the newly announced Bing Image Creator will generate a set of images for users based on a line of written text. The addition of image content in Bing further entrenches its early lead against competitors in Big Tech’s rapidly evolving race for AI dominance. Google announced it opened access to its Bard chatbot the same day, nearly a month after Microsoft added ChatGPT to Bing.
“By typing in a description of an image, providing additional context like location or activity, and choosing an art style, Image Creator will generate an image from your own imagination,” Microsoft head of consumer marketing Yusuf Mehdi said in a statement. “It’s like your creative copilot.”
For the Edge browser, Microsoft says its new Image creator will appear as a new icon in the Edge sidebar