Integrating OpenAI’s ChatGPT and GPT-4: Socket’s story with code vulnerability scanning (it works very well)

Several months ago, Socket, which makes a freemium security scanner for JavaScript and Python projects, connected OpenAI’s ChatGPT model (and more recently its GPT-4 model) to its internal threat feed.

The results, according to CEO Feross Aboukhadijeh, were surprisingly good. “It worked way better than expected,” he told The Register in an email. “Now I’m sitting on a couple hundred vulnerabilities and malware packages and we’re rushing to report them as quick as we can.”

Socket’s scanner was designed to detect supply chain attacks. Available as a GitHub app or a command line tool, it scans JavaScript and Python projects in an effort to determine whether any of the many packages that may have been imported from the npm or PyPI registries contain malicious code.

Aboukhadijeh said Socket has confirmed 227 vulnerabilities, all using ChatGPT. The vulnerabilities fall into different categories and don’t share common characteristics.

The Register was provided with numerous examples of published packages that exhibited malicious behavior or unsafe practices, including: information exfiltration, SQL injection, hardcoded credentials, potential privilege escalation, and backdoors.

We were asked not to share several examples as they have yet to be removed, but here’s one that has already been dealt with.

  1. mathjs-min “Socket reported this to npm and it has been removed,” said Aboukhadijeh. “This was a pretty nasty one.”
    1. AI analysis: “The script contains a discord token grabber function which is a serious security risk. It steals user tokens and sends them to an external server. This is malicious behavior.”
    2. https://socket.dev/npm/package/mathjs-min/files/11.7.2/lib/cjs/plain/number/arithmetic.js#L28

“There are some interesting effects as well, such as things that a human might be persuaded of but the AI is marking as a risk,” Aboukhadijeh added.

“These decisions are somewhat subjective, but the AI is not dissuaded by comments claiming that a dangerous piece of code is not malicious in nature. The AI even includes a humorous comment indicating that it doesn’t trust the inline comment.”

  1. Example trello-enterprise
    1. AI analysis: “The script collects information like hostname, username, home directory, and current working directory and sends it to a remote server. While the author claims it is for bug bounty purposes, this behavior can still pose a privacy risk. The script also contains a blocking operation that can cause performance issues or unresponsiveness.”
    2. https://socket.dev/npm/package/trello-enterprises/files/1000.1000.1000/a.js

Aboukhadijeh explained that the software packages at these registries are vast and it’s difficult to craft rules that thoroughly plumb the nuances of every file, script, and bit of configuration data. Rules tend to be fragile and often produce too much detail or miss things a savvy human reviewer would catch.

Applying human analysis to the entire corpus of a package registry (~1.3 million for npm and ~450,000 for PyPI) just isn’t feasible, but machine learning models can pick up some of the slack by helping human reviewers focus on the more dubious code modules.

“Socket is analyzing every npm and PyPI package with AI-based source code analysis using ChatGPT,” said Aboukhadijeh.

“When it finds something problematic in a package, we flag it for review and ask ChatGPT to briefly explain its findings. Like all AI-based tooling, this may produce some false positives, and we are not enabling this as a blocking issue until we gather more feedback on the feature.”

Aboukhadijeh provided The Register with a sample report from its ChatGPT helper that identifies risky, though not conclusively malicious behavior. In this instance, the machine learning model offered this assessment, “This script collects sensitive information about the user’s system, including username, hostname, DNS servers, and package information, and sends it to an external server.”

Screenshot of ChatGPT report for Socket security scanner

Screenshot of ChatGPT report for Socket security scanner – Click to enlarge

Socket ChatGPT advisory screenshot

What a ChatGPT-based Socket advisory looks like … Click to enlarge

According to Aboukhadijeh, Socket was designed to help developers make informed decisions about risk in a way that doesn’t interfere with their work. So raising the alarm about every install script – a common attack vector – can create too much noise. Analysis of these scripts using a large language model dials the alarm bell down and helps developers recognize real problems. And these models are becoming more capable.

[…]

Source: Integrating OpenAI’s ChatGPT and GPT-4: Socket’s story • The Register

Robin Edgar

Organisational Structures | Technology and Science | Military, IT and Lifestyle consultancy | Social, Broadcast & Cross Media | Flying aircraft

 robin@edgarbv.com  https://www.edgarbv.com