Cerebras’ wafer-size chip is 10,000 times faster than a GPU

Cerebras Systems and the federal Department of Energy’s National Energy Technology Laboratory today announced that the company’s CS-1 system is more than 10,000 times faster than a graphics processing unit (GPU).

On a practical level, this means AI neural networks that previously took months to train can now train in minutes on the Cerebras system.

Cerebras makes the world’s largest computer chip, the WSE. Chipmakers normally slice a wafer from a 12-inch-diameter ingot of silicon to process in a chip factory. Once processed, the wafer is sliced into hundreds of separate chips that can be used in electronic hardware.

But Cerebras, started by SeaMicro founder Andrew Feldman, takes that wafer and makes a single, massive chip out of it. Each piece of the chip, dubbed a core, is interconnected in a sophisticated way to other cores. The interconnections are designed to keep all the cores functioning at high speeds so the transistors can work together as one.

Cerebras’s CS-1 system uses the WSE wafer-size chip, which has 1.2 trillion transistors, the basic on-off electronic switches that are the building blocks of silicon chips. Intel’s first 4004 processor in 1971 had 2,300 transistors, and the Nvidia A100 80GB chip, announced yesterday, has 54 billion transistors.

Feldman said in an interview with VentureBeat that the CS-1 was also 200 times faster than the Joule Supercomputer, which is No. 82 on a list of the top 500 supercomputers in the world.

“It shows record-shattering performance,” Feldman said. “It also shows that wafer scale technology has applications beyond AI.”

Above: The Cerebras WSE has 1.2 trillion transistors compared to Nvidia’s largest GPU, the A100 at 54.2 billion transistors.

These are fruits of the radical approach Los Altos, California-based Cerebras has taken, creating a silicon wafer with 400,000 AI cores on it instead of slicing that wafer into individual chips. The unusual design makes it a lot easier to accomplish tasks because the processor and memory are closer to each other and have lots of bandwidth to connect them, Feldman said. The question of how widely applicable the approach is to different computing tasks remains.

A paper based on the results of Cerebras’ work with the federal lab said the CS-1 can deliver performance that is unattainable with any number of central processing units (CPUs) and GPUs, which are both commonly used in supercomputers. (Nvidia’s GPUs are used in 70% of the top supercomputers now). Feldman added that this is true “no matter how large that supercomputer is.”

Source: Cerebras’ wafer-size chip is 10,000 times faster than a GPU | VentureBeat