GPU Performance Comparison Shows Surprising Variability

Suppose one GPU may be very very similar to one other? Suppose once more. It seems that there’s stunning variability within the efficiency delivered by chips of the identical mannequin. That may make getting your cash’s price by renting time on a GPU from a cloud supplier an actual roll of the cube, based on analysis from the School of William & Mary, Jefferson Lab, and Silicon Data.

“It’s referred to as the silicon lottery,” says Carmen Li, founder and CEO of Silicon Knowledge, which tracks GPU rental prices and benchmarks cloud-computing efficiency.

The silicon lottery’s existence has been identified since at the least 2022, when researchers on the College of Wisconsin tied it to variations within the efficiency of GPU-dependent supercomputers. Li and her colleagues figured that the impact can be much more pronounced for AI cloud clients.

Efficiency varies for GPU fashions within the cloud

So that they ran 6,800 cases of the index agency’s benchmark check on 3,500 randomly chosen GPUs operated by 11 cloud-computing suppliers. The three,500 GPUs comprised 11 models of Nvidia GPU, probably the most superior being the Nvidia H200 SXM. (The crew wasn’t simply choosing on Nvidia; the GPU big makes up many of the rental cloud market.)

The benchmark, referred to as SiliconMark, is meant to supply a snapshot of a GPU’s skill to run large language models, or LLMs. It checks 16-bit floating-point computing performance, measured in trillions of operations per second, and a GPU’s internal-memory bandwidth, measured in gigabytes per second. The results confirmed that the computing efficiency diversified for all fashions, however for the 259 H100 PCIe GPUs it differed by as a lot as 34.5 p.c, and the reminiscence bandwidth of the 253 H200 SXM GPUs diversified by as a lot as 38 p.c.

Chart comparing GPU internal memory bandwidth by model, from Tesla T4 to H200 SXM.

SOURCE: SILICON DATA

Variations in how the GPU is cooled, how cloud operators configure their computer systems, and the way a lot use the chip has seen can all contribute to variations in efficiency of in any other case equivalent chips. However Silicon Knowledge’s evaluation confirmed that the actual perpetrator was variations within the chips themselves, possible as a result of manufacturing points.

Such randomness has actual dollars-and-cents penalties, the researchers argue, as a result of there’s an opportunity {that a} pricier, extra superior GPU received’t ship higher efficiency than an older mannequin chip.

So what ought to GPU renters do? “Probably the most sensible strategy is to benchmark the precise rental they obtain,” says Jason Cornick, head of infrastructure at Silicon Knowledge. “Working a benchmark device [such as SiliconMark] permits them to check their particular occasion’s efficiency towards a broader corpus of knowledge.”

From Your Website Articles

Associated Articles Across the Internet

Source link

GPU Performance Comparison Shows Surprising Variability

DAIMON Robotics Wants to Give Robot Hands a Sense of Touch

AI Cyberattacks Meet Memory-Safe Code Defenses

Two Cases Where Simulation Fills the Gap

The FPGA Chip Is an IEEE Milestone

Sparse AI Hardware Slashes Energy and Latency

Tech Life – The workers in the engine room of big tech

Apple Expects ‘Significantly Higher Memory Costs’ to Impact iPhone, MacBook Neo

Why AI Engineers Are Moving Beyond LangChain to Native Agent Architectures

Alcovia Ford Nugget-style six-sleeper Ducato camper van

AI is already across your business and its carbon impact probably is too

Featured Picks

NSW is on another innovation go-slow, halting council appointments amid more consultation

PEGI expands video game age ratings to address online gambling interaction risks

The Machine Learning “Advent Calendar” Day 19: Bagging in Excel

GPU Performance Comparison Shows Surprising Variability

Efficiency varies for GPU fashions within the cloud

Related Posts