Nvidia CEO Jensen Huang speaks during a press conference at MGM during CES 2018 in Las Vegas on January 7, 2018.
Mandel Ngan | AFP | Getty Images
Software capable of writing passages of text or drawing images resembling human creations sparked a gold rush in the technology industry.
Companies like Microsoft and Google are racing to integrate cutting-edge AI into their search engines, while billion-dollar rivals such as OpenAI and Stable Diffusion are rushing to release their software to the public.
Powering many of these applications is a roughly $10,000 chip that has become one of the AI industry’s most critical tools: the Nvidia A100.
The A100 has become the “workhorse” of AI professionals right now, says Nathan Benaich, an investor who publishes a newsletter And report covering the AI industry, including a partial list of supercomputers using A100s. Nvidia has 95% of the market for graphics processors that can be used for machine learning, according to New Street Research.
The A100 is perfectly suited for the kind of machine learning models that power tools like ChatGPT, Bing AIOr Steady broadcast. It is able to perform many simple calculations simultaneously, which is important for training and using neural network models.
The technology behind the A100 was originally used to render sophisticated 3D graphics in games. It’s often called a graphics processor or GPU, but these days Nvidia’s A100 is configured and targeted at machine learning tasks and running in data centers, not shiny gaming PCs.
Large companies or startups working on software such as chatbots and image generators require hundreds or thousands of Nvidia chips, and purchase them themselves or secure access to computers from a cloud provider.
Hundreds of GPUs are needed to train artificial intelligence models, such as large language models. The chips must be powerful enough to quickly process terabytes of data to recognize patterns. After that, GPUs like the A100 are also needed for “inference” or using the model to generate text, make predictions, or identify objects inside photos.
This means that AI companies must have access to many A100s. Some space entrepreneurs even see the number of A100s they have access to as a sign of progress.
“A year ago we had 32 A100s,” said Emad Mostaque, CEO of Stability AI. wrote on Twitter in January. “Dream big and stack up moar GPU kids. Brrr.” Stability AI is the company that helped develop Stable Diffusion, an image generator that caught the eye last fall and is said to have a valuation of over a billion dollars.
Now Stability AI has access to more than 5,400 A100 GPUs, according to to an estimate from the State of AI report, which tracks and tracks which companies and universities have the largest collection of A100 GPUs — though it doesn’t include cloud providers, who don’t release their numbers publicly.
Nvidia is getting on the AI bandwagon
Nvidia should benefit from the AI hype cycle. During Wednesday’s fiscal fourth quarter revenue reportalthough overall sales were down 21%, investors pushed the store about 14% Thursday, mainly because the company’s AI chip business – reported as data centers – rose 11% to more than $3.6 billion in sales in the quarter, showing continued growth.
Nvidia shares are up 65% so far in 2023, outpacing the S&P 500 and other semiconductor stocks.
Nvidia CEO Jensen Huang kept talking about AI in a call with analysts Wednesday, suggesting the recent boom in artificial intelligence is central to the company’s strategy.
“The activity around the AI infrastructure we’ve built and the activity around inference using Hopper and Ampere to influence large language models has just exploded over the last 60 days,” said Huang said. “There’s no doubt that whatever our view of this year, as we enter the year, it’s been pretty drastically changed as a result of the last 60, 90 days.”
Ampere is Nvidia’s code name for the A100 chip generation. Hopper is codenamed for the next generation, including H100, which recently started shipping.
More computers needed
Nvidia A100 processor
Compared to other types of software, like streaming a web page, which occasionally uses processing power in bursts for microseconds, machine learning tasks can take up all of the computer’s processing power., sometimes for hours or days.
This means that companies that end up with a successful AI product often need to acquire more GPUs to handle peak periods or improve their models.
These GPUs are not cheap. In addition to a single A100 on a card that can be slotted into an existing server, many data centers use a system that includes eight A100 GPUs working together.
This system, that of Nvidia DGX-A100, has a suggested price of almost $200,000, although it comes with the necessary chips. On Wednesday, Nvidia announced that it would directly sell cloud access to DGX systems, likely lowering the cost of entry for DIYers and researchers.
It’s easy to see how the cost of the A100s can add up.
For example, an estimate from New Street Research found that the OpenAI-based ChatGPT model in Bing’s search could require 8 GPUs to provide an answer to a question in under a second.
At this rate, Microsoft would need over 20,000 8-GPU servers just to deploy the model in Bing for everyone, suggesting Microsoft’s feature could cost $4 billion in infrastructure spending.
“If you’re from Microsoft and you want to scale that, at the scale of Bing, that’s maybe $4 billion. If you want to scale that at the scale of Google, which handles 8 or 9 billion queries every day you actually need to spend $80 billion on DGX.” said Antoine Chkaiban, technology analyst at New Street Research. “The numbers we got are huge. But they just reflect the fact that every user who adopts such a large language model requires a massive supercomputer while they’re using it.”
The latest version of Stable Diffusion, an image generator, was trained on 256 A100 GPUsor 32 machines with 8 A100s each, according to online information published by Stability AI, totaling 200,000 compute hours.
At market price, training the model alone costs $600,000, Stability AI CEO Mostaque said on Twitter, suggesting in an exchange of tweets the price was exceptionally cheap compared to its rivals. This does not include the cost of “inference” or model deployment.
Nvidia CEO Huang said in an interview with CNBC’s Katie Tarasov that the company’s products are actually inexpensive for the amount of compute these types of models require.
“We took what would otherwise be a $1 billion data center running processors, and scaled it down to a $100 million data center,” Huang said. “Now $100 million, when you put that in the cloud and shared by 100 companies, that’s next to nothing.”
Huang said Nvidia’s GPUs allow startups to train models at a much lower cost than using a traditional computer processor.
“Now you could build something like a big language model, like a GPT, for something like 10, 20 million dollars,” Huang said. “It’s really, really affordable.”
Nvidia isn’t the only company making GPUs for AI purposes. AMD And Intel have competing GPUs and big cloud companies like Google And Amazon develop and deploy their own chips specifically designed for AI workloads.
Still, “AI hardware remains heavily consolidated by NVIDIA,” according to the State of AI calculation report. In December, more than 21,000 open-source AI articles reported using Nvidia chips.
Most researchers included in the State of AI Compute Index used V100, Nvidia’s chip released in 2017, but A100 grew rapidly in 2022 to become Nvidia’s third most widely used chip, just behind a 1500 consumer graphics chip $ or less originally intended for gambling.
The A100 also has the distinction of being one of the few chips to be subject to export controls for national defense reasons. Last fall, Nvidia said in a filing with the SEC that the US government had imposed a licensing requirement prohibiting the export of the A100 and H100 to China, Hong Kong and Russia.
“The U.S. government has indicated that the new licensing requirement will address the risk that covered products may be used or diverted to a ‘military end-use’ or ‘military end-user’ in China and Russia,” Nvidia said. says in his file. Nvidia has previously said it has adapted some of its chips for the Chinese market to comply with US export restrictions.
The fiercest competition for the A100 might be its successor. The A100 was first introduced in 2020, an eternity ago in chip cycles. The H100, introduced in 2022, is beginning to be produced in volume — in fact, Nvidia saw more H100 chip revenue in the quarter ending January than the A100, he said Wednesday, although the H100 is more expensive per unit.
The H100, according to Nvidia, is the first of its data center GPUs to be optimized for processors, an increasingly important technique used by many of the latest and greatest AI applications. Nvidia said Wednesday it wants to speed up AI training by more than a million percent. This could mean that eventually AI companies won’t need so many Nvidia chips.