[ad_1]
Superior Micro Gadgets CEO Lisa Su on Tuesday in San Francisco unveiled a chip that could be a centerpiece within the firm’s technique for synthetic intelligence computing, boasting its monumental reminiscence and information throughput for so-called generative AI duties equivalent to massive language fashions.
The Intuition MI300X, because the half is understood, is a follow-on to the beforehand introduced MI300A. The chip is mostly a mixture of a number of “chiplets,” particular person chips which can be joined collectively in a single bundle by shared reminiscence and networking hyperlinks.
Additionally: 5 methods to discover the usage of generative AI at work
Su, onstage for an invite-only viewers on the Fairmont Lodge in downtown San Francisco, referred to the half as a “generative AI accelerator,” and mentioned the GPU chiplets contained in it, a household referred to as CDNA 3, are “designed particularly for AI and HPC [high-performance computing] workloads.”
The MI300X is a “GPU-only” model of the half. The MI300A is a mixture of three Zen4 CPU chiplets with a number of GPU chiplets. However within the MI300X, the CPUs are swapped out for 2 further CDNA 3 chiplets.
Additionally: Nvidia unveils new form of Ethernet for AI, Grace Hopper ‘Superchip’ in full manufacturing
The MI300X will increase the transistor rely from 146 billion transistors to 153 billion, and the shared DRAM reminiscence is boosted from 128 gigabytes within the MI300A to 192 gigabytes.
The reminiscence bandwidth is boosted from 800 gigabytes per second to five.2 terabytes per second.
“Our use of chiplets on this product could be very, very strategic,” mentioned Su, due to the power to combine and match totally different sorts of compute, swapping out CPU or GPU.
Su mentioned the MI300X will provide 2.4 occasions the reminiscence density of Nvidia’s H100 “Hopper” GPU, and 1.6 occasions the reminiscence bandwidth.
“The generative AI, massive language fashions have modified the panorama,” mentioned Su. “The necessity for extra compute is rising exponentially, whether or not you are speaking about coaching or about inference.”
To show the necessity for highly effective computing, Sue confirmed the half engaged on what she mentioned is the most well-liked massive language mannequin in the meanwhile, the open supply Falcon-40B. Language fashions require extra compute as they’re constructed with larger and larger numbers of what are referred to as neural community “parameters.” The Falcon-40B consists of 40 billion parameters.
Additionally: GPT-3.5 vs GPT-4: Is ChatGPT Plus price its subscription payment?
The MI300X, she mentioned, is the primary chip that’s highly effective sufficient to run a neural community of that measurement, solely in reminiscence, reasonably than having to maneuver information, back-and-forth to and from exterior reminiscence.
Su demonstrated the MI300X making a poem about San Francisco utilizing Falcon-40B.
“A single MI300X can run fashions as much as roughly 80 billion parameters” in reminiscence, she mentioned.
“Whenever you examine MI300X to the competitors, MI300X gives 2.4 occasions extra reminiscence, and 1.6 occasions extra reminiscence bandwidth, and with all of that further reminiscence capability, we even have a bonus for big language fashions as a result of we are able to run bigger fashions straight in reminiscence.”
Additionally: How ChatGPT can rewrite and enhance your present code
To have the ability to run all the mannequin in reminiscence, mentioned Su, implies that, “for the biggest fashions, that truly reduces the variety of GPUs you want, considerably rushing up the efficiency, particularly for inference, in addition to lowering the whole value of possession.”
“I really like this chip, by the way in which,” enthused Su. “We love this chip.”
“With MI300X, you may scale back the variety of GPUs, and as mannequin sizes continue to grow, this may grow to be much more necessary.”
“With extra reminiscence, extra reminiscence bandwidth, and fewer GPUs wanted, we are able to run extra inference jobs per GPU than you may earlier than,” mentioned Su. That can scale back the whole value of possession for big language fashions, she mentioned, making the know-how extra accessible.
Additionally: For AI’s ‘iPhone second’, Nvidia unveils a big language mannequin chip
To compete with Nvidia’s DGX methods, Su unveiled a household of AI computer systems, the “AMD Intuition Platform.” The primary occasion of that may mix eight of the MI300X with 1.5 terabytes of HMB3 reminiscence. The server conforms to the {industry} normal Open Compute Platform spec.
“For purchasers, they’ll use all this AI compute functionality in reminiscence in an industry-standard platform that drops proper into their present infrastructure,” mentioned Su.
In contrast to MI300X, which is barely a GPU, the present MI300A goes up towards Nvidia’s Grace Hopper combo chip, which makes use of Nvidia’s Grace CPU and its Hopper GPU, which the corporate introduced final month is in full manufacturing.
MI300A is being constructed into the El Capitan supercomputer below development on the Division of Vitality’s Lawrence Livermore Nationwide Laboratories, famous Su.
Additionally: The right way to use ChatGPT to create an app
The MI300A is being proven as a pattern at the moment to AMD prospects, and the MI300X will start sampling to prospects within the third quarter of this yr, mentioned Su. Each shall be in quantity manufacturing within the fourth quarter, she mentioned.
You’ll be able to watch a replay of the presentation on the Web site arrange by AMD for the information.
[ad_2]
Source link