* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Thursday, October 2, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    Toni Braxton Is Turning Her Biggest Hits Into Lifetime Movies – Yahoo

    Toni Braxton Is Turning Her Biggest Hits Into Lifetime Movies – Yahoo

    Major airline to offer new in-flight entertainment options for passengers – PennLive.com

    Major airline to offer new in-flight entertainment options for passengers – PennLive.com

    Penn State-Themed Restaurant and Entertainment Spot Happy Valley Live Set to Open in State College – StateCollege.com

    Penn State-Themed Restaurant and Entertainment Spot Happy Valley Live Set to Open in State College – StateCollege.com

    The Police Made Chart History With This 1979 Hit Nearly 50 Years Ago – Yahoo

    How The Police Changed Music Forever with Their Iconic 1979 Hit Nearly 50 Years Ago

    Good Deed Entertainment Acquires Worldwide Rights To Liza Mandelup’s Documentary ‘Caterpillar’ – Deadline

    Good Deed Entertainment Lands Global Rights to Liza Mandelup’s Captivating Documentary ‘Caterpillar

    Danielle Fishel Explains Why Being on “DWTS” Makes Her Feel ‘Like It’s 1994 Again’ Filming “Boy Meets World” (Exclusive) – Yahoo

    Danielle Fishel Explains Why Being on “DWTS” Makes Her Feel ‘Like It’s 1994 Again’ Filming “Boy Meets World” (Exclusive) – Yahoo

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    A Tech Expo Shows What China Can Make, but Not Who’ll Buy It All – The New York Times

    Inside China’s Tech Expo: Cutting-Edge Innovations Face Uncertain Demand

    Steampunk Metal Oval Technology Sense Sunglasses Personality Handmade Chain Multicolor Sunglasses UV400 – The San Joaquin Valley Sun

    Steampunk Metal Oval Sunglasses with Handmade Multicolor Chain – Bold UV400 Protection and Unique Style

    STELLA Automotive AI Appoints Fred Seidelman as Chief Technology Officer – Yahoo Finance

    STELLA Automotive AI Appoints Fred Seidelman as New Chief Technology Officer

    Saving Energy and Money with Smart Technology – Terms of Service with Clare Duffy – Podcast on CNN Podcasts – CNN

    Saving Energy and Money with Smart Technology – Terms of Service with Clare Duffy – Podcast on CNN Podcasts – CNN

    Four Strategic Signals Technology Leaders Are Tuning In To – SPONSOR CONTENT FROM ARM – Harvard Business Review

    Four Essential Strategic Signals Every Technology Leader Should Watch

    Virginia Tech hosts annual New Music + Technology Festival this week – Cardinal News

    Virginia Tech Kicks Off Exciting Annual New Music and Technology Festival This Week

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    Toni Braxton Is Turning Her Biggest Hits Into Lifetime Movies – Yahoo

    Toni Braxton Is Turning Her Biggest Hits Into Lifetime Movies – Yahoo

    Major airline to offer new in-flight entertainment options for passengers – PennLive.com

    Major airline to offer new in-flight entertainment options for passengers – PennLive.com

    Penn State-Themed Restaurant and Entertainment Spot Happy Valley Live Set to Open in State College – StateCollege.com

    Penn State-Themed Restaurant and Entertainment Spot Happy Valley Live Set to Open in State College – StateCollege.com

    The Police Made Chart History With This 1979 Hit Nearly 50 Years Ago – Yahoo

    How The Police Changed Music Forever with Their Iconic 1979 Hit Nearly 50 Years Ago

    Good Deed Entertainment Acquires Worldwide Rights To Liza Mandelup’s Documentary ‘Caterpillar’ – Deadline

    Good Deed Entertainment Lands Global Rights to Liza Mandelup’s Captivating Documentary ‘Caterpillar

    Danielle Fishel Explains Why Being on “DWTS” Makes Her Feel ‘Like It’s 1994 Again’ Filming “Boy Meets World” (Exclusive) – Yahoo

    Danielle Fishel Explains Why Being on “DWTS” Makes Her Feel ‘Like It’s 1994 Again’ Filming “Boy Meets World” (Exclusive) – Yahoo

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    A Tech Expo Shows What China Can Make, but Not Who’ll Buy It All – The New York Times

    Inside China’s Tech Expo: Cutting-Edge Innovations Face Uncertain Demand

    Steampunk Metal Oval Technology Sense Sunglasses Personality Handmade Chain Multicolor Sunglasses UV400 – The San Joaquin Valley Sun

    Steampunk Metal Oval Sunglasses with Handmade Multicolor Chain – Bold UV400 Protection and Unique Style

    STELLA Automotive AI Appoints Fred Seidelman as Chief Technology Officer – Yahoo Finance

    STELLA Automotive AI Appoints Fred Seidelman as New Chief Technology Officer

    Saving Energy and Money with Smart Technology – Terms of Service with Clare Duffy – Podcast on CNN Podcasts – CNN

    Saving Energy and Money with Smart Technology – Terms of Service with Clare Duffy – Podcast on CNN Podcasts – CNN

    Four Strategic Signals Technology Leaders Are Tuning In To – SPONSOR CONTENT FROM ARM – Harvard Business Review

    Four Essential Strategic Signals Every Technology Leader Should Watch

    Virginia Tech hosts annual New Music + Technology Festival this week – Cardinal News

    Virginia Tech Kicks Off Exciting Annual New Music and Technology Festival This Week

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Technology

Among AI infrastructure hopefuls, Qualcomm has become an unlikely ally

May 19, 2024
in Technology
Among AI infrastructure hopefuls, Qualcomm has become an unlikely ally
Share on FacebookShare on Twitter

Analysis With its newly formed partnership with Arm server processor designer Ampere Computing, Qualcomm is slowly establishing itself as AI infrastructure startups’ best friend.

Announced during Ampere’s annual strategy and roadmap update on Thursday, the duo promised a 2U machine that includes eight Qualcomm AI 100 Ultra accelerators for performing machine-learning inference and 192 Ampere CPU cores. “In a typically 12.5kW rack, this equates to hosting up to 56 AI accelerators with 1,344 computation cores, while eliminating the need for expensive liquid cooling,” Ampere beamed.

Ampere and its partner Oracle have gone to great lengths to demonstrate that running the large language models (LLMs) behind many popular chatbots is entirely possible on CPUs, provided you set your expectations appropriately. We’ve explored this concept at length, but in a nutshell limited memory bandwidth means that CPUs are generally best suited to running smaller models between seven and eight billion parameters in size and usually only at smaller batch sizes — that is to say fewer concurrent users.

This is where Qualcomm’s AI 100 accelerators come in, as their higher memory bandwidth allows them to handle inferencing on larger models or higher batch sizes. And remember that inferencing involves running operations over the whole model; if your LLM is 4GB, 8GB, or 32GB in size, that’s a lot of numbers to repeatedly crunch every time you want to generate the next part of a sentence or piece of source code from a prompt.

Why Qualcomm?

When it comes to AI chips for the datacenter, Qualcomm isn’t a name that tends to come up all that often.

Most of the focus falls on GPU giant Nvidia with the remaining attention split between Intel’s Gaudi and AMD’s Instinct product lines. Instead, most of the attention Qualcomm has garnered has centered around its AI smartphone and notebook strategy.

However, that’s not to say Qualcomm doesn’t have a presence in the datacenter. In fact, its AI 100 series accelerators have been around for years, with its most recent Ultra-series parts making their debut last fall.

The accelerator is a slim, single slot PCIe card aimed at inferencing on LLMs. At 150W the card’s power requirements are rather sedate compared to the 600W and 700W monsters from AMD and Nvidia that are so often in the headlines.

Despite its slim form factor and relatively low-power draw, Qualcomm claims a single AI 100 Ultra is capable of running 100 billion parameter models while a pair of them can be coupled to support GPT-3 scale models (175 billion parameters).

In terms of inference performance, the 64-core card pushes 870 TOPs [PDF] at INT8 precision and is fueled by 128GB of LPDDR4x memory capable of 548GB/s of bandwidth.

Memory bandwidth is a major factor for scaling AI inferencing to larger batch sizes.

Generating the first token which, with chatbots we experience as the delay between submitting a prompt and the first word of the response appearing, is often compute bound. However, beyond that, each subsequent word generated tends to be memory bound.

This is part of the reason that GPU vendors like AMD and Nvidia have been moving to larger banks of faster HBM3 and HBM3e memory. The two silicon slingers’ latest chips boast memory bandwidths in excess of 5TB/s, roughly ten times that of Qualcomm’s part.

To overcome some of these limitations, Qualcomm has leaned heavily on software optimizations, adopting technologies like speculative decoding and micro-scaling formats (MX).

If you’re not familiar, speculative decoding uses a small, lightweight model to generate the initial response and then uses a larger model to check and correct its accuracy. In theory, this combination can boost the throughput and efficiency of an AI app.

Formats like MX6 and MX4, meanwhile, aim to reduce the memory footprint of models. These formats are technically a form of quantization that compresses model weights to lower precision, reducing the memory capacity and bandwidth required.

By combining MX6 and speculative decoding, Qualcomm claims these technologies can achieve a fourfold improvement in throughput over a FP16 baseline.

For Ampere, Qualcomm offers an alternative to Nvidia GPUs, which already work with its CPUs, for larger scale AI inferencing.

AI upstarts amp Qualcomm’s accelerators

Ampere isn’t the only one that’s teamed up with Qualcomm to address AI inferencing. There’s a missing piece to this puzzle that hasn’t been addressed: Training.

Waferscale AI startup Cerebras, another member of Ampere’s AI Platform Alliance, announced a collaboration with Qualcomm back in March alongside the launch of its WSE-3 chips and CS-3 systems.

Cerebras is unique among AI infrastructure vendors for numerous reasons, the most obvious being their chips are literally the size of dinner plates and now each boast 900,000 cores and 44GB of SRAM — and no, that’s not a typo.

As impressive as Cerebra’s waferscale chips may be, they’re designed for training models, not running them. This isn’t as big a headache as it might seem. Inferencing is a far less vendor-specific endeavor than training. This means that models trained on Cerebra’s CS-2 or 3 clusters can be deployed on any number of accelerators with minimal tuning.

The difference with Qualcomm is that the two are making an ecosystem play. As we covered at the time, Cerebras is working to train smaller, more accurate, and performant models that can take full advantage of Qualcomm’s software optimizations around speculative decoding, sparse inference, and MX quantization.

Building the ecosystem

Curiously, Qualcomm isn’t listed as a member of the AI Platform Alliance, at least not yet anyway. Having said that, the fact that Qualcomm’s AI 100 Ultra accelerators are already on the market may mean they’re just a stop gap while other smaller players within the alliance catch up.

And in this regard, the AI Platform Alliance has a number of members working on inference accelerators at various stages of commercialization. One of the more interesting we’ve come across is Furiosa — and yes, that is a Mad Max reference. The chip startup even has a computer vision accelerator codenamed Warboy, if there was any doubt.

Furiosa’s 2nd-gen accelerator codenamed RNGD — pronounced Renegade because in the post-AI world, who needs vowels —  is fabbed on a TSMC 5nm process and boasts up to 512 teraFLOPS of 8-bit performance or 1,024 TOPS at INT4. So, for workloads that can take advantage of lower 4-bit precision, the 150W chip has a modest advantage over Qualcomm’s AI 100.

The chip’s real bonus is 48GB of HBM3 memory which, while lower in capacity than Qualcomm’s part, boasts nearly three times more bandwidth at 1.5TB/s.

Dell latest to enjoy speculative soar as AI bubble builds

Aleph Alpha enlists Cerebras waferscale supers to train AI for German military

CoreWeave debt deal with investment firms raises $7.5B for AI datacenter startup

Hugging Face to make $10M worth of old Nvidia GPUs freely available to AI devs

When we might see the RNGD in the wild remains to be seen. However, the key takeaway from the AI Platform Alliance seems to exist so that individual startups can focus on tackling whatever aspect of the AI spectrum they’re best at and lean on the others for the rest, whether that’s through direct collaborations or standardization.

 In the meantime, it seems Qualcomm has picked up a few new friends along the way.

Filling a gap

Ampere’s reliance on Qualcomm for larger models at higher batch sizes may be short lived, thanks to architectural improvements introduced in the Armv9 instruction set architecture.

As we previously reported, the custom cores the CPU vendor developed for its Ampere One family of processors utilized elements of both the older v8 and newer v9 architectures. As we understand it, the v9-A spec introduced Scalable Matrix Extension 2 (SME2) support aimed at accelerating the kinds of matrix mathematics common in machine learning workloads. However for the moment, we’re told Ampere’s current chips are handling AI inferencing jobs using its twin 128-bit vector units.

It’s reasonable to believe future Arm-compatible chips from Ampere and others could make use of SME2. In fact, on the client side, Apple’s new M4 SoC is Armv9-compatible with SME2 acceleration baked into its cores, The Register has learned from trusted sources.

Qualcomm was actually one of the first to adopt Armv9, in some of its Snapdragon system-on-chips. However, the chip biz appears to be going back to Armv8, when using CPU designs from its Nuvia acquisition, a decision we have little doubt has become a point of contention with Arm. While Arm would like its customers to pick v9 with SME2 for CPU-based AI inference, Qualcomm is instead taking the line that v8 is fine with inference offloaded from the CPU to another processing unit.

In datacenter land, memory bandwidth will remain a bottleneck regardless of Armv9 or SME2. The introduction of speedier multiplexer combined rank (MCR) DIMMs should help, with 12 channel platforms capable of achieving 825GB/s of bandwidth.

As we’ve seen from Intel’s Xeon 6 demos, this bandwidth boost should allow models up to 70 billion parameters to run reasonably at 4-bit precision on a single CPU. ®

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : The Register – https://go.theregister.com/feed/www.theregister.com/2024/05/19/ai_ampere_qualcomm/

Tags: amongInfrastructuretechnology
Previous Post

Leasing North American datacenters before they’re finished is so hot right now

Next Post

Mortal Kombat 1 Fan Creates a Kendrick Vs. Drake Battle

Alabama man earns world record for 3-foot, 6-inch beard locks – upi.com

Alabama man earns world record for 3-foot, 6-inch beard locks – upi.com

October 2, 2025
How Trump could use a government shutdown to turbocharge his economic agenda – Yahoo Finance

How Trump could use a government shutdown to turbocharge his economic agenda – Yahoo Finance

October 2, 2025
Toni Braxton Is Turning Her Biggest Hits Into Lifetime Movies – Yahoo

Toni Braxton Is Turning Her Biggest Hits Into Lifetime Movies – Yahoo

October 2, 2025
Reproductive Health Emergency Kits To Be Distributed Saturday At Jacksonville Really Really Free Market – Center for Biological Diversity

Reproductive Health Emergency Kits To Be Distributed Saturday At Jacksonville Really Really Free Market – Center for Biological Diversity

October 2, 2025
Times/Siena Survey: Americans Worry Divisions Cannot Be Overcome – The New York Times

Americans Fear Deep Divisions May Be Impossible to Overcome

October 2, 2025
Oak Ridge Reservation Set for $42M Ecological Restoration, Balancing – Hoodline

Oak Ridge Reservation to Undergo $42M Ecological Restoration and Balancing Effort

October 2, 2025
Mayor green lights Science Center development; residents call it ‘giant win’ for St. Pete – WFLA

Mayor green lights Science Center development; residents call it ‘giant win’ for St. Pete – WFLA

October 2, 2025
A ‘Great Wave’ is rippling through our galaxy, pushing thousands of stars out of place – Live Science

A ‘Great Wave’ is rippling through our galaxy, pushing thousands of stars out of place – Live Science

October 2, 2025
These are the best breweries in New Jersey, according to recent online ranking – Yahoo

These are the best breweries in New Jersey, according to recent online ranking – Yahoo

October 2, 2025
A Tech Expo Shows What China Can Make, but Not Who’ll Buy It All – The New York Times

Inside China’s Tech Expo: Cutting-Edge Innovations Face Uncertain Demand

October 2, 2025

Categories

Archives

October 2025
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728293031  
« Sep    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (847)
  • Economy (868)
  • Entertainment (21,742)
  • General (17,369)
  • Health (9,911)
  • Lifestyle (881)
  • News (22,149)
  • People (870)
  • Politics (879)
  • Science (16,078)
  • Sports (21,368)
  • Technology (15,851)
  • World (851)

Recent News

Alabama man earns world record for 3-foot, 6-inch beard locks – upi.com

Alabama man earns world record for 3-foot, 6-inch beard locks – upi.com

October 2, 2025
How Trump could use a government shutdown to turbocharge his economic agenda – Yahoo Finance

How Trump could use a government shutdown to turbocharge his economic agenda – Yahoo Finance

October 2, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version