* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Wednesday, August 13, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    John Davison departs from IGN Entertainment – GamesIndustry.biz

    John Davison Steps Down from IGN Entertainment Leadership

    JPMorgan raises Flutter Entertainment stock price target to GBP273 – Investing.com

    JPMorgan Raises Flutter Entertainment Price Target to £273, Signaling Strong Growth Ahead

    Star Entertainment reaches deal to sell 50% stake in Brisbane resort to HK investors – Reuters

    Star Entertainment Seals Landmark Deal, Sells Half of Brisbane Resort to Hong Kong Investors

    Country music star ripped by ex-wife amid court battle: ‘Karma is a … well you know’ – PennLive.com

    This LA singer performed at Trump casinos. Now he’s a retired bus driver in Acadiana. – The Advocate

    This LA singer performed at Trump casinos. Now he’s a retired bus driver in Acadiana. – The Advocate

    Six Flags Entertainment Corporation Reports 2025 Second Quarter Results, Provides July Performance Update, and Updates Full-Year Guidance – Business Wire

    Six Flags Reveals Thrilling Q2 2025 Results, Shares July Highlights, and Updates Full-Year Outlook

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Indirect tax transformation: Navigating change, embracing technology – Thomson Reuters tax and accounting

    Revolutionizing Indirect Tax: Embracing Technology to Navigate Change

    California’s wildfire moonshot: How new technology will defeat advancing flames – Los Angeles Times

    California’s Wildfire Revolution: How Cutting-Edge Technology Is Poised to Stop Raging Flames

    LSU grad uses 3D printing to create adaptive technology for children – CBS News

    LSU Graduate Revolutionizes Adaptive Technology for Kids with 3D Printing

    Gas-to-liquids technology can support national resilience – The Strategist | ASPI’s analysis and commentary site

    Unlocking National Strength: How Gas-to-Liquids Technology Drives Resilience

    Micron Technology (MU) Launched a New Memory Chip for Space Application – Yahoo Finance

    Micron Technology Launches Revolutionary Memory Chip Built for Space Exploration

    United Airlines passengers in US delayed after tech glitch halts flights – BBC

    United Airlines passengers in US delayed after tech glitch halts flights – BBC

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    John Davison departs from IGN Entertainment – GamesIndustry.biz

    John Davison Steps Down from IGN Entertainment Leadership

    JPMorgan raises Flutter Entertainment stock price target to GBP273 – Investing.com

    JPMorgan Raises Flutter Entertainment Price Target to £273, Signaling Strong Growth Ahead

    Star Entertainment reaches deal to sell 50% stake in Brisbane resort to HK investors – Reuters

    Star Entertainment Seals Landmark Deal, Sells Half of Brisbane Resort to Hong Kong Investors

    Country music star ripped by ex-wife amid court battle: ‘Karma is a … well you know’ – PennLive.com

    This LA singer performed at Trump casinos. Now he’s a retired bus driver in Acadiana. – The Advocate

    This LA singer performed at Trump casinos. Now he’s a retired bus driver in Acadiana. – The Advocate

    Six Flags Entertainment Corporation Reports 2025 Second Quarter Results, Provides July Performance Update, and Updates Full-Year Guidance – Business Wire

    Six Flags Reveals Thrilling Q2 2025 Results, Shares July Highlights, and Updates Full-Year Outlook

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Indirect tax transformation: Navigating change, embracing technology – Thomson Reuters tax and accounting

    Revolutionizing Indirect Tax: Embracing Technology to Navigate Change

    California’s wildfire moonshot: How new technology will defeat advancing flames – Los Angeles Times

    California’s Wildfire Revolution: How Cutting-Edge Technology Is Poised to Stop Raging Flames

    LSU grad uses 3D printing to create adaptive technology for children – CBS News

    LSU Graduate Revolutionizes Adaptive Technology for Kids with 3D Printing

    Gas-to-liquids technology can support national resilience – The Strategist | ASPI’s analysis and commentary site

    Unlocking National Strength: How Gas-to-Liquids Technology Drives Resilience

    Micron Technology (MU) Launched a New Memory Chip for Space Application – Yahoo Finance

    Micron Technology Launches Revolutionary Memory Chip Built for Space Exploration

    United Airlines passengers in US delayed after tech glitch halts flights – BBC

    United Airlines passengers in US delayed after tech glitch halts flights – BBC

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Technology

What is AI inference at the edge, and why is it important for businesses?

July 22, 2024
in Technology
What is AI inference at the edge, and why is it important for businesses?
Share on FacebookShare on Twitter

A person standing in front of a rack of servers inside a data center

(Image credit: Shutterstock.com / Gorodenkoff)

AI inference at the edge refers to running trained machine learning (ML) models closer to end users when compared to traditional cloud AI inference. Edge inference accelerates the response time of ML models, enabling real-time AI applications in industries such as gaming, healthcare, and retail.

What is AI inference at the edge?

Before we look at AI inference specifically at the edge, it’s worth understanding what AI inference is in general. In the AI/ML development lifecycle, inference is where a trained ML model performs tasks on new, previously unseen data, such as making predictions or generating content. AI inference happens when end users interact directly with an ML model embedded in an application. For example, when a user inputs a prompt to ChatGPT and gets a response back, the time when ChatGPT is “thinking” is when inference is occurring, and the output is the result of that inference.

AI inference at the edge is a subset of AI inference whereby an ML model runs on a server close to end users; for example, in the same region or even the same city. This proximity reduces latency to milliseconds for faster model response, which is beneficial for real-time applications like image recognition, fraud detection, or gaming map generation.

Head of AI Product at Gcore.

How AI inference at the edge relates to edge AI

AI inference at the edge is a subset of edge AI. Edge AI involves processing data and running ML models closer to the data source rather than in the cloud. Edge AI includes everything related to edge AI computing, from edge servers (the metro edge) to IoT devices and telecom base stations (the far edge). Edge AI also includes training at the edge, not just inference. In this article, we’ll focus on AI inference on edge servers.

How inference at the edge compares to cloud inference

With cloud AI inference, you run an ML model on the remote cloud server, and the user data is sent and processed in the cloud. In this case, an end user may interact with the model from a different region, country, or even continent. As a result, cloud inference latency ranges from hundreds of milliseconds to seconds. This type of AI inference is suitable for applications that don’t require local data processing or low latency, such as ChatGPT, DALL-E, and other popular GenAI tools. Edge inference differs in two related ways:

Inference happens closer to the end userLatency is lower

How AI inference at the edge works

AI inference at the edge relies on an IT infrastructure with two main architectural components: a low-latency network and servers powered by AI chips. If you need scalable AI inference that can handle load spikes, you also need a container orchestration service, such as Kubernetes; this runs on edge servers and enables your ML models to scale up and down quickly and automatically. Today, only a few providers have the infrastructure to offer global AI inference at the edge that meets these requirements.

Low-latency network: A provider offering AI inference at the edge should have a distributed network of edge points of presence (PoPs) where servers are located. The more edge PoPs, the quicker the network roundtrip time, which means ML model responses occur faster for end users. A provider should have dozens—or even hundreds—of PoPs worldwide and should offer smart routing, which routes a user request to the closest edge server to use the globally distributed network efficiently and effectively.

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

Servers with AI accelerators: To reduce computation time, you need to run your ML model on a server or VM powered by an AI accelerator, such as NVIDIA GPU. There are GPUs designed specifically for AI inference. For example, one of the latest models, the NVIDIA L40S, has up to 5x faster inference performance than the A100 and H100 GPUs, which are primarily designed for training large ML models but are also used for inference. The NVIDIA L40S GPU is currently the best AI accelerator for performing AI inference.

Container orchestration: Deploying ML models in containers makes models scalable and portable. A provider can manage an underlying container orchestration tool on your behalf. In that setup, an ML engineer looking to integrate a model into an application would simply upload a container image with an ML model and get a ready-to-use ML model endpoint. When a load spike occurs, containers with your ML model will automatically scale up, and then scale back down when the load subsides.

Key benefits of AI inference at the edge

AI inference at the edge offers three key benefits across industries or use cases: low latency, security and sovereignty, and cost efficiency.

Low latency 

The lower the network latency, the faster your model will respond. If a provider’s average network latency is under 50 ms, it’s appropriate for most apps requiring a near-instant response. By comparison, cloud latency can be as high as a few hundred milliseconds, depending on your location relative to the cloud server. That’s a noticeable difference for an end user, with cloud latency potentially leading to frustration as end users are left waiting for their AI responses.

Keep in mind that a low-latency network only accounts for the travel time of the data. A 50 ms network latency doesn’t mean users will get an AI output in 50 ms; you need to add the time that the ML model takes to perform inference. That ML model processing time is contingent on the model being used and may account for the majority of the processing time for end users. That’s all the more reason to ensure you’re using a low-latency network, so your users get the best possible response time while ML model developers continue to improve model inference speed.

Security and sovereignty

Keeping data at the edge—meaning local to the user—simplifies compliance with local laws and regulations, such as GDPR and its equivalents in other countries. An edge inference provider should set up its inference infrastructure to adhere to local laws to ensure that you and your users are protected appropriately.

Edge inference also increases the confidentiality and privacy of your end users’ data because it’s processed locally rather than being sent to remote cloud servers. This reduces the attack surface and minimizes the risk of data exposure during transmission.

Cost efficiency

Typically, a provider charges only for the computational resources utilized by the ML model. This, along with carefully configured autoscaling and model execution schedules, can significantly reduce inference costs. Who should use AI inference at the edge?

Here are some common scenarios where inference at the edge would be the optimal choice:

Low latency is critical for your application and users. A wide range of real-time applications, from facial recognition to trade analysis, require low latency. Edge inference provides the lowest latency inference option.Your user base is spread across multiple geographical locations. In this case, you need to provide the same user experience—meaning the same low latency—to all of your users regardless of their location. This requires a globally distributed edge network.You don’t want to deal with infrastructure maintenance. If supporting cloud and AI infrastructure isn’t part of your core business, it may be worth delegating these processes to an experienced, expert partner. You can then focus your resources on developing your application.You want to keep your data local, for example, within the country where it’s generated. In this case, you need to perform AI inference as close to your end users as possible. A globally distributed edge network can meet this need, whereas the cloud is unlikely to offer the extent of distribution you require.

Which industries benefit from AI inference at the edge?

AI inference at the edge benefits any industry where AI/ML is used, but especially those developing real-time applications. In the technology sector, this would include generative AI applications, chatbots and virtual assistants, data augmentation and AI tools for software engineers. In gaming, it would be AI content and map generation, real-time player analytics and real-time AI bot customisation and conversation. For the retail market, typical applications would be smart grocery with self-checkout and merchandising, virtual try-on, and content generation, predictions, and recommendations. 

In manufacturing the benefits are to real-time defect detection in production pipelines, VR/VX applications and rapid response feedback while in the media and entertainment industry it would be content analysis, real-time translation and automated transcription. Another sector that develops real time applications is automotive, and particularly rapid response for autonomous vehicles, vehicle personalization, advanced driver assistance and real-time traffic updates.

Conclusion

For organizations looking to deploy real-time applications, AI inference at the edge is an essential component of their infrastructure. It significantly reduces latency, ensuring ultra-fast response times. For end users, this means a seamless, more engaging experience, whether playing online games, using chatbots, or shopping online with a virtual try-on service. Enhanced data security means businesses can offer superior AI services while protecting user data. AI inference at the edge is a critical enabler to AI/ML production deployment at scale, driving AI/ML innovation and efficiency across numerous industries.

We list the best bare metal hosting.

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Michele Taroni is Head of AI Product at Gcore.

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : TechRadar – https://www.techradar.com/pro/what-is-ai-inference-at-the-edge-and-why-is-it-important-for-businesses

Tags: importantinferencetechnology
Previous Post

Let them entertain you: why Amazon’s the best place to buy your big TV

Next Post

Why the cloud shows us the future of AI

Comparative single-cell analyses reveal evolutionary repurposing of a conserved gene programme in bat wing development – Nature

Unveiling the Hidden Genetic Blueprint Behind the Evolution of Bat Wings Through Single-Cell Analysis

August 13, 2025
Opinion | Katharine Suding: 476 acres of possibility near Boulder for science, sustainability and the arts – The Boulder Reporting Lab

476 Acres of Possibility Near Boulder: A Bold Vision for Science, Sustainability, and the Arts

August 13, 2025
Interstellar Object 3I/ATLAS Seen in Stunning New Hubble Image – ScienceAlert

Stunning New Hubble Image Reveals Mysterious Interstellar Object 3I/ATLAS

August 13, 2025
MyMaine Media celebrates Maine’s modern lifestyle – WGME

Experience Maine’s Modern Lifestyle Like Never Before with MyMaine Media

August 13, 2025
Validea’s Top Information Technology Stocks Based On Martin Zweig – 8/13/2025 – Nasdaq

Must-Watch Information Technology Stocks for August 2025 Inspired by Martin Zweig’s Strategy

August 13, 2025
Grit, goals and glam: How beauty brands are making up for lost time and tapping into women’s sports – The New York Times

Grit, goals and glam: How beauty brands are making up for lost time and tapping into women’s sports – The New York Times

August 13, 2025
Trump Crypto Firm Announces $1.5 Billion Digital Coin Deal – The New York Times

Trump’s Crypto Company Unveils Revolutionary $1.5 Billion Digital Coin Deal

August 13, 2025
The end of ‘Townie Summer’: IU students return and stimulate Bloomington’s economy – WRTV

Townie Summer Wraps Up as IU Students Return, Revitalizing Bloomington’s Economy

August 13, 2025
John Davison departs from IGN Entertainment – GamesIndustry.biz

John Davison Steps Down from IGN Entertainment Leadership

August 13, 2025
Augusta Health takes a look at local health outcomes with needs assessment – The News Leader | Staunton, VA

Augusta Health Explores Local Health Outcomes Through Comprehensive Needs Assessment

August 13, 2025

Categories

Archives

August 2025
MTWTFSS
 123
45678910
11121314151617
18192021222324
25262728293031
« Jul    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (769)
  • Economy (791)
  • Entertainment (21,668)
  • General (16,446)
  • Health (9,830)
  • Lifestyle (802)
  • News (22,149)
  • People (793)
  • Politics (800)
  • Science (16,005)
  • Sports (21,289)
  • Technology (15,771)
  • World (774)

Recent News

Comparative single-cell analyses reveal evolutionary repurposing of a conserved gene programme in bat wing development – Nature

Unveiling the Hidden Genetic Blueprint Behind the Evolution of Bat Wings Through Single-Cell Analysis

August 13, 2025
Opinion | Katharine Suding: 476 acres of possibility near Boulder for science, sustainability and the arts – The Boulder Reporting Lab

476 Acres of Possibility Near Boulder: A Bold Vision for Science, Sustainability, and the Arts

August 13, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version