* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Tuesday, October 21, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    Country music legend rushed to hospital year after heart surgery. Here’s what we know – PennLive.com

    Country Music Legend Rushed to Hospital One Year After Heart Surgery – What’s Happening Now?

    Strictly Come Dancing results: Chris Robshaw is eliminated while drag queen La Voix escapes dance-off – Yahoo

    Strictly Come Dancing results: Chris Robshaw is eliminated while drag queen La Voix escapes dance-off – Yahoo

    Placer County town of Loomis considers entertainment zone for downtown – CBS News

    Loomis Unveils Thrilling New Entertainment Zone to Revitalize Downtown

    CT Culture Corner: Robert Redford films to watch – CT Insider

    CT Culture Corner: Robert Redford films to watch – CT Insider

    Elmira’s New Entertainment Venue ‘Centertown Social’ Adding its Final Touches – WENY News

    Elmira’s New Entertainment Venue ‘Centertown Social’ Adding its Final Touches – WENY News

    Bella Thorne Shows Off Her Toned Abs in Crop Top – Yahoo

    Bella Thorne Shows Off Her Toned Abs in a Chic Crop Top

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    3 E Network Technology Group Limited Closes $1.5 Million Convertible Promissory Note Offering – Quiver Quantitative

    3 E Network Technology Group Limited Closes $1.5 Million Convertible Promissory Note Offering – Quiver Quantitative

    3 Technology Stocks to Buy Now – Yahoo Finance

    3 Must-Buy Tech Stocks You Can’t Afford to Miss Right Now

    ‘New frontier’: Austin leaders start discussions on air taxi technology – KXAN Austin

    Austin Leaders Ignite Exciting Conversations on the Future of Air Taxi Technology

    How a Gemma model helped discover a new potential cancer therapy pathway – blog.google

    How a Gemma Model Revealed a Breakthrough Pathway for Cancer Treatment

    Italian Technology in Manufacturing: Supporting North American Industries and Keeping Production Local – Thomasnet

    How Italian Technology is Revolutionizing North American Manufacturing and Boosting Local Production

    Guide to Proteomics Project Planning: Sample Preparation Strategies – Technology Networks

    Guide to Proteomics Project Planning: Sample Preparation Strategies – Technology Networks

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    Country music legend rushed to hospital year after heart surgery. Here’s what we know – PennLive.com

    Country Music Legend Rushed to Hospital One Year After Heart Surgery – What’s Happening Now?

    Strictly Come Dancing results: Chris Robshaw is eliminated while drag queen La Voix escapes dance-off – Yahoo

    Strictly Come Dancing results: Chris Robshaw is eliminated while drag queen La Voix escapes dance-off – Yahoo

    Placer County town of Loomis considers entertainment zone for downtown – CBS News

    Loomis Unveils Thrilling New Entertainment Zone to Revitalize Downtown

    CT Culture Corner: Robert Redford films to watch – CT Insider

    CT Culture Corner: Robert Redford films to watch – CT Insider

    Elmira’s New Entertainment Venue ‘Centertown Social’ Adding its Final Touches – WENY News

    Elmira’s New Entertainment Venue ‘Centertown Social’ Adding its Final Touches – WENY News

    Bella Thorne Shows Off Her Toned Abs in Crop Top – Yahoo

    Bella Thorne Shows Off Her Toned Abs in a Chic Crop Top

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    3 E Network Technology Group Limited Closes $1.5 Million Convertible Promissory Note Offering – Quiver Quantitative

    3 E Network Technology Group Limited Closes $1.5 Million Convertible Promissory Note Offering – Quiver Quantitative

    3 Technology Stocks to Buy Now – Yahoo Finance

    3 Must-Buy Tech Stocks You Can’t Afford to Miss Right Now

    ‘New frontier’: Austin leaders start discussions on air taxi technology – KXAN Austin

    Austin Leaders Ignite Exciting Conversations on the Future of Air Taxi Technology

    How a Gemma model helped discover a new potential cancer therapy pathway – blog.google

    How a Gemma Model Revealed a Breakthrough Pathway for Cancer Treatment

    Italian Technology in Manufacturing: Supporting North American Industries and Keeping Production Local – Thomasnet

    How Italian Technology is Revolutionizing North American Manufacturing and Boosting Local Production

    Guide to Proteomics Project Planning: Sample Preparation Strategies – Technology Networks

    Guide to Proteomics Project Planning: Sample Preparation Strategies – Technology Networks

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Technology

What is AI inference at the edge, and why is it important for businesses?

July 22, 2024
in Technology
What is AI inference at the edge, and why is it important for businesses?
Share on FacebookShare on Twitter

A person standing in front of a rack of servers inside a data center

(Image credit: Shutterstock.com / Gorodenkoff)

AI inference at the edge refers to running trained machine learning (ML) models closer to end users when compared to traditional cloud AI inference. Edge inference accelerates the response time of ML models, enabling real-time AI applications in industries such as gaming, healthcare, and retail.

What is AI inference at the edge?

Before we look at AI inference specifically at the edge, it’s worth understanding what AI inference is in general. In the AI/ML development lifecycle, inference is where a trained ML model performs tasks on new, previously unseen data, such as making predictions or generating content. AI inference happens when end users interact directly with an ML model embedded in an application. For example, when a user inputs a prompt to ChatGPT and gets a response back, the time when ChatGPT is “thinking” is when inference is occurring, and the output is the result of that inference.

AI inference at the edge is a subset of AI inference whereby an ML model runs on a server close to end users; for example, in the same region or even the same city. This proximity reduces latency to milliseconds for faster model response, which is beneficial for real-time applications like image recognition, fraud detection, or gaming map generation.

Head of AI Product at Gcore.

How AI inference at the edge relates to edge AI

AI inference at the edge is a subset of edge AI. Edge AI involves processing data and running ML models closer to the data source rather than in the cloud. Edge AI includes everything related to edge AI computing, from edge servers (the metro edge) to IoT devices and telecom base stations (the far edge). Edge AI also includes training at the edge, not just inference. In this article, we’ll focus on AI inference on edge servers.

How inference at the edge compares to cloud inference

With cloud AI inference, you run an ML model on the remote cloud server, and the user data is sent and processed in the cloud. In this case, an end user may interact with the model from a different region, country, or even continent. As a result, cloud inference latency ranges from hundreds of milliseconds to seconds. This type of AI inference is suitable for applications that don’t require local data processing or low latency, such as ChatGPT, DALL-E, and other popular GenAI tools. Edge inference differs in two related ways:

Inference happens closer to the end userLatency is lower

How AI inference at the edge works

AI inference at the edge relies on an IT infrastructure with two main architectural components: a low-latency network and servers powered by AI chips. If you need scalable AI inference that can handle load spikes, you also need a container orchestration service, such as Kubernetes; this runs on edge servers and enables your ML models to scale up and down quickly and automatically. Today, only a few providers have the infrastructure to offer global AI inference at the edge that meets these requirements.

Low-latency network: A provider offering AI inference at the edge should have a distributed network of edge points of presence (PoPs) where servers are located. The more edge PoPs, the quicker the network roundtrip time, which means ML model responses occur faster for end users. A provider should have dozens—or even hundreds—of PoPs worldwide and should offer smart routing, which routes a user request to the closest edge server to use the globally distributed network efficiently and effectively.

Sign up to the TechRadar Pro newsletter to get all the top news, opinion, features and guidance your business needs to succeed!

Servers with AI accelerators: To reduce computation time, you need to run your ML model on a server or VM powered by an AI accelerator, such as NVIDIA GPU. There are GPUs designed specifically for AI inference. For example, one of the latest models, the NVIDIA L40S, has up to 5x faster inference performance than the A100 and H100 GPUs, which are primarily designed for training large ML models but are also used for inference. The NVIDIA L40S GPU is currently the best AI accelerator for performing AI inference.

Container orchestration: Deploying ML models in containers makes models scalable and portable. A provider can manage an underlying container orchestration tool on your behalf. In that setup, an ML engineer looking to integrate a model into an application would simply upload a container image with an ML model and get a ready-to-use ML model endpoint. When a load spike occurs, containers with your ML model will automatically scale up, and then scale back down when the load subsides.

Key benefits of AI inference at the edge

AI inference at the edge offers three key benefits across industries or use cases: low latency, security and sovereignty, and cost efficiency.

Low latency 

The lower the network latency, the faster your model will respond. If a provider’s average network latency is under 50 ms, it’s appropriate for most apps requiring a near-instant response. By comparison, cloud latency can be as high as a few hundred milliseconds, depending on your location relative to the cloud server. That’s a noticeable difference for an end user, with cloud latency potentially leading to frustration as end users are left waiting for their AI responses.

Keep in mind that a low-latency network only accounts for the travel time of the data. A 50 ms network latency doesn’t mean users will get an AI output in 50 ms; you need to add the time that the ML model takes to perform inference. That ML model processing time is contingent on the model being used and may account for the majority of the processing time for end users. That’s all the more reason to ensure you’re using a low-latency network, so your users get the best possible response time while ML model developers continue to improve model inference speed.

Security and sovereignty

Keeping data at the edge—meaning local to the user—simplifies compliance with local laws and regulations, such as GDPR and its equivalents in other countries. An edge inference provider should set up its inference infrastructure to adhere to local laws to ensure that you and your users are protected appropriately.

Edge inference also increases the confidentiality and privacy of your end users’ data because it’s processed locally rather than being sent to remote cloud servers. This reduces the attack surface and minimizes the risk of data exposure during transmission.

Cost efficiency

Typically, a provider charges only for the computational resources utilized by the ML model. This, along with carefully configured autoscaling and model execution schedules, can significantly reduce inference costs. Who should use AI inference at the edge?

Here are some common scenarios where inference at the edge would be the optimal choice:

Low latency is critical for your application and users. A wide range of real-time applications, from facial recognition to trade analysis, require low latency. Edge inference provides the lowest latency inference option.Your user base is spread across multiple geographical locations. In this case, you need to provide the same user experience—meaning the same low latency—to all of your users regardless of their location. This requires a globally distributed edge network.You don’t want to deal with infrastructure maintenance. If supporting cloud and AI infrastructure isn’t part of your core business, it may be worth delegating these processes to an experienced, expert partner. You can then focus your resources on developing your application.You want to keep your data local, for example, within the country where it’s generated. In this case, you need to perform AI inference as close to your end users as possible. A globally distributed edge network can meet this need, whereas the cloud is unlikely to offer the extent of distribution you require.

Which industries benefit from AI inference at the edge?

AI inference at the edge benefits any industry where AI/ML is used, but especially those developing real-time applications. In the technology sector, this would include generative AI applications, chatbots and virtual assistants, data augmentation and AI tools for software engineers. In gaming, it would be AI content and map generation, real-time player analytics and real-time AI bot customisation and conversation. For the retail market, typical applications would be smart grocery with self-checkout and merchandising, virtual try-on, and content generation, predictions, and recommendations. 

In manufacturing the benefits are to real-time defect detection in production pipelines, VR/VX applications and rapid response feedback while in the media and entertainment industry it would be content analysis, real-time translation and automated transcription. Another sector that develops real time applications is automotive, and particularly rapid response for autonomous vehicles, vehicle personalization, advanced driver assistance and real-time traffic updates.

Conclusion

For organizations looking to deploy real-time applications, AI inference at the edge is an essential component of their infrastructure. It significantly reduces latency, ensuring ultra-fast response times. For end users, this means a seamless, more engaging experience, whether playing online games, using chatbots, or shopping online with a virtual try-on service. Enhanced data security means businesses can offer superior AI services while protecting user data. AI inference at the edge is a critical enabler to AI/ML production deployment at scale, driving AI/ML innovation and efficiency across numerous industries.

We list the best bare metal hosting.

This article was produced as part of TechRadarPro’s Expert Insights channel where we feature the best and brightest minds in the technology industry today. The views expressed here are those of the author and are not necessarily those of TechRadarPro or Future plc. If you are interested in contributing find out more here: https://www.techradar.com/news/submit-your-story-to-techradar-pro

Michele Taroni is Head of AI Product at Gcore.

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : TechRadar – https://www.techradar.com/pro/what-is-ai-inference-at-the-edge-and-why-is-it-important-for-businesses

Tags: importantinferencetechnology
Previous Post

Let them entertain you: why Amazon’s the best place to buy your big TV

Next Post

Why the cloud shows us the future of AI

Revisiting Pope Francis’ Call for ‘Ecological Conversion’ – Sojourners

Pope Francis’ Urgent Call for an ‘Ecological Conversion’: A Fresh Perspective

October 21, 2025
PJSB donates $50,000 for UACCM Nursing and Science Center – KVOM

PJSB donates $50,000 for UACCM Nursing and Science Center – KVOM

October 21, 2025
An update from Riane Eisler, author of “The Chalice and the Blade,” on achieving peace. – Psychology Today

An update from Riane Eisler, author of “The Chalice and the Blade,” on achieving peace. – Psychology Today

October 21, 2025
We live in a sailboat and travel the world full-time… but our lifestyle isn’t as glamorous as you’d think – Daily Mail

Living Full-Time on a Sailboat and Traveling the World: The Reality Behind the Glamour

October 21, 2025
3 E Network Technology Group Limited Closes $1.5 Million Convertible Promissory Note Offering – Quiver Quantitative

3 E Network Technology Group Limited Closes $1.5 Million Convertible Promissory Note Offering – Quiver Quantitative

October 21, 2025
This Week in Navy Sports Presented by Navy Federal Credit Union – Naval Academy Athletics

Thrilling Moments from This Week in Navy Sports

October 21, 2025
New Land Cruiser “FJ” Makes World Premiere – トヨタ自動車株式会社 公式企業サイト

Unveiling the All-New Land Cruiser “FJ”: A Bold and Exciting World Premiere

October 20, 2025
World economy resilient but underwhelming, says IMF chief – African Business

Global Economy Proves Resilient Yet Growth Falls Short, Warns IMF Chief

October 20, 2025
Country music legend rushed to hospital year after heart surgery. Here’s what we know – PennLive.com

Country Music Legend Rushed to Hospital One Year After Heart Surgery – What’s Happening Now?

October 20, 2025
Proteases in intestinal health and disease – Nature

The Vital Role of Proteases in Intestinal Health and Disease

October 20, 2025

Categories

Archives

October 2025
M T W T F S S
 12345
6789101112
13141516171819
20212223242526
2728293031  
« Sep    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (878)
  • Economy (899)
  • Entertainment (21,770)
  • General (17,713)
  • Health (9,940)
  • Lifestyle (912)
  • News (22,149)
  • People (900)
  • Politics (909)
  • Science (16,110)
  • Sports (21,399)
  • Technology (15,879)
  • World (882)

Recent News

Revisiting Pope Francis’ Call for ‘Ecological Conversion’ – Sojourners

Pope Francis’ Urgent Call for an ‘Ecological Conversion’: A Fresh Perspective

October 21, 2025
PJSB donates $50,000 for UACCM Nursing and Science Center – KVOM

PJSB donates $50,000 for UACCM Nursing and Science Center – KVOM

October 21, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version