* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Sunday, September 7, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    Victor Garber on his viral “And Just Like That” toilet scene: ‘I was delighted to be doing something ridiculous’ (exclusive) – yahoo.com

    Victor Garber on his viral “And Just Like That” toilet scene: ‘I was delighted to be doing something ridiculous’ (exclusive) – yahoo.com

    Pendulum Announce Homecoming 2026 Australian Tour – yahoo.com

    Pendulum Announces Thrilling Homecoming Tour Across Australia in 2026

    ITV Studios Launches New Entertainment Label – Global Bulletin – IMDb

    ITV Studios Unveils Exciting New Entertainment Label

    TS Entertainment bringing Malibu Jack’s to former Owensboro mall – Lane Report

    TS Entertainment Launches Malibu Jack’s at Former Owensboro Mall Location

    Jenny Han Dropped a Major ‘The Summer I Turned Pretty’ Easter Egg Revealing [SPOILER] – yahoo.com

    Jenny Han Just Unveiled a Huge ‘The Summer I Turned Pretty’ Easter Egg That Changes Everything [SPOILER]

    Liam Payne’s Cousin Ross Harris Honors Late Singer With Emotional Song ‘Bones’ – yahoo.com

    Liam Payne’s Cousin Ross Harris Honors Late Singer with Emotional New Song ‘Bones

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Health Technology Ecosystem – Centers for Medicare & Medicaid Services | CMS (.gov)

    Discover the Future of Health Technology: Innovations Revolutionizing Patient Care

    Coherent Joins LLNL’s STARFIRE Diode Technology Working Group to Advance Inertial Fusion Energy – GlobeNewswire

    Coherent Partners with LLNL’s STARFIRE Team to Drive Breakthroughs in Inertial Fusion Energy

    Gene Associated With Deadly Heart Disease in Golden Retrievers Identified – Technology Networks

    Breakthrough Discovery Uncovers Gene Behind Deadly Heart Disease in Golden Retrievers

    Monkey Island LNG Picks ConocoPhillips’ Liquefaction Technology – Hart Energy

    Monkey Island LNG Selects ConocoPhillips’ Advanced Liquefaction Technology for Next-Gen Energy Solutions

    Credo Technology Group Holding Ltd. (CRDO) Surpasses Q1 Earnings and Revenue Estimates – Yahoo Finance

    Credo Technology Group Surpasses Q1 Earnings and Revenue Expectations

    The Economist is hiring a science and technology correspondent – The Economist

    Exciting Opportunity: Become Our Next Science and Technology Correspondent!

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    Victor Garber on his viral “And Just Like That” toilet scene: ‘I was delighted to be doing something ridiculous’ (exclusive) – yahoo.com

    Victor Garber on his viral “And Just Like That” toilet scene: ‘I was delighted to be doing something ridiculous’ (exclusive) – yahoo.com

    Pendulum Announce Homecoming 2026 Australian Tour – yahoo.com

    Pendulum Announces Thrilling Homecoming Tour Across Australia in 2026

    ITV Studios Launches New Entertainment Label – Global Bulletin – IMDb

    ITV Studios Unveils Exciting New Entertainment Label

    TS Entertainment bringing Malibu Jack’s to former Owensboro mall – Lane Report

    TS Entertainment Launches Malibu Jack’s at Former Owensboro Mall Location

    Jenny Han Dropped a Major ‘The Summer I Turned Pretty’ Easter Egg Revealing [SPOILER] – yahoo.com

    Jenny Han Just Unveiled a Huge ‘The Summer I Turned Pretty’ Easter Egg That Changes Everything [SPOILER]

    Liam Payne’s Cousin Ross Harris Honors Late Singer With Emotional Song ‘Bones’ – yahoo.com

    Liam Payne’s Cousin Ross Harris Honors Late Singer with Emotional New Song ‘Bones

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Health Technology Ecosystem – Centers for Medicare & Medicaid Services | CMS (.gov)

    Discover the Future of Health Technology: Innovations Revolutionizing Patient Care

    Coherent Joins LLNL’s STARFIRE Diode Technology Working Group to Advance Inertial Fusion Energy – GlobeNewswire

    Coherent Partners with LLNL’s STARFIRE Team to Drive Breakthroughs in Inertial Fusion Energy

    Gene Associated With Deadly Heart Disease in Golden Retrievers Identified – Technology Networks

    Breakthrough Discovery Uncovers Gene Behind Deadly Heart Disease in Golden Retrievers

    Monkey Island LNG Picks ConocoPhillips’ Liquefaction Technology – Hart Energy

    Monkey Island LNG Selects ConocoPhillips’ Advanced Liquefaction Technology for Next-Gen Energy Solutions

    Credo Technology Group Holding Ltd. (CRDO) Surpasses Q1 Earnings and Revenue Estimates – Yahoo Finance

    Credo Technology Group Surpasses Q1 Earnings and Revenue Expectations

    The Economist is hiring a science and technology correspondent – The Economist

    Exciting Opportunity: Become Our Next Science and Technology Correspondent!

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Technology

Google DeepMind unveils ‘superhuman’ AI system that excels in fact-checking, saving costs and improving accuracy

March 29, 2024
in Technology
Google DeepMind unveils ‘superhuman’ AI system that excels in fact-checking, saving costs and improving accuracy
Share on FacebookShare on Twitter

Join us in Atlanta on April 10th and explore the landscape of security workforce. We will explore the vision, benefits, and use cases of AI for security teams. Request an invite here.

A new study from Google’s DeepMind research unit has found that an artificial intelligence system can outperform human fact-checkers when evaluating the accuracy of information generated by large language models.

The paper, titled “Long-form factuality in large language models” and published on the pre-print server arXiv, introduces a method called Search-Augmented Factuality Evaluator (SAFE). SAFE uses a large language model to break down generated text into individual facts, and then uses Google Search results to determine the accuracy of each claim.

“SAFE utilizes an LLM to break down a long-form response into a set of individual facts and to evaluate the accuracy of each fact using a multi-step reasoning process comprising sending search queries to Google Search and determining whether a fact is supported by the search results,” the authors explained.

‘Superhuman’ performance sparks debate

The researchers pitted SAFE against human annotators on a dataset of roughly 16,000 facts, finding that SAFE’s assessments matched the human ratings 72% of the time. Even more notably, in a sample of 100 disagreements between SAFE and the human raters, SAFE’s judgment was found to be correct in 76% of cases.

VB Event

The AI Impact Tour – Atlanta

Continuing our tour, we’re headed to Atlanta for the AI Impact Tour stop on April 10th. This exclusive, invite-only event, in partnership with Microsoft, will feature discussions on how generative AI is transforming the security workforce. Space is limited, so request an invite today.

Request an invite

While the paper asserts that “LLM agents can achieve superhuman rating performance,” some experts are questioning what “superhuman” really means here.

On a quick read I can’t figure out much about the human subjects, but it looks like superhuman means better than an underpaid crowd worker, rather a true human fact checker? That makes the characterization misleading. (Like saying that 1985 chess software was superhuman).…

— Gary Marcus (@GaryMarcus) March 28, 2024

Gary Marcus, a well-known AI researcher and frequent critic of overhyped claims, suggested on Twitter that in this case, “superhuman” may simply mean “better than an underpaid crowd worker, rather a true human fact checker.”

“That makes the characterization misleading,” he said. “Like saying that 1985 chess software was superhuman.”

Marcus raises a valid point. To truly demonstrate superhuman performance, SAFE would need to be benchmarked against expert human fact-checkers, not just crowdsourced workers. The specific details of the human raters, such as their qualifications, compensation, and fact-checking process, are crucial for properly contextualizing the results.

Cost savings and benchmarking top models

One clear advantage of SAFE is cost — the researchers found that using the AI system was about 20 times cheaper than human fact-checkers. As the volume of information generated by language models continues to explode, having an economical and scalable way to verify claims will be increasingly vital.

The DeepMind team used SAFE to evaluate the factual accuracy of 13 top language models across 4 families (Gemini, GPT, Claude, and PaLM-2) on a new benchmark called LongFact. Their results indicate that larger models generally produced fewer factual errors. 

However, even the best-performing models generated a significant number of false claims. This underscores the risks of over-relying on language models that can fluently express inaccurate information. Automatic fact-checking tools like SAFE could play a key role in mitigating those risks.

Transparency and human baselines are crucial

While the SAFE code and LongFact dataset have been open-sourced on GitHub, allowing other researchers to scrutinize and build upon the work, more transparency is still needed around the human baselines used in the study. Understanding the specifics of the crowdworkers’ background and process is essential for assessing SAFE’s capabilities in proper context.

As the tech giants race to develop ever more powerful language models for applications ranging from search to virtual assistants, the ability to automatically fact-check the outputs of these systems could prove pivotal. Tools like SAFE represent an important step towards building a new layer of trust and accountability.

However, it’s crucial that the development of such consequential technologies happens in the open, with input from a broad range of stakeholders beyond the walls of any one company. Rigorous, transparent benchmarking against human experts — not just crowdworkers — will be essential to measure true progress. Only then can we gauge the real-world impact of automated fact-checking on the fight against misinformation.

VB Daily

Stay in the know! Get the latest news in your inbox daily

By subscribing, you agree to VentureBeat’s Terms of Service.

Thanks for subscribing. Check out more VB newsletters here.

An error occured.

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : VentureBeat – https://venturebeat.com/ai/google-deepmind-unveils-superhuman-ai-system-that-excels-in-fact-checking-saving-costs-and-improving-accuracy/

Tags: DeepMindGoogletechnology
Previous Post

The Pirate Queen interview: How Singer Studios and Lucy Liu brought forgotten history to life

Next Post

How data can help you understand the game industry | Joost Van Dreunen interview

WA Department of Ecology decries federal report that downplays climate change – KREM

WA Department of Ecology decries federal report that downplays climate change – KREM

September 7, 2025
This Week in Science: Out-of-Sync Seasons, Anti-Aging Breakthroughs, And More! – yahoo.com

This Week in Science: Out-of-Sync Seasons, Anti-Aging Breakthroughs, And More! – yahoo.com

September 7, 2025
Diagnostic dilemma: Woman’s severe knee pain reveals ‘golden threads’ in her joints – Live Science

Mystery Solved: Woman’s Severe Knee Pain Traced to Shimmering ‘Golden Threads’ in Her Joints

September 7, 2025
PHOTOS: All the celebrities at Aryna Sabalenka’s US Open final victory – Tennis.com

PHOTOS: All the celebrities at Aryna Sabalenka’s US Open final victory – Tennis.com

September 7, 2025
Health Technology Ecosystem – Centers for Medicare & Medicaid Services | CMS (.gov)

Discover the Future of Health Technology: Innovations Revolutionizing Patient Care

September 7, 2025
Vermont H.S. sports scores for Friday, Sept. 5: See how your favorite team fared – Burlington Free Press

Friday Night Showdowns: Vermont High School Sports Scores and Highlights from Sept. 5

September 7, 2025
Reformulation of general relativity brings it closer to Newtonian physics – Physics World

Reformulation of general relativity brings it closer to Newtonian physics – Physics World

September 7, 2025
Trump’s Economy Fails Arkansans as Unemployment Reaches Four-Year High – SWARK Today

Trump’s Economy Fails Arkansans as Unemployment Reaches Four-Year High – SWARK Today

September 7, 2025
Victor Garber on his viral “And Just Like That” toilet scene: ‘I was delighted to be doing something ridiculous’ (exclusive) – yahoo.com

Victor Garber on his viral “And Just Like That” toilet scene: ‘I was delighted to be doing something ridiculous’ (exclusive) – yahoo.com

September 7, 2025
Heroes on the Hill event addresses mental health for vets, first responders – CBS News

Heroes on the Hill: Tackling Mental Health Challenges for Vets and First Responders

September 7, 2025

Categories

Archives

September 2025
MTWTFSS
1234567
891011121314
15161718192021
22232425262728
2930 
« Aug    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (811)
  • Economy (829)
  • Entertainment (21,706)
  • General (16,899)
  • Health (9,870)
  • Lifestyle (842)
  • News (22,149)
  • People (831)
  • Politics (835)
  • Science (16,039)
  • Sports (21,328)
  • Technology (15,809)
  • World (810)

Recent News

WA Department of Ecology decries federal report that downplays climate change – KREM

WA Department of Ecology decries federal report that downplays climate change – KREM

September 7, 2025
This Week in Science: Out-of-Sync Seasons, Anti-Aging Breakthroughs, And More! – yahoo.com

This Week in Science: Out-of-Sync Seasons, Anti-Aging Breakthroughs, And More! – yahoo.com

September 7, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version