* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Sunday, December 7, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    Ex-‘Grey’s Anatomy’ star opens up battle against incurable disease – PennLive.com

    Ex-‘Grey’s Anatomy’ star opens up battle against incurable disease – PennLive.com

    “This acquisition brings together two pioneering entertainment businesses, combining Netflix’s innovation, global reach and best-in-class streaming service with Warner Bros.’ century-long legacy of world-class storytelling.” – facebook.com

    Netflix and Warner Bros. Join Forces to Revolutionize Entertainment with Unmatched Innovation and Legendary Storytelling

    Through the lens: Four decades of arts & entertainment with photojournalist Roger Mastroianni – Fresh Water Cleveland

    Through the lens: Four decades of arts & entertainment with photojournalist Roger Mastroianni – Fresh Water Cleveland

    Discussing Netflix’s deal to buy Warner Bros. – Spectrum News

    Discussing Netflix’s deal to buy Warner Bros. – Spectrum News

    Why Caesars Entertainment (CZR) Stock Is Down Today – Markets Financial Content

    Why Caesars Entertainment (CZR) Stock Took a Hit Today

    12TH ANNUAL WOMEN IN ENTERTAINMENT RETURNS TO DIGNITY HEALTH SPORTS PARK ON DECEMBER 11 – Dignity Health Sports Park

    12th Annual Women in Entertainment Event Makes a Grand Return to Dignity Health Sports Park on December 11

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    An Intrinsic Calculation For Bytes Technology Group plc (LON:BYIT) Suggests It’s 27% Undervalued – Yahoo Finance

    Intrinsic Valuation Reveals Bytes Technology Group Is Undervalued by 27%

    Amundi Acquires 235,432 Shares of Cognizant Technology Solutions Corporation $CTSH – MarketBeat

    Amundi Acquires 235,432 Shares of Cognizant Technology Solutions Corporation $CTSH – MarketBeat

    ComNav unveils innovative products ‘From Earth to Ocean’ – GPS World

    ComNav Launches Revolutionary ‘From Earth to Ocean’ Product Line

    Gorilla Technology (NASDAQ: GRRR) gets 2025 Nobel Sustainability Trust nod for Leadership in Implementation – Stock Titan

    Gorilla Technology (NASDAQ: GRRR) gets 2025 Nobel Sustainability Trust nod for Leadership in Implementation – Stock Titan

    The 65″ Panasonic Z95A 4K OLED TV With MLA Technology Drops to $1,499.99 Only at Best Buy – IGN Southeast Asia

    The 65″ Panasonic Z95A 4K OLED TV With MLA Technology Drops to $1,499.99 Only at Best Buy – IGN Southeast Asia

    Hospitals Under Pressure: How Technology Can Transform Operations – MedCity News

    Hospitals Under Pressure: How Technology Is Transforming Healthcare Operations

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    Ex-‘Grey’s Anatomy’ star opens up battle against incurable disease – PennLive.com

    Ex-‘Grey’s Anatomy’ star opens up battle against incurable disease – PennLive.com

    “This acquisition brings together two pioneering entertainment businesses, combining Netflix’s innovation, global reach and best-in-class streaming service with Warner Bros.’ century-long legacy of world-class storytelling.” – facebook.com

    Netflix and Warner Bros. Join Forces to Revolutionize Entertainment with Unmatched Innovation and Legendary Storytelling

    Through the lens: Four decades of arts & entertainment with photojournalist Roger Mastroianni – Fresh Water Cleveland

    Through the lens: Four decades of arts & entertainment with photojournalist Roger Mastroianni – Fresh Water Cleveland

    Discussing Netflix’s deal to buy Warner Bros. – Spectrum News

    Discussing Netflix’s deal to buy Warner Bros. – Spectrum News

    Why Caesars Entertainment (CZR) Stock Is Down Today – Markets Financial Content

    Why Caesars Entertainment (CZR) Stock Took a Hit Today

    12TH ANNUAL WOMEN IN ENTERTAINMENT RETURNS TO DIGNITY HEALTH SPORTS PARK ON DECEMBER 11 – Dignity Health Sports Park

    12th Annual Women in Entertainment Event Makes a Grand Return to Dignity Health Sports Park on December 11

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    An Intrinsic Calculation For Bytes Technology Group plc (LON:BYIT) Suggests It’s 27% Undervalued – Yahoo Finance

    Intrinsic Valuation Reveals Bytes Technology Group Is Undervalued by 27%

    Amundi Acquires 235,432 Shares of Cognizant Technology Solutions Corporation $CTSH – MarketBeat

    Amundi Acquires 235,432 Shares of Cognizant Technology Solutions Corporation $CTSH – MarketBeat

    ComNav unveils innovative products ‘From Earth to Ocean’ – GPS World

    ComNav Launches Revolutionary ‘From Earth to Ocean’ Product Line

    Gorilla Technology (NASDAQ: GRRR) gets 2025 Nobel Sustainability Trust nod for Leadership in Implementation – Stock Titan

    Gorilla Technology (NASDAQ: GRRR) gets 2025 Nobel Sustainability Trust nod for Leadership in Implementation – Stock Titan

    The 65″ Panasonic Z95A 4K OLED TV With MLA Technology Drops to $1,499.99 Only at Best Buy – IGN Southeast Asia

    The 65″ Panasonic Z95A 4K OLED TV With MLA Technology Drops to $1,499.99 Only at Best Buy – IGN Southeast Asia

    Hospitals Under Pressure: How Technology Can Transform Operations – MedCity News

    Hospitals Under Pressure: How Technology Is Transforming Healthcare Operations

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Technology

ChatGPT-3.5, Claude 3 kick pixelated butt in Street Fighter III tournament for LLMs

April 11, 2024
in Technology
ChatGPT-3.5, Claude 3 kick pixelated butt in Street Fighter III tournament for LLMs
Share on FacebookShare on Twitter

Large language models (LLMs) can now be put to the test in the retro arcade video game Street Fighter III, and so far it seems some are better than others.

The Street Fighter III-based benchmark, termed LLM Colosseum, was created by four AI devs from Phospho and Quivr during the Mistral hackathon in San Francisco last month. The benchmark works by pitting two LLMs against each other in an actual game of Street Fighter III, keeping each updated on how close victory is, where the opposing LLM is, what move it took. Then it asks for what it would like to do, after which it will make a move.

According to the official leaderboard for LLM Colosseum, which is based on 342 fights between eight different LLMs, ChatGPT-3.5 Turbo is by far the winner, with an Elo rating of 1,776.11. That’s well ahead of several iterations of ChatGPT-4, which landed in the 1,400s to 1,500s.

What even makes an LLM good at Street Fighter III is balance between key characteristics, said Nicolas Oulianov, one of the LLM Colosseum developers. “GPT-3.5 turbo has a good balance between speed and brains. GPT-4 is a larger model, thus way smarter, but much slower.”

The disparity between ChatGPT-3.5 and 4 in LLM Colosseum is an indication of what features are being prioritized in the latest LLMs, according to Oulianov. “Existing benchmarks focus too much on performance regardless of speed. If you’re an AI developer, you need custom evaluations to see if GPT-4 is the best model for your users,” he said. Even fractions of a second can count in fighting games, so taking any extra time can result in a quick loss.

A different experiment with LLM Colosseum was documented by Amazon Web Services developer Banjo Obayomi, running models off Amazon Bedrock. This tournament involved a dozen different models, though Claude clearly swept the competition by snagging first to fourth place, with Claude 3 Haiku scoring first place.

Obayomi also tracked the quirky behavior that tested LLMs exhibited from time to time, including attempts to play invalid moves such as the devastating “hardest hitting combo of all.”

There were also instances where LLMs just refused to play anymore. The companies that create AI models tend to inject them with an anti-violent outlook, and will often refuse to answer any prompt that it deems to be too violent. Claude 2.1 was particularly pacifistic, saying it couldn’t tolerate even fictional fighting.

Compared to actual human players, though, these chatbots aren’t exactly playing at a pro level. “I fought a few SF3 games against LLMs,” says Oulianov. “So far, I think LLMs only stand a chance to win in Street Fighter 3 against a 70 or a five-year-old.”

MPs ask: Why is it so freakin’ hard to get AI giants to pay copyright holders?

Boffins deem Google DeepMind’s material discoveries rather shallow

US Air Force secretary so confident in AI-controlled F-16s, he’ll fly in one

Next-gen Meta AI chip serves up ads while sipping power

ChatGPT-4 similarly performed pretty poorly in Doom, another old-school game that requires quick thinking and fast movement.

But why test LLMs in a retro fighting game?

The idea of benchmarking LLMs in an old-school video game is funny and maybe that’s all the reason LLM Colosseum needs to exist, but it might be a little more than that. “Unlike other benchmarks you see in press releases, everyone played video games, and can get a feel of why it would be challenging for an LLM,” Oulianov said. “Large AI companies are gaming benchmarks to get pretty scores and show off.”

But he does note that “the Street Fighter benchmark is kind of the same, but way more entertaining.”

Beyond that, Oulianov said LLM Colosseum showcases how intelligent general-purpose LLMs already are. “What this project shows is the potential for LLMs to become so smart, so fast, and so versatile, that we can use them as ‘turnkey reasoning machines’ basically everywhere. The goal is to create machines able to not only reason with text, but also react to their environment and interact with other thinking machines.”

Oulianov also pointed out that there are already AI models out there that can play modern games at a professional level. DeepMind’s AlphaStar trashed StarCraft II pros back in 2018 and 2019, and OpenAI’s OpenAI Five model proved to be capable of beating world champions and cooperating effectively with human teammates.

Today’s chat-oriented LLMs aren’t anywhere near the level of purpose-made models (just try playing a game of chess against ChatGPT), but perhaps it won’t be that way forever. “With projects like this one, we show that this vision is closer to reality than science fiction,” Oulianov said. ®

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : The Register – https://go.theregister.com/feed/www.theregister.com/2024/04/11/chatgpt_claude_street_fighter_3/

Tags: ChatGPTClaudetechnology
Previous Post

US, Japan announce joint AI research projects funded by Nvidia, Microsoft, and others

Next Post

Next Vision, or Vision Next? What we really thought about Google and Intel’s AI events

The making of the 2026 World Cup schedule: Simulations, an all-nighter and a giant ‘puzzle’ – The New York Times

Inside the Epic Challenge of Crafting the 2026 World Cup Schedule: Simulations, Sleepless Nights, and a Giant Puzzle

December 7, 2025
Ford CEO Jim Farley Says Fuel Economy Standards Were ‘Totally Out Of Touch’ – Ford Authority

Ford CEO Jim Farley Blasts Fuel Economy Standards as ‘Totally Out of Touch

December 7, 2025
Ex-‘Grey’s Anatomy’ star opens up battle against incurable disease – PennLive.com

Ex-‘Grey’s Anatomy’ star opens up battle against incurable disease – PennLive.com

December 7, 2025
Jets’ Gabriel Vilardi opens up about mental health struggles: ‘You just see the negatives’ – The Athletic – The New York Times

Jets’ Gabriel Vilardi Shares His Journey of Overcoming Mental Health Challenges: “You Just See the Negatives

December 7, 2025
Florida kicks off first black bear hunt in a decade, despite pushback – Florida Politics

Florida kicks off first black bear hunt in a decade, despite pushback – Florida Politics

December 7, 2025
Pacific Northwest ‘snapshot’ shows how surprisingly tough birds are – futurity.org

Pacific Northwest ‘snapshot’ shows how surprisingly tough birds are – futurity.org

December 7, 2025
The Natural View: The Fight Against Microplastics with 1 Life Science – WholeFoods Magazine

The Natural View: The Fight Against Microplastics with 1 Life Science – WholeFoods Magazine

December 7, 2025
Global scientists gather in SW China’s Tengchong to explore innovation-driven development – news.cgtn.com

Global Scientists Unite in SW China’s Tengchong to Ignite Innovation and Drive Development

December 7, 2025
I’m a lifestyle editor and mum of two. Here’s 9 gifts I’m buying my kids this Christmas – Yahoo Life UK

I’m a lifestyle editor and mum of two. Here’s 9 gifts I’m buying my kids this Christmas – Yahoo Life UK

December 7, 2025
An Intrinsic Calculation For Bytes Technology Group plc (LON:BYIT) Suggests It’s 27% Undervalued – Yahoo Finance

Intrinsic Valuation Reveals Bytes Technology Group Is Undervalued by 27%

December 7, 2025

Categories

Archives

December 2025
M T W T F S S
1234567
891011121314
15161718192021
22232425262728
293031  
« Nov    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (957)
  • Economy (977)
  • Entertainment (21,852)
  • General (18,607)
  • Health (10,016)
  • Lifestyle (987)
  • News (22,149)
  • People (981)
  • Politics (989)
  • Science (16,190)
  • Sports (21,476)
  • Technology (15,957)
  • World (964)

Recent News

The making of the 2026 World Cup schedule: Simulations, an all-nighter and a giant ‘puzzle’ – The New York Times

Inside the Epic Challenge of Crafting the 2026 World Cup Schedule: Simulations, Sleepless Nights, and a Giant Puzzle

December 7, 2025
Ford CEO Jim Farley Says Fuel Economy Standards Were ‘Totally Out Of Touch’ – Ford Authority

Ford CEO Jim Farley Blasts Fuel Economy Standards as ‘Totally Out of Touch

December 7, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version