* . *
  • About
  • Advertise
  • Privacy & Policy
  • Contact
Sunday, June 8, 2025
Earth-News
  • Home
  • Business
  • Entertainment
    Middle Eastern Entertainment Headlines at 5:49 a.m. GMT – Yahoo

    Exciting Updates from the Middle Eastern Entertainment Scene!

    Ceramic Dalmatian Entertainment is WLAF’s Business of the Week – WLAF

    Spotlight on Success: Ceramic Dalmatian Entertainment Shines as This Week’s Featured Business!

    Brass Lion Entertainment unveils co-op action RPG Wu-Tang: Rise of the Deceiver – VentureBeat

    Unleash Your Inner Warrior: Discover the Co-Op Action RPG Wu-Tang: Rise of the Deceiver!

    Entertainment lineup released for 2025 Mississippi State Fair – WAPT

    Exciting Entertainment Lineup Unveiled for the 2025 Mississippi State Fair!

    After Denzel Washington Said He Would Be In Black Panther 3, Ryan Coogler Explained Why He’s ‘Fine’ With That Information Being Revealed So Early – Yahoo

    Ryan Coogler Shares Why He’s Cool with Denzel Washington’s Black Panther 3 Reveal!

    Traveling Tacos and Tequila Festival to stop at Florence Yall’s stadium this October – Cincinnati Enquirer

    Get Ready for a Flavor Fiesta: Traveling Tacos and Tequila Festival Hits Florence Y’all’s Stadium This October!

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Innovation at Scale: How P&G Transforms Business Through Technology – Procter & Gamble

    Revolutionizing Business: P&G’s Bold Journey into Technological Innovation

    Drag racer survives frightening airborne crash at World Wide Technology Raceway – FOX 2

    Drag racer survives frightening airborne crash at World Wide Technology Raceway – FOX 2

    Apple Watch and the future of wearable technology in healthcare – MSN

    Revolutionizing Healthcare: The Future of Wearable Technology with Apple Watch

    ECS Professor Pankaj K. Jha Receives NSF Grant to Develop Quantum Technology – Syracuse University News

    Unlocking the Future: ECS Professor Pankaj K. Jha Secures NSF Grant for Groundbreaking Quantum Technology Development

    Fire Tech Brief: 5 Fire Apparatus Technology Upgrades – firehouse.com

    Revving Up Safety: 5 Innovative Upgrades for Fire Apparatus Technology

    U.S. FDA Grants Platform Technology Designation to the Viral Vector Used in SRP-9003, Sarepta’s Investigational Gene Therapy for the Treatment of Limb Girdle Muscular Dystrophy Type 2E/R4 – Sarepta Therapeutics

    Breakthrough for Gene Therapy: FDA Designates Viral Vector in Sarepta’s SRP-9003 for Limb Girdle Muscular Dystrophy Treatment

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
  • Home
  • Business
  • Entertainment
    Middle Eastern Entertainment Headlines at 5:49 a.m. GMT – Yahoo

    Exciting Updates from the Middle Eastern Entertainment Scene!

    Ceramic Dalmatian Entertainment is WLAF’s Business of the Week – WLAF

    Spotlight on Success: Ceramic Dalmatian Entertainment Shines as This Week’s Featured Business!

    Brass Lion Entertainment unveils co-op action RPG Wu-Tang: Rise of the Deceiver – VentureBeat

    Unleash Your Inner Warrior: Discover the Co-Op Action RPG Wu-Tang: Rise of the Deceiver!

    Entertainment lineup released for 2025 Mississippi State Fair – WAPT

    Exciting Entertainment Lineup Unveiled for the 2025 Mississippi State Fair!

    After Denzel Washington Said He Would Be In Black Panther 3, Ryan Coogler Explained Why He’s ‘Fine’ With That Information Being Revealed So Early – Yahoo

    Ryan Coogler Shares Why He’s Cool with Denzel Washington’s Black Panther 3 Reveal!

    Traveling Tacos and Tequila Festival to stop at Florence Yall’s stadium this October – Cincinnati Enquirer

    Get Ready for a Flavor Fiesta: Traveling Tacos and Tequila Festival Hits Florence Y’all’s Stadium This October!

  • General
  • Health
  • News

    Cracking the Code: Why China’s Economic Challenges Aren’t Shaking Markets, Unlike America’s” – Bloomberg

    Trump’s Narrow Window to Spread the Truth About Harris

    Trump’s Narrow Window to Spread the Truth About Harris

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    Israel-Gaza war live updates: Hamas leader Ismail Haniyeh assassinated in Iran, group says

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    PAP Boss to Niger Delta Youths, Stay Away from the Protest

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Court Restricts Protests In Lagos To Freedom, Peace Park

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Fans React to Jazz Jennings’ Inspiring Weight Loss Journey

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Science
  • Sports
  • Technology
    Innovation at Scale: How P&G Transforms Business Through Technology – Procter & Gamble

    Revolutionizing Business: P&G’s Bold Journey into Technological Innovation

    Drag racer survives frightening airborne crash at World Wide Technology Raceway – FOX 2

    Drag racer survives frightening airborne crash at World Wide Technology Raceway – FOX 2

    Apple Watch and the future of wearable technology in healthcare – MSN

    Revolutionizing Healthcare: The Future of Wearable Technology with Apple Watch

    ECS Professor Pankaj K. Jha Receives NSF Grant to Develop Quantum Technology – Syracuse University News

    Unlocking the Future: ECS Professor Pankaj K. Jha Secures NSF Grant for Groundbreaking Quantum Technology Development

    Fire Tech Brief: 5 Fire Apparatus Technology Upgrades – firehouse.com

    Revving Up Safety: 5 Innovative Upgrades for Fire Apparatus Technology

    U.S. FDA Grants Platform Technology Designation to the Viral Vector Used in SRP-9003, Sarepta’s Investigational Gene Therapy for the Treatment of Limb Girdle Muscular Dystrophy Type 2E/R4 – Sarepta Therapeutics

    Breakthrough for Gene Therapy: FDA Designates Viral Vector in Sarepta’s SRP-9003 for Limb Girdle Muscular Dystrophy Treatment

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
No Result
View All Result
Earth-News
No Result
View All Result
Home Science

Robotic Grasp of Language: Unlocking an Open-Ended World for Automation

November 17, 2023
in Science
Robotic Grasp of Language: Unlocking an Open-Ended World for Automation
Share on FacebookShare on Twitter

Robot Open World Art Concept

MIT’s CSAIL introduced F3RM, a robotic system that combines visual and language features, allowing robots to grasp objects following open-ended instructions. This innovation, which supports task generalization from few examples, could significantly improve efficiency in warehouses and extend to various real-world applications, including domestic assistance.

By blending 2D images with foundation models to build 3D feature fields, a new MIT method helps robots understand and manipulate nearby objects with open-ended language prompts.

Imagine you’re visiting a friend abroad, and you look inside their fridge to see what would make for a great breakfast. Many of the items initially appear foreign to you, with each one encased in unfamiliar packaging and containers. Despite these visual distinctions, you begin to understand what each one is used for and pick them up as needed.

Inspired by humans’ ability to handle unfamiliar objects, a group from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) designed Feature Fields for Robotic Manipulation (F3RM), a system that blends 2D images with foundation model features into 3D scenes to help robots identify and grasp nearby items. F3RM can interpret open-ended language prompts from humans, making the method helpful in real-world environments that contain thousands of objects, like warehouses and households.

Robotic Adaptability and Task Generalization

F3RM offers robots the ability to interpret open-ended text prompts using natural language, helping the machines manipulate objects. As a result, the machines can understand less-specific requests from humans and still complete the desired task. For example, if a user asks the robot to “pick up a tall mug,” the robot can locate and grab the item that best fits that description.

Feature Fields for Robotic Manipulation (F3RM)

Feature Fields for Robotic Manipulation (F3RM) enables robots to interpret open-ended text prompts using natural language, helping the machines manipulate unfamiliar objects. The system’s 3D feature fields could be helpful in environments that contain thousands of objects, such as warehouses. Credit: Courtesy of the researchers

“Making robots that can actually generalize in the real world is incredibly hard,” says Ge Yang, postdoc at the National Science Foundation AI Institute for Artificial Intelligence and Fundamental Interactions and MIT CSAIL. “We really want to figure out how to do that, so with this project, we try to push for an aggressive level of generalization, from just three or four objects to anything we find in MIT’s Stata Center. We wanted to learn how to make robots as flexible as ourselves, since we can grasp and place objects even though we’ve never seen them before.”

Learning “What’s Where by Looking”

The method could assist robots with picking items in large fulfillment centers with inevitable clutter and unpredictability. In these warehouses, robots are often given a description of the inventory that they’re required to identify. The robots must match the text provided to an object, regardless of variations in packaging, so that customers’ orders are shipped correctly.

For example, the fulfillment centers of major online retailers can contain millions of items, many of which a robot will have never encountered before. To operate at such a scale, robots need to understand the geometry and semantics of different items, with some being in tight spaces. With F3RM’s advanced spatial and semantic perception abilities, a robot could become more effective at locating an object, placing it in a bin, and then sending it along for packaging. Ultimately, this would help factory workers ship customers’ orders more efficiently.

“One thing that often surprises people with F3RM is that the same system also works on a room and building scale, and can be used to build simulation environments for robot learning and large maps,” says Yang. “But before we scale up this work further, we want to first make this system work really fast. This way, we can use this type of representation for more dynamic robotic control tasks, hopefully in real-time, so that robots that handle more dynamic tasks can use it for perception.”

Application Across Environments

The MIT team notes that F3RM’s ability to understand different scenes could make it useful in urban and household environments. For example, the approach could help personalized robots identify and pick up specific items. The system aids robots in grasping their surroundings — both physically and perceptively.

“Visual perception was defined by David Marr as the problem of knowing ‘what is where by looking,’” says senior author Phillip Isola, MIT associate professor of electrical engineering and computer science and CSAIL principal investigator. “Recent foundation models have gotten really good at knowing what they are looking at; they can recognize thousands of object categories and provide detailed text descriptions of images. At the same time, radiance fields have gotten really good at representing where stuff is in a scene. The combination of these two approaches can create a representation of what is where in 3D, and what our work shows is that this combination is especially useful for robotic tasks, which require manipulating objects in 3D.”

Creating a “Digital Twin”

F3RM begins to understand its surroundings by taking pictures on a selfie stick. The mounted camera snaps 50 images at different poses, enabling it to build a neural radiance field (NeRF), a deep learning method that takes 2D images to construct a 3D scene. This collage of RGB photos creates a “digital twin” of its surroundings in the form of a 360-degree representation of what’s nearby.

In addition to a highly detailed neural radiance field, F3RM also builds a feature field to augment geometry with semantic information. The system uses CLIP, a vision foundation model trained on hundreds of millions of images to efficiently learn visual concepts. By reconstructing the 2D CLIP features for the images taken by the selfie stick, F3RM effectively lifts the 2D features into a 3D representation.

Open-Ended Interaction

After receiving a few demonstrations, the robot applies what it knows about geometry and semantics to grasp objects it has never encountered before. Once a user submits a text query, the robot searches through the space of possible grasps to identify those most likely to succeed in picking up the object requested by the user. Each potential option is scored based on its relevance to the prompt, similarity to the demonstrations the robot has been trained on, and if it causes any collisions. The highest-scored grasp is then chosen and executed.

To demonstrate the system’s ability to interpret open-ended requests from humans, the researchers prompted the robot to pick up Baymax, a character from Disney’s “Big Hero 6.” While F3RM had never been directly trained to pick up a toy of the cartoon superhero, the robot used its spatial awareness and vision-language features from the foundation models to decide which object to grasp and how to pick it up.

F3RM also enables users to specify which object they want the robot to handle at different levels of linguistic detail. For example, if there is a metal mug and a glass mug, the user can ask the robot for the “glass mug.” If the bot sees two glass mugs and one of them is filled with coffee and the other with juice, the user can ask for the “glass mug with coffee.” The foundation model features embedded within the feature field enable this level of open-ended understanding.

“If I showed a person how to pick up a mug by the lip, they could easily transfer that knowledge to pick up objects with similar geometries such as bowls, measuring beakers, or even rolls of tape. For robots, achieving this level of adaptability has been quite challenging,” says MIT PhD student, CSAIL affiliate, and co-lead author William Shen. “F3RM combines geometric understanding with semantics from foundation models trained on internet-scale data to enable this level of aggressive generalization from just a small number of demonstrations.”

Reference: “Distilled Feature Fields Enable Few-Shot Language-Guided Manipulation” by William Shen, Ge Yang, Alan Yu, Jansen Wong, Leslie Pack Kaelbling an Phillip Isola, 27 July 2023, Computer Science> Computer Vision and Pattern Recognition.
arXiv:2308.07931

Shen and Yang wrote the paper under the supervision of Isola, with MIT professor and CSAIL principal investigator Leslie Pack Kaelbling and undergraduate students Alan Yu and Jansen Wong as co-authors. The team was supported, in part, by Amazon.com Services, the National Science Foundation, the Air Force Office of Scientific Research, the Office of Naval Research’s Multidisciplinary University Initiative, the Army Research Office, the MIT-IBM Watson Lab, and the MIT Quest for Intelligence. Their work will be presented at the 2023 Conference on Robot Learning.

>>> Read full article>>>
Copyright for syndicated content belongs to the linked Source : SciTechDaily – https://scitechdaily.com/robotic-grasp-of-language-unlocking-an-open-ended-world-for-automation/

Tags: Grasproboticscience
Previous Post

Scientists Shed New Light on the “Dark Matter” of Cellular Biology

Next Post

Formula One race car cracks a drain cover during Las Vegas Grand Prix practice run

Shotgun sequencing of airborne eDNA achieves rapid assessment of whole biomes, population genetics and genomic variation – Nature

Revolutionizing Biodiversity: Rapid Insights into Ecosystems and Genetic Diversity Through Shotgun Sequencing of Airborne eDNA

June 8, 2025
Earth’s energy balance is rising much faster than scientists predicted, and we have no idea why – Live Science

Unraveling the Mystery: Earth’s Energy Balance is Surging Faster Than Expected!

June 8, 2025
The Undermining of Science — and Society — Continues – UExpress

How the Erosion of Science is Impacting Our Society

June 8, 2025
10 habits that secretly ‘kill’ your happy hormones – Times of India

10 Surprising Habits That Sabotage Your Happy Hormones

June 8, 2025
A GPS Blackout Would Shut Down the World – WIRED

How a GPS Blackout Could Bring the World to a Standstill

June 8, 2025
Six Steps To Ruin a Country’s Image, Economy, and Global Standing | Opinion – Newsweek

Six Surefire Ways to Dismantle a Nation’s Reputation and Prosperity

June 8, 2025
Middle Eastern Entertainment Headlines at 5:49 a.m. GMT – Yahoo

Exciting Updates from the Middle Eastern Entertainment Scene!

June 8, 2025
Omada shares rise 21% in Nasdaq debut after health tech company’s IPO – CNBC

Omada Soars 21% in Thrilling Nasdaq Debut Following Successful IPO!

June 8, 2025
Politics-Based Investing Sounds Smart. But These Strategies Work Better. – Barron’s

Politics-Based Investing Sounds Smart. But These Strategies Work Better. – Barron’s

June 8, 2025
Innovation at Scale: How P&G Transforms Business Through Technology – Procter & Gamble

Revolutionizing Business: P&G’s Bold Journey into Technological Innovation

June 8, 2025

Categories

Archives

June 2025
MTWTFSS
 1
2345678
9101112131415
16171819202122
23242526272829
30 
« May    
Earth-News.info

The Earth News is an independent English-language daily published Website from all around the World News

Browse by Category

  • Business (20,132)
  • Ecology (677)
  • Economy (690)
  • Entertainment (21,596)
  • General (15,271)
  • Health (9,732)
  • Lifestyle (694)
  • News (22,149)
  • People (691)
  • Politics (698)
  • Science (15,909)
  • Sports (21,193)
  • Technology (15,676)
  • World (675)

Recent News

Shotgun sequencing of airborne eDNA achieves rapid assessment of whole biomes, population genetics and genomic variation – Nature

Revolutionizing Biodiversity: Rapid Insights into Ecosystems and Genetic Diversity Through Shotgun Sequencing of Airborne eDNA

June 8, 2025
Earth’s energy balance is rising much faster than scientists predicted, and we have no idea why – Live Science

Unraveling the Mystery: Earth’s Energy Balance is Surging Faster Than Expected!

June 8, 2025
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

No Result
View All Result

© 2023 earth-news.info

Go to mobile version