Tech

Google DeepMind showcases robot navigation with Gemini AI

Published

9 months ago

July 11, 2024

Google DeepMind showcases robot navigation with Gemini AI

Google’s DeepMind Robotics team has achieved a significant breakthrough in robot navigation using its Gemini 1.5 Pro AI.

In a recent paper titled “Mobility VLA: Multimodal Instruction Navigation with Long-Context VLMs and Topological Graphs,” the team showcases how robots can now respond to complex commands and navigate office environments.

The project marks a leap forward in integrating natural language interactions with advanced AI capabilities.

TikTok introduces generative AI avatars, AI dubbing tool for enhanced branded content

Videos released by DeepMind demonstrate the robots responding to verbal prompts like “OK, Robot” followed by tasks such as guiding humans to specific locations within their 9,000-square-foot office space.

Before executing tasks, the robots undergo training through Multimodal Instruction Navigation with demonstration Tours (MINT).

This involves physically guiding the robots around the office while verbally identifying key landmarks.

The system’s hierarchical Vision-Language-Action (VLA) framework enhances their understanding by combining environmental perception with reasoning abilities.

DeepMind reports an impressive success rate of approximately 90% across more than 50 interactions with employees.

This breakthrough highlights the potential of generative AI not only in robot navigation but also in enhancing human-robot interactions and expanding applications in office automation and beyond.

Google’s DeepMind Robotics team showcases breakthroughs in robot navigation using Gemini AI, integrating natural language commands and visual cues.
BREAKING NEWS Federal Government Tinubu pic.twitter.com/LHXPjndKMB

— TopNaija.ng (@topnaijang) July 11, 2024

Related Topics:Artificial Intelligence (AI)Gemini AI Google Robots

Up Next

Ford unveils electric Capri: Modern icon reimagined for Europe

Don't Miss

Elon Musk’s role fuels Kenyan Gen Z-driven protests

Lawrence Agbo

Lawrence Agbo, a tech journalist for over four years, excels in crafting SEO-driven content that boosts business success. He also serves as an AI tutor, sharing his knowledge to educate others. His work has been cited on Wikipedia and various online media platforms.

TopNaija.ng

Google DeepMind showcases robot navigation with Gemini AI

You may like

Trending