A Robot Helper
Understanding Verbal Instructions
to Provide Assistance
in Lab Workspace

OBJECTIVE

We aim to build a robot that understands language instructions and is able to accomplish simple task in the lab workspace in order to help people


REPRODUCTION

Using our own design to reproduce OK-Robot

  • 3D Semantic Map
  • Open-Vocabulary Object Navigation
  • SOTA Grasp Genration
  • LLM Driven Framework

ENHANCEMENT

Enhance more functionalities to bridge the gap between robot and real life setting like lab environment

  • Dynamic Semantic Memory System
  • More Complext Task beyond Pick-and-Place
  • Improve Human-Robot Interaction
  • Achieve AI Alignment

METHODOLOGY

Various methods will be employed in 2 phases


Methods in OK-Robot

  • VoxelMap
  • NavigationPlan
  • LangSAM
  • AnyGrasp
  • A heuristic algorithm to place the object

Methods for Enhancement

  • Based on VoxelMap, develop a method to partically, periodically and incrementally update the semantic memory system
  • Utilizing llm and prompt engineering to enable the task planning capability for understanding complex task
  • For UI design, basically Python, PyQt or Tkinter will be used. OpenCV can be used for video handling. For auditory feedback, text-to-speech model might be used. For dialogue mechanism, LLM along with proper prompt engineering to mimic a question raiser.