Adapting the robot agent to complex, visual-language navigation tasks with large language models

Recent advancements in artificial intelligence have resulted in the development of Large Language Models (LLMs), like ChatGPT, Copilot and Gemini, which are capable of generating and comprehending human-like text. These models have demonstrated promise in enhancing robot task planning by natural language instructions.

This project proposes to integrate LLMs with robotics system, to enhance their navigation and task execution abilities. The robot agent, equipped with a 2D LiDAR, should be capable of completing subgoals provided by the LLMs to fulfill the given instruction in natural language.

Supervisor:

Professor Pan, Jia

Member:

Yau Cheuk Nam Cyrus (3035949140)