DVM-SLAM: Decentralized Visual Monocular Simultaneous Localization and Mapping for Multi-Agent Systems

Joshua Bird, Jan Blumenkamp, Amanda Prorok

Department of Computer Science and Technology

University of Cambridge

DVM-SLAM running the TUM-VI Rooms 01-03 dataset.

Our system enables multiple agents to collaborate in mapping unknown environments while simultaneously providing relative localization.

Abstract

Cooperative Simultaneous Localization and Mapping (C-SLAM) enables multiple agents to work together in mapping unknown environments while simultaneously estimating their own positions. This approach enhances robustness, scalability, and accuracy by sharing information between agents, reducing drift, and enabling collective exploration of larger areas. In this paper, we present Decentralized Visual Monocular SLAM (DVM-SLAM), the first open-source decentralized monocular C-SLAM system. By only utilizing low-cost and light-weight monocular vision sensors, our system is well suited for small robots and micro aerial vehicles (MAVs). DVM-SLAM's real-world applicability is validated on physical robots with a custom collision avoidance framework, showcasing its potential in real-time multi-agent autonomous navigation scenarios. We also demonstrate comparable accuracy to state-of-the-art centralized monocular C-SLAM systems.

Supplementary Material

In this section we provide additional material to complement the paper.

Collision Avoidance Demo

In this video, we deploy DVM-SLAM on the Cambridge RoboMaster platform with a custom collision avoidance framework and test the system in an intersection environment, where the two robots would normally collide. The agents are able to localize each other even when their views do not overlap and they can not see each other, demonstrating that a shared map is being built. Out of the four consecutive trials run in this environment, there were zero collisions between the two agents, and the distance between agents never went below the collision threshold of 0.55 meters. The RMS ATE of the system was 7.4cm over the 50-meter-long trajectory.

Plot of distance between the two robots throughout all four collision avoidance trials. The dips between trials are the robots' positions being reset.

Distributed Pose Graph Optimization

DVM-SLAM performs incremental, asynchronous, and distributed pose graph optimization through a keyframe sharing method. This refines the map as the agents explore the environment, reducing drift and improving accuracy.

This video demonstrates the distributed pose graph optimization process in action during the TUM-VI Rooms 01-03 dataset, with multiple agents mapping the environment simultaneously.

(a) External keyframe $k_{ext}$ with $M_{ext}=\{m_3, m_4, m_5\}$. Note the references to existing keyframes and map points.

(b) Existing local map.

(c) Step 1: Move $k_{ext}$ and $M_{ext}$ to the local map and relink references. Relinked connections are drawn in purple.

(d) Step 2: Merge duplicate map points. $m_5$ and $m_0$ have a similar feature descriptor and location, therefore they are merged.

(e) Step 3: Local pose graph optimization around $k_{ext}$ to refine the map using the new information. Original keyframe and map point locations are shown in purple.

Visual overview of inserting external keyframes and map points into the local map. External keyframe (a) and initial local map (b) are combined to create our final local map (e). This is performed incrementally by each agent as additional external data is received, and enables to agents to collaboratively contribute to and benefit from a shared, continuously improving map without the need for centralized coordination.