Cooperative Simultaneous Localization and Mapping (C-SLAM) enables multiple agents to work together in mapping unknown environments while simultaneously estimating their own positions. This approach enhances robustness, scalability, and accuracy by sharing information between agents, reducing drift, and enabling collective exploration of larger areas. In this paper, we present Decentralized Visual Monocular SLAM (DVM-SLAM), the first open-source decentralized monocular C-SLAM system. By only utilizing low-cost and light-weight monocular vision sensors, our system is well suited for small robots and micro aerial vehicles (MAVs). DVM-SLAM's real-world applicability is validated on physical robots with a custom collision avoidance framework, showcasing its potential in real-time multi-agent autonomous navigation scenarios. We also demonstrate comparable accuracy to state-of-the-art centralized monocular C-SLAM systems.
In this section we provide additional material to complement the paper.
In this video, we deploy DVM-SLAM on the Cambridge RoboMaster platform with a custom collision avoidance framework and test the system in an intersection environment, where the two robots would normally collide. The agents are able to localize each other even when their views do not overlap and they can not see each other, demonstrating that a shared map is being built. Out of the four consecutive trials run in this environment, there were zero collisions between the two agents, and the distance between agents never went below the collision threshold of 0.55 meters. The RMS ATE of the system was 7.4cm over the 50-meter-long trajectory.
DVM-SLAM performs incremental, asynchronous, and distributed pose graph optimization through a keyframe sharing method. This refines the map as the agents explore the environment, reducing drift and improving accuracy.
This video demonstrates the distributed pose graph optimization process in action during the TUM-VI Rooms 01-03 dataset, with multiple agents mapping the environment simultaneously.
Visual overview of inserting external keyframes and map points into the local map. External keyframe (a) and initial local map (b) are combined to create our final local map (e). This is performed incrementally by each agent as additional external data is received, and enables to agents to collaboratively contribute to and benefit from a shared, continuously improving map without the need for centralized coordination.