MAGICIAN

Contents

Videos Overview Architecture Imagined Gaussians Planning Results Video Comparisons BibTeX

MAGICIAN

Efficient Long-Term Planning with Imagined Gaussians for Active Mapping

Shiyao Li^1,2 Antoine Guédon¹ Shizhe Chen² Vincent Lepetit¹

¹Institut Polytechnique de Paris ²Inria

CVPR 2026 · Oral

Paper Code Macarons++ Dataset

Scroll to explore

Videos

Fushimi Castle

Neuschwanstein Castle

Main Contributions

A long-term active mapping framework MAGICIAN using a 3D world model.
Fast coverage gain estimation for future viewpoints via Imagined Gaussians
SOTA performance across indoor/outdoor scenes, embodiments, and action spaces

Architecture of MAGICIAN

Overview of the proposed MAGICIAN framework. At time t, we first predict the occupancy field and update the Imagined Gaussians. We can then efficiently estimate the coverage gain and apply beam search to plan candidate trajectories, selecting the one with the highest expected gain. The agent then executes the first actions of the best trajectory before repeating this process in the next planning loop. In this figure, lighter colors in the Imagined Gaussians indicate higher novelty, while darker colors correspond to previously observed areas. The first trajectory darkens the novelty field the most, representing the optimal path at time t.

Imagined Gaussians

Long-term planning involves evaluating numerous candidate cameras, which motivates the introduction of Imagined Gaussians.

Predicting the coverage gain of a given camera is non-trivial, it requires evaluating three key properties for each point in its FoV:

Coverage Gain:

\[ G(\mathbf{c}) = \int_{\partial \mathcal{E} \cap \mathcal{V}(\mathbf{c})} \sigma(\mathbf{x}) \cdot o(\mathbf{x}, \mathbf{c}) \cdot \gamma(\mathbf{x} \mid \mathbf{C}_t) \, d\mathbf{x} \]

\(\sigma\) (occupancy): Is there a solid surface?
\(o\) (occlusion): Is it visible from this camera?
\(\gamma\) (novelty): Is this an unobserved point?

Indeed, we need to predict the occupancy and visibility of every point, then integrate over the entire field of view.
Too complex, too slow...
Wait, doesn't this formula look strangely familiar? Like volumetric rendering?

Volumetric Rendering:

\[ I(\mathbf{p}) = \int_{0}^{+\infty} q(\mathbf{o} + s\mathbf{d}) \cdot T(s;\, \mathbf{o}, \mathbf{d}) \cdot f(\mathbf{o} + s\mathbf{d},\, \mathbf{d})\, ds \]

Aha! They share the same mathematical structure!

By training only a lightweight occupancy model, we can leverage the 3DGS renderer for fast coverage gain computation by predicting an occupancy field and converting it into Gaussians.
We call this representation Imagined Gaussians.

Occupancy Field

Imagined Gaussians

Rendered Novelty Map

Then the process of evaluating a camera pose can be simplified to predicting an occupancy field, converting it into Imagined Gaussians, and then summing over the rendered pixels to instantly obtain the total coverage gain.

❗ ~30x faster than original \(G(\mathbf{c})\) estimation!

Imagined Gaussians at \(t_0\)

Imagined Gaussians at \(t_1 > t_0\)

Imagined Gaussians at \(t_2 > t_1\)

Evolution of Imagined Gaussians Compared with Ground Truth Mesh. The brighter the Gaussians, the higher their predicted occupancy. As exploration progresses (from left to right), our Imagined Gaussians increasingly align with the ground truth mesh, demonstrating improved environmental modeling.

Long-term Planning

We uses beam search to efficiently explore the space of candidate trajectories and select the one with the highest expected coverage gain. The agent then executes the first \(N_f\) actions before replanning. For ablation studies on the planning phase, please refer to the main paper.

Results

MAGICIAN achieves state-of-the-art performance across diverse robot platforms and both indoor and outdoor environments.

Pisa Cathedral

FisherRF

MACARONS

MAGICIAN

Barts

FisherRF

MACARONS

MAGICIAN

St. Sofia Church

FisherRF

MACARONS

MAGICIAN

Video Comparisons

Previous state of the art

MAGICIAN

Our long-term planning is more efficient and achieves more comprehensive exploration compared to short-sighted planning approaches.

BibTeX

@inproceedings{li2026magician,
  author = "Shiyao Li and Antoine Guédon and Shizhe Chen and Vincent Lepetit",
  title = {{MAGICIAN: Efficient Long-Term Planning with Imagined Gaussians for Active Mapping}},
  booktitle = {{Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)}},
  year = 2026
}