In your 3DGS setup, these packages form a pipeline:
video/images
→ ffmpeg
→ COLMAP
→ Nerfstudio
→ gsplat
→ 3D Gaussian Splatting result
cmake and ninja are supporting build tools used when some Python/CUDA packages need to compile native code.
1.ffmpeg
ffmpeg is a video and image processing tool.
We need it because your input may be a video recorded by a phone, camera, or robot. 3DGS training does not directly learn from a video file as one continuous file. It usually needs many image frames.
So ffmpeg is used to:
video.mp4 → image_0001.jpg, image_0002.jpg, image_0003.jpg, ...
It can also resize videos/images.
Example:
ffmpeg -i input.mp4 -vf scale=1280:-1 output_720p.mp4
Why it matters for you:
It helps extract and resize frames before camera pose estimation and 3DGS training.
2.colmap
COLMAP is a Structure-from-Motion tool.
It estimates:
- camera poses,
- sparse 3D points,
- camera intrinsics if needed.
For 3DGS, this is very important.
3DGS needs to know:
For image 0001, where was the camera?
For image 0002, where was the camera?
For image 0003, where was the camera?
COLMAP provides this information.
The general flow is:
images
→ feature extraction
→ feature matching
→ sparse reconstruction
→ camera poses + sparse point cloud
Why it matters for you:
Without camera poses, 3DGS cannot correctly connect different images into one 3D scene.
3.cmake
cmake is a build configuration tool.
Some 3DGS-related packages contain C++ or CUDA code. Python alone is not fast enough for some operations, especially GPU rasterization.
cmake helps prepare the build process for compiled code.
It answers questions such as:
Where is CUDA?
Where is the compiler?
Which source files should be compiled?
Which libraries should be linked?
Why it matters for you:
It is needed when installing or compiling packages that include C++/CUDA extensions.
4.ninja
ninja is a fast build system.
If cmake prepares the build instructions, ninja executes them quickly.
Simple comparison:
cmake = prepares the build plan
ninja = performs the compilation
Why it matters for you:
It helps compile CUDA/C++ extensions faster and more reliably.
5.nerfstudio
Nerfstudio is the main framework we use.
It provides the complete training and viewing pipeline.
It gives you commands such as:
ns-process-data
ns-train
ns-viewer
ns-export
For your 3DGS test, we use:
ns-process-data video ...
to process videos/images and run COLMAP, and:
ns-train splatfacto ...
to train 3D Gaussian Splatting.
Why it matters for you:
Nerfstudio is the user-friendly framework that connects data processing, camera pose estimation, 3DGS training, and visualization.
6.gsplat
gsplat is the CUDA-accelerated Gaussian Splatting backend.
3DGS needs to render many 3D Gaussians very fast. This operation is computationally heavy. gsplat provides efficient GPU code for this.
Nerfstudio’s splatfacto method uses gsplat for fast Gaussian rasterization.
Simple explanation:
Nerfstudio = training framework
Splatfacto = Nerfstudio’s 3DGS method
gsplat = fast CUDA renderer used by Splatfacto
Why it matters for you:
It makes 3DGS training and rendering fast enough to be practical.
7. How they work together
For a video input, the workflow is:
1. ffmpeg
extracts frames from video
2. COLMAP
estimates camera poses and sparse 3D points
3. Nerfstudio
organizes data and starts training
4. gsplat
renders Gaussians efficiently during training
5. cmake + ninja
help compile required CUDA/C++ components when needed
For an image input, the workflow is similar, but ffmpeg may only be used for resizing:
images
→ COLMAP
→ Nerfstudio
→ gsplat
→ 3DGS result
8. Simple analogy
Think of building a 3DGS scene like making a robot map from a video.
| Package | Role | Simple analogy |
| ------------ | ---------------------- | -------------------------------- |
| ffmpeg | extracts frames | cuts video into photos |
| COLMAP | estimates camera poses | finds where each photo was taken |
| cmake | prepares compilation | prepares construction plan |
| ninja | compiles code | builds the machine parts |
| Nerfstudio | main framework | project manager |
| gsplat | fast Gaussian renderer | GPU engine |
9. One-sentence summary
ffmpeg prepares video frames, COLMAP estimates camera poses, Nerfstudio manages the 3DGS training pipeline, gsplatperforms fast Gaussian rendering on the GPU, and cmake/ninja help compile the required C++/CUDA components.