- Ref.:
- Used Packages: ffmpeg, COLMAP, cmake, ninja, Nerfstudio, gsplat
- Example of creating a 360-degree outfit showcase using images
Below is a clean setup for your workstation:
Ubuntu 24.04
Dual RTX 5080, 16 GB VRAM each
NVIDIA driver 580.159.03
CUDA / nvcc 12.8
gcc/g++ 13.3
We will use Nerfstudio Splatfacto. Nerfstudio supports end-to-end image/video processing, and its ns-process-datacommand supports images and video inputs; the official documentation says COLMAP and FFmpeg should be installed for this workflow. Splatfacto is Nerfstudio’s 3D Gaussian Splatting method, where 3D Gaussians are projected onto 2D camera views for fast rendering.
3DGS Setup and Test Guide
1. Create the conda environment
conda create -n ns3dgs python=3.10 -y
conda activate ns3dgs
python -m pip install --upgrade pip setuptools wheel
Check:
python --version
which python
Expected:
Python 3.10.x
.../envs/ns3dgs/bin/python
2. Install PyTorch for CUDA 12.8
Because your workstation has CUDA 12.8, use the CUDA 12.8 PyTorch wheel:
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
Test GPU detection:
python - <<'PY'
import torch
print("torch version:", torch.__version__)
print("torch cuda:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
print("gpu count:", torch.cuda.device_count())
for i in range(torch.cuda.device_count()):
print(i, torch.cuda.get_device_name(i))
PY
Expected result:
cuda available: True
gpu count: 2
0 NVIDIA GeForce RTX 5080
1 NVIDIA GeForce RTX 5080
Do not continue until this test works.
3. Install Ubuntu dependencies
sudo apt update
sudo apt install -y git cmake ninja-build build-essential ffmpeg colmap
Check:
ffmpeg -version
colmap -h
cmake --version
ninja --version
4. Install Nerfstudio and gsplat
pip install nerfstudio
pip install gsplat
Nerfstudio can be installed by pip install nerfstudio, according to its PyPI page. The gsplat library provides CUDA-accelerated Gaussian rasterization with Python bindings.
Test installation:
ns-train --help
ns-process-data --help
python -c "import gsplat; print('gsplat OK')"
If gsplat fails, try this compiler workaround:
sudo apt install -y gcc-12 g++-12
export CC=/usr/bin/gcc-12
export CXX=/usr/bin/g++-12
pip uninstall -y gsplat
pip install --no-cache-dir gsplat
python -c "import gsplat; print('gsplat OK')"
5. Create a project folder
mkdir -p ~/vln_3dgs_project/raw_data/video_test/video
mkdir -p ~/vln_3dgs_project/raw_data/image_test/images
mkdir -p ~/vln_3dgs_project/processed_data
mkdir -p ~/vln_3dgs_project/outputs
Recommended first scene:
Good:
- lab desk
- robot workspace
- cabinet corner
- bookshelf
- toolbox area
Avoid:
- white corridor
- glass wall
- empty room
- fast camera motion
- many moving people
Part A: Test 3DGS using a video
A1. Prepare the video
Copy your video into the folder:
cp your_video.mp4 ~/vln_3dgs_project/raw_data/video_test/video/
Suggested video setting:
Length: 30–60 seconds
Resolution: 720p or 1080p
Motion: slow
Lighting: stable
People: no moving people
Camera motion: include translation, not only rotation
If the video is too large, create a 720p version:
ffmpeg -i ~/vln_3dgs_project/raw_data/video_test/video/your_video.mp4 \
-vf scale=1280:-1 \
~/vln_3dgs_project/raw_data/video_test/video/your_video_720p.mp4
A2. Process the video
Start with 300 frames:
conda activate ns3dgs
ns-process-data video \
--data ~/vln_3dgs_project/raw_data/video_test/video/your_video_720p.mp4 \
--output-dir ~/vln_3dgs_project/processed_data/video_test_300 \
--num-frames-target 300
This step will:
1. extract video frames,
2. run COLMAP,
3. estimate camera poses,
4. create transforms.json,
5. prepare the dataset for Splatfacto.
Check the result:
ls ~/vln_3dgs_project/processed_data/video_test_300
Expected:
images
transforms.json
colmap
If transforms.json does not exist, COLMAP failed.
A3. Train Splatfacto from the video dataset
Use only one RTX 5080 first:
CUDA_VISIBLE_DEVICES=0 ns-train splatfacto \
--data ~/vln_3dgs_project/processed_data/video_test_300 \
--output-dir ~/vln_3dgs_project/outputs
During training, you should see a viewer link:
http://localhost:7007
Open it in your workstation browser.
If you are using SSH:
ssh -L 7007:localhost:7007 user@your_server_ip
Then open this on your local computer:
http://localhost:7007
A4. If CUDA memory is not enough
Your RTX 5080 has 16 GB VRAM. If you get out-of-memory, reduce the frame number:
ns-process-data video \
--data ~/vln_3dgs_project/raw_data/video_test/video/your_video_720p.mp4 \
--output-dir ~/vln_3dgs_project/processed_data/video_test_150 \
--num-frames-target 150
Then train:
CUDA_VISIBLE_DEVICES=0 ns-train splatfacto \
--data ~/vln_3dgs_project/processed_data/video_test_150 \
--output-dir ~/vln_3dgs_project/outputs
Part B: Test 3DGS using images
B1. Prepare image folder
Put your images here:
~/vln_3dgs_project/raw_data/image_test/images
Example:
ls ~/vln_3dgs_project/raw_data/image_test/images
Expected:
0001.jpg
0002.jpg
0003.jpg
...
Recommended first image dataset:
Number of images: 50–200
Resolution: 720p or 1080p
Scene: small indoor scene
Motion: different viewpoints around the scene
Lighting: stable
If your images are very high resolution, resize them:
mkdir -p ~/vln_3dgs_project/raw_data/image_test/images_720p
for img in ~/vln_3dgs_project/raw_data/image_test/images/*; do
filename=$(basename "$img")
ffmpeg -y -i "$img" -vf scale=1280:-1 \
~/vln_3dgs_project/raw_data/image_test/images_720p/"$filename"
done
B2. Process the images
ns-process-data images \
--data ~/vln_3dgs_project/raw_data/image_test/images_720p \
--output-dir ~/vln_3dgs_project/processed_data/image_test
Check:
ls ~/vln_3dgs_project/processed_data/image_test
Expected:
images
transforms.json
colmap
B3. Train Splatfacto from images
CUDA_VISIBLE_DEVICES=0 ns-train splatfacto \
--data ~/vln_3dgs_project/processed_data/image_test \
--output-dir ~/vln_3dgs_project/outputs
Open the viewer:
http://localhost:7007
6. How to use both GPUs
For one first training run, use only one GPU.
Your best use of dual RTX 5080 is parallel experiments:
Terminal 1:
CUDA_VISIBLE_DEVICES=0 ns-train splatfacto \
--data ~/vln_3dgs_project/processed_data/video_test_300 \
--output-dir ~/vln_3dgs_project/outputs
Terminal 2:
CUDA_VISIBLE_DEVICES=1 ns-train splatfacto \
--data ~/vln_3dgs_project/processed_data/image_test \
--output-dir ~/vln_3dgs_project/outputs
This is easier than trying to use two GPUs for one 3DGS scene.
7. How to judge whether the test succeeded
In the viewer, check:
| Item | Good result | | ----------------- | ------------------------------ | | camera trajectory | smooth, not scattered randomly | | object appearance | recognizable | | table/wall/floor | stable, not twisted | | novel view | not too blurry | | ghost artifacts | few | | overall scene | consistent 3D structure |
If the result is broken, the most common reason is bad camera pose estimation, not the 3DGS model.
8. Common problems and fixes
Problem 1: COLMAP fails
Symptoms:
No transforms.json
Very few registered images
Broken camera path
Fix:
Record slower.
Use more visual texture.
Avoid white walls and glass.
Avoid only rotating the camera.
Move with translation.
Avoid moving people.
Use 150–300 good frames.
Problem 2: CUDA out of memory
Fix:
Use 150 frames instead of 300.
Resize to 720p.
Use a smaller scene.
Use one GPU only.
Close other GPU programs.
Command:
nvidia-smi
Problem 3: gsplat compile/import error
Try:
sudo apt install -y gcc-12 g++-12
export CC=/usr/bin/gcc-12
export CXX=/usr/bin/g++-12
pip uninstall -y gsplat
pip install --no-cache-dir gsplat
Problem 4: Viewer cannot open through SSH
Use port forwarding:
ssh -L 7007:localhost:7007 user@your_server_ip
Then open:
http://localhost:7007
9. Recommended first experiment for your workstation
Because your GPUs have 16 GB VRAM, use this setting first:
Input type: video
Scene: lab desk or robot workspace
Video: 30–60 seconds
Resolution: 720p
Frames: 150–300
GPU: CUDA_VISIBLE_DEVICES=0
Method: ns-train splatfacto
After this works, try:
RGB-D / robot-view video
ROS 2 pose alignment
SAM 2 semantic masks
language-guided correction
VLN/VLA training environment
10. Full command summary
# Create environment
conda create -n ns3dgs python=3.10 -y
conda activate ns3dgs
python -m pip install --upgrade pip setuptools wheel
# Install PyTorch CUDA 12.8
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128
# Test PyTorch
python - <<'PY'
import torch
print("torch version:", torch.__version__)
print("torch cuda:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
print("gpu count:", torch.cuda.device_count())
for i in range(torch.cuda.device_count()):
print(i, torch.cuda.get_device_name(i))
PY
# Install system tools
sudo apt update
sudo apt install -y git cmake ninja-build build-essential ffmpeg colmap
# Install Nerfstudio and gsplat
pip install nerfstudio
pip install gsplat
# Test installation
ns-train --help
ns-process-data --help
python -c "import gsplat; print('gsplat OK')"
# Create folders
mkdir -p ~/vln_3dgs_project/raw_data/video_test/video
mkdir -p ~/vln_3dgs_project/raw_data/image_test/images
mkdir -p ~/vln_3dgs_project/processed_data
mkdir -p ~/vln_3dgs_project/outputs
# Process video
ns-process-data video \
--data ~/vln_3dgs_project/raw_data/video_test/video/your_video_720p.mp4 \
--output-dir ~/vln_3dgs_project/processed_data/video_test_300 \
--num-frames-target 300
# Train video test
CUDA_VISIBLE_DEVICES=0 ns-train splatfacto \
--data ~/vln_3dgs_project/processed_data/video_test_300 \
--output-dir ~/vln_3dgs_project/outputs
# Process images
ns-process-data images \
--data ~/vln_3dgs_project/raw_data/image_test/images_720p \
--output-dir ~/vln_3dgs_project/processed_data/image_test
# Train image test
CUDA_VISIBLE_DEVICES=0 ns-train splatfacto \
--data ~/vln_3dgs_project/processed_data/image_test \
--output-dir ~/vln_3dgs_project/outputs
The next thing you should do is run the PyTorch GPU test first. Once both RTX 5080 GPUs are detected, install Nerfstudio and we can test one small video.