Video to Gaussian Splat Scene Converter (as docker container)


Project scope
Categories
Cloud technologies Robotics Software development Machine learningSkills
docker container docker (software) python (programming language) report writing slam algorithms (simultaneous localization and mapping) pose estimation match moving computer vision gaussian processSunny Days Technologies is seeking to develop a streamlined solution for creating game-like scenes from videos in order to enhance learner's experience.
This project challenges the student to develop a comprehensive pipeline for converting single-shot video footage (drone footage, circular motion recording, linear recording, etc.) into Gaussian Splat representations. Student will research, implement, and package a complete solution as a Docker image that processes video input and generates a Gaussian Splat scene.
Ideal candidates should have some notional experience with computer vision fundamentals, 3D, and be comfortable with exploring open source programs. Familiarity with Python and Docker containerization is beneficial.
The core of this project involves creating an end-to-end pipeline that:
- Processes continuous video footage
- Extracts camera poses and scene geometry
- Optimizes a 3D Gaussian Splat representation
- Outputs a standardized Gaussian Splat format that can be visualized
The pipeline should leverage existing open-source technologies while creating a seamless workflow. Students will need to understand and integrate various components including Structure from Motion (SfM), camera tracking, and Gaussian Splat optimization techniques.
Timeline (5 weeks)
Week 1: Research phase, investigating existing tools and techniques
Week 2: Present working prototype
Week 3: Integration, optimization, and Docker containerization
Week 4 + 5: Testing, documentation, and final report preparation
Relevant Open Source tools: COLMAP, gsplat, opensplat, SuperSplat, nianticlabs/spz
Input: A single video file (.mp4, .mov, etc.) with camera motion around or through a scene
Output: A Gaussian Splat representation (.ply or .spz or other appropriate format) that can be visualized in standard Gaussian Splat viewers
Students will develop a complete video-to-Gaussian-Splat pipeline containerized as a Docker image. This will involve several different steps:
- Research current Gaussian Splat generation techniques and their requirements
- Create a processing pipeline utilizing tools such as COLMAP, gsplat, opensplat, SuperSplat, and/or nianticlabs/spz
- Develop frame extraction and preprocessing modules
- Implement Structure from Motion (SfM) and camera pose estimation
- Build Gaussian Splat optimization and generation components
- Package the entire solution as a Docker image with appropriate interfaces
- Test the pipeline on various video inputs (drone footage, walk-throughs, orbiting captures)
- Document the architecture, implementation choices, and performance characteristics
Final deliverables should include:
- A Docker image that can be run with the command:
docker run -v /path/to/your/videos:/videos gaussian_processor_image /videos/input.mp4
- 2 page documentation of the pipeline architecture and recommendation of execution, along with limitations
- A report on "Simultaneous Localization and Mapping (SLAM)" techniques and their potential applications in aligning multiple related videos into one unified Gaussian Splat
Bonus objectives:
- Implement .spz as a final step in pipeline
- Optimize the pipeline for reasonable processing time on consumer hardware
- Performance benchmarks on various input types (can be provided by company)
Sharing knowledge in specific technical skills, techniques, methodologies required for the project.
Providing access to necessary tools, software, and resources required for project completion.
Scheduled check-ins to discuss progress, address challenges, and provide feedback.
Supported causes
The global challenges this project addresses, aligning with the United Nations Sustainable Development Goals (SDGs). Learn more about all 17 SDGs here.
About the company
We're a startup that provides effective, fun, conversation based tools for language learners to speak a new language with confidence!
Portals
-
Vancouver, British Columbia, Canada