Implement photometric stereo and plane sweep stereo algorithms for 3D scene reconstruction.
This project explores classical computer vision techniques for recovering 3D surface properties from 2D images. Photometric stereo uses varying illumination to estimate surface normals and albedo, while plane sweep stereo uses multiple viewpoints to compute depth maps through normalized cross-correlation matching.
| Completed: | August 2025 |
| Implementation: | Python with NumPy, OpenCV, Open3D |
| Datasets: | Harvard Photometric Stereo, Middlebury Stereo |
| Key Concepts: | Multi-view geometry, surface reconstruction, mesh generation |
Stereo vision encompasses multiple approaches to 3D scene reconstruction, each exploiting different image formation principles. This project implements two complementary techniques: photometric stereo for detailed surface analysis and geometric stereo for spatial structure recovery.
The complete pipeline processes input images through photometric and geometric analysis to produce comprehensive 3D reconstructions including surface normals, albedo maps, depth estimates, and final mesh models.
Photometric stereo recovers surface properties by analyzing how appearance changes under different lighting conditions. The method assumes Lambertian reflectance and uses least squares optimization to solve for surface normals and albedo.
For Lambertian surfaces, observed intensity equals surface albedo times the dot product of surface normal and light direction:
The implementation solves a linear system for each pixel using multiple illumination observations:
Plane sweep stereo estimates scene depth by systematically testing different depth hypotheses and finding the best stereo correspondences using normalized cross-correlation matching.
3D points are projected to 2D image coordinates using calibrated camera parameters:
Correspondence quality is measured using normalized cross-correlation between image patches:
The algorithm systematically tests depth hypotheses by projecting one image to the other's viewpoint:
The final stage converts 2D analysis results into complete 3D mesh models through surface integration and point cloud reconstruction techniques.
Surface normals are integrated to produce depth maps using the Frankot-Chellappa algorithm, which enforces integrability constraints in the frequency domain.
Point clouds are converted to smooth meshes using Poisson surface reconstruction:
Successfully implemented all three components of the stereo vision pipeline, achieving accurate surface normal estimation, depth map computation, and high-quality 3D mesh reconstruction on multiple datasets.