Stereo Vision and 3D Reconstruction

Implement photometric stereo and plane sweep stereo algorithms for 3D scene reconstruction.

This project explores classical computer vision techniques for recovering 3D surface properties from 2D images. Photometric stereo uses varying illumination to estimate surface normals and albedo, while plane sweep stereo uses multiple viewpoints to compute depth maps through normalized cross-correlation matching.

Complete 3D reconstruction pipeline: from input images to surface normals, depth maps, and final 3D meshes

Project Details

Completed:	August 2025
Implementation:	Python with NumPy, OpenCV, Open3D
Datasets:	Harvard Photometric Stereo, Middlebury Stereo
Key Concepts:	Multi-view geometry, surface reconstruction, mesh generation

Overview

Stereo vision encompasses multiple approaches to 3D scene reconstruction, each exploiting different image formation principles. This project implements two complementary techniques: photometric stereo for detailed surface analysis and geometric stereo for spatial structure recovery.

The complete pipeline processes input images through photometric and geometric analysis to produce comprehensive 3D reconstructions including surface normals, albedo maps, depth estimates, and final mesh models.

Photometric Stereo

Photometric stereo recovers surface properties by analyzing how appearance changes under different lighting conditions. The method assumes Lambertian reflectance and uses least squares optimization to solve for surface normals and albedo.

Lambertian Reflectance Model

For Lambertian surfaces, observed intensity equals surface albedo times the dot product of surface normal and light direction:

Algorithm Implementation

The implementation solves a linear system for each pixel using multiple illumination observations:

Data Setup: Load N images under known illumination directions
Linear System: For each pixel, construct system I = ρ(n·l)
Least Squares: Solve overdetermined system for albedo-normal product
Decomposition: Extract albedo magnitude and unit normal direction
Coordinate Frame: Ensure normals follow RGB color encoding (R=+X, G=+Y, B=+Z)

Photometric Stereo Results

Input: Multiple Illuminations

Same viewpoint, 9 different lighting directions

Recovered Surface Normals

RGB encoding: R=+X, G=+Y, B=+Z directions

Surface Albedo

Intrinsic surface reflectance properties

Cat Dataset Results

Real photometric stereo reconstruction

Plane Sweep Stereo

Plane sweep stereo estimates scene depth by systematically testing different depth hypotheses and finding the best stereo correspondences using normalized cross-correlation matching.

Camera Projection and Epipolar Geometry

3D points are projected to 2D image coordinates using calibrated camera parameters:

Normalized Cross-Correlation Matching

Correspondence quality is measured using normalized cross-correlation between image patches:

Plane Sweep Algorithm

The algorithm systematically tests depth hypotheses by projecting one image to the other's viewpoint:

Calibration: Load camera intrinsics and extrinsics for stereo pair
Depth Sampling: Define range of depth hypotheses to test
Image Projection: For each depth, project left image to right viewpoint
NCC Computation: Calculate normalized cross-correlation between projected and actual right image
Cost Volume: Build 3D array of matching costs (H×W×D)
Depth Selection: Choose depth with highest correlation at each pixel

Plane Sweep Stereo Results

Stereo Image Pair

Calibrated cameras with known baseline

Plane Sweep Process

Animation showing depth hypothesis testing

NCC Cost Volume

Correlation scores across depth layers

Computed Depth Map

White=near, Black=far depth estimates

3D Mesh Reconstruction

The final stage converts 2D analysis results into complete 3D mesh models through surface integration and point cloud reconstruction techniques.

Surface Integration

Surface normals are integrated to produce depth maps using the Frankot-Chellappa algorithm, which enforces integrability constraints in the frequency domain.

Mesh Generation Pipeline

Point clouds are converted to smooth meshes using Poisson surface reconstruction:

Point Cloud Generation: Unproject depth values to 3D coordinates using camera intrinsics
Normal Estimation: Compute point normals from local surface orientation
Poisson Reconstruction: Generate smooth mesh surface from oriented point cloud
Texture Mapping: Project original images for photorealistic rendering

3D Reconstruction Results

Mesh from Photometric Normals

High surface detail from lighting analysis

Mesh from Stereo Depth

Accurate global geometry from multi-view

Final 3D reconstruction combining photometric detail with geometric accuracy

Implementation Results

Technical Achievements

Successfully implemented all three components of the stereo vision pipeline, achieving accurate surface normal estimation, depth map computation, and high-quality 3D mesh reconstruction on multiple datasets.

Performance Metrics

Photometric Stereo: Sub-second processing for surface normal and albedo recovery on 128×128 images
Plane Sweep Stereo: Efficient cost volume computation with vectorized NCC matching under 10 seconds
Mesh Reconstruction: Robust surface integration and Poisson reconstruction for clean 3D models
Multi-Dataset Validation: Tested on Harvard photometric stereo and Middlebury stereo benchmark datasets

Implementation Details

Vectorization: NumPy operations for efficient computation without nested loops
Numerical Stability: Proper handling of singular cases and division by zero
Memory Management: Efficient processing of large cost volumes and image stacks
Coordinate Systems: Consistent normal and depth representations for mesh compatibility

← Back to All Projects