myavuzokumus/Drone Visual Odometry - From Scratch.md

## const.py
"""
Camera Calibration Constants
============================
Contains camera intrinsic parameters and distortion coefficients.

These values are obtained through camera calibration (e.g., using a
checkerboard pattern). Replace with your own camera's calibration data.

Author: myavuzokumus
License: MIT
"""

import numpy as np

# Camera calibration parameters
# Replace these with your camera's actual calibration values
CameraParams = {
    # Intrinsic Matrix (3x3)
    # | fx  0   cx |
    # | 0   fy  cy |
    # | 0   0   1  |
    # fx, fy: focal lengths in pixels
    # cx, cy: principal point (optical center)
    'IntrinsicMatrix': np.array([
        [800.0,   0.0, 320.0],
        [  0.0, 800.0, 240.0],
        [  0.0,   0.0,   1.0]
    ], dtype=np.float64),

    # Distortion Coefficients
    # [k1, k2, p1, p2, k3]
    # k1, k2, k3: radial distortion coefficients
    # p1, p2: tangential distortion coefficients
    'DistCoeffs': np.array([
        -0.1,   # k1: radial
         0.01,  # k2: radial
         0.0,   # p1: tangential
         0.0,   # p2: tangential
         0.0    # k3: radial (higher order)
    ], dtype=np.float64)
}

## Drone Visual Odometry - From Scratch.md

      
    Raw
  

              Drone Visual Odometry - From Scratch.md
            
          
    Drone Visual Odometry - From Scratch

A lightweight 2D visual odometry system for aerial drone navigation, built without SLAM frameworks.
Overview

This system estimates drone position using only a downward-facing camera by:

Detecting and matching ORB features between consecutive frames
Estimating 2D affine transformation to extract translation
Using GPS calibration to solve monocular scale ambiguity
Auto-aligning coordinate systems via brute-force search

Key Features


No SLAM frameworks - Built using only OpenCV and NumPy
GPS-based calibration - Solves scale ambiguity using initial GPS data
Auto coordinate alignment - No manual camera calibration needed
Lightweight - Designed for real-time drone navigation

Files


File
Description


visual_odometry.py
Main trajectory estimator class


undistort_image.py
Lens distortion correction


const.py
Camera calibration parameters


Usage

from visual_odometry import DroneTrajectoryEstimator

estimator = DroneTrajectoryEstimator()

for frame, gps_available, gps_position in video_stream:
    # health_status: 1 = GPS available, 0 = GPS unavailable
    position = estimator.process_frame(
        frame, 
        health_status=1 if gps_available else 0,
        gt_translation=gps_position
    )
    print(f"Estimated position: {position}")
Algorithm Pipeline

Frame N → Undistort → ORB Features → Match with Frame N-1
                                            ↓
                                    Affine Transform
                                            ↓
                              Extract tx, ty (pixel displacement)
                                            ↓
                    [If calibrating] → Accumulate for scale calculation
                    [If navigating] → Apply scale & transform → Position

Requirements


Python 3.11+
OpenCV (pip install opencv-python)
NumPy (pip install numpy)

Results


Test environment: ~89% accuracy
Competition (grass field): ~55% accuracy

The accuracy drop was due to:

Repetitive grass texture causing poor feature matching
Static scale factor not accounting for altitude changes

License

MIT

  
## undistort_image.py
"""
Image Undistortion Module
=========================
Removes lens distortion from camera images using calibration parameters.

Lens distortion (especially fisheye/barrel distortion) causes straight lines
to appear curved, which affects feature detection accuracy. This module
corrects for radial and tangential distortion using the camera's intrinsic
matrix and distortion coefficients.

Author: myavuzokumus
License: MIT
"""

import logging
import cv2
from const import CameraParams


def undistort_image(image):
    """
    Remove lens distortion from an image.

    Uses OpenCV's undistort function with optimal camera matrix to:
    1. Correct radial distortion (barrel/pincushion effect)
    2. Correct tangential distortion (lens misalignment)
    3. Crop to valid region of interest (ROI)

    Args:
        image: Input image (grayscale or BGR)

    Returns:
        np.ndarray: Undistorted and cropped image
        False: If an error occurs during processing

    Note:
        CameraParams must contain:
        - 'IntrinsicMatrix': 3x3 camera matrix (fx, fy, cx, cy)
        - 'DistCoeffs': Distortion coefficients (k1, k2, p1, p2, k3)
    """
    try:
        h, w = image.shape[:2]

        # Calculate optimal camera matrix that retains all pixels
        # alpha=1 means all source pixels are retained (may have black borders)
        new_camera_matrix, roi = cv2.getOptimalNewCameraMatrix(
            CameraParams['IntrinsicMatrix'],
            CameraParams['DistCoeffs'],
            (w, h),
            1,  # alpha: 0=crop all black, 1=retain all pixels
            (w, h)
        )

        # Apply undistortion transformation
        undistorted_image = cv2.undistort(
            image,
            CameraParams['IntrinsicMatrix'],
            CameraParams['DistCoeffs'],
            None,
            new_camera_matrix
        )

        # Crop to valid region (removes black borders from undistortion)
        x, y, w, h = roi
        undistorted_image = undistorted_image[y:y + h, x:x + w]

        return undistorted_image

    except Exception as e:
        logging.error(f"Error undistorting image: {e}")
        return False

## visual_odometry.py
"""
Visual Odometry for Drone Navigation
=====================================
A lightweight 2D visual odometry system for aerial drone navigation.
Built for the 2024 AI Transportation Competition.

This module estimates drone position using only a downward-facing camera,
solving two key challenges:
1. Monocular Scale Ambiguity - using GPS calibration
2. Coordinate System Alignment - using brute-force auto-alignment

Author: myavuzokumus
License: MIT
"""

import cv2
import numpy as np
from undistort_image import undistort_image
from const import CameraParams


class DroneTrajectoryEstimator:
    """
    Estimates drone trajectory using monocular visual odometry.

    The algorithm works in two phases:
    1. Calibration Phase (GPS available): Collects data to calculate scale factor
       and determine correct coordinate transformation
    2. Navigation Phase (GPS unavailable): Uses calibrated parameters to estimate
       position from visual features alone

    Attributes:
        MATCH_DISTANCE_THRESHOLD: Ratio threshold for filtering good feature matches
        RANSAC_REPROJ_THRESHOLD: RANSAC reprojection error threshold in pixels
    """

    # Matching distance threshold (Lowe's ratio test)
    MATCH_DISTANCE_THRESHOLD = 0.75

    # RANSAC reprojection threshold for outlier rejection
    RANSAC_REPROJ_THRESHOLD = 3.0

    def __init__(self):
        # Previous frame's keypoints and descriptors for matching
        self.prev_keypoints = None
        self.prev_des = None

        # ORB (Oriented FAST and Rotated BRIEF) feature detector
        # - nfeatures: Maximum number of features to retain
        # - scaleFactor: Pyramid decimation ratio (>1)
        # - nlevels: Number of pyramid levels
        # - WTA_K: Number of points for oriented BRIEF descriptor
        self.orb = cv2.ORB_create(
            nfeatures=1000,
            scaleFactor=1.2,
            nlevels=8,
            edgeThreshold=31,
            firstLevel=0,
            WTA_K=2,
            scoreType=cv2.ORB_HARRIS_SCORE,
            patchSize=31,
            fastThreshold=20
        )

        # Scale factor: converts pixel displacement to meters
        # Calculated during calibration phase using GPS ground truth
        self.scale = None

        # Current estimated position [x, y] in meters
        self.current_position = np.zeros(2)

        # GPS positions recorded during calibration (ground truth)
        self.real_recorded_positions = np.empty((0, 2))

        # Visual odometry positions in pixel space (before scaling)
        self.vo_recorded_positions = np.zeros((1, 2))

        # Best coordinate transformation found during calibration
        # Format: (flip_x, flip_y, switch_axes)
        self.best_transformation = (1, 1, False)

    def process_frame(self, frame, health_status, gt_translation=None):
        """
        Process a single frame and update position estimate.

        Args:
            frame: BGR image from drone camera
            health_status: 1 = GPS available, 0 = GPS unavailable
            gt_translation: Ground truth position [x, y] from GPS (if available)

        Returns:
            np.ndarray: Current estimated position [x, y] in meters
        """
        # Convert to grayscale for feature detection
        frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

        # First frame: initialize features, no motion to estimate yet
        if self.prev_keypoints is None and self.prev_des is None:
            img_undistorted = undistort_image(frame)
            self.prev_keypoints, self.prev_des = self.orb.detectAndCompute(
                img_undistorted, None
            )
            return self.current_position

        # Remove lens distortion for accurate feature positions
        img_undistorted = undistort_image(frame)

        # Detect features in current frame
        keypoints, des = self.orb.detectAndCompute(img_undistorted, None)

        # Match features between previous and current frame
        matches = self.match_features(self.prev_des, des)

        # Filter matches using distance ratio test
        good_matches = self.filter_good_matches(matches)

        # Extract matched point coordinates
        src_pts, dst_pts = self.get_matched_points(good_matches, keypoints)

        # Estimate 2D affine transformation (rotation + translation + scale)
        # Using RANSAC for robustness against outliers
        M, inliers = cv2.estimateAffinePartial2D(
            src_pts, dst_pts,
            method=cv2.RANSAC,
            ransacReprojThreshold=self.RANSAC_REPROJ_THRESHOLD
        )

        # End of calibration phase: calculate scale factor
        if self.scale is None and health_status == 0:
            self.calculate_scale()
            self.current_position = self.real_recorded_positions[-1]

        # Update position using affine transformation
        if M is not None:
            self.update_position(M)

        # Store current frame data for next iteration
        self.prev_keypoints = keypoints
        self.prev_des = des

        # During calibration: record GPS ground truth
        if health_status == 1 and gt_translation is not None:
            self.real_recorded_positions = np.concatenate(
                (self.real_recorded_positions, gt_translation.copy().reshape(1, -1)),
                axis=0
            )
            self.current_position = gt_translation

        return self.current_position

    def filter_good_matches(self, matches):
        """
        Filter matches using adaptive distance threshold.

        Uses mean distance as baseline - matches significantly worse than
        average are likely incorrect.

        Args:
            matches: List of cv2.DMatch objects

        Returns:
            List of filtered matches
        """
        mean_distance = np.mean([match.distance for match in matches])
        return [m for m in matches if m.distance < self.MATCH_DISTANCE_THRESHOLD * mean_distance]

    def get_matched_points(self, good_matches, keypoints):
        """
        Extract point coordinates from matches.

        Args:
            good_matches: Filtered list of cv2.DMatch objects
            keypoints: Keypoints from current frame

        Returns:
            Tuple of (source_points, destination_points) as numpy arrays
        """
        src_pts = np.float32(
            [self.prev_keypoints[m.queryIdx].pt for m in good_matches]
        ).reshape(-1, 1, 2)

        dst_pts = np.float32(
            [keypoints[m.trainIdx].pt for m in good_matches]
        ).reshape(-1, 1, 2)

        return src_pts, dst_pts

    def calculate_scale(self):
        """
        Calculate scale factor using GPS ground truth.

        Compares total displacement in pixel space (VO) against
        real-world displacement (GPS) to derive meters-per-pixel ratio.

        This solves the monocular scale ambiguity problem by using
        the calibration phase GPS data as reference.
        """
        # Real-world displacement from GPS (meters)
        delta_x_real = (self.real_recorded_positions[-1][0] -
                        self.real_recorded_positions[0][0])
        delta_y_real = (self.real_recorded_positions[-1][1] -
                        self.real_recorded_positions[0][1])

        # Visual displacement in pixels (with axis correction applied)
        corrected_data = self.correct_sign_and_axis(self.vo_recorded_positions[-1])
        delta_x_vo = corrected_data[0]
        delta_y_vo = corrected_data[1]

        # Scale = meters / pixels (separate for X and Y axes)
        self.scale = np.array([
            abs(delta_x_real / delta_x_vo),
            abs(delta_y_real / delta_y_vo)
        ])

    def update_position(self, M):
        """
        Update position estimate using affine transformation matrix.

        The affine matrix M has structure:
        | a   b   tx |
        | c   d   ty |

        Where tx = M[0,2] and ty = M[1,2] are the translation components.

        Args:
            M: 2x3 affine transformation matrix from cv2.estimateAffinePartial2D
        """
        if self.scale is None:
            # Calibration phase: accumulate raw pixel displacements
            vo_current = self.vo_recorded_positions[-1].copy()
            vo_current[0] += M[0, 2]  # tx: horizontal displacement
            vo_current[1] += M[1, 2]  # ty: vertical displacement

            self.vo_recorded_positions = np.concatenate(
                (self.vo_recorded_positions, vo_current.reshape(1, -1)),
                axis=0
            )
        else:
            # Navigation phase: convert pixels to meters using scale
            corrected = self.apply_transformations(np.array([M[0, 2], M[1, 2]]))
            self.current_position[0] += corrected[0] * self.scale[0]
            self.current_position[1] += corrected[1] * self.scale[1]

    def match_features(self, desc1, desc2, method='bf'):
        """
        Match feature descriptors between two frames.

        Args:
            desc1: Descriptors from previous frame
            desc2: Descriptors from current frame
            method: 'bf' for BruteForce, 'flann' for FLANN-based matching

        Returns:
            List of cv2.DMatch objects, sorted by distance
        """
        if method == 'bf':
            # BruteForce matcher with Hamming distance (for binary descriptors)
            matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
        elif method == 'flann':
            # FLANN-based matcher (faster for large datasets)
            FLANN_INDEX_LSH = 6
            index_params = dict(
                algorithm=FLANN_INDEX_LSH,
                table_number=6,
                key_size=12,
                multi_probe_level=1
            )
            search_params = dict(checks=50)
            matcher = cv2.FlannBasedMatcher(index_params, search_params)
        else:
            raise ValueError(f"Unsupported matching method: {method}")

        matches = matcher.match(desc1, desc2)
        return sorted(matches, key=lambda x: x.distance)

    def undistort_points(self, points):
        """
        Remove lens distortion from keypoint coordinates.

        Args:
            points: List of cv2.KeyPoint objects

        Returns:
            np.ndarray: Undistorted point coordinates
        """
        points = np.array([kp.pt for kp in points], dtype=np.float32)
        points_undistorted = cv2.undistortPoints(
            np.expand_dims(points, axis=1),
            CameraParams['IntrinsicMatrix'],
            CameraParams['DistCoeffs'],
            None,
            CameraParams['IntrinsicMatrix']
        )
        return points_undistorted[:, 0, :]

    def correct_sign_and_axis(self, sim_point):
        """
        Find optimal coordinate transformation by brute-force search.

        The camera coordinate system may not align with GPS coordinates:
        - X might be inverted
        - Y might be inverted
        - X and Y might be swapped

        This method tests all 8 possible combinations (2×2×2) and selects
        the one that minimizes error against GPS ground truth.

        Args:
            sim_point: Point in visual odometry coordinate space

        Returns:
            np.ndarray: Transformed point that best matches GPS coordinates
        """
        min_distance = float('inf')
        best_correction = sim_point

        # Test all 8 geometric transformations
        for flip_x in [-1, 1]:                    # X-axis sign
            for flip_y in [-1, 1]:                # Y-axis sign
                for switch_axes in [False, True]: # Swap X and Y

                    # Apply candidate transformation
                    test_point = np.array([
                        sim_point[0] * flip_x,
                        sim_point[1] * flip_y
                    ])
                    if switch_axes:
                        test_point = test_point[::-1]

                    # Calculate error against GPS ground truth
                    distances = np.linalg.norm(
                        self.real_recorded_positions - test_point, axis=1
                    )
                    distance = np.min(distances)

                    # Keep transformation with minimum error
                    if distance < min_distance:
                        min_distance = distance
                        best_correction = test_point
                        self.best_transformation = (flip_x, flip_y, switch_axes)

        return np.array(best_correction)

    def apply_transformations(self, pixel_displacement):
        """
        Apply the locked-in coordinate transformation.

        After calibration, the best transformation is fixed and applied
        to all subsequent measurements.

        Args:
            pixel_displacement: Raw [dx, dy] from affine matrix

        Returns:
            np.ndarray: Corrected displacement in GPS coordinate system
        """
        flip_x, flip_y, switch_axes = self.best_transformation

        corrected = np.array([
            pixel_displacement[0] * flip_x,
            pixel_displacement[1] * flip_y
        ])

        if switch_axes:
            corrected = corrected[::-1]

        return corrected

    def show_result(self, frame1, frame2, keypoints1, keypoints2, matches):
        """
        Visualize feature matches between two frames (for debugging).

        Args:
            frame1, frame2: Input images
            keypoints1, keypoints2: Detected keypoints
            matches: List of matches to visualize
        """
        # Resize for display
        scale_percent = 50
        width = int(frame1.shape[1] * scale_percent / 100)
        height = int(frame2.shape[0] * scale_percent / 100)
        dim = (width, height)

        resized_img1 = cv2.resize(frame1, dim, interpolation=cv2.INTER_AREA)
        resized_img2 = cv2.resize(frame2, dim, interpolation=cv2.INTER_AREA)

        # Draw top 10 matches
        img_matches = cv2.drawMatches(
            resized_img1, keypoints1,
            resized_img2, keypoints2,
            matches[:10], None,
            flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS
        )

        cv2.imshow('Matches', img_matches)
        cv2.waitKey(0)
        cv2.destroyAllWindows()
	"""
	Camera Calibration Constants
	============================
	Contains camera intrinsic parameters and distortion coefficients.

	These values are obtained through camera calibration (e.g., using a
	checkerboard pattern). Replace with your own camera's calibration data.

	Author: myavuzokumus
	License: MIT
	"""

	import numpy as np

	# Camera calibration parameters
	# Replace these with your camera's actual calibration values
	CameraParams = {
	# Intrinsic Matrix (3x3)
	# \| fx 0 cx \|
	# \| 0 fy cy \|
	# \| 0 0 1 \|
	# fx, fy: focal lengths in pixels
	# cx, cy: principal point (optical center)
	'IntrinsicMatrix': np.array([
	[800.0, 0.0, 320.0],
	[ 0.0, 800.0, 240.0],
	[ 0.0, 0.0, 1.0]
	], dtype=np.float64),

	# Distortion Coefficients
	# [k1, k2, p1, p2, k3]
	# k1, k2, k3: radial distortion coefficients
	# p1, p2: tangential distortion coefficients
	'DistCoeffs': np.array([
	-0.1, # k1: radial
	0.01, # k2: radial
	0.0, # p1: tangential
	0.0, # p2: tangential
	0.0 # k3: radial (higher order)
	], dtype=np.float64)
	}
File	Description
`visual_odometry.py`	Main trajectory estimator class
`undistort_image.py`	Lens distortion correction
`const.py`	Camera calibration parameters
	"""
	Image Undistortion Module
	=========================
	Removes lens distortion from camera images using calibration parameters.

	Lens distortion (especially fisheye/barrel distortion) causes straight lines
	to appear curved, which affects feature detection accuracy. This module
	corrects for radial and tangential distortion using the camera's intrinsic
	matrix and distortion coefficients.

	Author: myavuzokumus
	License: MIT
	"""

	import logging
	import cv2
	from const import CameraParams


	def undistort_image(image):
	"""
	Remove lens distortion from an image.

	Uses OpenCV's undistort function with optimal camera matrix to:
	1. Correct radial distortion (barrel/pincushion effect)
	2. Correct tangential distortion (lens misalignment)
	3. Crop to valid region of interest (ROI)

	Args:
	image: Input image (grayscale or BGR)

	Returns:
	np.ndarray: Undistorted and cropped image
	False: If an error occurs during processing

	Note:
	CameraParams must contain:
	- 'IntrinsicMatrix': 3x3 camera matrix (fx, fy, cx, cy)
	- 'DistCoeffs': Distortion coefficients (k1, k2, p1, p2, k3)
	"""
	try:
	h, w = image.shape[:2]

	# Calculate optimal camera matrix that retains all pixels
	# alpha=1 means all source pixels are retained (may have black borders)
	new_camera_matrix, roi = cv2.getOptimalNewCameraMatrix(
	CameraParams['IntrinsicMatrix'],
	CameraParams['DistCoeffs'],
	(w, h),
	1, # alpha: 0=crop all black, 1=retain all pixels
	(w, h)
	)

	# Apply undistortion transformation
	undistorted_image = cv2.undistort(
	image,
	CameraParams['IntrinsicMatrix'],
	CameraParams['DistCoeffs'],
	None,
	new_camera_matrix
	)

	# Crop to valid region (removes black borders from undistortion)
	x, y, w, h = roi
	undistorted_image = undistorted_image[y:y + h, x:x + w]

	return undistorted_image

	except Exception as e:
	logging.error(f"Error undistorting image: {e}")
	return False
	"""
	Visual Odometry for Drone Navigation
	=====================================
	A lightweight 2D visual odometry system for aerial drone navigation.
	Built for the 2024 AI Transportation Competition.

	This module estimates drone position using only a downward-facing camera,
	solving two key challenges:
	1. Monocular Scale Ambiguity - using GPS calibration
	2. Coordinate System Alignment - using brute-force auto-alignment

	Author: myavuzokumus
	License: MIT
	"""

	import cv2
	import numpy as np
	from undistort_image import undistort_image
	from const import CameraParams


	class DroneTrajectoryEstimator:
	"""
	Estimates drone trajectory using monocular visual odometry.

	The algorithm works in two phases:
	1. Calibration Phase (GPS available): Collects data to calculate scale factor
	and determine correct coordinate transformation
	2. Navigation Phase (GPS unavailable): Uses calibrated parameters to estimate
	position from visual features alone

	Attributes:
	MATCH_DISTANCE_THRESHOLD: Ratio threshold for filtering good feature matches
	RANSAC_REPROJ_THRESHOLD: RANSAC reprojection error threshold in pixels
	"""

	# Matching distance threshold (Lowe's ratio test)
	MATCH_DISTANCE_THRESHOLD = 0.75

	# RANSAC reprojection threshold for outlier rejection
	RANSAC_REPROJ_THRESHOLD = 3.0

	def __init__(self):
	# Previous frame's keypoints and descriptors for matching
	self.prev_keypoints = None
	self.prev_des = None

	# ORB (Oriented FAST and Rotated BRIEF) feature detector
	# - nfeatures: Maximum number of features to retain
	# - scaleFactor: Pyramid decimation ratio (>1)
	# - nlevels: Number of pyramid levels
	# - WTA_K: Number of points for oriented BRIEF descriptor
	self.orb = cv2.ORB_create(
	nfeatures=1000,
	scaleFactor=1.2,
	nlevels=8,
	edgeThreshold=31,
	firstLevel=0,
	WTA_K=2,
	scoreType=cv2.ORB_HARRIS_SCORE,
	patchSize=31,
	fastThreshold=20
	)

	# Scale factor: converts pixel displacement to meters
	# Calculated during calibration phase using GPS ground truth
	self.scale = None

	# Current estimated position [x, y] in meters
	self.current_position = np.zeros(2)

	# GPS positions recorded during calibration (ground truth)
	self.real_recorded_positions = np.empty((0, 2))

	# Visual odometry positions in pixel space (before scaling)
	self.vo_recorded_positions = np.zeros((1, 2))

	# Best coordinate transformation found during calibration
	# Format: (flip_x, flip_y, switch_axes)
	self.best_transformation = (1, 1, False)

	def process_frame(self, frame, health_status, gt_translation=None):
	"""
	Process a single frame and update position estimate.

	Args:
	frame: BGR image from drone camera
	health_status: 1 = GPS available, 0 = GPS unavailable
	gt_translation: Ground truth position [x, y] from GPS (if available)

	Returns:
	np.ndarray: Current estimated position [x, y] in meters
	"""
	# Convert to grayscale for feature detection
	frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

	# First frame: initialize features, no motion to estimate yet
	if self.prev_keypoints is None and self.prev_des is None:
	img_undistorted = undistort_image(frame)
	self.prev_keypoints, self.prev_des = self.orb.detectAndCompute(
	img_undistorted, None
	)
	return self.current_position

	# Remove lens distortion for accurate feature positions
	img_undistorted = undistort_image(frame)

	# Detect features in current frame
	keypoints, des = self.orb.detectAndCompute(img_undistorted, None)

	# Match features between previous and current frame
	matches = self.match_features(self.prev_des, des)

	# Filter matches using distance ratio test
	good_matches = self.filter_good_matches(matches)

	# Extract matched point coordinates
	src_pts, dst_pts = self.get_matched_points(good_matches, keypoints)

	# Estimate 2D affine transformation (rotation + translation + scale)
	# Using RANSAC for robustness against outliers
	M, inliers = cv2.estimateAffinePartial2D(
	src_pts, dst_pts,
	method=cv2.RANSAC,
	ransacReprojThreshold=self.RANSAC_REPROJ_THRESHOLD
	)

	# End of calibration phase: calculate scale factor
	if self.scale is None and health_status == 0:
	self.calculate_scale()
	self.current_position = self.real_recorded_positions[-1]

	# Update position using affine transformation
	if M is not None:
	self.update_position(M)

	# Store current frame data for next iteration
	self.prev_keypoints = keypoints
	self.prev_des = des

	# During calibration: record GPS ground truth
	if health_status == 1 and gt_translation is not None:
	self.real_recorded_positions = np.concatenate(
	(self.real_recorded_positions, gt_translation.copy().reshape(1, -1)),
	axis=0
	)
	self.current_position = gt_translation

	return self.current_position

	def filter_good_matches(self, matches):
	"""
	Filter matches using adaptive distance threshold.

	Uses mean distance as baseline - matches significantly worse than
	average are likely incorrect.

	Args:
	matches: List of cv2.DMatch objects

	Returns:
	List of filtered matches
	"""
	mean_distance = np.mean([match.distance for match in matches])
	return [m for m in matches if m.distance < self.MATCH_DISTANCE_THRESHOLD * mean_distance]

	def get_matched_points(self, good_matches, keypoints):
	"""
	Extract point coordinates from matches.

	Args:
	good_matches: Filtered list of cv2.DMatch objects
	keypoints: Keypoints from current frame

	Returns:
	Tuple of (source_points, destination_points) as numpy arrays
	"""
	src_pts = np.float32(
	[self.prev_keypoints[m.queryIdx].pt for m in good_matches]
	).reshape(-1, 1, 2)

	dst_pts = np.float32(
	[keypoints[m.trainIdx].pt for m in good_matches]
	).reshape(-1, 1, 2)

	return src_pts, dst_pts

	def calculate_scale(self):
	"""
	Calculate scale factor using GPS ground truth.

	Compares total displacement in pixel space (VO) against
	real-world displacement (GPS) to derive meters-per-pixel ratio.

	This solves the monocular scale ambiguity problem by using
	the calibration phase GPS data as reference.
	"""
	# Real-world displacement from GPS (meters)
	delta_x_real = (self.real_recorded_positions[-1][0] -
	self.real_recorded_positions[0][0])
	delta_y_real = (self.real_recorded_positions[-1][1] -
	self.real_recorded_positions[0][1])

	# Visual displacement in pixels (with axis correction applied)
	corrected_data = self.correct_sign_and_axis(self.vo_recorded_positions[-1])
	delta_x_vo = corrected_data[0]
	delta_y_vo = corrected_data[1]

	# Scale = meters / pixels (separate for X and Y axes)
	self.scale = np.array([
	abs(delta_x_real / delta_x_vo),
	abs(delta_y_real / delta_y_vo)
	])

	def update_position(self, M):
	"""
	Update position estimate using affine transformation matrix.

	The affine matrix M has structure:
	\| a b tx \|
	\| c d ty \|

	Where tx = M[0,2] and ty = M[1,2] are the translation components.

	Args:
	M: 2x3 affine transformation matrix from cv2.estimateAffinePartial2D
	"""
	if self.scale is None:
	# Calibration phase: accumulate raw pixel displacements
	vo_current = self.vo_recorded_positions[-1].copy()
	vo_current[0] += M[0, 2] # tx: horizontal displacement
	vo_current[1] += M[1, 2] # ty: vertical displacement

	self.vo_recorded_positions = np.concatenate(
	(self.vo_recorded_positions, vo_current.reshape(1, -1)),
	axis=0
	)
	else:
	# Navigation phase: convert pixels to meters using scale
	corrected = self.apply_transformations(np.array([M[0, 2], M[1, 2]]))
	self.current_position[0] += corrected[0] * self.scale[0]
	self.current_position[1] += corrected[1] * self.scale[1]

	def match_features(self, desc1, desc2, method='bf'):
	"""
	Match feature descriptors between two frames.

	Args:
	desc1: Descriptors from previous frame
	desc2: Descriptors from current frame
	method: 'bf' for BruteForce, 'flann' for FLANN-based matching

	Returns:
	List of cv2.DMatch objects, sorted by distance
	"""
	if method == 'bf':
	# BruteForce matcher with Hamming distance (for binary descriptors)
	matcher = cv2.BFMatcher(cv2.NORM_HAMMING, crossCheck=True)
	elif method == 'flann':
	# FLANN-based matcher (faster for large datasets)
	FLANN_INDEX_LSH = 6
	index_params = dict(
	algorithm=FLANN_INDEX_LSH,
	table_number=6,
	key_size=12,
	multi_probe_level=1
	)
	search_params = dict(checks=50)
	matcher = cv2.FlannBasedMatcher(index_params, search_params)
	else:
	raise ValueError(f"Unsupported matching method: {method}")

	matches = matcher.match(desc1, desc2)
	return sorted(matches, key=lambda x: x.distance)

	def undistort_points(self, points):
	"""
	Remove lens distortion from keypoint coordinates.

	Args:
	points: List of cv2.KeyPoint objects

	Returns:
	np.ndarray: Undistorted point coordinates
	"""
	points = np.array([kp.pt for kp in points], dtype=np.float32)
	points_undistorted = cv2.undistortPoints(
	np.expand_dims(points, axis=1),
	CameraParams['IntrinsicMatrix'],
	CameraParams['DistCoeffs'],
	None,
	CameraParams['IntrinsicMatrix']
	)
	return points_undistorted[:, 0, :]

	def correct_sign_and_axis(self, sim_point):
	"""
	Find optimal coordinate transformation by brute-force search.

	The camera coordinate system may not align with GPS coordinates:
	- X might be inverted
	- Y might be inverted
	- X and Y might be swapped

	This method tests all 8 possible combinations (2×2×2) and selects
	the one that minimizes error against GPS ground truth.

	Args:
	sim_point: Point in visual odometry coordinate space

	Returns:
	np.ndarray: Transformed point that best matches GPS coordinates
	"""
	min_distance = float('inf')
	best_correction = sim_point

	# Test all 8 geometric transformations
	for flip_x in [-1, 1]: # X-axis sign
	for flip_y in [-1, 1]: # Y-axis sign
	for switch_axes in [False, True]: # Swap X and Y

	# Apply candidate transformation
	test_point = np.array([
	sim_point[0] * flip_x,
	sim_point[1] * flip_y
	])
	if switch_axes:
	test_point = test_point[::-1]

	# Calculate error against GPS ground truth
	distances = np.linalg.norm(
	self.real_recorded_positions - test_point, axis=1
	)
	distance = np.min(distances)

	# Keep transformation with minimum error
	if distance < min_distance:
	min_distance = distance
	best_correction = test_point
	self.best_transformation = (flip_x, flip_y, switch_axes)

	return np.array(best_correction)

	def apply_transformations(self, pixel_displacement):
	"""
	Apply the locked-in coordinate transformation.

	After calibration, the best transformation is fixed and applied
	to all subsequent measurements.

	Args:
	pixel_displacement: Raw [dx, dy] from affine matrix

	Returns:
	np.ndarray: Corrected displacement in GPS coordinate system
	"""
	flip_x, flip_y, switch_axes = self.best_transformation

	corrected = np.array([
	pixel_displacement[0] * flip_x,
	pixel_displacement[1] * flip_y
	])

	if switch_axes:
	corrected = corrected[::-1]

	return corrected

	def show_result(self, frame1, frame2, keypoints1, keypoints2, matches):
	"""
	Visualize feature matches between two frames (for debugging).

	Args:
	frame1, frame2: Input images
	keypoints1, keypoints2: Detected keypoints
	matches: List of matches to visualize
	"""
	# Resize for display
	scale_percent = 50
	width = int(frame1.shape[1] * scale_percent / 100)
	height = int(frame2.shape[0] * scale_percent / 100)
	dim = (width, height)

	resized_img1 = cv2.resize(frame1, dim, interpolation=cv2.INTER_AREA)
	resized_img2 = cv2.resize(frame2, dim, interpolation=cv2.INTER_AREA)

	# Draw top 10 matches
	img_matches = cv2.drawMatches(
	resized_img1, keypoints1,
	resized_img2, keypoints2,
	matches[:10], None,
	flags=cv2.DrawMatchesFlags_NOT_DRAW_SINGLE_POINTS
	)

	cv2.imshow('Matches', img_matches)
	cv2.waitKey(0)
	cv2.destroyAllWindows()