Computer Vision Engineer (Detection, Tracking & 2D Metric Calibration Specialist)

🌍 Remote, USA 🎯 Full-time 🕐 Posted Recently

Job Description

Project Context

CrackCoach is an AI platform for automatic analysis of show-jumping videos.

This role builds the IMAGE-level perception and geometry stack that everything depends on: detection, tracking, obstacle understanding, jump segmentation, and metric calibration in real-world competition footage.

Without a rock-solid perception and geometric foundation, pose estimation, biomechanics, and AI coaching are not reliable.

Core Mission and Responsibilities

You will design, implement, and validate a production-grade computer vision pipeline capable of ingesting raw competition videos and producing robust, structured, and metric-aware outputs.

Your responsibilities include:

  • Video ingestion and preprocessing: handle codecs, resolutions, FPS, orientation, stabilization, and cropping policies.
  • Horse-and-rider detection using state-of-the-art detectors (YOLO / RT-DETR / Detectron2 or equivalent).
  • Persistent tracking across frames (ByteTrack, BoT-SORT, DeepSORT, Kalman-based trackers).
  • Obstacle detection and scene understanding for show-jumping arenas (rails, poles, standards).
  • Obstacle-to-jump association logic: correctly identify which obstacle is being jumped and when.
  • Automatic segmentation of a full round video into individual jump clips (per-obstacle segments).
  • 2D trajectory reconstruction of the horse in image space with stable, low-jitter trajectories.

2D Metric Calibration (Image → Ground Plane)

In addition to perception, this role includes implementing a robust 2D metric calibration module:

  • Estimate a ground-plane homography (image → ground) using stable scene references such as obstacle bases or other ground contact points.
  • Compute a pixel-to-meter scale, ideally leveraging known or user-declared obstacle heights (e.g. “course at 1.35m”) when available.
  • Project horse trajectories from image space to ground-plane coordinates in meters.
  • Enable metric estimates such as:
  • approach speed (m/s)
  • distances between obstacles (m)
  • take-off and landing distances at ground level (m)
  • approximate stride length at ground level (when combined later with biomechanics)
  • Provide a calibration confidence indicator and gracefully fall back to relative (pixel-based) measures when calibration is unreliable.

The calibration module must be robust, non-blocking, and designed for real-world competition footage (single camera, uncontrolled viewpoints).

Required Technical Skills

  • Strong background in computer vision applied to video (sports footage experience is a strong plus).
  • Proven experience with object detection (YOLO family, Detectron2, RT-DETR, etc.).
  • Multi-object tracking expertise (ByteTrack / BoT-SORT / DeepSORT; handling occlusions and ID switches).
  • Experience with segmentation models (Mask R-CNN, YOLO-Seg, SAM-family) if needed for background removal.
  • Solid understanding of image-space geometry and camera perspective limitations.
  • Experience implementing 2D metric calibration using planar homography and RANSAC.
  • Comfortable working with pixel-to-meter conversions and expressing metric uncertainty.
  • Advanced Python and OpenCV; deep learning framework (PyTorch preferred).
  • Experience building modular, maintainable pipelines with clear interfaces and exports.

Key Technical Challenges

  • Highly variable camera angles, zoom levels, and lighting conditions.
  • Dynamic occlusions from obstacles, rails, other horses, and spectators.
  • Motion blur and compression artifacts in user-generated videos.
  • Background clutter and false positives (banners, rails, similar shapes).
  • Maintaining stable trajectories despite noisy detections and temporary misses.
  • Correct obstacle differentiation and obstacle association in multi-obstacle scenes.
  • Metric calibration with a single camera, limited scene control, and partial reference data.
  • Performance constraints: processing HD videos in minutes, not hours.

Expected Deliverables

  • A fully modular computer vision pipeline (source code) that ingests raw video and outputs:
  • detections
  • tracks
  • obstacle detections
  • jump segments
  • 2D trajectories
  • ground-plane metric projections (when calibration is reliable)
  • A 2D calibration module producing pixel-to-meter scale, ground-plane mapping, and confidence scores.
  • Trained detection/segmentation models (weights + training scripts) when custom training is required.
  • Clean data exports (JSON / CSV) and stable ROI frame exports for pose estimation and biomechanics.
  • Visual validation outputs (overlays showing boxes, tracks, obstacles, jump boundaries, and metric projections).
  • Clear technical documentation defining interfaces and data formats for downstream pose estimation, biomechanics, and AI coaching stages.

Important Notes

  • This role does NOT include pose estimation or biomechanics (handled by separate specialists).
  • Metric calibration is 2D ground-plane based, not full 3D reconstruction.
  • Robustness and graceful degradation are more important than theoretical precision.

Apply Now

Apply Now

Ready to Apply?

Don't miss out on this amazing opportunity!

🚀 Apply Now

Similar Jobs

Recent Jobs

You May Also Like