Job Description
Project Context
CrackCoach is an AI platform for automatic analysis of show-jumping videos.
This role builds the IMAGE-level perception and geometry stack that everything depends on: detection, tracking, obstacle understanding, jump segmentation, and metric calibration in real-world competition footage.
Without a rock-solid perception and geometric foundation, pose estimation, biomechanics, and AI coaching are not reliable.
⸻
Core Mission and Responsibilities
You will design, implement, and validate a production-grade computer vision pipeline capable of ingesting raw competition videos and producing robust, structured, and metric-aware outputs.
Your responsibilities include:
- Video ingestion and preprocessing: handle codecs, resolutions, FPS, orientation, stabilization, and cropping policies.
- Horse-and-rider detection using state-of-the-art detectors (YOLO / RT-DETR / Detectron2 or equivalent).
- Persistent tracking across frames (ByteTrack, BoT-SORT, DeepSORT, Kalman-based trackers).
- Obstacle detection and scene understanding for show-jumping arenas (rails, poles, standards).
- Obstacle-to-jump association logic: correctly identify which obstacle is being jumped and when.
- Automatic segmentation of a full round video into individual jump clips (per-obstacle segments).
- 2D trajectory reconstruction of the horse in image space with stable, low-jitter trajectories.
2D Metric Calibration (Image → Ground Plane)
In addition to perception, this role includes implementing a robust 2D metric calibration module:
- Estimate a ground-plane homography (image → ground) using stable scene references such as obstacle bases or other ground contact points.
- Compute a pixel-to-meter scale, ideally leveraging known or user-declared obstacle heights (e.g. “course at 1.35m”) when available.
- Project horse trajectories from image space to ground-plane coordinates in meters.
- Enable metric estimates such as:
- approach speed (m/s)
- distances between obstacles (m)
- take-off and landing distances at ground level (m)
- approximate stride length at ground level (when combined later with biomechanics)
- Provide a calibration confidence indicator and gracefully fall back to relative (pixel-based) measures when calibration is unreliable.
The calibration module must be robust, non-blocking, and designed for real-world competition footage (single camera, uncontrolled viewpoints).
⸻
Required Technical Skills
- Strong background in computer vision applied to video (sports footage experience is a strong plus).
- Proven experience with object detection (YOLO family, Detectron2, RT-DETR, etc.).
- Multi-object tracking expertise (ByteTrack / BoT-SORT / DeepSORT; handling occlusions and ID switches).
- Experience with segmentation models (Mask R-CNN, YOLO-Seg, SAM-family) if needed for background removal.
- Solid understanding of image-space geometry and camera perspective limitations.
- Experience implementing 2D metric calibration using planar homography and RANSAC.
- Comfortable working with pixel-to-meter conversions and expressing metric uncertainty.
- Advanced Python and OpenCV; deep learning framework (PyTorch preferred).
- Experience building modular, maintainable pipelines with clear interfaces and exports.
⸻
Key Technical Challenges
- Highly variable camera angles, zoom levels, and lighting conditions.
- Dynamic occlusions from obstacles, rails, other horses, and spectators.
- Motion blur and compression artifacts in user-generated videos.
- Background clutter and false positives (banners, rails, similar shapes).
- Maintaining stable trajectories despite noisy detections and temporary misses.
- Correct obstacle differentiation and obstacle association in multi-obstacle scenes.
- Metric calibration with a single camera, limited scene control, and partial reference data.
- Performance constraints: processing HD videos in minutes, not hours.
⸻
Expected Deliverables
- A fully modular computer vision pipeline (source code) that ingests raw video and outputs:
- detections
- tracks
- obstacle detections
- jump segments
- 2D trajectories
- ground-plane metric projections (when calibration is reliable)
- A 2D calibration module producing pixel-to-meter scale, ground-plane mapping, and confidence scores.
- Trained detection/segmentation models (weights + training scripts) when custom training is required.
- Clean data exports (JSON / CSV) and stable ROI frame exports for pose estimation and biomechanics.
- Visual validation outputs (overlays showing boxes, tracks, obstacles, jump boundaries, and metric projections).
- Clear technical documentation defining interfaces and data formats for downstream pose estimation, biomechanics, and AI coaching stages.
⸻
Important Notes
- This role does NOT include pose estimation or biomechanics (handled by separate specialists).
- Metric calibration is 2D ground-plane based, not full 3D reconstruction.
- Robustness and graceful degradation are more important than theoretical precision.
Apply Now
Apply Now