AI Red-Teamer – Adversarial AI Testing, English

🌍 Remote, USA 🎯 Full-time 🕐 Posted Recently

Job Description

Red-team AI models and agents by testing jailbreak attempts, prompt injections, misuse scenarios, and exploit strategies
Generate high-quality human evaluation data by annotating model failures, classifying vulnerabilities, and identifying systemic risks
Apply structured testing methodologies using taxonomies, benchmarks, and playbooks to ensure consistent evaluation
Document findings clearly and reproducibly, producing reports, datasets, and adversarial test cases that teams can act upon
Work across multiple projects, supporting different AI systems and evaluation objectives

You have **prior red-teaming experience**, such as adversarial AI testing, cybersecurity, or socio-technical risk analysis
You naturally think **adversarially**, exploring ways to push systems to their limits and uncover weaknesses
You prefer **structured methodologies**, using frameworks and benchmarks rather than ad-hoc testing
You communicate risks and vulnerabilities **clearly to both technical and non-technical audiences**
You are comfortable **working across multiple projects and adapting to new evaluation challenges**Nice-to-Have Specialties- **Adversarial Machine Learning:** jailbreak datasets, prompt injection attacks, RLHF/DPO vulnerabilities, or model extraction techniques- **Cybersecurity:** penetration testing, exploit development, reverse engineering- **Socio-technical risk analysis:** harassment or misinformation testing, abuse pattern analysis- **Creative adversarial thinking:** backgrounds in psychology, acting, writing, or other disciplines that support unconventional attack strategies

Benefits:

Apply tot his job

Apply To this Job