MDCalc Logo

MDCalc

QA Engineer, AI Products

Posted 21 Days Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
As a QA Engineer, you will ensure the quality of AI features, design test strategies, maintain automated pipelines, and collaborate on quality metrics.
The summary above was generated by AI
The Opportunity

Since 2005, MDCalc has been an essential part of the clinician’s workflow to help achieve better patient outcomes. Actively used by more than 65% of physicians worldwide, MDCalc is the most broadly used medical reference – at the point-of-care – for clinical decision tools and content, and one of only four references used by >50% of US HCPs. These evidence-based tools and content are used by millions of medical professionals globally and support 50+ specialties and cover 200+ patient conditions.

To continue to further accelerate and steward this growth, we are expanding the AI product team with a QA Engineer. This role will be critical to MDCalc’s expanded success in continuing to support our millions of clinical users worldwide in taking care of hundreds of millions of patients.

The Role

As a QA Engineer on the AI Products group at MDCalc, you will play a key role in ensuring the quality, reliability, and clinical trustworthiness of MDCalc's AI-powered features. You'll focus on the unique challenges of testing LLM-based systems, where outputs are non-deterministic, correctness is often a spectrum rather than a binary, and regressions can be subtle. You'll be part of a collaborative, fast-moving team that takes pride in delivering software that clinicians trust to care for millions of patients worldwide.

The responsibilities of this individual include the following, but are not limited to:

  • Design and execute test strategies for LLM-powered features, including prompt regression testing, output evaluation, and hallucination detection

  • Build and maintain automated evaluation pipelines (eval sets, golden datasets, LLM-as-judge frameworks) to catch quality regressions in non-deterministic outputs

  • Perform black-box and exploratory testing of MDCalc's AI features across web and mobile, with particular attention to clinical accuracy, safety, and edge cases

  • Define quality metrics for AI outputs (accuracy, faithfulness, relevance, safety, latency, cost) and establish thresholds for release readiness

  • Collaborate cross-functionally with engineers, product managers, ML/AI engineers, and clinical reviewers to define what "good" looks like for AI responses

  • Investigate and triage AI failure modes, distinguishing model issues, prompt issues, retrieval issues, and integration bugs

  • Participate in team discussions, offering feedback on testability, risks, prompt design, and guardrails

  • Help develop QA strategies to expand future testing capacity, automation, and evaluation coverage as the AI product surface grows

Your Background
  • 5+ years of experience in software QA, with at least 1 year of hands-on testing of LLM-based or AI/ML-powered features

  • Strong understanding of QA principles, test case creation/documentation, and best practices for both deterministic and non-deterministic systems

  • Hands-on experience with LLM tooling and concepts: prompt engineering, RAG systems, evaluation frameworks (e.g., Promptfoo, Braintrust, LangSmith, DeepEval, Ragas, OpenAI Evals), and LLM APIs (OpenAI, Anthropic, etc.)

  • Experience designing automated qualitative evaluation approaches, including LLM-as-judge, rubric-based scoring, semantic similarity checks, and golden dataset regression testing

  • Proficiency with test automation tools, with a focus on Playwright

  • Strong SQL skills for data validation, test data creation, and verifying data integrity across systems

  • Familiarity with token usage, latency profiling, and cost monitoring as quality signals

  • Eagerness to learn quickly and a positive, solutions-oriented attitude

  • Clear and concise communicator, able to surface issues, blockers, and risks effectively when communicating ambiguous or probabilistic failures

  • Self-motivated, proactive, and able to manage time and priorities independently

What MDCalc offers:
  • Ability to make a true difference in medicine: MDCalc is the most broadly used medical reference by physicians, used by over 65% of US attending doctors weekly

  • Medical, Dental, & Vision Coverage, with option to extend to your dependents

  • Company-sponsored short-term insurance

  • Fully-paid 8 week parental leave, after 6 months of employment

  • Company-sponsored 401k, after 3 months of employment

  • Unlimited vacation for salaried roles - we trust you to take the time you need

  • Bi-annual company offsites to connect, reflect, and plan together

  • Work from home monthly stipend

  • A culture of fun and motivated team members who believe in a greater mission here at MDCalc

Similar Jobs

2 Hours Ago
Easy Apply
Remote
United States
Easy Apply
101K-165K Annually
Senior level
101K-165K Annually
Senior level
Big Data • Fintech • Mobile • Payments • Financial Services
Lead a team of Customer Advocacy Associates to resolve complex escalations, own QA and vendor performance, manage SLAs/KPIs, drive cross-functional initiatives, analyze complaint trends, and implement scalable process and product improvements to enhance customer outcomes and reduce complaint rates.
3 Hours Ago
Remote
United States
Senior level
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Lead and scale a team of full‑cycle Account Executives to acquire and close SMB customers. Own revenue, pipeline health, deal quality, forecasting, and coaching across discovery, negotiation, and close. Build processes, metrics, CRM discipline, and a data/AI-driven operating cadence to improve win rates, deal velocity, and pipeline quality.
3 Hours Ago
Remote
United States
Mid level
Mid level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Lead full sales cycle for US SME customers across ecommerce, SaaS, and professional services: prospect, negotiate, close, onboard, and ramp accounts; translate technical product value; exceed revenue targets; collaborate with internal teams; represent Airwallex at events; and refine scalable sales processes.
Top Skills: Crm SystemsSales Enablement ToolsSalesforce

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account