Job Title: Mid QA – Automated Testing & AI ValidationExperience: 3+ YearsEmployment Type: Contract (Short Term)OverviewWe are seeking a Mid-Level QA Engineer – Automated Testing & AI Validation with hands-on experience in testing AI-powered systems, particularly LLM-driven applications. The ideal candidate will have a strong QA automation background, a growing understanding of probabilistic AI outputs, and enthusiasm for building and improving AI evaluation frameworks in real-world, production environments.This role requires high autonomy, comfort with ambiguity, and strong written communication skills in an async-first setup.Key ResponsibilitiesDesign, develop, and execute automated test cases using Python and Pytest for AI-driven applications.Validate LLM integrations, APIs, and multi-agent workflows through functional, regression, and smoke testing.Perform intent classification, semantic similarity, and response consistency testing for conversational AI systems.Conduct hallucination detection and factual accuracy checks using automated and semi-automated methods.Implement response quality scoring using LLM-as-a-Judge evaluation patterns.Use LLM observability and tracing tools such as LangFuse or LangSmith to monitor and validate model behavior.Test conversational AI applications including chatbots and virtual assistants across use cases.Support Kubernetes-based application health checks and basic smoke testing.Integrate automated tests into CI/CD pipelines using GitHub Actions.Document test cases, evaluation criteria, defects, and QA findings clearly and concisely.Collaborate with engineering and AI teams to improve evaluation pipelines and testing strategies.Continuously learn and contribute to advanced AI evaluation methodologies.Skills3–5 years of experience in QA or test automation roles.Strong proficiency in Python-based test frameworks (Pytest).Basic to intermediate understanding of LLM evaluation concepts and AI system testing.Hands-on experience with LLM observability and tracing tools (LangFuse, LangSmith, or similar).Experience in API testing and validation of AI integrations.Exposure to conversational AI or chatbot testing.Familiarity with CI/CD processes and GitHub Actions.Knowledge of Kubernetes-based environments for testing and validation.Strong documentation and defect reporting skills.Ability to understand semantic correctness over strict assertion-based testing.Self-driven mindset with the ability to work independently and handle ambiguity.Preferred SkillsExperience testing production-grade LLM systems with real user traffic.Exposure to LLM-as-a-Judge frameworks and custom evaluation pipelines.Familiarity with Azure-based infrastructure (AKS, Key Vault, PostgreSQL).Understanding of multi-agent frameworks such as LangGraph or Microsoft Agent Framework.Interest in AI ethics, responsible AI, and high-quality evaluation practices.
Job Title
Mid QA - Automated Testing & AI Validation