This is an incredible opportunity to be part of a company that has been at the forefront of AI and high-performance data storage innovation for over two decades. DataDirect Networks (DDN) is a global market leader renowned for powering many of the world's most demanding AI data centers, in industries ranging from life sciences and healthcare to financial services, autonomous cars, Government, academia, research and manufacturing. /"DDN's A3I solutions are transforming the landscape of AI infrastructure./" – IDC “The real differentiator is DDN. I never hesitate to recommend DDN. DDN is the de facto name for AI Storage in high performance environments” - Marc Hamilton, VP, Solutions Architecture & Engineering | NVIDIA DDN is the global leader in AI and multi-cloud data management at scale. Our cutting-edge data intelligence platform is designed to accelerate AI workloads, enabling organizations to extract maximum value from their data. With a proven track record of performance, reliability, and scalability, DDN empowers businesses to tackle the most challenging AI and data-intensive workloads with confidence. Our success is driven by our unwavering commitment to innovation, customer-centricity, and a team of passionate professionals who bring their expertise and dedication to every project. This is a chance to make a significant impact at a company that is shaping the future of AI and data management. Our commitment to innovation, customer success, and market leadership makes this an exciting and rewarding role for a driven professional looking to make a lasting impact in the world of AI and data storage. Principal Architect – Quality Engineering FrameworkAbout the RoleAs Principal Architect for Quality Engineering Frameworks, you will own the technical vision, architecture, and evolution of Infinia’s pytest‑based automation platform. This role is responsible for transforming automation into a scalable, developer‑centric service that validates correctness, performance, resilience, and real‑world behavior across highly distributed systems.This is a hands‑on, staff‑plus individual contributor role with broad organizational influence. You will write and review production‑quality code, define architectural standards, and mentor engineers across QE and Development.ResponsibilitiesFramework Architecture· Own the end‑to‑end architecture and technical direction of the Python/pytest automation framework.· Define architectural standards, extension points, and long‑term evolution.· Make and document architectural tradeoffs through design reviews and Architectural Decision Records.· Ensure the framework scales with system complexity and organizational growth.Automation Platform and Reusable Tooling· Design and implement reusable Python libraries, pytest fixtures, and plugins.· Provide a self‑service automation platform with a clear, opinionated paved road for developers.· Enable testing of APIs, CLIs, storage systems, and distributed workloads using shared abstractions.· Maintain strict standards for determinism, readability, and maintainability.System Correctness, POSIX, S3, and Storage Validation· Architect automation validating POSIX filesystem semantics, including metadata operations, locking, concurrency, permissions, and consistency· Validate object, block, and networked storage systems including S3‑compatible object storage, NVMe/iSCSI, and NFS/SMB.· Ensure correctness under failure scenarios, scale, and sustained load.Distributed Systems, Resilience, and Scale· Design automation covering clustering behavior, membership changes, failover, and recovery.· Validate horizontal and elastic scaling in real deployment conditions.· Extend automation into repeatable resilience and chaos testing beyond simple failure injection.Performance and Stress Testing· Integrate performance and stress testing into CI/CD pipelines.· Use tools such as fio, IOR, Minio Warp, Mongoose, and MLPerf.· Validate throughput, latency, and stability and continuously detect performance regressions.Cloud and Execution Environments· Architect automation to execute consistently across AWS and GCP.· Support execution on Kubernetes, Docker, hypervisors, and bare‑metal systems.· Validate cloud‑specific behaviors including autoscaling, contention, networking variability, and zonal failure modes.· Balance execution scale, cost, and feedback time.Telemetry‑Driven Validation· Integrate automation with Grafana, Prometheus, and ELK.· Validate system behavior using metrics and logs in addition to test assertions.· Enable deep diagnostics and root‑cause analysis from automated runs.Code Quality and Technical Leadership· Lead code reviews for framework and automation contributions.· Enforce architectural and coding standards across the automation repository.· Act as a technical authority for test design and framework usage.· Mentor QE and Development engineers in Python, pytest, and automation architecture.
Job Title
QA Architect - Storage