Reinforcement Learning Engineer — Physical Intelligence (Humanoids)MethdAI | New Delhi, India | Full-Time | On-SiteAbout MethdAIOur Physical Intelligence initiative is where cutting-edge robotics meets real-world deployment — and we're just getting started. We are building humanoid robots that think, adapt, and act. If you want your code to move steel and change what robots can do in the real world, this is your seat at the table.The RoleWe are looking for a Reinforcement Learning Engineer who is equal parts researcher and builder. You will own the full RL pipeline — from policy design in simulation to hardware deployment on a real humanoid robot. This is a foundational hire on an early-stage team where your decisions directly shape the product and the science behind it.What You'll Work OnDesign and iterate on RL-based grasping policies for real-world robotic manipulation tasks, pushing the boundaries of what our humanoid arm can autonomously achieve.Benchmark SB3 algorithms (PPO, SAC, TD3, and beyond) against manipulation and locomotion tasks, building rigorous evaluation pipelines to guide algorithm selectionBuild and maintain sim-to-real pipelines — closing the gap between simulated training environments and the behaviour of physical hardware.Deploy trained policies on real humanoid hardware, collaborating closely with the robotics team on integration, testing, and iteration.Instrument and evaluate experiments end-to-end: reward shaping, exploration tuning, policy stability, and transfer robustnessWhat We're Looking ForMust-Have:M-Tech in AI/Robotics or B-Tech/M-Tech in Computer Science, Electrical or Mechanical EngineeringStrong Python skills with a commitment to clean, modular, and testable codeSolid command of RL fundamentals: MDPs, policy gradients, value functions, actor-critic architectures, reward shaping, and exploration strategiesHands-on experience training and comparing policies using Stable Baselines3 (PPO, SAC, TD3, or equivalents)Working knowledge of robotic arm kinematics and the sim-to-real transfer problemAbility to operate independently in an ambiguous, fast-moving environmentNice-to-Have:Experience with simulation platforms such as MuJoCo or Isaac SimFamiliarity with imitation learning, behaviour cloning, or inverse RLPrior work deploying policies on physical robotic hardwareContributions to open-source RL or robotics middleware (e.g., ROS/ROS2)What We OfferHands-on access to real humanoid hardware — you'll deploy and test policies on an actual robot, not just in simulationFull creative freedom to explore approaches; we value intellectual courage and novel thinking over rigid playbooksA rare opportunity to be an early team member shaping the product, the codebase, and the culture of Physical Intelligence at MethdAIDirect mentorship and collaboration in a high-ownership, low-bureaucracy environmentCompetitive compensation commensurate with experience
Job Title
Reinforcement Learning Engineer