WebMD and its affiliates is an Equal Opportunity/Affirmative Action employer and does not discriminate on the basis of race, ancestry, color, religion, sex, gender, age, marital status, sexual orientation, gender identity, national origin, medical condition, disability, veterans status, or any other basis protected by law.About PulsePoint:PulsePoint is a fast-growing healthcare technology company (with adtech roots) using real-time data to transform healthcare. We help brands and agencies interpret the hard-to-read signals across the health journey and unify these digital determinants of health with real-world data to produce the most dimensional view of the customer. Our award-winning advertising platforms use machine learning and programmatic automation to seamlessly activate this data, making marketing, predictive analytics, and decision support easy and instantaneous.DescriptionOur Data & Analytics team is at the very heart of what makes PulsePoint an innovative, fast-paced, and market-changing company.Our path forward is through data and this team is in the driver's seat for the journey.The Big Picture:You will build, deliver & continually innovate on PulsePoint's insightful reporting and data-driven solutions. Your efforts help alleviate friction points and streamline processes that enable internal teams to provide exceptional service, powering the decisions of our customers.As a Data Engineer, ML/Data Science, you will use your data science and stats expertise to contribute to R&D projects for DTC, new Data Products, and Bespoke Segments expansion. You can work fully remotely in India, and we will provide you with a company-issued laptop. This is a FTE role.In short, you will be the conduit through which we will revolutionize health decisions through real-time data.Key Responsibilities:Write robust, modular, production-ready code in Python and SQL, following best practices in OOP, version control (Git), and software design principles.Collaborate with other data scientists to design and productionize ML models and integrate them into end-to-end data systems.Build tools, frameworks and ETL/ELT pipelines to enable efficient data access, processing, and model deployment.Apply a working knowledge of common ML algorithms (classification, regression, clustering, etc.) to support experimentation and solution design.Here are some projects you can help with:Data science & stats-related projectsWork on R&D projects for DTCHelp build new Data ProductsContribute to Bespoke Segments expansionHelp us design and define the methodology for our measurement products and user identificationContinuously improve the quality of HCP onboarding/Targeting/measurementAudience IQ/DTC product development, Identity graph/Data IQCollaborate with internal teams to delight our customers with timely and accurate data reporting that meets all requirementsResearch & implement new data products or capabilitiesAutomate data visualization and reporting capabilities that empower users (both internal and external) to access data on their own, thereby improving quality, accuracy, and speedSynthesize raw data into actionable insights to drive business results, identify key trends and opportunities for business teams, and report the findings in a simple, compelling wayEvaluate and approve additional data partners or data assets to be utilized for identity resolution, targeting, or measurementEnhance PulsePoint's data reporting and insights generation capability by publishing internal reports about Health dataAct as the “Subject Matter Expert” to help internal teams understand the capabilities of our platforms, how to implement & troubleshootRequirementsRequired qualifications:2-6 years of hands-on experience as a Data Science Engineer, ML Engineer, or similar role4-5+ years of relevant experience in:-Strong SQL skills for querying and managing structured datasets on cloud databases like GCP, AWS, Trino etc.-Highly proficient knowledge of Excel (pivot tables, VLOOKUP, formulas, functions)-Data analysis & manipulation-Solid programming experience in Python, especially in production environments (modular design, data validation, error handling, testing)At least a Bachelor’s degree in Business Intelligence and Analytics or closely related fieldPractical experience with:-Knowledge of Distributed Systems and Cluster computing frameworks like Apache Spark, for large-scale data processing and machine learning with PySpark ML-Google Cloud Architecture covering BigQuery, Cloud Storage (GCS), Compute Engine VMs, Dataproc clusters-ML Pipeline Orchestration-Deploying and managing ML models, with working knowledge of Bagging & Boosting Techniques, Model performance metrics, hyperparameter tuning etc.-MLOps practices, exposure to MLflow, Vertex AI, or other MLOps toolsExperience with Containerization (Docker) and KubernetesKnowledge of Airflow, Dagster, or similar orchestration toolsProven experience in experimentation methods and Stats modeling in support of product development and optimizationWilling and able to work 3:30pm-12:30am IST, you can work fully remotelyPreferred qualifications:Experience with LookML & DBTUnderstanding of Frontend Dev ToolsAnd one of:-ELT experience-Tableau/Looker/PowerBI-Experience with automationAble to organize large data sets to answer critical questions, extrapolate trends, and tell a storyExperience in Programmatic/AdtechFamiliarity with health-related data setsWhat are ‘red flags’ for us:Candidates won’t succeed here if they haven’t worked closely with data sets or have simply translated requirements created by others into SQL without a deeper understanding of how the data impacts our business and, in turn, our clients’ success metrics.Selection Process (order of these sessions may be subject to change):1) Online SQL Test (40 mins)2) Initial Screen (30 mins)3) Hiring Manager Interview (45 mins)4) Video call w/ Sr. Data Scientist (45 mins)4) 1:1s w/ SVP of Data (30 mins)
Job Title
(Remote, India) Data Engineer, ML/Data Science