GlaxoSmithKline logo

Senior NLP Data Engineer

GlaxoSmithKline

Cambridge, MA
Full Time
Senior
about 1 month ago

Job Description

About the Role

A Senior NLP Data Engineer at GSK's Onyx Research Data Platform organization will be a leading technical contributor responsible for designing, building, and maintaining data tools, services, and pipelines that support AI-driven products and NLP/GenAI solutions. The role involves collaborating with cross-functional teams to leverage modern data engineering tools and techniques, ensuring high-quality, scalable, and maintainable data systems that accelerate GSK's predictive capabilities and scientific research efforts.

Key Responsibilities

  • Design, build, and operate data tools, services, workflows, and pipelines that deliver high value through AI-driven products using modern data engineering tools (e.g., Spark, Kafka, Storm) and orchestration tools (e.g., Google Workflow, AirFlow Composer).
  • Partner with AIML and knowledge graph platform teams to build, test, and deploy NLP and GenAI pipelines, systems, and solutions.
  • Apply graph-based data modeling techniques for efficient organization, integration, and data retrieval to ensure system flexibility and maintainability.
  • Produce well-engineered software, including automated test suites, technical documentation, and operational strategies.
  • Identify opportunities to reuse modular code and develop microservices to improve efficiencies.
  • Contribute to the roadmaps of upstream teams (e.g., Data Platforms, DataOps, DevOps) to enhance overall program effectiveness.
  • Ensure consistent application of platform abstractions for logging and lineage to maintain quality and traceability.
  • Participate in code reviews, adhere to coding best practices, and promote team standards.
  • Follow QMS framework and CI/CD best practices, and help improve them for better ways of working.
  • Provide leadership and mentorship to team members to ensure high-quality delivery and operational robustness.

Requirements

  • Bachelor's degree in Data Engineering, Computer Science, Software Engineering, or related discipline.
  • 5+ years of industry experience in data engineering.
  • Knowledge of NLP and GenAI techniques, experience processing unstructured data, using vector stores, and approximate retrieval.
  • Experience building end-to-end systems based on machine learning or deep learning methods.
  • Experience overcoming high volume, high compute challenges.
  • Familiarity with orchestrating tooling.
  • Cloud experience (e.g., AWS, Google Cloud, Azure).
  • Experience with automated testing and design.
  • Experience with DevOps practices.
  • Deep knowledge of at least one programming language (e.g., Python, Scala, Java).
  • Experience with big data tools (e.g., Spark, Kafka, Storm).
  • Proven experience with machine learning algorithms and NLP frameworks like Pytorch, TensorFlow, Spacy.
  • Experience with CI/CD implementations using git and tools like Jenkins, CircleCI, GitLab, Azure DevOps.
  • Experience working in agile environments using tools like Jira and Confluence.
  • Experience with Infrastructure as Code and automation tools such as Terraform.

Nice to Have

  • Master's or PhD in Data Engineering, Computer Science, Software Engineering, or related discipline.
  • Good understanding of ontologies and semantic harmonization of data across sources.
  • Experience implementing Generative AI solutions.
  • Proven track record working with knowledge graphs and graph databases.
  • Proficiency in semantic web technologies (SPARQL, RDF, OWL).
  • Experience working with complex biomedical datasets, including genomics, proteomics, and high-throughput screening.

Benefits & Perks

  • Visit GSK US Benefits Summary to learn more about the comprehensive benefits program GSK offers US employees.

Working at GlaxoSmithKline

GSK is a global biopharma company committed to uniting science, technology, and talent to get ahead of disease. The company values an inclusive, inspiring, and growth-oriented environment where employees are encouraged to be themselves, feel valued, and contribute to impactful health solutions. GSK emphasizes innovation, collaboration, and employee wellbeing, fostering a workplace where people can thrive and make a difference in global health.

Apply Now

Job Details

Posted AtJul 12, 2025
Job CategoryData Engineering
SalaryCompetitive salary
Job TypeFull Time
Work ModeOnsite
ExperienceSenior

Job Skills

AI Insights

Key skills identified from this job posting

Sign upto access all insights for this job

About GlaxoSmithKline

Website

gsk.com

Company Size

10000+ employees

Location

Cambridge, MA

Industry

Pharmaceutical and Medicine Manufacturing

Get job alerts

Set up personalized alerts for your job search and get tailored job digests for close matches