McKesson logo

Lead Data Engineer

McKesson

Irving, TX
Full Time
Senior
150k-250k
6 days ago

Job Description

About the Role

McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve - we care. We foster a culture where you can grow, make an impact, and are empowered to bring new ideas. Together, we thrive as we shape the future of health for patients, our communities, and our people. If you want to be part of tomorrow's health today, we want to hear from you.

Key Responsibilities

  • Lead the design and development of enterprise data assets including data models, feature stores, and analytical datasets using Azure modern data stack
  • Architect scalable data pipelines and ETL/ELT processes leveraging Azure Data Factory, Azure Synapse Analytics, and Azure Data Lake Storage
  • Implement advanced data processing solutions using Apache Spark on Azure Databricks for large-scale data transformation and analytics
  • Develop reusable data frameworks and libraries in Python to accelerate data asset creation and ensure consistency across the organization
  • Establish data asset governance including versioning, lineage tracking, and quality monitoring to ensure enterprise-grade reliability
  • Lead and mentor a team of 8-12 data engineers focused on data asset development and optimization
  • Provide technical guidance on complex data engineering challenges, architectural decisions, and best practices
  • Foster collaborative development environment emphasizing code quality, testing, and continuous improvement
  • Drive knowledge sharing initiatives and technical training to elevate team capabilities in modern data engineering practices
  • Collaborate with cross-functional teams including Data Science, Analytics, and Business Intelligence to deliver integrated data solutions
  • Optimize Azure Synapse Analytics workflows for high-performance data processing and analytical workloads
  • Implement efficient data storage strategies using Azure Data Lake Storage Gen2 with appropriate partitioning and compression techniques
  • Leverage Azure Data Factory for orchestrating complex data workflows and managing data pipeline dependencies
  • Utilize Azure Databricks for advanced Spark-based data processing, machine learning pipelines, and real-time analytics
  • Integrate with Azure services including Cosmos DB, Event Hubs, and Service Bus for comprehensive data ecosystem solutions
  • Develop sophisticated data processing applications using Python with emphasis on performance, scalability, and maintainability
  • Implement advanced Spark programming techniques including RDD operations, DataFrame API, and Spark SQL for optimal data processing
  • Leverage PySpark for large-scale data transformations, aggregations, and complex analytical computations
  • Utilize Spark Streaming for real-time data processing and event-driven analytics solutions
  • Implement Delta Lake patterns for reliable data lakes with ACID transactions and time travel capabilities
  • Establish comprehensive data quality frameworks including validation, profiling, and anomaly detection
  • Implement performance monitoring and optimization strategies for data pipelines and processing workflows
  • Design and implement data testing strategies including unit testing, integration testing, and data validation
  • Optimize Spark jobs for cost efficiency and performance including cluster sizing, caching strategies, and partition optimization
  • Ensure data asset documentation, metadata management, and knowledge transfer processes

Requirements

  • Degree or equivalent with typically 10+ years of relevant experience; fewer years if holding relevant Master's or Doctorate qualifications
  • Expert-level proficiency in Python programming including advanced libraries (Pandas, NumPy, Scikit-learn, PyTorch/TensorFlow)
  • Deep expertise in Apache Spark ecosystem including Spark Core, Spark SQL, PySpark, and Spark Streaming
  • Extensive experience with Microsoft Azure data services including Azure Data Factory, Azure Synapse Analytics, Azure Data Lake Storage, and Azure Databricks
  • Strong background in SQL and database technologies including data modeling, query optimization, and performance tuning
  • Proficiency in version control systems (Git), CI/CD pipelines, and infrastructure-as-code practices
  • Proven experience leading technical teams of 5+ data engineers with focus on mentorship and skill development
  • Strong project management skills with ability to deliver complex data engineering projects on time and within scope
  • Advanced Python programming with focus on data processing, analysis, and pipeline development
  • Experience with batch processing, real-time data processing, streaming analytics, and event-driven architectures
  • Knowledge of data governance, data quality, and metadata management best practices

Nice to Have

  • Azure certifications including Azure Data Engineer Associate or Azure Solutions Architect Expert
  • Databricks certifications (Spark Developer, Data Engineer Professional)
  • Experience with additional Azure services including Azure Machine Learning, Azure Cognitive Services, and Azure Functions
  • Knowledge of container technologies (Docker, Kubernetes) and serverless computing patterns
  • Understanding of data security, privacy, and compliance requirements in enterprise environments
  • Deep understanding of data architecture patterns including data lakes, data warehouses, and modern data platform design
  • Knowledge of async programming, multiprocessing, and performance optimization techniques
  • Familiarity with testing frameworks (pytest, unittest) and code quality tools (Black, Flake8, MyPy)
  • Advanced Azure Synapse Analytics usage including dedicated SQL pools, serverless SQL, and Spark pools

Qualifications

  • Degree or equivalent with typically 10+ years of relevant experience; fewer years if holding relevant Master's or Doctorate qualifications

Benefits & Perks

  • Competitive compensation package including base salary, performance bonus, and equity participation
  • Comprehensive benefits including health, dental, vision, and retirement planning
  • Professional development opportunities including training, certifications, and conference attendance
  • Opportunity to work with cutting-edge data technologies at Fortune 10 scale
  • Collaborative culture emphasizing technical excellence, innovation, and continuous learning
  • Clear career advancement path within our growing data engineering organization

Working at McKesson

McKesson fosters a culture emphasizing impact, growth, innovation, technical excellence, and continuous learning. We are committed to diversity and inclusion, supporting our employees' well-being and professional development.

Apply Now

Job Details

Posted AtJun 16, 2025
Job CategoryData Engineering
Salary150k-250k
Job TypeFull Time
ExperienceSenior

About McKesson

Website

mckesson.com

Company Size

10000+ employees

Location

Irving, TX

Industry

Drugs and Druggists' Sundries Merchant Wholesalers

Get job alerts

Set up personalized alerts for your job search and get tailored job digests for close matches