Data Engineer |Remote|

Hi,

Hope you are doing well

I do have a position from one of our client. Below is the job description, let me know if you are interested.

Job title:– Data Engineer

Location:- Remote

End-Client:-  (84.51) Working with Prime Vendor

Work Authorisation- USC Only

Job Description:-

Top 3 Skills: Python, Databricks, Azure Cloud Services

Project: Clean Room Initiative – Measuring and reporting on campaign effectiveness

Interview Process: Panel interview with live technical screening assessment

Prescreening: Standard (5 questions + games)

We are a data science company and a wholly owned subsidiary of The Kroger Company. We own 10 Petabytes of data and collect 35+ terabytes of new data each week sourced from 62 million households. As a member of our engineering team, you will use various cutting-edge technologies to further the Clean Room initiative by developing and scaling data pipelines that turn our data into actionable insights used to personalize the customer experience for shoppers at Kroger. We use agile development methodologies to iteratively deliver and build our products.

As a Senior Data Engineer, you will have the opportunity to design and build software products and features for both internal and external clients. We are a team of innovators, continuously exploring new technologies that ensure 84.51° remains on the forefront of data development. In this position, you will be using software engineering best practices to build data pipelines utilizing Python, PySpark/Snowpark, Azure Cloud Services, and analytical platforms such as Databricks and Snowflake.

As part of the Clean Room engineering team, you will work with transaction, audience, and exposure data to measure and report on campaign effectiveness while ensuring that our data and 3rd party data is treated with the highest standards of privacy compliance.

Responsibilities

  • Develop distributed data processing data pipeline solutions to support large-scale data transformations and processing
  • Orchestrate multi-step data transformation pipelines to ensure seamless data flow and integration
  • Perform unit, integration, and regression testing on packaged code to maintain system reliability and accuracy
  • Build transformation logic and code using Object-Oriented Programming principles to enhance maintainability and scalability
  • Enhance CI/CD pipelines to streamline the deployment process and improve efficiency in the path to production
  • Create data quality checks for both ingested and post-processed data to ensure accuracy and consistency
  • Implement data observability by setting up alerting and monitoring for automated pipeline solutions
  • Maintain and enhance existing applications, improving functionality and performance as needed
  • Build cloud resources using infrastructure as code to support scalable and efficient cloud deployments
  • Provide mentoring and technical guidance to junior team members to foster skill development and collaboration
  • Participate in retrospective reviews to evaluate and improve development processes
  • Contribute to the estimation process for new work and releases to ensure accurate planning and resource allocation
  • Bring new perspectives to problem-solving and contribute innovative ideas to enhance workflows and systems
  • Stay committed to continuous learning and process improvement, striving to improve personal skills and optimize development practices

Requirements

  • 2+ years of professional data development experience
  • 2+ years of experience with SQL and Python development
  • Proficiency in software engineering best practices
  • Experience with data pipeline orchestration
  • Strong knowledge of distributed data processing using PySpark and/or Snowpark
  • Proficiency in automated testing with tools like PyTest
  • Experience with version control systems such as Git and GitHub
  • Proficiency in Python frameworks and object-oriented programming principles
  • Understanding of data observability, logging, monitoring, and alerting
  • Experience implementing data quality checks and processes
  • Familiarity with cloud technologies and services, with a preference for Azure (GCP and AWS also considered)
  • Experience with CI/CD practices and dependency management tools like Conda or venv
  • Ability to debug enterprise applications and optimize performance
  • Understanding of infrastructure as code and SOLID principles
  • Experience working within Agile methodologies, particularly Scrum

Nice to Have

  • Advanced Python development and object-oriented programming
  • Experience with Databricks and/or Snowflake
  • Strong background in distributed data processing with PySpark or Snowpark
  • Hands-on experience with CI/CD practices for data pipelines
  • Proficiency in automated data pipeline orchestration
  • Deep understanding of API development and integration
  • Expertise in implementing robust data quality checks
  • Extensive experience working with cloud technologies for scalable data solutions
 
 

click here

0 0 votes
Article Rating
Subscribe
Notify of
guest
0 Comments
Most Voted
Newest Oldest
Inline Feedbacks
View all comments