We are searching for Professionals below business requirements for one of our clients. Please read through the requirements and connect with us in case it suits your profile.
[email protected]
Implementation Partner: HCL America
Key Responsibilities:
· Design, develop, and implement scalable, high-performance data solutions on GCP.
· Ensure that changes to data access permissions are reflected in the Tableau dashboard within 24 hours.
· Collaborate with technical and business users to share and manage data sets across multiple projects.
· Utilize GCP tools and technologies to optimize data processing and storage.
· Re-architect the data pipeline that builds the BigQuery dataset used for GCP IAM dashboards to make it more scalable.
· Run and customize DLP scans.
· Build bidirectional integrations between GCP and Collibra.
· Explore and potentially implement Dataplex and custom format-preserving encryption for de-identifying data for developers in lower environments.
Qualifications – Required:
· Bachelor's degree in Computer Engineering or a related field.
· 5+ years of experience in an engineering role using Python, Java, Spark, and SQL.
· 5+ years of experience working as a Data Engineer in GCP.
· Proficiency with Google’s Identity and Access Management (IAM) API.
· Strong Linux/Unix background and hands-on knowledge.
· Experience with big data technologies such as HDFS, Spark, Impala, and Hive.
· Experience with Shell scripting and bash.
· Experience with version control platforms like GitHub.
· Experience with unit testing code.
· Experience with development ecosystems including Jenkins, Artifactory, CI/CD, and Terraform.
· Demonstrated proficiency with Airflow.
· Ability to advise management on approaches to optimize for data platform success.
· Ability to effectively communicate highly technical information to various audiences, including management, the user community, and less-experienced staff.
· Proficiency in multiple programming languages, frameworks, domains, and tools.
· Coding skills in Scala.
· Experience with GCP platform development tools such as Pub/Sub, Cloud Storage, Bigtable, BigQuery, Dataflow, Dataproc, and Composer.
· Knowledge in Hadoop and cloud platforms and surrounding ecosystems.
· Experience with web services and APIs (RESTful and SOAP).
· Ability to document designs and concepts.
· API Orchestration and Choreography for consumer apps.
· Well-rounded technical expertise in Apache packages and hybrid cloud architectures.
· Pipeline creation and automation for data acquisition.
· Metadata extraction pipeline design and creation between raw and transformed datasets.
· Quality control metrics data collection on data acquisition pipelines.
· Experience contributing to and leveraging Jira and Confluence.
· Strong experience working with real-time streaming applications and batch-style large-scale distributed computing applications using tools like Spark, Kafka, Flume, Pub/Sub, and Airflow.
· Ability to work with different file formats like Avro, Parquet, and JSON..
· Hands-on experience in Analysis, Design, Coding, and Testing phases of the Software Development Life Cycle (SDLC)