Job Description
The road to the future is uncharted. By combining our expertise across connectivity, AI, security and more, we’ll map a new way forward. Working together, we’ll create a future that’s more connected, more intelligent, more sustainable for everyone.
PySpark Pipeline Development: Design, develop, deploy, and enhance petabyte-scale Spark structured streaming pipelines using Pyspark on Databricks.
Databricks Platform Ownership: Understand the intricacies of the Databricks Platform, manage our Databricks deployment using Terraform infrastructure as code, and possess a firm grasp of data governance concepts.
CI/CD & Infrastructure Management: Take ownership of the pipeline deployment lifecycle. Implement and manage robust CI/CD pipelines using Terraform to ensure smooth, automated, and reliable deployments.
Performance Optimization: Monitor pipeline performance, identify bottlenecks, and implement optimizations to ensure they operate at peak efficiency even as data volumes grow.
Troubleshooting & Issue Resolution: Proactively identify and resolve issues that may arise. The ability to diagnose problems quickly and implement effective solutions will be critical to maintaining system uptime.
Collaborative Development: Collaborate with data producers, data consumers, and data stewards to deliver dependable and scalable solutions.
Staying Ahead of the Curve: Keep abreast of the latest advancements in big data technologies, data engineering practices, and cloud computing. Explore new tools and techniques to improve our data infrastructure.
Education: Bachelor's or Master's degree in Computer Science, Data Science, or a related field.
Experience: 5+ years of hands-on experience in data engineering or a similar role, with a proven track record of building and maintaining large-scale data processing systems.
Spark Expertise: Deep proficiency in the Apache Spark data engineering engine, with a strong understanding of best practices for building high-performance applications. Proficiency with structured streaming preferred.
Big Data Technologies: Solid understanding of big data ecosystems such as Databricks. Experience with data warehousing and data lake concepts is a plus.
Problem-Solving Skills: Exceptional analytical and problem-solving skills, with the ability to break down complex problems into manageable components and develop effective solutions.
Communication & Collaboration: Excellent communication and interpersonal skills, with the ability to collaborate effectively with cross-functional teams
Delta Lake: Familiarity interacting with open source or Databricks proprietary Delta tables.
Equal Opportunity
Rivian and Volkswagen Group Technologies is committed to ensuring that our hiring process is accessible for persons with disabilities. If you have a disability or limitation, such as those covered by the Americans with Disabilities Act, that requires accommodations to assist you in the search and application process, please email us at candidateaccommodations@rivian.com.
Candidate Data Privacy
Rivian and Volkswagen Group Technologies may share your Candidate Personal Data with (i) internal personnel who have a need to know such information in order to perform their duties, including individuals on our People Team, Finance, Legal, and the team(s) with the position(s) for which you are applying; (ii) Rivian and Volkswagen Group Technologies affiliates; and (iii) Rivian and Volkswagen Group Technologies’ service providers, including providers of background checks, staffing services, and cloud services.
Rivian and Volkswagen Group Technologies may transfer or store internationally your Candidate Personal Data, including to or in the United States, Canada, and the European Union and in the cloud, and this data may be subject to the laws and accessible to the courts, law enforcement and national security authorities of such jurisdictions.
Please note that we are currently not accepting applications from third party application services.
Visit Original Source:
http://ca.indeed.com/viewjob