Overview
We are seeking an experienced and hands-on Databricks Specialist to join our partner’s team. In this role, you will be responsible for auditing current workloads and engineering robust, efficient data-processing pipelines.
Key Responsibilities
- Audit existing Databricks workspaces, notebooks, and jobs to ensure performance, security, and cost-efficiency.
- Design and build modular batch and streaming sub-pipelines that land, cleanse, enrich, and publish data as physical tables across all medallion layers.
- Implement robust exception handling, alerting, and automated rollback/retry mechanisms.
- Create comprehensive documentation, reusable code samples, and lead knowledge transfer sessions for data engineers and analysts.
Required Skills & Experience
- 5+ years of experience building production-grade ETL/ELT pipelines on Databricks (Spark 3.x, Delta Lake).
- In-depth knowledge of the medallion architecture, performance tuning (e.g., Optimize/Z-Ordering), and cost-effective cluster configurations.
- Proficiency in PySpark or Scala, along with advanced SQL skills — including job profiling and fine-tuning at the shuffle partition level.
- Demonstrated experience in building reliable pipelines with data quality checks, unit testing, and CI/CD (e.g., Databricks Repos, GitHub Actions, Azure DevOps).
Nice to Have
- Relevant certifications such as Databricks Data Engineer Professional or Databricks Champion Enablement (DBCE).