The Role:
As a Lead Data Engineer, you will spearhead the creation of both the team and the data pipeline from the ground up. You will tackle cutting-edge challenges in the big data and AI sectors, requiring a strong passion for innovation. Collaboration with cross-functional teams will be essential to understand business requirements and translate them into technical solutions. Additionally, you will provide technical guidance, mentorship, and support to junior team members.
Responsibilities:
- Design and implement efficient, scalable, and reliable data pipelines and infrastructure for ingesting, processing, and transforming large datasets.
- Develop the technical strategy and roadmap for data engineering projects, aligning them with business objectives. Continuously evaluate and integrate industry best practices and advanced technical approaches, updating the strategy as the industry evolves.
- Build and lead a high-performing data engineering team, offering business, technical, and personal development coaching.
- Drive data engineering projects by leveraging both internal and cross-functional resources, setting ambitious targets, and achieving them through innovative methods.
- Promote a collaborative and inclusive team culture, working closely with data scientists and other teams.
Requirements:
Minimum Qualifications:
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field.
- 5 years of experience in data engineering, with proven expertise in building scalable data pipelines and infrastructure.
- Proficiency in programming languages such as Python, SQL, and Java.
- Strong understanding of database systems, data warehousing, and distributed computing concepts.
- Excellent leadership, communication, and interpersonal skills.
- Demonstrated success in leading teams and delivering complex data engineering projects.
- Ability to thrive in a fast-paced, dynamic environment and manage multiple priorities effectively.
Preferred Qualifications:
- 3 years of experience with big data technologies such as Apache Spark, Hadoop, and Kafka.