Senior Data Engineer

TopShelf Talent Group • dallas, tx, us • 5m ago

Our client is a well-known investment management firm in the Dallas area and they're seeking a Senior Data Engineer to join their team.

They need a Senior Data Engineer with 8+ years of hands-on expertise in data engineering, with a focus on creating reliable, efficient, and scalable systems. This role is pivotal in shaping the data architecture and pipelines for a fast-paced, data-driven environment. If you thrive on solving complex problems and want to be at the forefront of modern data engineering practices, then apply today!

In this player-coach role, you will lead by example, combining technical leadership with hands-on development to drive the design and implementation of innovative data solutions. You'll collaborate with cross-functional teams to support diverse data needs, working in industries such as Oil & Gas, Titles & Leases, or Financial Services, where data precision and scalability are paramount.

Education: A Bachelor's degree in Computer Science, Engineering (B.Tech, BE), or a related field such as MCA (Master of Computer Applications) is required for this role.

Experience: 8+ years in data engineering with a focus on building scalable and reliable data infrastructure.

Skills:

Language: Proficiency in Java or Python or Scala. Prior experience in Oil & Gas, Titles & Leases, or Financial Services is a must have.
Databases: Expertise in relational and NoSQL databases like PostgreSQL, MongoDB, Redis, and Elasticsearch.
Data Pipelines: Strong experience in designing and implementing ETL/ELT pipelines for large datasets.
Tools: Hands-on experience with Databricks, Spark, and cloud platforms.
Data Lakehouse: Expertise in data modeling, designing Data Lakehouses, and building data pipelines.
Modern Data Stack: Familiarity with modern data stack and data governance practices.
Data Orchestration: Proficient in data orchestration and workflow tools.
Data Modeling: Proficient in modeling and building data architectures for high-throughput environments.
Stream Processing: Extensive experience with stream processing technologies such as Apache Kafka.
Distributed Systems: Strong understanding of distributed systems, scalability, and availability.
DevOps: Familiarity with DevOps practices, continuous integration, and continuous deployment (CI/CD).
Problem-Solving: Strong problem-solving skills with a focus on scalable data infrastructure.

Key Responsibilities:

This is a player-coach role, with high expectations of hands on design and development.
Design and development of systems for ingestion, persistence, consumption, ETL/ELT, versioning for different data types e.g. relational, document, geospatial, graph, timeseries etc. in transactional and analytical patterns.
Drive the development of applications related to data extraction, especially from formats like TIFF, PDF, and others, including OCR and data classification/categorization.
Analyze and improve the efficiency, scalability, and reliability of our data infrastructure.
Assist in the design and implementation of robust ETL/ELT pipelines for processing large volumes of data.
Collaborate with cross-functional scrum teams to respond quickly and effectively to business needs.
Work closely with data scientists and analysts to define data requirements and develop comprehensive data solutions.
Implement data quality checks and monitoring to ensure data integrity and reliability across all systems.
Develop and maintain data models, schemas, and documentation to support data-driven decision-making.
Manage and scale data infrastructure on cloud platforms, leveraging cloud-native tools and services.