Our client is a well-known investment management firm in the Dallas area and they're seeking a Senior Data Engineer to join their team.
They need a Senior Data Engineer with 8+ years of hands-on expertise in data engineering, with a focus on creating reliable, efficient, and scalable systems. This role is pivotal in shaping the data architecture and pipelines for a fast-paced, data-driven environment. If you thrive on solving complex problems and want to be at the forefront of modern data engineering practices, then apply today!
In this player-coach role, you will lead by example, combining technical leadership with hands-on development to drive the design and implementation of innovative data solutions. You'll collaborate with cross-functional teams to support diverse data needs, working in industries such as Oil & Gas, Titles & Leases, or Financial Services, where data precision and scalability are paramount.
Education: A Bachelor's degree in Computer Science, Engineering (B.Tech, BE), or a related field such as MCA (Master of Computer Applications) is required for this role.
Experience: 8+ years in data engineering with a focus on building scalable and reliable data infrastructure.
Skills:
- Language: Proficiency in Java or Python or Scala. Prior experience in Oil & Gas, Titles & Leases, or Financial Services is a must have.
- Databases: Expertise in relational and NoSQL databases like PostgreSQL, MongoDB, Redis, and Elasticsearch.
- Data Pipelines: Strong experience in designing and implementing ETL/ELT pipelines for large datasets.
- Tools: Hands-on experience with Databricks, Spark, and cloud platforms.
- Data Lakehouse: Expertise in data modeling, designing Data Lakehouses, and building data pipelines.
- Modern Data Stack: Familiarity with modern data stack and data governance practices.
- Data Orchestration: Proficient in data orchestration and workflow tools.
- Data Modeling: Proficient in modeling and building data architectures for high-throughput environments.
- Stream Processing: Extensive experience with stream processing technologies such as Apache Kafka.
- Distributed Systems: Strong understanding of distributed systems, scalability, and availability.
- DevOps: Familiarity with DevOps practices, continuous integration, and continuous deployment (CI/CD).
- Problem-Solving: Strong problem-solving skills with a focus on scalable data infrastructure.
Key Responsibilities:
- This is a player-coach role, with high expectations of hands on design and development.
- Design and development of systems for ingestion, persistence, consumption, ETL/ELT, versioning for different data types e.g. relational, document, geospatial, graph, timeseries etc. in transactional and analytical patterns.
- Drive the development of applications related to data extraction, especially from formats like TIFF, PDF, and others, including OCR and data classification/categorization.
- Analyze and improve the efficiency, scalability, and reliability of our data infrastructure.
- Assist in the design and implementation of robust ETL/ELT pipelines for processing large volumes of data.
- Collaborate with cross-functional scrum teams to respond quickly and effectively to business needs.
- Work closely with data scientists and analysts to define data requirements and develop comprehensive data solutions.
- Implement data quality checks and monitoring to ensure data integrity and reliability across all systems.
- Develop and maintain data models, schemas, and documentation to support data-driven decision-making.
- Manage and scale data infrastructure on cloud platforms, leveraging cloud-native tools and services.