Overview:
A Senior Backend Java Kafka Developer is required to review and reverse engineer existing Java code, map data flows across multiple environments, and develop metadata and data lineage solutions. This role involves collaboration with technical teams to analyze applications, understand system design, and create data flow mappings across various data sources. The candidate will also work on metadata extraction automation and data lineage for both on-premises and cloud environments.
Key Responsibilities:
Code Review and Reverse Engineering:
Analyze and reverse engineer existing Java code to create source-to-target mapping for multiple data flows.
Map and document data flows from disparate data sources such as Kafka, Protocol Buffers, Redis, APIs, Flat Files, Databases, etc.
Ensure data flow and lineage documentation align with business requirements and data governance standards.
Metadata and Data Lineage Solutions:
Develop metadata solutions and data lineage documentation for on-premise and cloud systems.
Automate metadata extraction using custom connectors and programming tools to simplify lineage tracking across data sources.
Work with technical SMEs and developers to create detailed data flow diagrams and technical documentation for complex systems.
Custom Metadata Development:
Develop and implement programs that automate the extraction of metadata and creation of data lineage documents for various data platforms.
Use tools such as Python, PySpark, and Java to build solutions that automate and integrate metadata from different systems.
Data Quality & Governance:
Implement data quality solutions across data sources (Kafka, APIs, flat files, JSON, databases) to ensure adherence to governance standards.
Maintain data catalogs and data dictionaries and ensure compliance with data governance policies.
Documentation & Diagrams:
Create detailed technical documentation for Java-based applications that process data in real-time and batch.
Use tools like draw.io to create architecture and data flow diagrams for multiple systems.
Collaboration & Project Execution:
Collaborate with technical SMEs to understand applications and systems, enabling effective reverse engineering.
Support the administration and ingestion of metadata management assets through custom extensions and connectors.
Multitask across multiple projects, ensuring adherence to deadlines and providing updates to stakeholders.
Required Skills and Experience:
6+ years of data analysis experience focusing on metadata, data flows, and mappings.
Proficient in Java, especially Java 8 onwards, with hands-on experience in Spring, Spring Boot, microservices, and REST APIs.
Experience working with Kafka streams, protocol buffers, and APIs as data sources.
Strong SQL knowledge and proficiency in programming with Python or PySpark for data analysis.
Hands-on experience with a variety of databases (relational, NoSQL, object-based) and familiarity with Git for version control.
Proven experience with data lineage and metadata management across on-premise and cloud platforms.
Ability to understand Java codebases and reverse engineer data mappings and flows.
Experience working with draw.io or similar tools to create architecture or data flow diagrams.
Excellent skills in technical documentation, writing, and maintaining documentation for complex, real-time data systems.
Preferred Skills:
Experience working with Kafka and Protobuf as data sources.
Familiarity with data governance tools and solutions.
Knowledge of object-oriented design and software design patterns.
Ability to design and develop data quality solutions for multiple data environments.