Senior Data EngineerCharacter Biosciences (Hybrid)
Character Biosciences is a drug discovery and development company building world-class, deeply-phenotyped databases that integrate genomics with longitudinal clinical and imaging data. Our interdisciplinary team, comprising experts in clinical science, data science, statistical genetics, machine learning and drug discovery, utilizes this platform to determine genetic drivers of disease progression, advance novel therapeutics and define genetics-based patient stratification. Powered by our data platform, Character Bio is currently advancing two programs in Dry Age-related Macular Degeneration with additional programs in earlier stages of discovery research.
As a Senior Data Engineer, you will be responsible for developing and maintaining our data science and ML infrastructure. You will build and optimize automated pipelines, architect and maintain our databases, and help deliver data and ML products to our internal science teams. You will also integrate bioinformatics and data science workflows, dashboards, and ML models into our data platform for use by our science teams. This role is crucial in ensuring the integrity, performance, and scalability of our data systems, particularly in a biomedical setting.
Key Responsibilities:
- In collaboration with our DS, Genetics, and Engineering leads, contribute to and execute on our vision for cutting edge biomedical data and ML infrastructure that is reliable, secure, and scalable
- Architect and optimize our databases and data warehouses to support both our drug discovery and clinical development workflows
- Build and scale automated data and ML pipelines to process biomedical data, such as sequencing data, medical records, and clinical images
- Integrate genetics and bioinformatics pipelines into our data platform and optimize their performance
- Design and integrate data products including interactive dashboards and datasets for use by scientists and clinicians for decision making
- Implement processes to monitor our data quality and ensure our data is accurate and accessible to our internal science teams
- Implement best practices for data architecture, governance, and security
- Continuously evaluate new technologies and tools to improve the efficiency of our data infrastructure
Skills and Qualifications:
- BS or MS degree in Computer Science or a related technical field
- 3+ years of experience as a Data Engineer or in a similar role.
- Strong proficiency with Java and python development for data processing and analysis.
- Familiarity with data pipeline orchestration tools (e.g., Airflow, Luigi, NextFlow, Dataflow).
- Experience in high-performance cloud computing environments, particularly GCP or AWS
- Strong proficiency in SQL and with database management including PostgreSQL, MySQL, or NoSQL.
- Experience with big data tools like Apache Spark, pyspark, BigQuery for data processing and analysis.
- Experience in data architecture, ETL processes, and database optimization, including data schema design and version management
- Strong problem-solving skills, with a focus on performance optimization and scalability.
- Excellent communication skills, with the ability to work cross-functionally in a team-oriented environment.
Preferred Qualifications:
- Experience working with biomedical data (e.g., electronic medical records, claims, sequencing data, and clinical images).
- Experience productionizing and scaling ML workflows, particularly using vision or large language models (LLMs) for inference.
- Familiarity building interactive data visualization with tools such as Shiny, Streamlit, Flask, or Django
Benefits include a competitive salary, strong equity incentives, medical, dental, vision, 401(k), and a flexible paid time off policy. Character is committed to recruiting, developing, and supporting colleagues from all backgrounds. We embrace diversity, equity, and inclusion as an integral part of our culture. Pursuant to the San Francisco Fair Chance Ordinance, we will consider for employment qualified applicants with arrest and conviction records. We are an E-Verify company.