Would you be interested in working for a fast-growing start-up in Palo Alto, CA who are building AI infrastructure and seeking an experienced AI Interpretability Staff Research Scientist to drive their efforts in developing transparent, explainable, and reliable AI systems?
In this role, you will work closely with our product and research teams to ensure our foundation models, including large language models and multimodal systems, are interpretable and align
with their standards of safety, honesty, and helpfulness. Your work will be crucial in advancing the field of AI interpretability and shaping the responsible development and deployment of AI technologies that can positively impact millions of lives across India and beyond.
Responsibilities:
- Lead research initiatives in AI interpretability, focusing on models with 10s and 100s of billions of parameters.
- Develop and implement advanced interpretability techniques for large language and multimodal models, including but not limited to LIME (Local Interpretable
- Model-Agnostic Explanations), SHAP (SHapley Additive exPlanations), and attention-based methods.
- Design and conduct robust experiments to understand LLMs by reverse engineering algorithms learned in their weights, both in quick toy scenarios and at scale in large models.
- Build infrastructure for running experiments and visualizing results to enhance model interpretability.
- Collaborate with the product team to integrate interpretability measures throughout the AI development pipeline, ensuring our models deliver helpful, honest, and transparent outputs.
- Lead the development of innovative methodologies for evaluating and improving the interpretability of large-scale AI models.
- Stay abreast of the latest developments in AI interpretability and explainable AI, contributing to the broader scientific community through publications and conference presentations.
Please reach out to Jia for more information.