Several growth-stage investments of ours in San Francisco, CA are looking for experts in GPU Optimization / Inference Acceleration.
In general, these are the responsibilities:
- Primarily focused on GPGPU programming to increase the performance of the product -- writing, debugging, and optimizing CUDA code from GPU kernel-level on upward to improve the holistic performance of new AI models
- Play a key role creating all of the tooling and associated infrastructure to increase the performance of the company -- from fairly straight-forward projects (profilers) to incredibly complex (new inference engines)
In general, these are the expectations:
- Proven background in CPU acceleration and/or GPU optimization (latter preferred) with a strong preference toward candidates who have expertise in CUDA Kernel hacking
- Experience working in deep learning environments and/or on products targeting high-performance ML systems
- Strong coding skills in high-performance environments (C/C++)
Please note: Due to the volume of applicants we typically receive, a follow-up email will not be sent unless a match is identified (sorry).
About us:
We are full-time, salaried employees of Greylock, and there are no fees associated with any of the work we do. Our team provides free candidate referrals/introductions to all of our active investments (one of the many services we provide), and we're always looking to add new people to our network of talent.