Title- GenAI Data Scientist
Fulltime
Location – Irving TX/Piscataway New Jersey
Role and responsibilities
• Collaborate with data engineers, data scientists, and stakeholders to understand data requirements, problem statements, system integrations, and RAG application functionalities.
• Utilize, apply & enhance GenAI models using state-of-the-art techniques like transformers, GANs, VAEs, LLMs (including experience with various LLM architectures and capabilities), and vector representations for efficient data processing.
• Implement and optimize GenAI models for performance, scalability, and efficiency, considering factors like chunking strategies for large datasets and efficient memory management.
• Integrate GenAI models, including LLMs, into production pipelines, applications, existing analytical solutions, and RAG workflows, ensuring seamless data flow and information exchange.
• Develop user-facing interfaces (UIs) using modern front-end frameworks (e.g., React, Angular) to deliver an intuitive and interactive experience for RAG applications.
• Develop robust APIs (RESTful or GraphQL) using back-end frameworks (e.g., Django, Node.js) to facilitate communication between the front-end UI, GenAI models, and data sources.
• Utilize LangChain and similar tools (e.g., PromptChain) to facilitate efficient data retrieval, processing, and prompt engineering for LLM fine-tuning within RAG applications.
• Apply software engineering principlesto develop secure, scalable, maintainable, and production-ready GenAI applications.
• Build and deploy GenAI applications on cloud platforms (AWS, Azure, or GCP), leveraging containerization technologies (Docker, Kubernetes) for efficient resource management.
• Integrate GenAI applications with other applications, tools, and analytical solutions (including dashboards and reporting tools) to create a cohesive user experience and workflow within the RAG ecosystem.
• Continuously evaluate and improve GenAI models, applications, and user interfaces based on data, feedback, user needs, and RAG application performance metrics.
• Stay up-to-date with the latest advancements in GenAI research, development, front-end and back-end development practices, integration tools, LLM architectures, and RAG functionalities.
• Document code, models, processes, UI/UX design choices, and RAG application design for future reference and knowledge sharing. Technical skills requirements The candidate must demonstrate proficiency in,
• Strong understanding of machine learning and deep learning concepts
• Proficiency in Python (libraries like TensorFlow, PyTorch) with experience in vector data manipulation libraries
• Experience with generative AI models (transformers, GANs, VAEs) and various LLM architectures
• Experience with front-end development frameworks (e.g., React, Angular) and UI/UX design principles
• Experience with back-end development frameworks (e.g., Django, Flask) and API development (RESTful or GraphQL)
• Experience with NLP techniques (text cleaning, pre-processing, text analysis)
• Experience with software engineering principles and best practices (object-oriented programming, design patterns, testing)
• Familiarity with cloud platforms (AWS, Azure, or GCP)
• Knowledge of containerization technologies (Docker, Kubernetes)
• Experience with data integration tools and techniques (a plus)
• Knowledge of chunking strategies for handling large datasets
• Experience working with RAG applications and their functionalities
• Expertise in LangChain and similar tools (e.g., PromptChain) for prompt engineering and data processing in RAG applications
• Experience with DevOps principles and tools for continuous integration and delivery (CI/CD)
• Experience with building and integrating with analytical dashboards and reporting tools Nice-to-have skills
• Experience working with RAG applications
• Experience with cloud-based data warehousing solutions (e.g., BigQuery, Redshift, Snowflake)
• Experience with cloud-based workflow orchestration tools (e.g., Airflow, Prefect)
• Familiarity with Kubernetes (K8S) is a welcome addition
• Google Cloud certification
• Unix or Shell scripting Qualifications
• B.Tech., M.Tech. or MCA degree from a reputed university