Job Description
|
:
|
Job Details
- Develop and implement enterprise-level Gen AI models and tools for business problem-solving.
- Collaborate with customers to identify and address business challenges through data-driven solutions.
- Build evaluation frameworks to measure LLM efficacy dataset quality and guide product development.
- Research emerging trends and technologies in Gen AI and related data science areas.
- Performed Exploratory Data Analysis to identify key features. Prepared and presented findings to corporate leadership.
- Develop use cases for prioritized business problems. Devise an approach and solution. Identify and collect required datasets and establish milestones and metrics for project success.
- Develop, Maintain RAG Pipelines and Optimize information retrievals for latency.
- Communicate insights to both technical and non-technical stakeholders.
- Conduct descriptive statistical analysis to reveal trends and patterns in customer data.
- Build, validate, and implement predictive models in collaboration with business owners.
- Design pilots, experiments, or surveys to derive insights for solving business problems.
- Interpret complex statistical results in an easy-to-understand language.
- Convert statistical findings into actionable business plans.
- Present analytics or modeling results to both technical and non-technical stakeholders.
Qualifications:
- Minimum bachelor’s degree in quantitative fields (mathematics, statistics, data science, physics, computer science, engineering, etc.); master’s or Ph.D. preferred.
- Experience solving business problems in the construction and distribution industry.
- 5+ years of experience in data science and machine learning development (2+ years with a master’s degree).
- 2+ years of experience in Natural Language Processing (NLP). Experience with Large Language Models (LLM) would be a plus.
- Proficient in utilizing machine learning technology stacks, which encompass various tools and frameworks, including notebook environments.
- Proven expertise in building machine/deep learning models using common frameworks such as PyTorch, TensorFlow, Keras, Scikit-learn, TensorFlow, XGBoost, etc.
- Solid Understanding and preferably working knowledge in Gen Techstack leveraging Langchain, LlamaIndex, React Prompt, Prompt engineering, Vector database, chunking , RAG pipelines, and optimizing Information Retrieval techniques and prompt Engineering Techniques.
- Proficient in Python and experienced with machine learning and NLP processing.
- Working knowledge of Datalake Techstacks ( like Snowflake, Big Query, Databricks) and Vector database.
- 5+ years of experience in data querying languages, scripting languages, or statistical/mathematical software.
- Extensive experience with statistical models and their application in data science.
- Ability to translate efficacy measurements of data science models into tangible business impact metrics.
- Proven knowledge of ML/AI platforms and workflows.
- Experience with data preprocessing techniques for big data containing text and tabular data, including feature engineering, dimensionality reduction, and normalization.
- Familiarity and hands-on experience with advanced ML models, including GPT-3/4, T5 , Claude and BERT.
|