About
Hi, I'm Saurabh Loya, a Data Scientist and AI Enthusiast passionate about transforming data into intelligent insights and building the future of AI. I hold a Master's in Computer Science from the University
of Utah and have worked with global teams at
BMW Group
and
Volkswagen Group, developing data-driven AI solutions and intelligent automation systems.
My core expertise lies in data science, machine learning, and artificial intelligence. I specialize in extracting meaningful insights from complex datasets, building predictive models, and developing AI systems that drive business value. My work spans across data analysis, statistical modeling, deep learning, and Large Language Models, using technologies like Python, TensorFlow, PyTorch, Language models, and LangChain. I've developed data pipelines, built ML models, created AI-powered applications, and designed intelligent systems that solve real-world problems through data-driven approaches.
I also have strong software engineering skills that enable me to build robust, scalable systems to support my data science and AI work. This combination allows me to create end-to-end solutions—from data collection and analysis to model deployment and AI system implementation—delivering complete, production-ready data science and AI solutions.
Feel free to check out my
CV and
drop me an email if you want
to chat with me!
Projects
Citi Bike Rental Analytics & Forecasting
Developed comprehensive analytics and time-series forecasting models for Citibike rental data
using Apache Spark and Facebook Prophet. Applied advanced statistical methods to predict demand patterns, enabling data-driven decision-making and optimized resource allocation in urban transportation systems.
Python
Apache Spark
Prophet
Time Series Analysis
Statistical Modeling
Source Code:
Time-Series-Analytics-and-Forecasting-with-Apache-Spark
AI-Powered Medical Chatbot
Developed an advanced Medical Chatbot leveraging LLaMA2, LangChain, and Pinecone VectorDB to provide instant, accurate medical information. Applied natural language processing and vector similarity search techniques to enhance patient engagement and deliver personalized healthcare insights.
Python
LangChain
LLaMA2
Pinecone
NLP
Generative AI
Vector Search
Source Code:
Medical Chatbot
🏆 Hackathon Winner:
Taskformer's AI Chatbot Hackathon
Interactive Pokemon Data Visualization
Developed an award-winning interactive data visualization tool to explore Pokémon
stats, type matchups, and battle outcomes using advanced statistical analysis and machine learning techniques. Applied data science methodologies to uncover hidden patterns in Pokémon data, securing the winner position in a class of 120 students.
D3.js
Python
Data Visualization
Statistical Analysis
Machine Learning
Source Code:
visual-journey-in-the-world-of-pokemon
🌐 Live Demo:
Explore the Pokémon World
ML-Based Android Malware Detection
Developed and compared multiple machine learning models to detect malicious
Android apps using system call frequency data analysis. Implemented advanced feature engineering and model evaluation techniques, with most algorithms built from scratch to achieve high accuracy in malware classification.
Python
Machine Learning
Scikit-learn
Feature Engineering
Cybersecurity
Model Evaluation
Source Code:
Android Malware detection
AI-Powered MCQ Generator
Developed an intelligent web application using OpenAI's language model,
LangChain, and Streamlit to automate the creation of multiple-choice
questions. Applied natural language processing and prompt engineering techniques to provide educators and content creators with customizable options for generating high-quality MCQs based on any input content.
Python
LangChain
Streamlit
OpenAI API
NLP
Prompt Engineering
Source Code:
MCQ Generator Web Application
MeetingMate - Automated reminders for Google Calendar events.
Automation tool integrated with the Google Calendar and Gmail APIs
to send timely reminders to attendees of upcoming events.
Source Code:
MeetingMate
Technology: Python, Streamlit, Google APIs
AutoBotTrain - Automated Chatbot Training Pipeline
Crafted an automated pipeline to efficiently generate utterances,
responses, and intents from user-entered text or business documents,
streamlining chatbot training process.
Source Code:
AutoBotTrain
Technology: Python, Spacy, NLP, Machine Learning
LLM based Medicine Recommendation System
Implemented an AI-driven medicine recommendation system utilizing
Large Language Model (LLM) technology to suggest medications based
on patient symptoms while providing insights into potential side
effects.
Source Code:
Medicine Recommendation System
Technology: Python, Spacy, Machine Learning, Large Language
Model
Smart Contract Fuzzing - Enhancing Security in Blockchain
Applications
Utilizing fuzzing techniques and static analysis, this project
meticulously identifies vulnerabilities within Ethereum smart
contracts, bolstering their security and fortifying decentralized
applications against potential threats.
Source Code:
Smart-Contract-Fuzzing
Technology: Solidity, Echidna