Farid Saud

Data Scientist | ML Engineer | AI Engineer

I am a Data Scientist & Infrastructure/Services Unit Manager at the Data Science Research Services (DSRS) unit at the GIES College of Business, University of Illinois at Urbana-Champaign. I specialize in Data Science, Machine Learning, Deep Learning, and AI/LLM Engineering.

At DSRS, I partner with faculty and research stakeholders to deliver datasets, models, and reproducible outputs. I lead the deployment of self-hosted AI services (LLM/VLM, embeddings, text-to-speech, image generation) on A100/H100 GPUs, manage infrastructure via Kubernetes, and mentor interns across Infrastructure and Services. I also lead internal tools like our Knowledge Base (RAG), semantic caching, and AI-powered paper highlighter.

I hold two Master’s degrees from UIUC - a M.S. in Statistics (Data Science concentration, GPA 4.0) and a M.S. in Data Science + Civil Engineering (GPA 4.0) - and a B.S. in Civil Engineering from Universidad San Francisco de Quito, Ecuador.

Beyond work, I follow advancements in AI, VR/AR, fast cars, drones, and video games. I love traveling and exploring diverse cuisines.

Experience

Data Science Research Services (DSRS) - Gies College of Business January 2024 - Present

Data Scientist & Infrastructure/Services Unit Manager (Jan 2025 - Present)

Partner with faculty/research stakeholders to scope research needs, deliver datasets, models, and clear written/visual outputs with reproducible code.
Lead internal tools and prototypes: DSRS Knowledge Base (RAG across 70+ repos), semantic caching layer, AI-powered paper highlighter (LLM/VLM + OCR).
Manage and mentor 4+ interns across Infrastructure and Services: screening, onboarding, code reviews, and milestone reviews.
Deploy and operate self-hosted AI services (LLM/VLM, embeddings, TTS, image generation) on A100/H100 GPUs via OpenAI-compatible endpoints; optimize single/multi-GPU inference.
Deploy and support applications on Kubernetes+Helm; manage databases (Postgres+pgvector), caching (Redis), CI/CD, and monitoring.

Data Science Intern - Infrastructure (Jan 2024 - Dec 2024)

Supported DevOps for DSRS applications and infrastructure across Azure and on-prem: containerization, health checks, CI/CD (GitHub Actions).
Executed scoped data science tasks end-to-end (data pulls, cleaning, analysis) with reproducible scripts/notebooks.
Evaluated LLM deployments (transformers/vLLM): latency/throughput benchmarks to inform DSRS AI strategy.

Capital Programs - Facilities and Services August 2023 - December 2023

Project Manager

Led 2 end-to-end projects valued between $500K and $5M, ensuring on-time and on-budget delivery.
Coordinated cross-functional teams and ran stakeholder check-ins to keep scope, schedule, and handoffs aligned.

University of Illinois at Urbana-Champaign August 2021 - May 2024

Teaching Assistant

Instructed over 200 students across 5 semesters and 13 sections.
Consistently ranked as Excellent Teacher: Fall 2021, Spring 2022, Fall 2022, and Spring 2023 (4 time winner).
Courses Taught: Intermediate Spanish, Spanish Composition

Universidad San Francisco de Quito January 2018 - May 2019

Teaching Assistant

Recipient of the undergraduate Teaching Assistantship (2018). Ranked as Excellent Teacher.
Courses Taught: Topography, Geometrical Design of Roads

Education

Master of Science in Statistics 2024

University of Illinois at Urbana-Champaign

Concentration: Data Science
GPA: 4.0/4.0
Selected courses:

Applied Machine Learning
Statistical Learning
Deep Learning
Mathematical Statistics
Statistical Modeling
Advanced Data Analysis
Time Series Analysis
Big Data Analytics

Master of Science in Data Science + Civil and Environmental Engineering 2021 - 2023

University of Illinois at Urbana-Champaign

Concentration: Construction Engineering and Management
GPA: 4.0/4.0
Selected courses:

Data Science for CEE
Machine Learning for CEE
Construction Optimization
Construction Data Modeling

Bachelor of Science in Civil Engineering 2017 - 2021

Universidad San Francisco de Quito

GPA: 3.9/4.0
Thesis: “Productivity in Construction: Measurement Methodologies in Ecuador”. Score: 99/100

Skills

Languages

Python
R
SQL
Bash
JavaScript

ML / AI / Data Science

NumPy
Pandas
Matplotlib
Scikit-learn
PyTorch
TensorFlow
HuggingFace Transformers
LangChain / LlamaIndex
vLLM / llama.cpp
spaCy / NLTK
Streamlit
Plotly

Cloud & DevOps

AWS (EC2, S3, SageMaker)
Azure
HPC/GPU Clusters (A100/H100)
Docker
Kubernetes (Helm/kubectl/k9s)
GitHub Actions

Data Engineering & Viz

PostgreSQL / pgvector
Redis
MongoDB
Spark (PySpark)
Tableau
Power BI

Projects

HandCV - Hand-Gesture Interactive Resume

A webcam-powered, gesture-controlled resume viewer using MediaPipe hand tracking. Navigate resume sections by tilting your hand as a dial, expand cards with an open palm, and collapse with a fist. Built with vanilla JavaScript and HTML5 Canvas - no frameworks, no build step. Try it live at fsaudm.github.io/hand.

fsaudm/HandCV

LangExtract - LLM-Powered Structured Extraction

A tool for extracting structured information from unstructured text documents (PDFs, papers) using open-source LLMs deployed on institutional GPU clusters. Supports user-defined extraction instructions and ensures all extracted data has verifiable existence in the source document. Built with LangChain and locally hosted models.

DSRS Knowledge Base (RAG)

Centralized institutional knowledge system spanning 70+ DSRS repositories. Prototypes state-of-the-art RAG workflows using pgvector, FAISS, and LangChain/LlamaIndex for semantic search and retrieval across codebases, documentation, and research artifacts.

Semantic Cache for LLM APIs

Prototype caching layer to reduce redundant LLM API calls and cost. Uses embedding-based similarity matching to serve cached responses for semantically equivalent queries, significantly reducing external API dependence.

LLM Deployment & Inference Optimization

Deployment and optimization of open-source AI models (LLM/VLM, embeddings, TTS, image generation) on A100/H100-class GPUs using vLLM, llama.cpp, and HuggingFace. Includes single-GPU and multi-GPU/distributed inference configurations via OpenAI-compatible endpoints for scalable batch processing.

OpenAI Batch Processing at Scale

Production workflows for 100K+ structured API calls to OpenAI models for research data processing. Includes robust error handling, rate limiting, structured output parsing, and cost optimization for large-scale research workflows.

Ashby Prize in Computational Science - Finalist

Delivered a real-time demonstration of an agent-based AI image generation system combining proprietary and locally hosted models. Showcased practical LLM deployment and application for research workflows at the UIUC Ashby Prize hackathon.

Object Detection using YOLOv10 and RT-DeTr in AWS

Implemented object detection in construction sites using YOLOv10 and RT-DeTr architectures with PyTorch and Ultralytics. Compared performance in terms of speed and accuracy on AWS SageMaker with GPUs, using the SODA construction dataset.

fsaudm/YOLOv10_RT-DeTr_in_AWS

LLMs and Deep Learning Models with Nvidia NIMs

Exploration of Nvidia’s Neural Infrastructure Modules (NIMs) with Llama 3.1 (8B and 405B) for reasoning and multi-lingual queries, plus StabilityAI’s Stable Diffusion XL for text-to-image generation.

fsaudm/Nvidia-nims

GMM and HMM - Implementation from Scratch in R

Implementation of a Gaussian Mixture Model using EM algorithm, and a Hidden Markov Model through Baum-Welch and Viterbi algorithms. Detailed R markdown walkthrough available here.

fsaudm/statistical-learning/tree/main/Coding4

Linear SVM using SGD - Implementation from Scratch in R

Implementation of a Linear SVM using the Pegasos algorithm (Shalev-Shwartz et al., 2011). R markdown with details here.

fsaudm/statistical-learning/tree/main/Coding5

Walmart Store Sales Forecasting

Data analysis of historical sales from 45 Walmart stores with Robust Linear Regression on SVD-smoothed data, forecasting future sales and identifying top/bottom performing stores and departments.

fsaudm/statistical-learning/tree/main/Project2

Post-Disaster Traffic Prediction

Replication and extension of Professor Hadi Meidani’s Kaczmarz algorithm for real-time, short-term traffic prediction. Explored spatial vs. time-based ordering of multivariate vectors, finding spatial ordering achieved superior prediction accuracy for extreme weather conditions.

fsaudm/Recursive-Estimation-of-Polynomial-Approximation-Kaczmarz-Algorithm

Neural Network using NumPy for MNIST

Fully-connected neural network from scratch using only NumPy, trained on MNIST (94% accuracy). Four-layer architecture with 164K parameters. All activation functions, forward/backward propagation implemented from scratch. Integrated with Weights & Biases for experiment tracking.

fsaudm/NeuralNet_in_NumPy

Awards & Certifications

ASHBY Prize in Computational Science Hackathon: 3rd Place & Best Presentation

Center for Artificial Intelligence Innovation at the National Center for Supercomputing Applications - May 2024

Awarded 3rd place (out of 50 participants) in the ASHBY Prize in Computational Science Hackathon, a competition focused on using LLMs as a front-end to computational workflows. We developed an end-to-end agent-based system capable of Retrieval Augmented Generation (RAG), integrated with an API-based model GPT-4 and a locally-hosted Llama 3 model. Received best presentation recognition for our great delivery and real-time demonstration!

Illinois Statistics Datathon 2024: 4th Place

Department of Statistics, UIUC, with Synchrony Financial - April 2024

Awarded 4th place (out of 345 participants) in the Illinois Statistics Datathon 2024. With my team, we performed extensive data pre-processing, exploratory data analysis (EDA), and feature engineering to identify and build key features for Synchrony’s Interactive Voice Response (IVR) System, a “real-world” dataset with millions of observations. We employed Logistic Regression (for its interpretability) to evaluate measured and engineered features’ effects on reducing the number of “floored” calls, effectively resulting in a data-driven decision with savings potential of $300,000 per 1% reduction of calls. You can read about our experience on this post.

Data Parallelism: How to Train Deep Learning Models on Multiple GPUs

NVIDIA - March 2024

Certifies competence in the completion of Workshop/Data Parallelism: How to Train Deep Learning Models on Multiple GPUs. Through this course, I successfully trained and deployed a set of Convolutional Neural Networks on the Fashion MNIST dataset using Python, CUDA, and the DDP library for Distributed Data Parallelism with 4 GPUs. I also applied advanced techniques such as gradual warmup, batch normalization, and the NovoGrad optimizer to enhance model performance and training efficiency. Great experience!

Languages

Spanish Native Language

English Full Professional Proficiency

Italian Professional Working Proficiency

French Beginner

Arabic Beginner

A Little More About Me

I recently achieved a 1-year streak on Duolingo for practicing Italian, Arabic, and Korean!
So far, I have traveled to 18 countries and 24 states in the US.
I cannot recommend enough Kurzgesagt, StatQuest, and Brilliant. Andrej Karpathy’s series are amazing, and I am very excited about Eureka Labs!