Machine Learning & MLOps Engineer
I design, build, and scale intelligent systems — from streaming NLP pipelines to real-time LLM applications. Driven by curiosity, I combine data engineering with AI research to turn ideas into production-ready systems.
View My LinkedIn Profile
May 1st, 2024 — I made a promise to rebuild from the ground up.
After two years studying data science, I realized I could use algorithms but didn’t understand them deeply enough.
So I started over — posting every day to rebuild my intuition and technical depth.
Over a year, I revisited Statistics, Probability, Machine Learning, and Databases, then expanded into Deep Learning, LLMs, and MLOps. That journey transformed me — from learning algorithms to leading real-world ML & MLOps projects.
Through this commitment, I’ve written over 200 in-depth posts exploring everything from the foundations of statistics to the architecture of transformers and production-grade ML systems.
Each post became a reflection of growth — documenting not only what I learned but how I built, deployed, and optimized real systems.
Today, the blog stands as a living record of my continuous evolution from student to Machine Learning & MLOps Engineer — grounded in curiosity, consistency, and craftsmanship.
🔗 Visit My Blog →
📄 Download My 200 Days Challenge Summary (PDF)
| Kafka | FastAPI | Docker | AWS EC2 | Prometheus | Grafana | Hugging Face | LLMOps | MLOps | Machine Translation |
Designed and implemented a real-time multilingual translation system that combines advanced AI with scalable cloud infrastructure to process and translate continuous news streams with ultra-low latency.
| PySpark Structured Streaming | Delta Lake | Databricks | Transformer Models | MLflow | Real-Time Inference | Hugging Face | MLOps | Sentiment Analysis |
Built a real-time tweet sentiment classification pipeline on Databricks using Spark Structured Streaming and Transformer models, with MLflow for experiment tracking and Delta Lake for fault-tolerant storage.
Delivered live dashboards visualizing sentiment trends across millions of tweets.

| NLP | BERTopic | RoBERTa | Hugging Face | Multilingual Analysis | Sentiment Modeling | Data Visualization |
Analyzed 500K+ multilingual social media posts (English & Spanish) to uncover public perceptions of e-cigarettes using BERTopic and RoBERTa, identifying cross-lingual sentiment shifts and key discussion themes.
| LDA | Topic Modeling | NLP | Sentiment Analysis | Cross-Cultural Analytics | Python | Visualization |
Explored European luxury hotel reviews using LDA topic modeling to uncover country-specific satisfaction drivers, linking linguistic tone to cultural preferences and customer experience.
🔗 Quality of Life Analysis: Tri-State Visualization (Report PDF)
| Data Visualization | Tableau | Statistics | Socioeconomic Indicators | Public Data Analysis |
Developed an interactive dashboard comparing education, income, housing, and healthcare metrics across New York, New Jersey, and Connecticut to evaluate regional quality of life patterns.
🔗 Multidimensional Analysis of Video Game Sales and Global Market Trends (Report PDF)
| Statistics | Regression Analysis | Market Analytics | Data Visualization | Exploratory Data Analysis |
Performed multivariate statistical modeling on global video game sales to uncover genre, platform, and regional trends, providing insight into market dynamics and sales forecasting.
| Machine Learning | Deep Learning | Time Series Forecasting | Data Visualization | Public Health Analytics |
Developed ML/DL models to predict COVID-19 trends in Ohio using feature engineering and time-series modeling, revealing key behavioral and social factors influencing public awareness.
Wonha Shin / leahnote01@gmail.com 📩 Email me!