Day108 Deep Learning Lecture Review - Advanced Techniques in DL
HW5: Out-of-distribution (OOD) Detection (Maximum Softmax Probability & ODIN) and Continual Learning (SLDA & IID Streaming)
HW5: Out-of-distribution (OOD) Detection (Maximum Softmax Probability & ODIN) and Continual Learning (SLDA & IID Streaming)
HW4: Model Calibration (Platt Scaling & Label Smoothing) and Conformal Prediction (Naive and Adaptive Predictions Sets)
HW3: Optimization through Data Loading, Profiling, & Scaling, and Comparison of Data Parallel & Distributed Data Parallel
Model Drifting, Periodic Re-Training, Detecting Model Drift, Continual Learning (Pre-Trained Model, NCC), and Real-Time Machine Learning
Data-Centric AI: Label Noise, Selection Bias, Data Leakage, and Error Analysis for Model Improvement (Subgroup Errors)
Data-Centric AI: Active Learning, SEALS(Similarity Search for Efficient Active Learning), Dataset Pruning, and Data Engine
Data-Centric AI: Crowdsourcing, Methods to Estimate Annotator Quality, Neural Scaling Laws, Pareto Curves and Power Law
Variation of Conformal Prediction: Size of Calibration Set, Evaluation, and Group-Based & Adaptive Conformal Prediction
Understanding Conformal Prediction: Concepts, Applications, Marginal Coverage, and Recipes In Detail
Uncertainty in Deep Learning, Distribution Shifts, Model Calibration, and Out-of-Distribution (OOD) Detection
Language Models- Transfer Learning, Basic Concepts & Terminologies, Components of NLP Models, and Attention Mechanism
Bias Mitigation Strategies: Loss Reweighting, Sampling & Synthetic Samples and Architectural Changes (OccamNets, Adversarial Training & DANN)
Model Comparison and Bias Mitigation; McNemar’s Test, Dataset Bias, and Bias Detection
AI Ethics; AI Safety, Key Issues, AGI (Artificial General Intelligence), and Current AI Models’ Challenges
Llama 3: Framework, Workflow (RMSNorm, Grouped Query Attention, RoPE, SwiGLU Attention), Pre-training & Post-training
Comparing Pre-trained model embeddings (ResNet+SBERT vs. CLIP) and Prompt Engineering (Short and Direct, Few-Shot Learning, & Expert Prompting)
HW2: Understanding of LoRA and Pre-trained Model Embeddings (ResNet+SBERT) for Visual Question Answering (VQA)
Weights & Biases (W&B) for Monitoring and Fine-Tuning ResNet-18 and Post-Training Evaluations (Dying ReLU, Brightness Robustness)
HW0: Softmax Properties, PyTorch Lightning, and DataLoader
Revisiting Ensemble Method, Random Forest, and XGBoost
Deep Learning & Numerical Precision(Floating Point), Hardware Considerations, and Distributed Model Training
LLMs - Speeding Up LLMs (Grouped Query Attention, KV Caches, MoE, and DPO)
LLMs- Generating Texts, Positional Encoding, and Fine-Tuning LLMs (LoRA)
LLMs - Perplexity, Tokenizers, Data Cleaning, and Embedding Layer
Basic Machine Learning & Deep Learning, Word Embedding, CNNs, RNNs, LSTM and Transformer
Large Language Model - BERT, GPT, and GPT-2, 3 & 4
Transformers and Foundation Models: GELU, Layer Norm, Key Concepts & Workflow
Brief Explanation of Basic Algebra and Machine Learning
Primary Goals, Common Tasks, and Deep Learning NLP
Transformer Architecture, How the Models Are Different, and Q,K,V in Self-Attention
Basic Concepts and the Detailed Architecture
Neural Net Zoo: Transformers, Recurrent Neural Networks (RNNs) and Graph Neural Networks (GNNs)
Types of Learning and Neural Net Zoo: Fully Connected Networks (MLPs), Inductive Bias, and Convolutional Neural Networks (CNNs)
Basic Mathematics, Supervised ML, and Review of Multi-Layered Perceptron
Bagging & Boosting : Basic Concepts & Code Implementation
Using the Majority Voting Principle to Make Predictions, and Evaluating & Tuning the Ensemble Classifier
Code Structure of Combining Classifiers via Majority Vote
Key Concepts and Mathematics Explanation
Use Other Metrics, Assign Different Class Weights, or Upsample the Minority Class
ROC area Under The Curve (ROC AUC)
Confusion Matrix and F1 score
Grid Search for Fine-Tuning Machine Learning Models
Bias & Variance, and Learning & Validation Curves
Model Selection and K-Fold Cross Validation
Key Concepts and Example Code with Scikit-learn
Applying Kernal Principal Component Analysis(KPCA) to New Data Points
Implementing a Kernal Principal Component Analysis(KPCA) in Python
Nonlinear Mappings with Kernel Principal Component Analysis
Compressing Data via Linear Discriminant Analysis
Compressing Data via Dimensionality Reduction and Summary of PCA
Partitioning a Dataset into Training & Test Datasets, Feature Scaling, and Feature Selection
Handling Categorical Data - Converting, Ordinal Encoding, and One-Hot Encoding
Handling Missing Data - Eliminating and Imputing & Estimators API
The Curse of Dimensionality
Distance Metrics- Euclidean, Manhattan, Minkowski & Chebyshev Distance, and Cosine Similarity
Basic Concepts, How It Works, and Parametric & Non-Parametric Model
Linear Algebra & Matrix for Programmers
Implementation Step by Step
Building a Decision Tree & Random Forest (1) - Key Concepts & How it Works
Information Gain (2) - Entropy & Classification Error
Components, How it Works & Maximizing Information Gain (1) - Gini Impurity
Solving Nonlinear Problems - Using a Kernal SVM
SVM: Nonlinear Separable Case
Basic Concepts and Mathematical Formulations
How To Train in Scikit-Learn, and Regularization with LR model
Cost Function of Logistic Regression
Basic Concepts and Sigmoid Function
Step by Step - Training Perceptron (3)
Step by Step - Training Perceptron (2)
Choosing a Classification Algorithm Step by Step - Training Perceptron (1)
Population Proportions, p-values & Confidence Intervals, and Type I & II Errors
Test Statistics (Z-Test, t-Test, and Chi-Squared Test)
Law of Large Numbers, Central Limit Theorem, and Hypothesis Testing (1) - General Setup
Properties of Random Variable
Continuous Probability Distribution and Markov Chains
Joint, Marginal, & Conditional Probability Distributions, and Discrete & Poisson Distributions
Basic Probability - Counting and Random Variables
Applying PCA in Machine Learning and Scree Plot
Concepts, Types of Regularization, and How It Works
Further Analysis on R-Squared
How R-Squared Is Used As a Performance Metric in Machine Learning
Concepts Overview, Mathematical Calculation, and Interpretation
Summarization and Types
Mathematical Explanation
Basic Concepts, Steps, and Key Consideration
Understanding Gradients in Machine Learning Applications
SSE Sum Squared Error and Choosing Cost Function
Cost Function and MSE Mean Squared Error
Concepts Overview and Mathematical Calculation Exercise
Applications on Machine Learning & Further Explanations
Mathematical Definition & Algorithms
Bayesian Statistics, the Law of Total Probability, and Var-Cov Matrix
Eigendecomposition, Symmetric Matrix, and Eigendecomposition
Norm(2), Eigenvectors and Eigenvalues
Scalar, Norm, Matrix (Inverse, Basic Functions), Rank
The Start of Recording TIL Summer ‘24 - During Summer in Rochester as a Data Scientist Candidate
Designing Machine Learning Systems: Causes of ML System Failures (2) (Correcting Degenerate Feedback Loops) & Data Distribution Shifts
Designing Machine Learning Systems: Causes of ML System Failures (1) (Production data differing from training data, Edge Cases, and Degenerate Feedback Loops)
Designing Machine Learning Systems: ML on the Cloud and on the Edge & Model Optimization (AutoTVM & WebAssembly)
Designing Machine Learning Systems: Model Comparison (Low-Rank Factorization, Knowledge Distillation, Pruning, & Quantization)
Designing Machine Learning Systems: Model Deployment and Batch Prediction Versus Online Prediction (Unifying Batch and Streaming Pipeline)
Designing Machine Learning Systems: Model Offline Evaluation (Methods: Perturbation Tests, Invariance Tests, etc.)
Designing Machine Learning Systems: Auto ML (Hyperparameter Tuning & NAS), Model Offline Evaluation (1) (Establishing Baselines)
Designing Machine Learning Systems: Experiment Tracking, Versioning, and Distributed Training (Data Parallel)
Designing Machine Learning Systems: Model Selection, Evaluating ML Models, & Ensemble Method (Bagging, Boosting, and Stacking)
Designing Machine Learning Systems: Data Leakage (Definition, Common Causes, Detecting and Preventing it)
Designing Machine Learning Systems: Feature Engineering Techniques (2) (Positional Embeddings) & Engineering Good Features
Designing Machine Learning Systems: Feature Engineering Techniques (1) (Handling Missing Values, Scaling, Normalization, Binning, Encoding Categorical Values...
Designing Machine Learning Systems: Data Augmentation (Simple Label-Preserving Transformations, Perturbation, and Data Synthesis)
Designing Machine Learning Systems: Class Imbalance (How to Deal with the problems: Evaluation Metrics, Over & Undersampling, Resampling and Algorithm-le...
Designing Machine Learning Systems: Labeling (Hand Labels, Natural Labels, & Addressing the Lack of Labels - Active Learning, etc.)
Designing Machine Learning Systems: Sampling (Nonprobability, Simple Random, Stratified, Weighted, Reservoir, and Importance Sampling)
Designing Machine Learning Systems: Modes of Dataflow & Batch / Real-Time Processing
Designing Machine Learning Systems: Data Formats (JSON, Parquet & Binary Format), Data Models (Relational & NoSQL), and Data Storage Engines (ETL)
Designing Machine Learning Systems: Framing ML Problems (2) (Types of ML Tasks & Objective Functions)
Designing Machine Learning Systems: Iterative Process & Framing ML Problems (1)
Designing Machine Learning Systems: Business and ML Objectives & Requirements for ML Systems
Designing Machine Learning Systems (MLOPs) Review Begins!
Practical Statistics for Data Scientists: Scaling and Categorical Variables (Scaling the Variables, Dominant Variables, Categorical Data, and Gower’s Distanc...
Practical Statistics for Data Scientists: Model-Based Clustering (Multivariate Normal Distribution, Mixtures of Normals & Selecting the Number of Cluster...
Practical Statistics for Data Scientists: Hierarchical Clustering (A Simple Example, the Dendrogram, the Agglomerative Algorithm & Measures of Dissimilar...
Practical Statistics for Data Scientists: K-Means Clustering (2) (Interpreting Clustering Results & Determining the Optimal Number of Clusters K)
Practical Statistics for Data Scientists: K-Means Clustering (1) (A Simple Example & K-Means Algorithm Code Source)
Practical Statistics for Data Scientists: Principal Components Analysis (2) (Formal Definition, Interpreting Components & Correspondence Analysis)
Practical Statistics for Data Scientists: Principal Components Analysis (1) (Unsupervised Learning, A Simple Example and Computing the Principal Components)
Practical Statistics for Data Scientists: Boosting (2) (Regularization, Hyperparameters & Cross-Validation)
Practical Statistics for Data Scientists: Boosting (1) (Key Concepts & XGBoost)
Practical Statistics for Data Scientists: Bagging and the Random Forest (2) (Random Forest II & Variable Importance)
Practical Statistics for Data Scientists: Bagging and the Random Forest (1) (Bagging and Random Forest)
Practical Statistics for Data Scientists: Tree Models (3) (Dealing With Overfitting Problems in R and Python & Predicting a Continuous Value)
Practical Statistics for Data Scientists: Tree Models (2) (A Simple Example, The Recursive Partitioning Algorithm, & Measuring Homogeneity or Impurity)
Practical Statistics for Data Scientists: K-Nearest Neighbors (3) (KNN as a Feature Engine) & Tree Models (1) (Key Concepts)
Practical Statistics for Data Scientists: K-Nearest Neighbors (2) (Standardization & Choosing K)
Practical Statistics for Data Scientists: K-Nearest Neighbors (1) (Example, Distance Metrics, and One Hot Encoder)
Practical Statistics for Data Scientists: Strategies for Imbalanced Data (2) (Data Generation, Cost-Based Classification, and Exploring the Predictions)
Practical Statistics for Data Scientists: Strategies for Imbalanced Data (1) (Undersampling & Oversampling)
Practical Statistics for Data Scientists: Evaluating Classification Models (Confusion Matrix, ROC-AUC & Lift)
Practical Statistics for Data Scientists: Logistic Regression (3) Assessing the Model
Practical Statistics for Data Scientists: Logistic Regression (2) (GLM, Interpretation, Fitting the Model)
Practical Statistics for Data Scientists: Logistic Regression (1) (Mathematic Foundation: Odds, Logit Function, Formula, and Examples)
Practical Statistics for Data Scientists: Discriminant Analysis (Covariance, Discriminant Function, and Application: Predicting Default Risk)
Practical Statistics for Data Scientists: Naive Bayes (Theoretical Approach, Code Source, & Prediction)
Practical Statistics for Data Scientists: Weighted Regression, and Interactions and Main Effects in Regression in Depth
Practical Statistics for Data Scientists: Stepwise Regression & Model Selection in Depth
Mathematical Principles: Non-parametric Inference (Wilcoxon Signed-Rank Test & Wilcoxon Rank-Sum Test)
Mathematical Principles: Inference on Proportions- Sample Size Estimation, Hypothesis Testing, and Chi-Squared Test
Mathematical Principles: Inference for Variance, Chi-Squared Distribution, F-Statistics, and Inference on Proportions (Wald & Wilson)
Mathematical Principles: Hypothesis Testing, Paired Samples, Independent Samples, and additional concepts.
Practical Statistics for Data Scientists: Partial Residual Plots and Nonlinearity, Polynomial & Spline Regression, and Generalized Additive Models
Practical Statistics for Data Scientists: Regression Diagnostics- Outliers, Influential Observations, and Heteroskedasticity
Practical Statistics for Data Scientists: Interpreting the Regression Equation - Correlation, Multicollinearity, Confounding Variables, and Interactions
Practical Statistics for Data Scientists: Factor Variables in Regression, Dummy Variables, Many Levels, and Ordered Factor Variables
Practical Statistics for Data Scientists: Assessing the Model, Cross Validation, Model Selection, and Prediction Using Regression
Practical Statistics for Data Scientists: Simple Linear Regression, Least Squares, and Multiple Linear Regression
Practical Statistics for Data Scientists: Chi-Square Theories, Fisher’s Exact Test, Multi-Arm Bandit Algorithm, and Power & Sample Size
Practical Statistics for Data Scientists: ANOVA(One & Two-Way), F-statistic, and Chi-Square Test
Practical Statistics for Data Scientists: t-Tests, Multiple Testing, False Discovery, and Degrees of Freedom
Practical Statistics for Data Scientists: p-Values, Practical Applications, and Type I & Type II Errors
Practical Statistics for Data Scientists: A/B Testing, Hypothesis Tests (One-Way & Two-Way), and Permutation Test
Practical Statistics for Data Scientists: t-Dist, Binomial, Chi-Square, F-Dist, and Poisson Distribution.
Practical Statistics for Data Scientists: Bootstrap, Confidence Intervals, and Normal Distribution
Practical Statistics for Data Scientists: Sampling, Bias, and Sampling Distribution(Central Limit Theorem)
Practical Statistics for Data Scientists: Data Distribution, Correlation, and Various Data Visualization Plots
Practical Statistics of Data Scientists: Elements of Statistical Terminologies- Data Statistics Fundamentals, Data Types, and Estimates of Location & Var...
The Ongoing Chronicles of TIL25 — A Motivating Expedition as a Data Scientist & AI/ML Engineer Candidate
Revisiting Ensemble Method, Random Forest, and XGBoost
Bagging & Boosting : Basic Concepts & Code Implementation
Using the Majority Voting Principle to Make Predictions, and Evaluating & Tuning the Ensemble Classifier
Code Structure of Combining Classifiers via Majority Vote
Key Concepts and Mathematics Explanation
Use Other Metrics, Assign Different Class Weights, or Upsample the Minority Class
ROC area Under The Curve (ROC AUC)
Confusion Matrix and F1 score
Grid Search for Fine-Tuning Machine Learning Models
Bias & Variance, and Learning & Validation Curves
Model Selection and K-Fold Cross Validation
Key Concepts and Example Code with Scikit-learn
Applying Kernal Principal Component Analysis(KPCA) to New Data Points
Implementing a Kernal Principal Component Analysis(KPCA) in Python
Nonlinear Mappings with Kernel Principal Component Analysis
Compressing Data via Linear Discriminant Analysis
Compressing Data via Dimensionality Reduction and Summary of PCA
Partitioning a Dataset into Training & Test Datasets, Feature Scaling, and Feature Selection
Handling Categorical Data - Converting, Ordinal Encoding, and One-Hot Encoding
Handling Missing Data - Eliminating and Imputing & Estimators API
The Curse of Dimensionality
Distance Metrics- Euclidean, Manhattan, Minkowski & Chebyshev Distance, and Cosine Similarity
Basic Concepts, How It Works, and Parametric & Non-Parametric Model
Linear Algebra & Matrix for Programmers
Implementation Step by Step
Building a Decision Tree & Random Forest (1) - Key Concepts & How it Works
Information Gain (2) - Entropy & Classification Error
Components, How it Works & Maximizing Information Gain (1) - Gini Impurity
Solving Nonlinear Problems - Using a Kernal SVM
SVM: Nonlinear Separable Case
Basic Concepts and Mathematical Formulations
How To Train in Scikit-Learn, and Regularization with LR model
Cost Function of Logistic Regression
Basic Concepts and Sigmoid Function
Step by Step - Training Perceptron (3)
Step by Step - Training Perceptron (2)
Choosing a Classification Algorithm Step by Step - Training Perceptron (1)
Population Proportions, p-values & Confidence Intervals, and Type I & II Errors
Test Statistics (Z-Test, t-Test, and Chi-Squared Test)
Law of Large Numbers, Central Limit Theorem, and Hypothesis Testing (1) - General Setup
Properties of Random Variable
Continuous Probability Distribution and Markov Chains
Joint, Marginal, & Conditional Probability Distributions, and Discrete & Poisson Distributions
Basic Probability - Counting and Random Variables
Applying PCA in Machine Learning and Scree Plot
Concepts, Types of Regularization, and How It Works
Further Analysis on R-Squared
How R-Squared Is Used As a Performance Metric in Machine Learning
Concepts Overview, Mathematical Calculation, and Interpretation
Summarization and Types
Mathematical Explanation
Basic Concepts, Steps, and Key Consideration
Understanding Gradients in Machine Learning Applications
SSE Sum Squared Error and Choosing Cost Function
Cost Function and MSE Mean Squared Error
Concepts Overview and Mathematical Calculation Exercise
Applications on Machine Learning & Further Explanations
Mathematical Definition & Algorithms
The Start of Recording TIL Summer ‘24 - During Summer in Rochester as a Data Scientist Candidate
Practical Statistics for Data Scientists: Scaling and Categorical Variables (Scaling the Variables, Dominant Variables, Categorical Data, and Gower’s Distanc...
Practical Statistics for Data Scientists: Model-Based Clustering (Multivariate Normal Distribution, Mixtures of Normals & Selecting the Number of Cluster...
Practical Statistics for Data Scientists: Hierarchical Clustering (A Simple Example, the Dendrogram, the Agglomerative Algorithm & Measures of Dissimilar...
Practical Statistics for Data Scientists: K-Means Clustering (2) (Interpreting Clustering Results & Determining the Optimal Number of Clusters K)
Practical Statistics for Data Scientists: K-Means Clustering (1) (A Simple Example & K-Means Algorithm Code Source)
Practical Statistics for Data Scientists: Principal Components Analysis (2) (Formal Definition, Interpreting Components & Correspondence Analysis)
Practical Statistics for Data Scientists: Principal Components Analysis (1) (Unsupervised Learning, A Simple Example and Computing the Principal Components)
Practical Statistics for Data Scientists: Boosting (2) (Regularization, Hyperparameters & Cross-Validation)
Practical Statistics for Data Scientists: Boosting (1) (Key Concepts & XGBoost)
Practical Statistics for Data Scientists: Bagging and the Random Forest (2) (Random Forest II & Variable Importance)
Practical Statistics for Data Scientists: Bagging and the Random Forest (1) (Bagging and Random Forest)
Practical Statistics for Data Scientists: Tree Models (3) (Dealing With Overfitting Problems in R and Python & Predicting a Continuous Value)
Practical Statistics for Data Scientists: Tree Models (2) (A Simple Example, The Recursive Partitioning Algorithm, & Measuring Homogeneity or Impurity)
Practical Statistics for Data Scientists: K-Nearest Neighbors (3) (KNN as a Feature Engine) & Tree Models (1) (Key Concepts)
Practical Statistics for Data Scientists: K-Nearest Neighbors (2) (Standardization & Choosing K)
Practical Statistics for Data Scientists: K-Nearest Neighbors (1) (Example, Distance Metrics, and One Hot Encoder)
Practical Statistics for Data Scientists: Strategies for Imbalanced Data (2) (Data Generation, Cost-Based Classification, and Exploring the Predictions)
Practical Statistics for Data Scientists: Strategies for Imbalanced Data (1) (Undersampling & Oversampling)
Practical Statistics for Data Scientists: Evaluating Classification Models (Confusion Matrix, ROC-AUC & Lift)
Practical Statistics for Data Scientists: Logistic Regression (3) Assessing the Model
Practical Statistics for Data Scientists: Logistic Regression (2) (GLM, Interpretation, Fitting the Model)
Practical Statistics for Data Scientists: Logistic Regression (1) (Mathematic Foundation: Odds, Logit Function, Formula, and Examples)
Practical Statistics for Data Scientists: Discriminant Analysis (Covariance, Discriminant Function, and Application: Predicting Default Risk)
Practical Statistics for Data Scientists: Naive Bayes (Theoretical Approach, Code Source, & Prediction)
Practical Statistics for Data Scientists: Weighted Regression, and Interactions and Main Effects in Regression in Depth
Practical Statistics for Data Scientists: Stepwise Regression & Model Selection in Depth
Mathematical Principles: Non-parametric Inference (Wilcoxon Signed-Rank Test & Wilcoxon Rank-Sum Test)
Mathematical Principles: Inference on Proportions- Sample Size Estimation, Hypothesis Testing, and Chi-Squared Test
Mathematical Principles: Inference for Variance, Chi-Squared Distribution, F-Statistics, and Inference on Proportions (Wald & Wilson)
Mathematical Principles: Hypothesis Testing, Paired Samples, Independent Samples, and additional concepts.
Practical Statistics for Data Scientists: Partial Residual Plots and Nonlinearity, Polynomial & Spline Regression, and Generalized Additive Models
Practical Statistics for Data Scientists: Regression Diagnostics- Outliers, Influential Observations, and Heteroskedasticity
Practical Statistics for Data Scientists: Interpreting the Regression Equation - Correlation, Multicollinearity, Confounding Variables, and Interactions
Practical Statistics for Data Scientists: Factor Variables in Regression, Dummy Variables, Many Levels, and Ordered Factor Variables
Practical Statistics for Data Scientists: Assessing the Model, Cross Validation, Model Selection, and Prediction Using Regression
Practical Statistics for Data Scientists: Simple Linear Regression, Least Squares, and Multiple Linear Regression
Practical Statistics for Data Scientists: Chi-Square Theories, Fisher’s Exact Test, Multi-Arm Bandit Algorithm, and Power & Sample Size
Practical Statistics for Data Scientists: ANOVA(One & Two-Way), F-statistic, and Chi-Square Test
Practical Statistics for Data Scientists: t-Tests, Multiple Testing, False Discovery, and Degrees of Freedom
Practical Statistics for Data Scientists: p-Values, Practical Applications, and Type I & Type II Errors
Practical Statistics for Data Scientists: A/B Testing, Hypothesis Tests (One-Way & Two-Way), and Permutation Test
Practical Statistics for Data Scientists: t-Dist, Binomial, Chi-Square, F-Dist, and Poisson Distribution.
Practical Statistics for Data Scientists: Bootstrap, Confidence Intervals, and Normal Distribution
Practical Statistics for Data Scientists: Sampling, Bias, and Sampling Distribution(Central Limit Theorem)
Practical Statistics for Data Scientists: Data Distribution, Correlation, and Various Data Visualization Plots
Practical Statistics of Data Scientists: Elements of Statistical Terminologies- Data Statistics Fundamentals, Data Types, and Estimates of Location & Var...
The Ongoing Chronicles of TIL25 — A Motivating Expedition as a Data Scientist & AI/ML Engineer Candidate
Revisiting Ensemble Method, Random Forest, and XGBoost
Transformers and Foundation Models: GELU, Layer Norm, Key Concepts & Workflow
Primary Goals, Common Tasks, and Deep Learning NLP
Bagging & Boosting : Basic Concepts & Code Implementation
Using the Majority Voting Principle to Make Predictions, and Evaluating & Tuning the Ensemble Classifier
Code Structure of Combining Classifiers via Majority Vote
Key Concepts and Mathematics Explanation
Use Other Metrics, Assign Different Class Weights, or Upsample the Minority Class
ROC area Under The Curve (ROC AUC)
Confusion Matrix and F1 score
Grid Search for Fine-Tuning Machine Learning Models
Bias & Variance, and Learning & Validation Curves
Model Selection and K-Fold Cross Validation
Key Concepts and Example Code with Scikit-learn
Applying Kernal Principal Component Analysis(KPCA) to New Data Points
Implementing a Kernal Principal Component Analysis(KPCA) in Python
Nonlinear Mappings with Kernel Principal Component Analysis
Compressing Data via Linear Discriminant Analysis
Compressing Data via Dimensionality Reduction and Summary of PCA
Partitioning a Dataset into Training & Test Datasets, Feature Scaling, and Feature Selection
Handling Categorical Data - Converting, Ordinal Encoding, and One-Hot Encoding
Handling Missing Data - Eliminating and Imputing & Estimators API
The Curse of Dimensionality
Distance Metrics- Euclidean, Manhattan, Minkowski & Chebyshev Distance, and Cosine Similarity
Basic Concepts, How It Works, and Parametric & Non-Parametric Model
Implementation Step by Step
Building a Decision Tree & Random Forest (1) - Key Concepts & How it Works
Information Gain (2) - Entropy & Classification Error
Components, How it Works & Maximizing Information Gain (1) - Gini Impurity
Solving Nonlinear Problems - Using a Kernal SVM
SVM: Nonlinear Separable Case
Basic Concepts and Mathematical Formulations
How To Train in Scikit-Learn, and Regularization with LR model
Cost Function of Logistic Regression
Basic Concepts and Sigmoid Function
Step by Step - Training Perceptron (3)
Step by Step - Training Perceptron (2)
Choosing a Classification Algorithm Step by Step - Training Perceptron (1)
HW5: Out-of-distribution (OOD) Detection (Maximum Softmax Probability & ODIN) and Continual Learning (SLDA & IID Streaming)
HW4: Model Calibration (Platt Scaling & Label Smoothing) and Conformal Prediction (Naive and Adaptive Predictions Sets)
HW3: Optimization through Data Loading, Profiling, & Scaling, and Comparison of Data Parallel & Distributed Data Parallel
Model Drifting, Periodic Re-Training, Detecting Model Drift, Continual Learning (Pre-Trained Model, NCC), and Real-Time Machine Learning
Data-Centric AI: Label Noise, Selection Bias, Data Leakage, and Error Analysis for Model Improvement (Subgroup Errors)
Data-Centric AI: Active Learning, SEALS(Similarity Search for Efficient Active Learning), Dataset Pruning, and Data Engine
Data-Centric AI: Crowdsourcing, Methods to Estimate Annotator Quality, Neural Scaling Laws, Pareto Curves and Power Law
Variation of Conformal Prediction: Size of Calibration Set, Evaluation, and Group-Based & Adaptive Conformal Prediction
Understanding Conformal Prediction: Concepts, Applications, Marginal Coverage, and Recipes In Detail
Uncertainty in Deep Learning, Distribution Shifts, Model Calibration, and Out-of-Distribution (OOD) Detection
Language Models- Transfer Learning, Basic Concepts & Terminologies, Components of NLP Models, and Attention Mechanism
Bias Mitigation Strategies: Loss Reweighting, Sampling & Synthetic Samples and Architectural Changes (OccamNets, Adversarial Training & DANN)
Model Comparison and Bias Mitigation; McNemar’s Test, Dataset Bias, and Bias Detection
AI Ethics; AI Safety, Key Issues, AGI (Artificial General Intelligence), and Current AI Models’ Challenges
Llama 3: Framework, Workflow (RMSNorm, Grouped Query Attention, RoPE, SwiGLU Attention), Pre-training & Post-training
Comparing Pre-trained model embeddings (ResNet+SBERT vs. CLIP) and Prompt Engineering (Short and Direct, Few-Shot Learning, & Expert Prompting)
HW2: Understanding of LoRA and Pre-trained Model Embeddings (ResNet+SBERT) for Visual Question Answering (VQA)
Weights & Biases (W&B) for Monitoring and Fine-Tuning ResNet-18 and Post-Training Evaluations (Dying ReLU, Brightness Robustness)
HW0: Softmax Properties, PyTorch Lightning, and DataLoader
Deep Learning & Numerical Precision(Floating Point), Hardware Considerations, and Distributed Model Training
LLMs - Speeding Up LLMs (Grouped Query Attention, KV Caches, MoE, and DPO)
LLMs- Generating Texts, Positional Encoding, and Fine-Tuning LLMs (LoRA)
LLMs - Perplexity, Tokenizers, Data Cleaning, and Embedding Layer
Basic Machine Learning & Deep Learning, Word Embedding, CNNs, RNNs, LSTM and Transformer
Large Language Model - BERT, GPT, and GPT-2, 3 & 4
Transformers and Foundation Models: GELU, Layer Norm, Key Concepts & Workflow
Brief Explanation of Basic Algebra and Machine Learning
Primary Goals, Common Tasks, and Deep Learning NLP
Transformer Architecture, How the Models Are Different, and Q,K,V in Self-Attention
Designing Machine Learning Systems: Causes of ML System Failures (2) (Correcting Degenerate Feedback Loops) & Data Distribution Shifts
Designing Machine Learning Systems: Causes of ML System Failures (1) (Production data differing from training data, Edge Cases, and Degenerate Feedback Loops)
Designing Machine Learning Systems: ML on the Cloud and on the Edge & Model Optimization (AutoTVM & WebAssembly)
Designing Machine Learning Systems: Model Comparison (Low-Rank Factorization, Knowledge Distillation, Pruning, & Quantization)
Designing Machine Learning Systems: Model Deployment and Batch Prediction Versus Online Prediction (Unifying Batch and Streaming Pipeline)
Designing Machine Learning Systems: Model Offline Evaluation (Methods: Perturbation Tests, Invariance Tests, etc.)
Designing Machine Learning Systems: Auto ML (Hyperparameter Tuning & NAS), Model Offline Evaluation (1) (Establishing Baselines)
Designing Machine Learning Systems: Experiment Tracking, Versioning, and Distributed Training (Data Parallel)
Designing Machine Learning Systems: Model Selection, Evaluating ML Models, & Ensemble Method (Bagging, Boosting, and Stacking)
Designing Machine Learning Systems: Data Leakage (Definition, Common Causes, Detecting and Preventing it)
Designing Machine Learning Systems: Feature Engineering Techniques (2) (Positional Embeddings) & Engineering Good Features
Designing Machine Learning Systems: Feature Engineering Techniques (1) (Handling Missing Values, Scaling, Normalization, Binning, Encoding Categorical Values...
Designing Machine Learning Systems: Data Augmentation (Simple Label-Preserving Transformations, Perturbation, and Data Synthesis)
Designing Machine Learning Systems: Class Imbalance (How to Deal with the problems: Evaluation Metrics, Over & Undersampling, Resampling and Algorithm-le...
Designing Machine Learning Systems: Labeling (Hand Labels, Natural Labels, & Addressing the Lack of Labels - Active Learning, etc.)
Designing Machine Learning Systems: Sampling (Nonprobability, Simple Random, Stratified, Weighted, Reservoir, and Importance Sampling)
Designing Machine Learning Systems: Modes of Dataflow & Batch / Real-Time Processing
Designing Machine Learning Systems: Data Formats (JSON, Parquet & Binary Format), Data Models (Relational & NoSQL), and Data Storage Engines (ETL)
Designing Machine Learning Systems: Framing ML Problems (2) (Types of ML Tasks & Objective Functions)
Designing Machine Learning Systems: Iterative Process & Framing ML Problems (1)
Designing Machine Learning Systems: Business and ML Objectives & Requirements for ML Systems
Designing Machine Learning Systems (MLOPs) Review Begins!
Population Proportions, p-values & Confidence Intervals, and Type I & II Errors
Test Statistics (Z-Test, t-Test, and Chi-Squared Test)
Law of Large Numbers, Central Limit Theorem, and Hypothesis Testing (1) - General Setup
Properties of Random Variable
Continuous Probability Distribution and Markov Chains
Joint, Marginal, & Conditional Probability Distributions, and Discrete & Poisson Distributions
Basic Probability - Counting and Random Variables
Applying PCA in Machine Learning and Scree Plot
Concepts, Types of Regularization, and How It Works
Further Analysis on R-Squared
How R-Squared Is Used As a Performance Metric in Machine Learning
Concepts Overview, Mathematical Calculation, and Interpretation
Applications on Machine Learning & Further Explanations
Mathematical Definition & Algorithms
Bayesian Statistics, the Law of Total Probability, and Var-Cov Matrix
Eigendecomposition, Symmetric Matrix, and Eigendecomposition
Norm(2), Eigenvectors and Eigenvalues
Scalar, Norm, Matrix (Inverse, Basic Functions), Rank
The Start of Recording TIL Summer ‘24 - During Summer in Rochester as a Data Scientist Candidate
Aggregating data with groupby
Adding new columns and refining unnecessary columns.
Change data type
Utilize public data to import two completely different data, pre-process them, and merge them
Basic Data Structure - 3. Deque
Basic Data Structure - 2. Queue
Basic Data Structure - 1. Stacks
TIL :CH02. Algorithm Anlaysis (2)
TIL: CH02. Algorithm Anlaysis
Python Basic Review
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Introduction of Statistics (2023) Whole Lecture Review
Population Proportions, p-values & Confidence Intervals, and Type I & II Errors
Law of Large Numbers, Central Limit Theorem, and Hypothesis Testing (1) - General Setup
Properties of Random Variable
Basic Data Structure - 3. Deque
Basic Data Structure - 2. Queue
Basic Data Structure - 1. Stacks
TIL :CH02. Algorithm Anlaysis (2)
TIL: CH02. Algorithm Anlaysis
Python Basic Review
Use Other Metrics, Assign Different Class Weights, or Upsample the Minority Class
ROC area Under The Curve (ROC AUC)
Confusion Matrix and F1 score
Grid Search for Fine-Tuning Machine Learning Models
Bias & Variance, and Learning & Validation Curves
Model Selection and K-Fold Cross Validation
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Introduction of Statistics (2023) Whole Lecture Review
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Interview Questions & Answers
Applying Kernal Principal Component Analysis(KPCA) to New Data Points
Implementing a Kernal Principal Component Analysis(KPCA) in Python
Nonlinear Mappings with Kernel Principal Component Analysis
Compressing Data via Linear Discriminant Analysis
Compressing Data via Dimensionality Reduction and Summary of PCA
Deep Learning & Numerical Precision(Floating Point), Hardware Considerations, and Distributed Model Training
LLMs - Speeding Up LLMs (Grouped Query Attention, KV Caches, MoE, and DPO)
LLMs- Generating Texts, Positional Encoding, and Fine-Tuning LLMs (LoRA)
LLMs - Perplexity, Tokenizers, Data Cleaning, and Embedding Layer
Large Language Model - BERT, GPT, and GPT-2, 3 & 4
Deep Learning & Numerical Precision(Floating Point), Hardware Considerations, and Distributed Model Training
LLMs - Speeding Up LLMs (Grouped Query Attention, KV Caches, MoE, and DPO)
LLMs- Generating Texts, Positional Encoding, and Fine-Tuning LLMs (LoRA)
LLMs - Perplexity, Tokenizers, Data Cleaning, and Embedding Layer
Large Language Model - BERT, GPT, and GPT-2, 3 & 4
Aggregating data with groupby
Adding new columns and refining unnecessary columns.
Change data type
Utilize public data to import two completely different data, pre-process them, and merge them
Introduction of Statistics (2023) Whole Lecture Review
Continuous Probability Distribution and Markov Chains
Joint, Marginal, & Conditional Probability Distributions, and Discrete & Poisson Distributions
Basic Probability - Counting and Random Variables
Partitioning a Dataset into Training & Test Datasets, Feature Scaling, and Feature Selection
How To Train in Scikit-Learn, and Regularization with LR model
Cost Function of Logistic Regression
Basic Concepts and Sigmoid Function
Implementation Step by Step
Building a Decision Tree & Random Forest (1) - Key Concepts & How it Works
Information Gain (2) - Entropy & Classification Error
Components, How it Works & Maximizing Information Gain (1) - Gini Impurity
Bagging & Boosting : Basic Concepts & Code Implementation
Using the Majority Voting Principle to Make Predictions, and Evaluating & Tuning the Ensemble Classifier
Code Structure of Combining Classifiers via Majority Vote
Key Concepts and Mathematics Explanation
AI Ethics; AI Safety, Key Issues, AGI (Artificial General Intelligence), and Current AI Models’ Challenges
Llama 3: Framework, Workflow (RMSNorm, Grouped Query Attention, RoPE, SwiGLU Attention), Pre-training & Post-training
Comparing Pre-trained model embeddings (ResNet+SBERT vs. CLIP) and Prompt Engineering (Short and Direct, Few-Shot Learning, & Expert Prompting)
HW2: Understanding of LoRA and Pre-trained Model Embeddings (ResNet+SBERT) for Visual Question Answering (VQA)
Designing Machine Learning Systems (MLOPs) Review Begins!
The Ongoing Chronicles of TIL25 — A Motivating Expedition as a Data Scientist & AI/ML Engineer Candidate
The Start of Recording TIL Summer ‘24 - During Summer in Rochester as a Data Scientist Candidate
Applying PCA in Machine Learning and Scree Plot
Applications on Machine Learning & Further Explanations
Mathematical Definition & Algorithms
Solving Nonlinear Problems - Using a Kernal SVM
SVM: Nonlinear Separable Case
Basic Concepts and Mathematical Formulations
The Curse of Dimensionality
Distance Metrics- Euclidean, Manhattan, Minkowski & Chebyshev Distance, and Cosine Similarity
Basic Concepts, How It Works, and Parametric & Non-Parametric Model
Basic Machine Learning & Deep Learning, Word Embedding, CNNs, RNNs, LSTM and Transformer
Basic Concepts and the Detailed Architecture
Types of Learning and Neural Net Zoo: Fully Connected Networks (MLPs), Inductive Bias, and Convolutional Neural Networks (CNNs)
Transformers and Foundation Models: GELU, Layer Norm, Key Concepts & Workflow
Primary Goals, Common Tasks, and Deep Learning NLP
Transformer Architecture, How the Models Are Different, and Q,K,V in Self-Attention
AI Ethics; AI Safety, Key Issues, AGI (Artificial General Intelligence), and Current AI Models’ Challenges
Llama 3: Framework, Workflow (RMSNorm, Grouped Query Attention, RoPE, SwiGLU Attention), Pre-training & Post-training
Comparing Pre-trained model embeddings (ResNet+SBERT vs. CLIP) and Prompt Engineering (Short and Direct, Few-Shot Learning, & Expert Prompting)
AI Ethics; AI Safety, Key Issues, AGI (Artificial General Intelligence), and Current AI Models’ Challenges
Llama 3: Framework, Workflow (RMSNorm, Grouped Query Attention, RoPE, SwiGLU Attention), Pre-training & Post-training
Comparing Pre-trained model embeddings (ResNet+SBERT vs. CLIP) and Prompt Engineering (Short and Direct, Few-Shot Learning, & Expert Prompting)
AI Ethics; AI Safety, Key Issues, AGI (Artificial General Intelligence), and Current AI Models’ Challenges
Llama 3: Framework, Workflow (RMSNorm, Grouped Query Attention, RoPE, SwiGLU Attention), Pre-training & Post-training
Comparing Pre-trained model embeddings (ResNet+SBERT vs. CLIP) and Prompt Engineering (Short and Direct, Few-Shot Learning, & Expert Prompting)
AI Ethics; AI Safety, Key Issues, AGI (Artificial General Intelligence), and Current AI Models’ Challenges
Llama 3: Framework, Workflow (RMSNorm, Grouped Query Attention, RoPE, SwiGLU Attention), Pre-training & Post-training
Comparing Pre-trained model embeddings (ResNet+SBERT vs. CLIP) and Prompt Engineering (Short and Direct, Few-Shot Learning, & Expert Prompting)
Implementation Step by Step
Building a Decision Tree & Random Forest (1) - Key Concepts & How it Works
Handling Categorical Data - Converting, Ordinal Encoding, and One-Hot Encoding
Handling Missing Data - Eliminating and Imputing & Estimators API
Types of Learning and Neural Net Zoo: Fully Connected Networks (MLPs), Inductive Bias, and Convolutional Neural Networks (CNNs)
Basic Mathematics, Supervised ML, and Review of Multi-Layered Perceptron
Basic Concepts and the Detailed Architecture
Neural Net Zoo: Transformers, Recurrent Neural Networks (RNNs) and Graph Neural Networks (GNNs)
Basic Machine Learning & Deep Learning, Word Embedding, CNNs, RNNs, LSTM and Transformer
Basic Concepts and the Detailed Architecture
Transformers and Foundation Models: GELU, Layer Norm, Key Concepts & Workflow
Primary Goals, Common Tasks, and Deep Learning NLP
Test Statistics (Z-Test, t-Test, and Chi-Squared Test)
Linear Algebra & Matrix for Programmers
Linear Algebra & Matrix for Programmers
Linear Algebra & Matrix for Programmers
Introduction of Statistics (2023) Whole Lecture Review
Interview Questions & Answers
Key Concepts and Example Code with Scikit-learn
Model Selection and K-Fold Cross Validation
Basic Mathematics, Supervised ML, and Review of Multi-Layered Perceptron
Basic Mathematics, Supervised ML, and Review of Multi-Layered Perceptron
Basic Mathematics, Supervised ML, and Review of Multi-Layered Perceptron
Neural Net Zoo: Transformers, Recurrent Neural Networks (RNNs) and Graph Neural Networks (GNNs)
Transformer Architecture, How the Models Are Different, and Q,K,V in Self-Attention
Transformer Architecture, How the Models Are Different, and Q,K,V in Self-Attention
Brief Explanation of Basic Algebra and Machine Learning
Brief Explanation of Basic Algebra and Machine Learning
Large Language Model - BERT, GPT, and GPT-2, 3 & 4
Large Language Model - BERT, GPT, and GPT-2, 3 & 4
Revisiting Ensemble Method, Random Forest, and XGBoost
Revisiting Ensemble Method, Random Forest, and XGBoost
HW0: Softmax Properties, PyTorch Lightning, and DataLoader
HW0: Softmax Properties, PyTorch Lightning, and DataLoader
HW0: Softmax Properties, PyTorch Lightning, and DataLoader
HW0: Softmax Properties, PyTorch Lightning, and DataLoader
Weights & Biases (W&B) for Monitoring and Fine-Tuning ResNet-18 and Post-Training Evaluations (Dying ReLU, Brightness Robustness)
Weights & Biases (W&B) for Monitoring and Fine-Tuning ResNet-18 and Post-Training Evaluations (Dying ReLU, Brightness Robustness)
Weights & Biases (W&B) for Monitoring and Fine-Tuning ResNet-18 and Post-Training Evaluations (Dying ReLU, Brightness Robustness)
HW2: Understanding of LoRA and Pre-trained Model Embeddings (ResNet+SBERT) for Visual Question Answering (VQA)
Model Comparison and Bias Mitigation; McNemar’s Test, Dataset Bias, and Bias Detection
Model Comparison and Bias Mitigation; McNemar’s Test, Dataset Bias, and Bias Detection
Model Comparison and Bias Mitigation; McNemar’s Test, Dataset Bias, and Bias Detection
Model Comparison and Bias Mitigation; McNemar’s Test, Dataset Bias, and Bias Detection