-
The course extends the fundamental tools in "Machine Learning Foundations" to powerful and practical models by three directions, which includes embedding numerous features, combining predictive features, and distilling hidden features. [這門課將先前「機器學習基石」課程中所學的基礎工具往三個方向延伸為強大而實用的工具。這三個方向包括嵌入大量的特徵、融合預測性的特徵、與萃取潛藏的特徵。]
Overview
Syllabus
-
- 第一講:Linear Support Vector Machine
- more robust linear classification solvable with quadratic programming
- 第二講:Dual Support Vector Machine
- another QP form of SVM with valuable geometric messages and almost no dependence on the dimension of transformation
- 第三講:Kernel Support Vector Machine
- kernel as a shortcut to (transform + inner product): allowing a spectrum of models ranging from simple linear ones to infinite dimensional ones with margin control
- 第四講:Soft-Margin Support Vector Machine
- a new primal formulation that allows some penalized margin violations, which is equivalent to a dual formulation with upper-bounded variables
- 第五講:Kernel Logistic Regression
- soft-classification by an SVM-like sparse model using two-level learning, or by a "kernelized" logistic regression model using representer theorem
- 第六講:Support Vector Regression
- kernel ridge regression via ridge regression + representer theorem, or support vector regression via regularized tube error + Lagrange dual
- 第七講:Blending and Bagging
- blending known diverse hypotheses uniformly, linearly, or even non-linearly; obtaining diverse hypotheses from bootstrapped data
- 第八講:Adaptive Boosting
- "optimal" re-weighting for diverse hypotheses and adaptive linear aggregation to boost weak algorithms
- 第九講:Decision Tree
- recursive branching (purification) for conditional aggregation of simple hypotheses
- 第十講:Random Forest
- bootstrap aggregation of randomized decision trees with automatic validation
- 第十一講:Gradient Boosted Decision Tree
- aggregating trees from functional + steepest gradient descent subject to any error measure
- 第十二講:Neural Network
- automatic feature extraction from layers of neurons with the back-propagation technique for stochastic gradient descent
- 第十三講:Deep Learning
- an early and simple deep learning model that pre-trains with denoising autoencoder and fine-tunes with back-propagation
- 第十四講:Radial Basis Function Network
- linear aggregation of distance-based similarities to prototypes found by clustering
- 第十五講:Matrix Factorization
- linear models of items on extracted user features (or vice versa) jointly optimized with stochastic gradient descent for recommender systems
- 第十六講:Finale
- summary from the angles of feature exploitation, error optimization, and overfitting elimination towards practical use cases of machine learning