In the final project, our teammates decide to work on a open machine learning problem on Kaggle — Titanic: Survival Prediction. We try different models in Sklearn package and it turns out that Adaboost algorithm has the best performance.
Category: Senior (2014)
Big Data: Language, Compile, and Application Homework 2
In this homework, I am asked to use Hadoop to fulfill relatively big data manipulation. It turns out that I need to install VirtualBox with Windows on my Mac before I can install Hadoop on Windows system. After that, I write both Java and Python files to fulfill the required calculation.
Big Data: Language, Compile, and Application Homework 1
In this project, I am given a problem, which can be solved with either general machine learning models: Support Vector Machine (SVM), Random Forest (RF), or Gradient Boosting Machine (GBM), or collaborative models: Matrix Factorization (MF) or Tensor Decomposition (TD). At the end, I decide to work on TD model and generate all required statistic
Continue Reading “Big Data: Language, Compile, and Application Homework 1”
Recent Comments