In this homework, I am asked to use Hadoop to fulfill relatively big data manipulation. It turns out that I need to install VirtualBox with Windows on my Mac before I can install Hadoop on Windows system. After that, I write both Java and Python files to fulfill the required calculation.