In this homework, I need to rank a set of documents with highest similarity to given query. The first challenge I face is to find out a good XML parser to preprocess given XML files. Later, I tried many different TF-IDF combinations to further improve the matching results.