This page summarizes the materials covered in CS 6601 – Artificial Intelligence I took at Georgia Tech in Fall 2012. Here, I discuss what I have learned throughout the course, my activities and findings, how well I think I did, and what impact it had on me. This page is targeted towards an audience with an academic background and a basic understanding of artificial intelligence.
FOUNDATION: READING AND ASSIGNMENTS
Throughout this course we completed 8 reading assignments and solved accompanying problems from the book Artificial Intelligence: A Modern Approach by Russell and Norvig. (AIMA). Each assignment covered approximately two chapters of the book and consisted of reading approximately 100 pages in AIMA and then completing between 3 and 5 problems from the book. We covered the following topics in depth:
- Search algorithms. such as tree and graph search, BFS and DFS, A*, iterative deepening, local search algorithms such as hill climbing, beam search or simulated annealing, bidirectional search and various heuristics.
- Adversarial Games and Constraint Satisfaction Problems. Alpha-beta pruning, minimax algorithm, constraint graphs, backtracking and forward chaining.
- Propositional and First-Order Logic. Basics of logic, axioms, clauses and normal forms, reduction to normal form and inference.
- Probabilistic Reasoning. Bayes Nets and Hidden Markov Models.
- Learning and Decision Making. (Partially observable) Markov decision process, value and policy iteration, decision trees, neural networks, support vector machines
- Reinforcement. Passive and active reinforcement learning, policy search
- Natural Language Processing. Text classification using n-grams, information retrieval, language classification, grammars.
A very general lesson from doing these assignments was certainly time management. The sheer volume of material covered in this course required quite some time and fitting this time into a busy schedule was a valuable lesson to learn as well.
In retrospect, these assignments were the most time consuming part of the course, even more so than that projects. This was in part due to the amount of material that we covered and in part due to my lacking background in artificial intelligence, which meant I had to complement the AIMA book with background reading. Overall though, it was a very valuable experience during which I gained a lot of knowledge and reviewed some familiar material that I will be able to apply to my research. In general, the scores I received for the assignments reflected the effort I spent on them, with the exception of the logic and Bayes net assignment. These concepts were rather new and foreign to me and seemed deceivingly simple at first. As a result, I did not spend enough time on them to go into enough depth for the assignments.
The most satisfying and enjoyable assignment, which was also the one I got the best grade for, was the one on natural language processing. Although this topic was completely new to me as well, the concept was very easy to understand and apply. Also, this assignment included an implementation aspect allowing me to solve the problem in a very practical. Specifically, learning about and implementing an agent capable of information retrieval from the web and parsing that information was invaluable for any future work that involves large-scale data collection.
This course required us to do two mini-projects on problems that are relevant to our future career and research. For each of these projects we had to frame the problem in a social context, propose an approach and a solution, implement it, and describe our findings in a mini-conference paper. These projects were done in teams of two with different partners. For each project, we had to write a proposal and a paper that were consequently reviewed and graded by our peers in class. The four best papers were then selected to give oral presentations, while the rest was presented in a poster session. Specifically, I completed the following two projects.
Battleships – A Game Playing Agent with Jacob Alperin-Sheriff
We implemented a agent capable of playing the popular game Battleships. Specifically, we designed a graphical user interface allowing human as well as computer players to participate in the game. The framework we implemented enabled us to initialize the game with arbitrary board sizes as well as number and sizes of ships to allow future extensions of our work to bigger game boards. However, to allow straight forward comparisons against other algorithms, we used a board size of 10×10 squares on which 10 ships of various sizes were placed. Our agent used a Monte Carlo sampling algorithm at its core, outperforming the rule-based approach that we compared against. Originally, we intended to solve this problem with search algorithms such as A*, but after some initial thought we noticed that a more elegant solution would be Monte Carlo sampling.
Even after we completely changed our approach, we received very good marks from our reviewers. This was after all not surprising, since we had a solid implementation to show and a very detailed validation and justification of our final approach as well as the change in our initial plan. This highlighted the fact that a deviation from the proposal does not necessarily lead to bad reviews as long as it can be explained in the paper. A minor disappointment after receiving almost perfect scores on the proposal and the paper itself was the fact that we weren’t selected for oral presentations.
The proposal of this project can be found here.
The paper for this project can be found here.
Music Genre Classification with Ramitha D. Chitloor
In this project, we implemented a classification framework for music that contained of feature extraction, classification using various classifiers (support vector machines, k-nearest neighbor, and decision trees), and a cross-validation module. We collected data from two freely available datasets including the Million Song dataset that was made available online and through a Python-based API. This project was certainly the most challenging of the two, given that neither I nor my partner had any expertise in the domain of music other than being passive consumers of music. Therefore, we relied on the scientific community and the literature on feature extraction and selection as well as on extensive song databases that offered readily extracted feature sets. Trying to avoid feature extraction directly from audio files, we settled on using global features (e.g. loudness of a song, contained energy, length, key, etc.) for classification. Nonetheless our algorithms are implemented in such a way that they easily extend to a richer feature set.
Our effort in familiarizing ourselves with this domain paid off, as our paper was one of the best in class and we were selected to give an oral presentation.
The proposal of this project can be found here.
The paper for this project can be found here.
Overall, I did very well on both projects yielding an almost perfect score for the first and a perfect score for the second project. The main reason for these results was the fact that we had a strict schedule splitting our time 2/3 for conceptualizing and implementing our idea and 1/3 for writing and summarizing our results. In addition to having a solid implementation we therefore had a well prepared paper with a variety of results, clear figures, and clear and concise text. Writing with a target audience in mind certainly helps to achieve such a writing style and the reviews I got for both of my proposals and papers confirmed this strategy.
In retrospect, this project-based approach to learning AI was a great way to introduce students to the process of peer review of scientific journals and conferences. We received frequent and timely feedback on our projects and learned to write with and audience in mind. Clear and concise writing as it turned out, was as important if not more so than a good implementation of our approach. I enjoyed honing my writing skills, especially since the reviewers rewarded me with excellent grades. In addition, I learned to present my work in both poster format as well as through a short 5-minute long oral presentation. This tight time constraint is in my opinion a good preparation for conference presentations that also have a tight upper bound of 15 to 20 minutes. Apart from these soft skills, I learned a variety of useful algorithms, most notably classification algorithms during my second project.
MY FAVORITE PROJECT
My favorite project was clearly the music genre classification project, which I will describe in more detail in the following section.
Music genre classification is widely discussed in the MIR (Music Information Retrieval) Society and has been of interest for a variety of reasons, including management of large music collections. As these collections are becoming more digital, genre classification is essential for creating recommendation systems, that can benefit collective music libraries, social networks, etc. Only limited agreement is achieved among human annotators when classifying music by genre and an automatic genre classifier aims to overcome those limitations by defining a standard for extracting features directly from the audio signal for classification. It is important that we extract the most robust set of features and establish a reliable ground truth for classifying into genres.
Figure 1: A taxonomy of music showing major and minor classification of genres (courtesy of )
Talupur et al.  describe a neural network approach that is trained for the classification tasks to determine the genre of audio files. Feature vectors are extracted using Fourier Transforms and fed into a neural network classifier. The algorithm – a linear vector quantization network – in this paper can only distinguish between four genres. Galkin et al.  also use neural networks for classification, specifically feed-forward neural networks. Their approach is limited to three genres as classification accuracy would drop significantly below the mentioned 72% if that number was increased. Tzanetakis et al.  apply Gaussian classifiers and Gaussian mixture models. They present a hierarchy of musical genres and an elaborate section on feature extraction. Yet their classification results in only 61% accuracy over ten genres. Salamon et al.  describe an approach using high-level melodic features for their classification. Various algorithms are compared including support vector machines, random forests, k-nearest neighbour networks and Bayesian networks. Recognition rates of over 90% are reported. This approach though requires the existence of a melody in an audio file, which is not the case for all genres.
The main purpose of this project was to implement different classification algorithms and compare their performance when applied to a practical problem. Specifically, we performed music genre classification of songs on a dataset containing 150 songs. The three algorithms we used for classification were support vector machines (SVM), k-nearest neighbor (k-NN) classification, and decision trees (DT). In general, three steps have to be performed to apply a classifier and evaluate the quality of the results:
- Data collection: Retrieve a feature set for a representative sample of music.
- Classification: Run the classification algorithm on this dataset.
- Validation: Confirm the validity of the results using cross-validation.
Data Collection: We used a combination of two datasets, specifically we retrieved song titles, artists and genres from  and used that information to retrieve features from .  offers a very convenient Python API that allows searching and downloading features for songs directly from their website. Specifically, due to unavailability of certain songs on , we used a data set of 112 songs in 5 genres.
Classification: We used the Machine Learning Toolkit (Milk, see ) to implement our classification framework. Milk provides basic implementations for most machine learning algorithms including SVMs, k-NNs, and DTs. We conditioned our dataset to fit the required input format for Milk and extended the algorithms to perform multilabel classification.
Validation: Cross-validation refers to a commonly used technique in machine learning where the dataset is divided into a training set and a test set. The classifier is trained using the training set and its classification accuracy determined using the test set. This process is repeated n times and the average accuracy is reported. We used 10-fold stratified cross-validation, where n = 10 and the data set is split 80%/20% into training and test data.
Our results showing the accuracy of the classification are summarized in Table I. Reported are the average accuracies over all possible permutations of genres. For example in the 2 genre case there are 10 possible non-redundant combinations of genres. All 10 accuracy results are averaged and reported.
|# of Genres||k-NN||SVM||DT|
|2||71.10 %||60.71 %||64.14 %|
|3||53.19 %||46.26 %||48.24 %|
|4||45.65 %||37.27 %||37.79 %|
|5||43.64 %||26.82 %||34.55 %|
It turns out that the classification rate not only depends on the number of genres, but also on the combinations of genres selected. We achieved accuracy rates of up to 97.5% when Pop and Techno are chosen. Fig. 2 shows the accuracy rates of each of the combinations of genres with the shaded bars representing the average value.
Figure 2: Accuracy rates in percent for all classifiers and a varying number and permutation of genres.
As can be clearly seen in Figure 1, k-NN outperforms SVM and decision tree classifiers. We attribute this result to distribution of our features in the input space. Figure 3 plots a selection of features against one another for each data point in our set. This is a sample of the 5 most discriminant features from our visual analysis. As one can see, the division of the input space according to the genres is not clear and most importantly, does not have any distinct linear boundaries. For an SVM to be efficient, we need the classes to be linearly separable , which is not the case for the data we had available. For decision trees, the data constructs a complex tree structure with many spurious relationships that affect the accuracy rate.
Figure 3: A sample of 2D plots showing pair-wise comparisons of the most discriminate features in our data set.
Comparing our feature set to the one used in , one can clearly see why they achieve average accuracies of up to 90 %. We believe that one of the main reasons Salamon et al.  have achieved such high accuracy is that the genres they used for classification are as disparate as opera and pop and therefore separate really well (as seen in Figure 4).
Figure 4: A 2D plot showing a sample of the feature set used in . M. Talupur, S. Nath, and H. Yan , “Classification of music genre,” 2001. [Online]. Available: http://www.cs.cmu.edu/∼yh/files/GCfA.pdf  A. Galkin, P. Rege, and R. Pocratsky, “24787 artifical intelligence and machine learning for engineering design – classficiation of music using neural networks,” 2011. [Online]. Available: agalkin.com/media/ music/proj_music_report.pdf  G. Tzanetakis and P. R. Cook, “Musical genre classification of audio signals,” IEEE Transactions on Speech and Audio Processing, vol. 10, no. 5, pp. 293–302, 2002.  J. Salamon, B. Rocha, and E. Gómez, “Musical genre classification using melody features extracted from polyphonic music signals,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Kyoto, Japan, 25/03/2012 2012. [Online]. Available: files/publications/SalamonRochaGomezICASSP2012.pdf  D. Turnbull. (2012, July) Music genre classification. [Online]. Available: http://modelai.gettysburg.edu/2012/music/  T. Jehan and B. Whitman. (2012) The echo nest. [Online]. Available: http://the.echonest.com/  L. P. Coelho. (2008 – 2012) Milk: Machine learning toolkit. [Online]. Available: http://packages.python.org/milk/  S. J. Russell and P. Norvig, Artificial Intelligence: A Modern Approach. Pearson Education, 2010.  J. G. A. Barbedo and A. Lopes, “Automatic genre classification of musical signals,” EURASIP Journal on Advances in Signal Processing, 2007.
What I learned
The main reason for picking the topic for this project was that I wanted to learn machine learning techniques / classification techniques and apply them to an interesting data set. By implementing SVMs, decision trees, and k-Nearest Neighbor classifiers, I gained a deeper understanding of how they work and what properties the input data must have for these algorithms to perform well.
Why this was my favorite project
This was my favorite project because when we started working on it, we did not know what the final result would be. It was certainly challenging given our limited background in music and music genre classification, but really enjoyable and rewarding once we got the data collection and classification working and gathered our first results.
Having to write a proposal for each of our projects that was later peer-reviewed seemed to be unnecessary overhead at first, but turned out to be very helpful in mainly two aspects. First, it made us review existing literature and research very carefully and get a feeling for what has been done already and where possible improvements can be made. Second, through writing a proposal, we were forced to create a plan of attack, design our experiments, organize our thoughts, and structure our project, which was certainly helpful and sped up the implementation and paper writing. What I found very positive in this process was the fact that changing your plans from what was written in the proposal to what was done for the final paper was not punished.
Finishing two projects in a similar manner, the lessons learned are certainly that time management and scheduling are key to a successful projects. Given that both of my projects yielded excellent results, my take home lesson is that for future team projects, I have to communicate more clearly who is responsible for which part of the project and enforce deadlines more vigorously. From a technical perspective, what certainly can be improved is the novelty of the projects I do. Given our tight time constraint though, I think I did interesting projects with good results.
The projects were certainly the best part of this graduate course since they gave us the chance to explore interesting areas of AI in further depth, work with different team partners, practice our time management and scheduling skills, and receive frequent feedback through the peer-evaluation system. Two projects per semester was also a good number as it allowed us to explore two different topics, it gave us enough time to go into more depth, but at the same time it left some room for improvement and future work. In retrospect, I would have preferred to do fewer assignments and therefore have more time for the projects.
The lectures on the other hand were not very useful since they explained concepts at a very high level and most of the time did not involve the basic concepts at all, but rather extensions. It seemed like this course was geared towards students with an already solid background in AI. In my opinion, lectures should teach the basic concepts and prepare students for the assignments in class and for doing research outside of class. This was generally not the case with this course, as materials were covered after assignments on the corresponding topics were due.
Nonetheless, this was a very valuable course, introducing a breadth of concepts, approaches, and algorithms, which will most certainly benefit me throughout my Ph.D. and in my future career.