Learning techniques for identifying vocal regions in music using the wavelet transformation version 1.0
Thesis (M.S.)--Georgetown University, 2011.; Includes bibliographical references.; Text (Electronic thesis) in PDF format. In this research I present a machine learning method for the automatic detection of vocal regions in music. I employ the wavelet transformation to extract wavelet coefficients, from which I build feature sets capable of constructing a model that can distinguish between regions of a song that contain vocals and those that are purely instrumental. Singing voice detection is an important aspect of the broader field of Music Information Retrieval, and efficient vocal region detection facilitates further research in other areas such as a singing voice detection, genre classification and the management of large music databases. As such, it is important for researchers to accurately detect automatically which sections of music contain vocals and which do not. Previous methods that used features, such as the popular Mel-Frequency Cepstral Coefficients (MFCC), have several disadvantages when analyzing signals in the time-frequency domain that the wavelet transformation can overcome. The models constructed by using the wavelet transformation on a windowed music signal produce a classification accuracy of 86.66%, 11% higher than models built using MFCCs. Additionally, I show that applying a decision tree algorithm to the vocal region detection problem will produce a more accurate model when compared to other, more widely applied learning algorithms, such as Support Vector Machines.
MetadataShow full item record
Showing items related by title, author, creator and subject.