Unsupervised Event and Extremism Detection in Open Source Data Streams
Social media, blogs, and newspapers are all example of noisy, open source data streams. The amount of data in these sources is multiplying, making it challenging to make sense of them. In this dissertation, we focus on extracting two types of signals from these noisy data sets, events and extremists. While these two signals seem unrelated, they are actually useful for understanding potential movement in areas of conflict.We have three major contributions in this dissertation. First, we develop methods for detecting targeted events, i.e. events of a particular type at a particular time and location from different forms of open-source data, specifically newspapers, blogs, and tweets. We propose both an offline and an online approach for identifying and summarizing events of the target domain occurring in a particular location from a large number of different news article sources. Next, we turn our attention to a noisier data stream, tweets. Unfortunately, the variability in sentence structure, vocabulary, and limited length require different methods to be proposed for this data stream. We propose a simple algorithm which leverages geotagged bursty term graphs to detect events from a tweet stream. Because Twitter is such a noisy domain and the Twitter API only gives samples of the tweet stream, we then focus on understanding the impact of sample size and noise level on location-based event detection. Finally, we consider extremist conversation on social media. We begin by identifying potential features about ISIS supporters on Twitter, grouping these features into categories, and presenting a case study looking at the ISIS extremist group. We then propose an approach for identifying users who engage in extremist discussions online. We conclude the dissertation by discussing future areas of work and highlighting current challenges for using big data to help make progress on societal scale issues.
MetadataShow full item record
Showing items related by title, author, creator and subject.
Wei, Yifang; Yossinger, Nili Sarit; Cronbaugh, Christopher; Quinn, Dennis R.; Singh, Lisa; Martin, Susan Forbes; Berkowitz, Sidney; Collmann, Jeff; McGrath, Susan; Swingewood, Eleanor; Taylor, Abbie, 1988- (2014)This paper describes initial efforts to use open-source data to capture knowledge about forced migration in Iraq. Our goal is to understand the connection between open-source data and possible leading indicators of forced ...