My research is interdisciplinary, encompassing social media/computational social science, web/data science, web archiving, and (local) news.
BLOC (Behavioral Language for Online Classification)
BLOC is a language for representing the online behaviors of social media accounts irrespective of class (human or cyborg or bot) or intent (malicious or benign). BLOC words consist of letters drawn from various alphabets (e.g., actions, pause, & content alphabets). The language is highly flexible, and can be applied to model a broad spectrum of legitimate and suspicious online behaviors.
BLOC has been effectively applied for explaining online behaviors, bot and coordination detection, and detecting accounts — controlled by various nation states — engaged in information operations.
2D PCA projections of BLOC TF-IDF vectors of accounts from three datasets that include both humans (blue) and bots (orange) illustrating the discriminative power of BLOC in separating accounts of different classes: (left) cresci-17 and (right) varol-17. The Venn diagrams show the top five pause-delimited BLOC words for the bot and human accounts shown.
StoryGraph
StoryGraph provides a collection of tools that analyze the news cycle. USA generates a news similarity graph every 10 minutes by computing the similarity of news stories from 17 US news sources across the partisanship spectrum (left, center, and right). In these graphs, the nodes represent news articles, and an edge between a pair of nodes represents a high degree of similarity between the nodes (similar news stories).
Three news similarity graphs illustrating the dynamics of the news cycle. In these graphs, a single node represents a news article, a connected component (multiple connected nodes) represents a single news story reported by the connected nodes. StoryGraph uses the average degree of the connected components to quantify the level of attention stories receive. The first graph shows what is often referred to as a slow news day; low overlap across different news media organizations. The second graph shows a scenario where the attention of the media is split across multiple news stories. The third graph, which is about the release of the [Mueller Report](https://en.wikipedia.org/wiki/Mueller_report), shows a major news event; high degree of overlap/connectivity across different news media organizations.
Local Memory Project helps users and small communities discover, collect, build, archive, and share collections of stories for important local events from local sources.
What Did It Look Like is a Twitter bot that replies to a tweet that contains the #whatdiditlooklike hashtag and a URL, with a Tumblr post of the yearly snapshot of what the webpage looked like.