Space Apps NYC 2017

I won best use of data at NASA’s SpaceApps 2017 in NYC with a project to unify NASA’s ~38,000 open data file formats and APIs.

Geography at Internet Scale

Geography at Internet Scale is a talk I gave at Dataengconf 2016 on the top things data engineers and data scientists needs to know about my favorite subject:

Robot Donald Trump Twitterbot

I created the Robot Donald Trump twitter bot in a few days during summer 2016. It uses Markov chains, trained on Trump campaign speeches, random songs, and crazy conspiracy theory sites to generate a tweet every 15 minutes. Automated grammar-correction functions turn the gibberish Markov chain output into something resembling English. Noun phrase chunking finds subjects to turn into hashtags and Sentiment analysis is used to add Trump’s characteristic terminating mood statements. It also occasionally adds a link to a site related to the tweet.

As of Dec. 2016 it was getting 100,000+ hits a month.

I shut it down in Jan. 2017 because it just got so depressing – passion projects should be fun!

Here’s it’s latest tweets:
Continue reading

2016 Iowa Caucus Research

I really love political data – so getting to use Dstillery’s billions of events and heavy duty machine learning pipeline to make a marketing piece for the start of election year 2016 was right up my alley. Together with my colleague Pete Ibarra (who I also collaborated with on our …

Hack the Dinos

Carleen Pan, Karol Zięba, Carol Lin and I won best Visualization at American Natural History Museum’s Hack the Dinos Hackathon with our project, Sapling Detector, a way for paleontologists to automatically convert images of phylogenetic trees into human readable text formats using computer vision and deep learning techniques.