Michael Lai

Basketball Player Tracker

2016-03-29T00:00:00+00:00

I wanted to create a system that could track players in a basketball clip and translate them to a coordinate grid.

Background

This kind of motion tracking already exists in the form of SportVU. But the aim of my project was to use the accessibility of YouTube clips to create my player tracking.

By satisfying this constraint the system would have the potential to be applied in college stadiums and foreign leagues where $100,000 installation of 6 cameras just isn’t feasible.

Player tracking has shown a lot of applications already in terms of strategic information, player health monitoring, and evaluating talent. Making it more accessible levels the playing field of analytics.

Challenges

Creating this system means I have to overcome a few challenges.

Projective Transform

After a series of filters, we find probabilistic Hough lines. Using the baseline, sideline, free throw line, and bottom of the free throw lane, we can create a map. Since the dimensions of a real court are known, we can place objects on the drawn court.

Player Identification

The next challenge we need to address is identifying players in each frame. Issues such as overlapping players and distorted lighting make this a significant challenge. I applied a convolutional neural network based on OverFeat. The model is implemented using TensorBox and creates bounding boxes on identified players.

Here’s a Youtube video one of the early iterations applied to a Youtube highlight reel.

The detection model has a parameter for threshold which allows less confident boxes to be included. By reducing this threshold to 30% confidence, I increased my recall by 40% with a loss in precision. The false positives will be filtered out in the next stage.

Filtering and Error Detection

I choose two colored filters, in this case purple and white, based on the teams that are playing. By applying a white and purple filter to each bounding box I can create a feature, color occurrence frequency, that allows me to classify players by team.

Furthermore, I can filter out erroneous bounding boxes by using this and detecting neither team color in the photo.

Results

When all these steps are applied to each frame of the video, the video can be reconstructed and looks something like this.

There’s a lot of room for improvement here. As you can see players cut in and cut out. Also the coordinates have significant noise.

The takeaway is that each player can be mapped to a location on the court, based on their position on the screen and the landmarks on the court.

Future work

The model is trained on only 75 frames of training data. I’d definitely look here to begin with improvements.

To make the translation more versatile, I need to improve the handling of court detection. This is particularly an issue when crossing half court. Trying to detect objects on the court rather than lines might be more resistant to obstruction and ultimately prove more flexible.

Next leveraging the temporal nature of film, I can reduce jittering by interpolating points between frames. A Kalman filter might be applicable here.

Finally I want to create a time series where I can link rectangles to each other. My idea for this is to use either the Kalman filter or a HMM to create the continuous sequence of coordinates based on a probability function including distance and color.

So there’s plenty of future work to do, and if you’d like to take a look at the code and try some of it for yourself check it out on my github.

NBA Player Descriptions

2016-03-03T00:00:00+00:00

Draftexpress.com does a lot of analysis. Sometimes I wonder if it’s all the same or if they’re actually coming up with some new and novel way to describe people every time.

I scraped 761 player profile pages, such as this one, from between 2000-2015 and used Latent Dirichlet Allocation (LDA) to model what topics writers were talking about.

Each topic encompasses a position, along with their desirable attributes. The point guard and power forward positions oddly share a the phrase “mid range”.

The positions are accurately categorized by the topics! This confirms that the writing and model capture positions correctly.

There’s also a very odd relationship between the clusters and the years that fall into those clusters.

Since the two clusters containing these years also both contained the phrase mid range, I dug into the frequency of the occurrence of midrange.

From the chart we can see that the usage of the phrase mid range is going out of style. Today’s NBA is focused on efficiency, and it seems like writers are following that trend too.

Apparently no one cares about midrange basketball anymore, not even NBA writers. So long Kobe and MJ. Hello Steph Curry.

The chart below helped me better understand my data.

It’s interactive! Click on the vertical axis to filter out different lines. It’s an easy way to filter each cluster in terms of year, pick, position, height, and weight.

Predicting Numbers

2016-02-23T00:00:00+00:00

Lately I’ve been building a model that can read your mind… or just your handwriting.

Using Tensorflow, Flask, and Heroku, I created an app that can guess numbers drawn on it.

Here’s me presenting my app!

The model is a convolutional neural network trained on the MNIST data set. The next step would be for me to store the drawn data and add it to my model… maybe when I have a bit more time.

It kinda sucks at guessing 9s and 0s, a consequence of a model that’s over fit to its specific training data. Maybe adding some deformations to the dataset would improve performance.

The instructions are simple, just click on the grid and draw a number.

Fallback link for browsers that, unlikely, don't support frames

If you want to see the code check out my github.

HTML5 To Jekyll

2016-02-15T00:00:00+00:00

I just finished my 5th week at Metis, so I decided what better way to celebrate than spend my weekend re-working my blog!

Making Jekyll work with HTML5 is a bit of a headache. I hope my pain can be your gain.

There’s a great tutorial on how to convert a theme to here but I’ll toss a few things that I found would’ve helped past me. I’m giving tips from the perspective that you at least skimmed this guide.

Basics

Find an HTML5 template you like.
Get Jekyll running.
Add the necessary folder structure and config files to your template.
Customize.

Tips

I’ll skip to number three since the first two should be pretty straight forward. The best tip I have for past me is to break up each section into an _include. It adds a lot of flexibility in how your layouts are designed.

For example having a header is necessary, but adding specific HTML for say a specific menu allows you to have slightly different headers to suit your needs.

In terms of customization, I spent far too much time wresting with the liquid code. I really hope I didn’t miss the easy way, but finding indexes and going through arrays was a monstrosity.


  #Finds the index of the current page as well as the total for the number of posts.
   for post in site.posts
     assign max = forloop.length | minus : 2
     if page.url == post.url
       assign current = forloop.index
     endif
   endfor

  #If the post is less than 3 from the start or greater than 3 from the end,
  #use the first 5 pages or last 5 respectively.
   if current < 3
     assign current = 3
   elsif current > max
     assign current = max
   endif

  #For each post use two posts forward in time and two backwards in as the links in the sidebar.
   for post in site.posts
     assign last = current | plus : 3
     assign first = current | minus : 3
     if forloop.index > first and forloop.index < last
      
     endif
   endfor

This little bit was necessary just to create chronologically adjacent sidebar links further than 1 post away.

Finally, copy elements from other themes. Tons of HTML5 based themes are available here and they’re already fit for jekyll

Liquid is really annoying and Jekyll is really amazing.

E-mail me.

Cleaning Pictures

2016-02-12T00:00:00+00:00

This week I discovered that image preprocessing is a ton of work.

My current project is creating a model that predicts digits using the MNIST data set.

To improve the performance of the my model I needed to touch up my digits with a two fold strategy.

Create a bounding box
De-Skew the image

Easy I thought. Even with no experience doing image processing how hard could it be? Well after step one I was feeling pretty confident. A quick drop into the skimage library and my numbers were bounded and resized.

thresh = threshold_otsu(image)
image = image > thresh
binary = regionprops(image)[0].image.astype(float)
plt.matshow(resize(binary,(28,28)))

Then I thought about de-skewing or rotating my image. And I kept thinking. For three days I had no real clue where to go. After reading a ton of papers and consulting Professor Google countless times I finally came to a workable solution.

I would use PCA to determine a principal axis vector and use that to calculate an angle and rotate my image. And it worked, sort of. Maybe I’ll get some better results next week.

If you have any methods or bits of knowledge you want to drop on me about image processing, I’d love to hear them.

E-mail me.

Determiner Noun: Number

2016-01-29T00:00:00+00:00

Since the dawn of time, humanity has sought the answers to what to name their movies if they want to make the most money. Finally someone has nudged the the space of understanding.

Beginning with a data set of just movie Titles, Revenue, and Theaters the struggle began, to break down Titles into parts of speech to find the answers I was looking for. In order to accomplish this I met a friend, “Spacey” that helped me smash my Titles into tiny parts of speech.

Armed with my tiny parts of speech, I herded the Movies into bins based on their Total Theater counts. In one bin small movies with 0-20 theaters, the next bin 20-1000, the final bin 1000+ theaters. This way all the bins were of roughly equal size.

In order to put my Total Grosses in line, I mushed them all through a natural log, which oddly enough make them more normal.

As my quest for the answers progressed, I decided to cross my parts of speech with my bins. Which gave me a new weapon: 56 features. Again I had to normalize them, in order to make them the same size. So they’d be a better fit, of course.

I took these features and trained them on a ridge (regression). And what a beautiful ridge regression it was. The lambda was all the way to 10.

After training them on the ridge regression, each feature produced a beta I knew I was close to finding the answers.

In order to bait out the answers, I subtracted the high betas minus the low betas for each part of speech. This was because my low betas were disguised, and were actually the baseline for my comparisons.

Huzzah! The answers!

For wide release movies I discovered it was best to name a movie with a noun, number, and punctuation. My guess is the number trend is a result of wide release sequels producing a lot of money.

For mid release movies, adpositions and verbs were the way to go.

While the results weren’t everything I was looking for, they were something worthwhile. Maybe I’ll be able to find more in Determiner Noun: Number + 1!

Learning Hearthstone

2016-01-19T00:00:00+00:00

In any competitive game information is a huge advantage. Imagine if an opponent played with an open hand. Planning your own turn and planning your future turns would be a piece of cake.

The big question is

How do we predict our opponent’s next move?

Our data is going to be 50,000 games of data where we have our opponents sequence of cards played.

To do this we’re going to use n-grams. N-grams, similar to the word engram, stores our data in a structure where we can easily recall it for relevant information.

An N-gram is comprised of two components: the order and the level. The level is the building block of our n-gram, the lowest level unit. The order is the length of our n-gram, or the length, in units of our n-gram.

N-grams are comprised of continuous strings pulled from a longer continuous string. Here’s what our N-grams look like for our data.

Using the n-grams for prediction is just a stone throw away. Suppose we want to know what comes after a leper gnome based on our data.

Based on our data we see two cards that occur after a leper gnome. So we apply equal weight and each has a 50% chance of being played next.

Applying this model to Hearthstone requires a slight modification. Since we can’t interrupt our opponent’s turn it’s more important to predict what their next turn will be rather than what card they’ll play next. The application then looks like the following.

The guy that implemented the model few other small tweaks and you can read more about it here!

N-grams have other applications in things like spam filters, text prediction, voice recognition, and more.

Turning Over MTA Data

2016-01-14T00:00:00+00:00

For my first data science project at Metis, I worked with a group tasked finding the best MTA station for the purpose of placing a kiosk and promoting a Women in Technology gala.

Goal

The focus of the project was to use MTA turnstile data in order to find the optimal station. To this end we decided that the goal of the kiosk should be to find the highest quantity of high quality attendees.

Approach

The quantity was defined simply as the foot traffic in a station. The quality was defined as the median donation value of that area. This donation value data came from a 2012 study from a different organization. Because the gala is scheduled for the Summer, we chose data from April and May to promote find the best time to promote the event.

Our methodology was to first find the top 5 foot traffic stations, then cross them with the median donation value for their respective areas.

Results

The top 9 Stations were as follows. The 86th St, 125 St, and 96 St stations were dropped due to the way the data was aggregated. Those stations have multiple lines with the same names but different locations. As a result the stations would aggregate foot traffic counts across multiple locations.

Next we crudely took a product of the media contribution and the foot traffic to get some idea of the overall quality of a station.

We ordered them by our “quality” metric and the Wall St station turned out to be our best bet. So we broke Wall St’s foot traffic down by the week to find a best day to send the street team.

It appears Wednesdays or Thursdays are our best bet. It turns out that other stations in the top 7 appear to follow a similar traffic trend.

Some limitations of the analysis include the fact that median contributions for a location and MTA may only be loosely connected. Furthermore things catching a lot of people in a hurry were not considered in the analysis.

Data Stuff

From my novice data cleaning perspective it was a significant challenge to correct the often screwed up turnstile data. It was necessary to do things such as interpolating values when turnstiles reset, or deleting values when they bizarrely went backwards. Automating all that was a nice learning experience.