This week I discovered that image preprocessing is a ton of work.
My current project is creating a model that predicts digits using the MNIST data set.
To improve the performance of the my model I needed to touch up my digits with a two fold strategy.
- Create a bounding box
- De-Skew the image
Easy I thought. Even with no experience doing image processing how hard could it be? Well after step one I was feeling pretty confident. A quick drop into the
skimage library and my numbers were bounded and resized.
thresh = threshold_otsu(image) image = image > thresh binary = regionprops(image).image.astype(float) plt.matshow(resize(binary,(28,28)))
Then I thought about de-skewing or rotating my image. And I kept thinking. For three days I had no real clue where to go. After reading a ton of papers and consulting Professor Google countless times I finally came to a workable solution.
I would use PCA to determine a principal axis vector and use that to calculate an angle and rotate my image. And it worked, sort of. Maybe I’ll get some better results next week.
If you have any methods or bits of knowledge you want to drop on me about image processing, I’d love to hear them.