Project Luther - MVP

Project Luther passes MVP.

I gave the MVP for Luther earlier today.

I was actually quite bewildered with what to do given the sense that there was nothing really to show.

This is a bit of a trap. There are so many interesting little traps we can fall into. There’s the problem of perfectionism… where we may want to keep improving or tweaking or correcting or whatever well beyond a reasonable timeframe. There are lots of pitfalls with regards to misunderstanding the data or not fully appreciating the algorithms or tools being used.

Then, there is the entire issue of what IS an MVP.

The curriculum today actually included a few pointers and references on this topic. It seems as much as is possible the idea is to have progressed fairly far through the goals of the project with a small sample of the data. This irks me somewhat because for a large class of projects, once you’re able to do something with a small bit of data, there’s not much to do related to scaling. In short, if you can crunch a few records, you’re already almost completely done.

In my case, the goal was to have completed data scraping and have done some regression. I had reached that point. I hadn’t done much beyond that.

Since some of my peers had indeed developed slide presentations, I followed suit. I chose to develop a presentation in reveal.js from the beginning. This made it straightforward to put it on the blog immediately, since quite literally it was being served from a local instance of the blog.

I was able to switch immediately to the underlying Jupyter notebook to show the bit of regression I’d done…

which seemed to show almost no coorelation at all between my target and the various features.

The other aspect of this round of presenting MVPs was the time allotted. It was to be about two minutes.

For this particular project, the final presentations need to similarly terse. We get a whopping five minutes.

For MVP, I scraped about 5,000 records. After some cleaning and such, I had a bit more than 1,500 records. The greatest feature-target correlation was about 0.1.

Nonetheless, it was easy to chart out the “next steps”: more data and more analysis.


© 2017. All rights reserved.