How to Layout and Manage Your Machine Learning Project

Last Updated on June 7, 2016

Project layout is critical for machine learning projects just as it is for software development projects. I think of it like language. A project layout organizes thoughts and gives you context for ideas just like knowing the names for things gives you the basis for thinking.

In this post I want to highlight some considerations in the layout and management of your machine learning project. This is very much related to the goals of project and science reproducibility. There is no “best” way, you will want to select and adopt the practices that best meet your predilections and project requirements.

Workflow Motivating Questions

Jeromy Anglim gave a presentation at the Melbourne R Users group in 2010 on the state of project layout for R. The video is a bit shaky but provides a good discussion on the topic.

I really like the motivation questions from Jeromy’s presentation:

Divide a project into files and folders?
Incorporate R analyses into a report?
Convert default R output into publication quality tables, figures, and text?
Build the final product?
Sequence the analyses?
Divide
To finish reading, please visit source site

Machine Learning Process