Managing analysis workflows in geospatial data science with GNU Make

Click for: original source

Martí Bosch wrote this guide how to go about using Jupyter Notebooks while using iterative approach to both data analysis and software development. He will also explain how to avoid some bad practices. Many issues can be settled by choosing helpful file names, good organization, documentation and source control of the code.

The article then describes an example from geospatial data science: analysis of the spatiotemporal patterns of urbanization.

It deals with:

  • Computational workflow approach
  • Automating the workflow with GNU Make
  • Detecting completed targets
  • Pattern rules and abstracting our workflow from the data
  • Detecting edited source files

All above with detailed code examples and illustrated charts and animations. Link to GitHub repository is included, too.

The main advantage of Make is its flexibility, which allows it to precisely manage any kind of computational workflow, from the compilation of source files to the example above. Nice one!

[Read More]

Tags big-data machine-learning data-science miscellaneous python