Module 4

Overview | Module 4

Welcome to Module 4 | Production Data Science. In this module, we will build upon all the skills developed in the previous modules to begin bringing our work thus far to production.

Specifically, this module will focus on developing skills around disseminating and sharing scientific insights in a professional, reproducible manner. We will dive into best practices for data visualization, data visualization at scale, pipeline development and testing, coding best practices including peer review, and communication strategies for effective scientific development.

By the end of this module, you will be familiar with and conversant in the following areas:

  • Visualizing Earth Systems data at scale.
  • Production Machine Learning in the cloud.
  • End-to-End Pipeline Development.
  • Effective and Efficient Publication Development.

Specifically by the end of the module, you will have accomplished the following:

  • Published a pipeline for dataset development
  • Developed visualizations of larger than memory data
  • Effectively documented your code
  • Launched a machine learning model using a REST API
  • Linted your code
  • Developed a submission quality reports using LaTeX and Overleaf
  • Communicated the results of your team projects in elevator pitches and formal presentations.
Previous
Syllabus