Module 3

Syllabus | Module 3

Upcoming Cohort

Cohort 4 of the Earth Systems Data Science in the Cloud Course will go through Module 2 the week of October 21, 2024. The information below contains the detailed course materials from the previous cohort and will be updated with the Cohort 4 schedule shortly.

Day 1 | Monday, March 25, 2024

Module 3 Introduction (30 min)

0900-0930 March 25, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Welcome Back
  • Interim Check In
  • Where we are in Earth Systems Data Science in the Cloud
  • Course Goals and Objectives
  • Module Goals and Objectives
  • Course Logistics

AI/ML Overview (60 min)

0930-1030 March 25, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • All of Statistics
  • Understanding & Prediction
  • Baysian & Frequentist
  • Machine Learning & AI
  • Unsupervised, Semi-Supervised, Supervised Machine Learning
  • Classification and Regression
  • Deep Learning and Not-so-Deep Learning
  • Types of data: Image, Text, Gridded, Tabular
  • Manual and Automated ML

Demo Day | Parallelization & ML (90 min)

1030-1200 March 25, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Machine Learning: The Bad, the Good, and the Awesome!
  • Scale: Data Processing, Training, and Inference

Lunch and Learn

1200-1300 March 25, 2024

Word

|

PDF

|

Zoom

  • Individual and Team Progress Check In

Model Development Workflow (90 min)

1300-1430 March 25, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Goals
  • Metrics
  • Training and Testing Data
  • Preparing your data
    • Information Leakage
    • Order of Operations
  • Training
    • Cross validation!
  • Optimization
  • Fitting
  • Out of Sample Performance
  • Workflow Evaluation

Team Project Module Goals (30 min)

1430-1500 March 25, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Team Project Check-In
  • Team Project Goals
  • Presentation
  • Report

Team Project Work - Data Processing (60 min)

1500-1600 March 25, 2024

Word

|

PDF

|

  • Data Preprocessing

Day 2 | Tuesday, March 26, 2024

Feature Engineering (60 min)

0900-1000 March 26, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Transformation
    • Imputation
    • Actual transformation
    • One-hot encoding
    • Outliers
    • Scaling
  • Extraction
    • Dimension Reduction
    • Clustering
    • Binning
  • Creation
    • Indexing
    • Orthogonal Matching
    • Lagging

Multi-Worker Parallelization | Dask (60 min)

1000-1100 March 26, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Starting a Cluster
  • Running a Cluster
  • Troubleshooting a Cluster
  • Shutting Down a Cluster

Building Blocks of Machine Learning (60 min)

1100-1200 March 26, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Linear Regression
  • Logistic Regression
  • K-Means Clustering

Lunch and Learn

1200-1300 March 26, 2024

Word

|

PDF

|

Zoom

  • Machine Learning Background Discussion

Machine Learning Metrics (30 min)

1300-1330 March 26, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • What they mean and how to interpret/implement them
  • Classification Metrics
  • Regression Metrics
  • Bias Variance Tradeoff
  • Stats Modeling criteria
  • Unbalanced data considerations

Team Project Time

1330-1600 March 26, 2024

Word

|

PDF

|

  • Feature Engineering

Day 3 | Wednesday, March 27, 2024

The Tool Landscape (30 min)

0900-0930 March 27, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • SciKit Learn
  • Darts
  • Deep Learning: Tensorflow, PyTorch, Keras
  • Hugging Face
  • AutoML Frameworks (Gluon, Canvas)

Serverless Multi-Worker Parallelization (150 min)

0930-1200 March 27, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Container Orchestration
  • Cloud services
  • Servers vs Serverless
  • Lambda, Fargate, Lithops, Coiled, Modal

Lunch and Learn

1200-1300 March 27, 2024

Word

|

PDF

|

Zoom

Explainable AI (60 min)

1300-1400 March 27, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • What is XAI?
  • Principles
  • Applications
  • Techniques

Team Project Work

1400-1600 March 27, 2024

Word

|

PDF

|

  • Beginning ML

Day 4 | Thursday, March 28, 2024

Cross Validation (60 min)

0900-1000 March 28, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Best Practices
  • Strategies
  • Validating across Space and Time

Team Project Development (120 min)

1000-1200 March 28, 2024

Word

|

PDF

|

  • Report development
  • Presentation development

Lunch and Learn

1200-1300 March 28, 2024

Word

|

PDF

|

Zoom

  • Team Project Check In

ML Algorithms and Approaches (90 min)

1300-1430 March 28, 2024

Content

|

Word

|

PDF

|

ZoomRecording

  • Unsupervised
  • Semi-supervised
  • Supervised

Team Presentation Training (90 min)

1430-1600 March 28, 2024

Word

|

PDF

|

  • Clear Communication
  • Body Language
  • Story Telling
  • Methods Focus
  • Takeaways

Day 5 | Friday, March 29, 2024

Team Presentations | DPD (120 min)

0900-1100 March 29, 2024

Word

|

PDF

|

ZoomRecording

  • Capstone Presentations and Feedback.

Module Wrap Up (30 min)

1100-1130 March 29, 2024

Content

|

Word

|

PDF

|

Zoom

  • Closing
  • Moving to Model Development
  • Interim Period
  • Next Steps
Previous
Overview