Module 1
Syllabus | Module 1
Recordings
Recordings from each session will be made available as soon as possible following the close of the session (generally within 24 hours). You can find links to recordings below. The home directory where the recordings are hosted is available here. Cohort 4 recordings will be added below as they become available.
Day 1 | Monday, January 27, 2025
Course Introduction (45 min)
- Welcome
- Introductions
- What is Data Science?
- What is Earth System Data Science in the Cloud?
- Course Goals and Objectives
- Module Goals and Objectives
- Course Logistics
Introduction to Command Line (60 min)
- What is Bash?
- What is an Environment?
- Navigating your Environment
- Manipulating your Environment
- Command Line Text Editors
Introduction to Python (60 min)
- Launching Python
- Versions
- Python as a Calculator
- Variable Assignment
- Beginning Data Structures
Lunch & Learn (60 min)
1200-1300 January 27, 2025
- Introductions
- Personal Goals
- Ask Me Anything
Day 2 | Tuesday January 28, 2025
Introduction to Git (60 min)
- Version Control
- Git Architecture
- Add, Commit, Push
Meet your Programming Assistant (60 min)
- Intro to LLMs
- Code Completion
- Best Practices for Communicating:
- Generation
- Correction
- Documentation
- Translation
Learning about Learning (30 min)
How do we learn programming languages?
- Read, Evaluate, Print, and Loop (REPL)
- Iteration
- Random Contextual Interference
- Programming Assistants
Programming in Python (30 min)
- Leveraging REPL for fast iteration
- Best Practices for Script Construction
- Comments and In-line Documentation
- Beginnings of Workflow Management
Lunch & Learn (60 min)
1200-1300 January 28, 2025
- Progress Check-In
- Questions
Fundamentals of Computing (60 min)
- What is a programming language?
- Types of Programming languages
- Compiled and Scripted
- Interpreters
- Advantages and Disadvantages
- The Programming Language Landscape
- What is a Computer?
- Storage
- Memory (RAM)
- Compute
- Networking
- Python
- History
- Advantages and Disadvantages
- Parallelization (Thread Lock)
- Support for Different Data Types
- Support for Different Types of Analysis
Programming Paradigms (60 min)
- Objects
- Functions!
- Classes!
- Object Oriented vs Functional Programming
Day 3 | Wednesday, January 29, 2025
Collaborative Git I (60 min)
- GitLab & Git
- Cloning, pulling, and pushing
- Branching, Merging, and Issues
- Communication & Project Structure
Leveraging LLMs (60 min)
- Large Language Model (LLM) Landscape
- Accessing and using LLMs
- Learning with and from LLMs
- Prompt Engineering
Dependency Management in Python (30 min)
- Install Libraries/Packages
- Importing Libraries/Packages
- Managing Libraries/Packages
- Environment Management
- Python Library Ecosystem
Data Types | Python (30 min)
- Scalars
- Vectors
- Arrays/Matrices
- Data Frames
- Indexing
Lunch & Learn (60 min)
1200-1300 January 29, 2025
- Technical Session: Why Scientists say what they do...
Control Structures (60 min)
- Loops
- If/Then
- Case
- Try/Except
- Decorators
Day 4 | Thursday, January 30, 2025
Introduction to Production Machine Learning (60 min)
- Models
- APIs
- Deployment
Collaborative Git I Continued (60 min)
- GitLab & Git
- Cloning, pulling, and pushing
- Branching, Merging, and Issues
- Communication & Project Structure
Team Kickoff (30 min)
- Introductions
- Research Theme Definition
- Personal and Project goals
Working in Teams (60 min)
- Communication
- Organization
- Roles
- ESDS Team Projects
Lunch & Learn (60 min)
1200-1300 January 30, 2025
- Team Project Discussion
LLMs, Data, and Agents (60 min)
- LLM Context
- Data Sources, Projects, and Retrieval Augmented Generation
- Building and Using Agents
Beginning a Project (30 min)
- Defining a Project
- Choosing a Language
- Finding Data
- Accessing Data
- Introduction to Data Formats
Team Time (60 min)
1430-1500 January 30, 2025
- Meet your team
- Beginning Team Project Discussions
Day 5 | Friday, January 31, 2025
Collaborative Git II (60 min)
- GitLab & Git
- Cloning, pulling, and pushing
- Branching, Merging, and Issues
- Communication & Project Structure
Foundations of Parallel Computing (30 min)
- What is parallel computing
- Units of parallelization
- MapReduce
- Why Map Reduce changed the world
- The two key types of Parallel Computing
Closing Team Exercise (60 min)
- Closing Exercise
- Module Wrap Up
- Next Steps