IntroDataScience

Marimo Notebooks

Learn how to use Marimo, the reactive Python notebook.


1. What is Marimo?

Marimo is a reactive Python notebook that runs as pure Python files (.py), not JSON blobs (.ipynb).

Key Features

Reactive Execution πŸ”„

Pure Python Files πŸ“„

Reproducible 🎯

Interactive 🎨


2. Why Marimo vs. Jupyter?

Marimo Advantages

Feature Jupyter Marimo
File format .ipynb (JSON) .py (Python)
Git-friendly ❌ Messy diffs βœ… Clean diffs
Reactivity ❌ Manual re-runs βœ… Automatic updates
Hidden state ⚠️ Common issue βœ… Impossible
Execution order ⚠️ Can be confusing βœ… Always clear
IDE support ⚠️ Limited βœ… Full Python support
Sharing .ipynb file .py file or HTML

When to Use Each?

Use Marimo when:

Use Jupyter when:

For this course: We use Marimo because it’s better for learning and collaboration!


3. Starting Marimo

Open an Existing Notebook

uv run marimo edit example_notebooks/01_python_basics.py

What happens:

  1. Marimo starts a local server
  2. Your browser opens automatically
  3. The notebook loads in the browser

Create a New Notebook

uv run marimo edit my_analysis.py

If the file doesn’t exist, Marimo creates it for you.

Run a Notebook (Non-Interactive)

uv run marimo run notebook.py

This executes the notebook and shows outputs, but you can’t edit cells.


4. Marimo UI Overview

When you open a Marimo notebook, you’ll see:

Main Components

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ [β–Ά] Run All    [+] Cell    [πŸ’Ύ] Save   β”‚  ← Toolbar
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                         β”‚
β”‚  Cell 1: Markdown                       β”‚  ← Markdown cell
β”‚  # My Data Analysis                     β”‚
β”‚                                         β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                         β”‚
β”‚  Cell 2: Python Code                    β”‚  ← Code cell
β”‚  import polars as pl                    β”‚
β”‚  df = pl.read_csv("data.csv")          β”‚
β”‚                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”             β”‚
β”‚  β”‚ Output: 100 rows Γ— 5  β”‚             β”‚  ← Cell output
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                         β”‚
β”‚  Cell 3: Interactive Plot               β”‚  ← Another cell
β”‚  ...                                    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚                                         β”‚
β”‚  πŸ“ Files  πŸ“Š Variables  βš™οΈ Settings   β”‚  ← Sidebar
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Toolbar Buttons


5. Creating and Running Cells

Adding a Cell

Types of Cells

Code Cell:

import polars as pl
df = pl.read_csv("data/raw/students.csv")
df.head()

Markdown Cell:

# My Analysis

This is a **markdown** cell with _formatting_.

To make a cell Markdown:

Running Cells

Automatic execution:

What gets re-run:

Cell Output

Outputs appear below the cell:


6. Reactive Execution Model

This is Marimo’s superpower! Let’s see it in action.

Example: Reactivity

Cell 1:

import marimo as mo
slider = mo.ui.slider(1, 100, value=50)
slider

Cell 2:

number = slider.value
print(f"The number is: {number}")

What happens:

How It Works

Marimo tracks dependencies:

# Cell A
x = 10

# Cell B (depends on A)
y = x * 2

# Cell C (depends on B)
z = y + 5

If you change x in Cell A:

  1. Cell B re-runs (depends on x)
  2. Cell C re-runs (depends on y)
  3. You always see current results!

Benefits


7. Exporting Notebooks

Export to HTML

uv run marimo export html notebook.py -o output.html

Result:

Export to Python Script

Marimo notebooks ARE Python files! Just run them:

uv run python notebook.py

Share as Marimo App

Convert to standalone app:

uv run marimo run notebook.py

Others can view (but not edit) the notebook.


8. Best Practices for Organizing Code

One Concept Per Cell

Good:

# Cell 1: Load data
df = pl.read_csv("data.csv")

# Cell 2: Clean data
df_clean = df.filter(pl.col("age") > 0)

# Cell 3: Analyze
summary = df_clean.group_by("category").agg(pl.mean("value"))

Avoid:

# Cell 1: Everything at once
df = pl.read_csv("data.csv")
df_clean = df.filter(pl.col("age") > 0)
summary = df_clean.group_by("category").agg(pl.mean("value"))
plot = px.bar(summary)

Use Descriptive Variable Names

Good:

student_df = pl.read_csv("students.csv")
high_scorers = student_df.filter(pl.col("score") > 85)

Avoid:

df = pl.read_csv("students.csv")
df2 = df.filter(pl.col("score") > 85)

Add Markdown Explanations

# Cell 1 (Markdown)
"""
# Data Loading
Loading the sales dataset from Q4 2024.
"""

# Cell 2 (Python)
sales = pl.read_csv("sales_q4.csv")

Use markdown headers to organize:

# Data Loading

... cells for loading data ...

# Data Cleaning

... cells for cleaning ...

# Analysis

... cells for analysis ...

# Visualization

... cells for plots ...

9. Sharing and Version Control

Git-Friendly

Marimo notebooks are .py files, so Git works perfectly!

What to commit:

git add my_analysis.py         # The notebook
git add data/raw/input.csv     # Input data
git add .gitignore             # Git configuration

What NOT to commit:

Viewing Diffs

Git diffs are readable:

  import polars as pl

- df = pl.read_csv("old_data.csv")
+ df = pl.read_csv("new_data.csv")

  df.head()

Compare this to Jupyter’s JSON mess!

Collaborating

Team workflow:

  1. Person A: Creates notebook, commits to Git
  2. Person B: Pulls from Git, opens with uv run marimo edit notebook.py
  3. Person B: Makes changes, commits
  4. Person A: Pulls changes, sees readable diff
  5. No merge conflicts! πŸŽ‰

10. Interactive Widgets and Sliders

Marimo makes interactive notebooks easy!

Import Marimo UI

import marimo as mo

Slider

# Create slider
age_slider = mo.ui.slider(0, 100, value=25, label="Age")
age_slider
# Use slider value
selected_age = age_slider.value
print(f"Selected age: {selected_age}")
category_dropdown = mo.ui.dropdown(
    options=["Electronics", "Clothing", "Books"],
    value="Electronics",
    label="Category"
)
category_dropdown

Text Input

name_input = mo.ui.text(placeholder="Enter name", label="Name")
name_input

Checkbox

filter_checkbox = mo.ui.checkbox(label="Show high scorers only")
filter_checkbox

Using Widgets in Analysis

# Cell 1: Create slider
import marimo as mo
threshold = mo.ui.slider(0, 100, value=75, label="Score Threshold")
threshold
# Cell 2: Filter based on slider (reactive!)
import polars as pl
df = pl.read_csv("data/raw/students.csv")
filtered = df.filter(pl.col("test_score") > threshold.value)
print(f"Students above {threshold.value}: {len(filtered)}")

Move the slider β†’ Cell 2 updates automatically!


Common Patterns

Pattern 1: Load β†’ Clean β†’ Analyze β†’ Visualize

# Cell 1: Load
import polars as pl
df = pl.read_csv("data.csv")

# Cell 2: Clean
df_clean = df.drop_nulls()

# Cell 3: Analyze
summary = df_clean.group_by("category").agg(pl.mean("value"))

# Cell 4: Visualize
import plotly.express as px
fig = px.bar(summary, x="category", y="value")
fig

Pattern 2: Interactive Exploration

# Cell 1: Create widgets
import marimo as mo
category = mo.ui.dropdown(options=["A", "B", "C"], value="A")
threshold = mo.ui.slider(0, 100, value=50)
mo.hstack([category, threshold])

# Cell 2: Filter reactively
filtered = df.filter(
    (pl.col("category") == category.value) &
    (pl.col("score") > threshold.value)
)
filtered

# Cell 3: Plot reactively
fig = px.scatter(filtered, x="x", y="y")
fig

Pattern 3: Parameterized Analysis

# Cell 1: Parameters
import marimo as mo
params = mo.ui.dictionary({
    "data_file": mo.ui.text(value="data.csv"),
    "min_score": mo.ui.slider(0, 100, value=70),
    "group_by": mo.ui.dropdown(options=["A", "B"], value="A")
})
params

# Cell 2: Run analysis with parameters
df = pl.read_csv(params.value["data_file"])
result = df.filter(
    pl.col("score") > params.value["min_score"]
).group_by(params.value["group_by"]).count()
result

Keyboard Shortcuts

Action Shortcut
Run cell Shift + Enter
Run and stay Ctrl/Cmd + Enter
New cell below B
New cell above A
Delete cell D D (press D twice)
Convert to Markdown M
Convert to Code Y
Save Ctrl/Cmd + S
Command palette Ctrl/Cmd + K

Troubleshooting

Notebook Won’t Open

Problem: uv run marimo edit notebook.py shows error

Solutions:

  1. Check you’re in the right directory: ls should show pyproject.toml
  2. Make sure dependencies are installed: uv sync
  3. Try a different port: uv run marimo edit --port 8080 notebook.py

Cell Won’t Run

Problem: Cell doesn’t execute when you change it

Solutions:

  1. Check for syntax errors (red underlines)
  2. Force run: Press Shift + Enter
  3. Restart kernel: Click β€œRestart” in settings

Reactivity Not Working

Problem: Changing one cell doesn’t update others

Solutions:

  1. Make sure you’re referencing variables correctly
  2. Check for variable name typos
  3. Look at dependency graph (in sidebar)

Can’t Import Package

Problem: ModuleNotFoundError: No module named 'package'

Solutions:

  1. Install the package: uv add package-name
  2. Restart the notebook
  3. Make sure you’re using uv run marimo edit

Next Steps


Official Resources:


Marimo makes data analysis interactive, reproducible, and fun! Start coding! πŸš€