Heimdall Lab

Lab runs Python analytics on your Lake tables — profile data, chart it, and save transforms back to silver.

Before you start

At least one bronze or silver table in Lake
Sidebar → Data Warehouse → Lab (/lab)
Part of Path C when data quality is unknown

Product page

See heimdallapp.org/product/lab for overview and use cases.

Loading data in code

Lab follows the same pattern as Databricks (spark.table()), Hex, and Observable — you load tables in code, not through a separate import UI.

list_tables()                              # print your bronze/silver catalog

orders = lake("bronze", "Q1 Upload")       # load a bronze table
clean = lake("silver", "Clean Orders")     # load a silver dataset

result = orders.groupby("category")["amount"].sum().reset_index()
result

The Lake catalog sidebar inserts lake() snippets when you click a table. Table names must match your catalog exactly (case-insensitive).

What you can do

Load up to 6 tables per cell via lake()
Run pandas, numpy, matplotlib, seaborn, and plotly in code cells
Preview tables and charts inline
Save results as a new silver dataset (never overwrites existing data)

Workflow

Open Lab from the sidebar.
Create a notebook and run list_tables() to see available data.
Load tables with lake("bronze", "name") or click tables in the catalog sidebar.
Run cells to explore and visualize.
Use Save result as silver when you want to persist a DataFrame output to the Lake catalog.
Publish gold from Lake when ready for ML or Forecast.

Profiling bronze → silver

Use Lab between bronze ingest and gold publish when you need to inspect quality before modeling:

Profile bronze — load with lake("bronze", "Q1 Upload"), check dtypes, null counts, and distributions with pandas.
Chart outliers — plot key columns with matplotlib, seaborn, or plotly to spot bad rows or skew.
Save to silver — filter, dedupe, or derive columns in code, then Save result as silver (creates a new catalog entry; never overwrites bronze).
Publish gold — when the silver dataset is model-ready, switch to Lake and publish for modeling.

This matches the exploratory profiling and ad-hoc transforms use cases on the marketing site.

Security model

Cells are read-only — they cannot modify bronze, silver, or gold tables.
Code runs in an isolated worker with no AWS credentials or network access.
Saving to silver is a separate explicit action that only creates new datasets.

Limits

30 second execution timeout per cell
Up to 6 lake() loads per run
Allowed imports: pandas, numpy, matplotlib, seaborn, plotly, and basic stdlib helpers

Common mistakes

Mistake	Fix
`lake()` can't find table	Run `list_tables()` — names must match catalog
Cell timeout	Reduce data scanned or aggregate before plotting
Skipped silver save	Bronze is never overwritten — use Save result as silver to persist transforms

Next steps

Publish gold when data is model-ready
User journeys Path C
Clean and combine — UI alternative to Lab for simple joins

Loading data in code​

What you can do​

Workflow​

Profiling bronze → silver​

Security model​

Limits​

Common mistakes​

Next steps​