Publish data for modeling (gold)
Time to complete: ~5 minutes
What you'll accomplish: Create a gold artifact that ML or Forecast can use in the training wizard.
Before you start
- A bronze or silver dataset in Lake (
/data) - Know your modeling goal: classification/regression (ML) or time series (Forecast)
Steps
- Sidebar → Lake → open bronze or silver table in the catalog.
- Click Publish gold (in the detail panel or row actions).
- Enter a gold artifact name.
- Select use case:
- Machine learning — choose target column to predict
- Forecast — choose time column; values must form a valid time series
- Click Validate — read errors in the dialog and fix source data if needed.
- Click Publish → artifact appears under Gold tab, status ready.
Start training from gold
| Product | UI path |
|---|---|
| ML | Machine Learning → New model → Published dataset (Gold) → select artifact |
| Forecast | Forecast → new build → select gold forecast dataset |
| Shortcut | Gold detail → Start ML project / forecast equivalent (pre-fills wizard) |
Requirements (verified in app)
| Use case | Minimum rows | Notes |
|---|---|---|
| ML | 100 | Target column required; needs variation in the target |
| Forecast | 30 | Parseable datetime + numeric value column |
Common mistakes
| Mistake | Fix |
|---|---|
| Gold missing in ML wizard | Confirm Gold tab shows ready and ML use case was enabled at publish |
| Validation failed (ML) | Add rows (100+), fix target column, check dtypes |
| Validation failed (Forecast) | Ensure datetime parses; value column is numeric |
| Skipped gold for "speed" | Direct CSV upload in ML wizard is disabled — gold is required |