Skip to main content

Database connections

Connect operational databases to Heimdall data workflows — either by landing tables in Lake or by calling ingestion APIs directly.

Lake bronze today

In the Data catalog, structured ingest supports CSV and Excel uploads. A database connector in the Lake UI is coming soon. Until then, export query results to CSV and use Add your first table.

Supported connectors

SourceIn Lake UIIngestion service API
CSV / Excel exportYes — add first tableN/A
PostgreSQLComing soon in Lake UIPOST /ingestion/postgres_import (legacy ML/Forecast upload)
MySQLComing soon in Lake UIPOST /ingestion/mysql_connect, mysql_import
MariaDBComing soon in Lake UIPOST /ingestion/mariadb_connect, mariadb_import

The ingestion API paths above upload directly to ML or Forecast storage buckets — not the Lake medallion catalog. Prefer CSV → Lake bronze → gold for new projects.

Export from PostgreSQL → Lake bronze

import psycopg2
import pandas as pd

connection = psycopg2.connect(
host='your-postgres-host',
user='your-username',
password='your-password',
database='your-database',
port=5432,
)

df = pd.read_sql("SELECT * FROM orders LIMIT 100000", connection)
df.to_csv("orders_export.csv", index=False)

Upload orders_export.csv in Data → Add data → Structured.

Legacy API — PostgreSQL import

The ingestion service can pull a table straight into ML or Forecast storage (bypassing Lake):

import requests

response = requests.post(
'https://ingestion.heimdallapp.org/ingestion/postgres_import',
params={'service': 'ml', 'table_name': 'orders'},
json={
'hostname': 'your-postgres-host',
'port': 5432,
'database': 'your-database',
'username': 'your-username',
'password': 'your-password',
},
headers={'Authorization': 'Bearer YOUR_TOKEN'},
)
print(response.json())

Use service=forecast for Forecast datasets. For production Lake workflows, export to CSV instead.

Legacy API — MySQL and MariaDB

List tables with POST /ingestion/mysql_connect or POST /ingestion/mariadb_connect, then import with mysql_import / mariadb_import using the same connection payload and service=ml or service=forecast. See the ingestion service OpenAPI for request schemas.

Best Practices

Connection management

  • Use connection pooling for large exports
  • Implement retry logic for transient network errors
  • Close connections after CSV export completes

Security

  • Store credentials in environment variables or a secrets manager
  • Enable SSL/TLS on database connections
  • Rotate passwords used by automation scripts

Next Steps

  1. Add your first Lake table — upload exported CSV to bronze
  2. Monitor performance — track ingest and API usage
  3. Integrate APIs — connect applications to deployed models