Documentation Index
Fetch the complete documentation index at: https://docs.blockdb.io/llms.txt
Use this file to discover all available pages before exploring further.
Overview
Azure Blob Storage mirrors the S3 experience but keeps data inside your Azure subscription. BlockDB writes archives into an ADLS Gen2 container using a dedicated service principal so you can pipe the files into Synapse, Fabric, or Databricks.Delivery Specs
- Containers: One per environment (e.g.,
blockdb-prod-archives). BlockDB targets a specific path such as/datasets/0101_blocks_v1/. - Authentication: OAuth client credentials (service principal) with
Storage Blob Data Contributoron the container. - Formats: Parquet (recommended) or CSV; naming matches
dataset_id=0101/date=2024-01-01/part-*.parquet. - Metadata: Manifests stored alongside the data with row counts, checksums, and
_tracing_idranges.
Provisioning Checklist
- Create the storage account (enable hierarchical namespace for ADLS Gen2).
- Register a service principal and grant it access to the container path.
- Share the tenant ID, client ID, and secret plus the target container URL with support@blockdb.io.
- Specify datasets, chains, and start/end timestamps; BlockDB schedules the first export drop.
Integrating Downstream
- Use Synapse Pipelines or Azure Data Factory to copy the data into dedicated SQL pools.
- Mount the container to Databricks and hydrate tables defined in
/BlockDb.Postgres.Tables.Public. - Monitor ingestion by reading the manifest blobs and reconciling counts with your warehouse.
If you enforce customer-managed keys, grant the service principal
get permissions on the Key Vault secret so BlockDB can encrypt uploads.