Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.blockdb.io/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Azure Blob Storage mirrors the S3 experience but keeps data inside your Azure subscription. BlockDB writes archives into an ADLS Gen2 container using a dedicated service principal so you can pipe the files into Synapse, Fabric, or Databricks.

Delivery Specs

  • Containers: One per environment (e.g., blockdb-prod-archives). BlockDB targets a specific path such as /datasets/0101_blocks_v1/.
  • Authentication: OAuth client credentials (service principal) with Storage Blob Data Contributor on the container.
  • Formats: Parquet (recommended) or CSV; naming matches dataset_id=0101/date=2024-01-01/part-*.parquet.
  • Metadata: Manifests stored alongside the data with row counts, checksums, and _tracing_id ranges.

Provisioning Checklist

  1. Create the storage account (enable hierarchical namespace for ADLS Gen2).
  2. Register a service principal and grant it access to the container path.
  3. Share the tenant ID, client ID, and secret plus the target container URL with support@blockdb.io.
  4. Specify datasets, chains, and start/end timestamps; BlockDB schedules the first export drop.

Integrating Downstream

  • Use Synapse Pipelines or Azure Data Factory to copy the data into dedicated SQL pools.
  • Mount the container to Databricks and hydrate tables defined in /BlockDb.Postgres.Tables.Public.
  • Monitor ingestion by reading the manifest blobs and reconciling counts with your warehouse.
If you enforce customer-managed keys, grant the service principal get permissions on the Key Vault secret so BlockDB can encrypt uploads.
Last modified on February 26, 2026