Skip to main content

Overview

Lineage is BlockDB’s trust layer. Every record—whether it is a raw log or a derived liquidity metric—carries a deterministic identifier that links back to the chain event that produced it. Clients can follow that identifier across APIs and exports to prove provenance, replay calculations, and satisfy audit checks without learning BlockDB’s internal implementation.

Key Benefits

  • Deterministic traceability from blocks and logs through higher-level datasets such as pools, reserves, intents, and liquidity depth.
  • Privacy-aware identifiers that remain consistent for a client but cannot be reversed or correlated across tenants.
  • Auditable workflows: the same IDs appear in SQL exports, APIs, and release snapshots, so analysts can reconcile environments.
  • Composable insights: lineage metadata lets you stitch multiple datasets together or validate AI features against raw chain evidence.

Field Surfaces

LocationFieldWhat it provides
APIs (/evm/*)_tracing_idCanonical handle for the record. Use it with lineage endpoints.
SQL exports_tracing_id, _created_at, _updated_atMirror the API fields so downstream warehouses can join lineage metadata.
Derived datasetstracking_ids_view[]Compact list of contributing raw events (salted per tenant).
Lineage APIsdataset, parents, genesisStructured response describing where the record came from and how it evolved.
Datasets that are still 🟡 Beta may temporarily omit tracking_ids_view[] while backfills complete. Release notes call out any deviations before they ship.

Lineage APIs

EndpointPurposeTypical use case
POST /evm/lineage/recordFetch lightweight metadata for a single _tracing_id.Look up a transaction or liquidity row directly from an analytics workflow.
POST /evm/lineage/genesisRetrieve the raw on-chain artifacts (block, transaction, log) that seeded the record.Produce audit packets or explainability notebooks showing source events.
POST /evm/lineage/parentsEnumerate upstream derived records.Walk the DAG to understand how higher-level metrics (e.g., reserves) were calculated.
All endpoints share the same authentication model documented in Access & SLA. Response payloads always include dataset identifiers such as 0102 (transactions) so you can align lineage with the Dataset Index.
Embed _tracing_id in your downstream fact tables. During incidents or audits, you can replay provenance by calling POST /evm/lineage/record directly from BI or notebook tooling.

Typical Workflow

  1. Capture the lineage handle
    Every API payload and warehouse export ships with _tracing_id. Persist it alongside your business keys.
  2. Lookup context
    Use /evm/lineage/record to confirm dataset ID, block coordinates, and the latest update timestamp.
  3. Fetch genesis when needed
    Call /evm/lineage/genesis to obtain the originating block/transaction/log payloads. This is useful for audit bundles, AI alignment, or dispute resolution.
  4. Traverse derived parents
    When you need to understand how a reserve, liquidity, or price record was calculated, query /evm/lineage/parents to see all upstream derived nodes.
  5. Correlate across datasets
    Join tracking_ids_view[] against your own salted hashes to prove that a specific raw log contributed to the analytic signal you are monitoring.

Architecture Snapshot

  • Lineage is modeled as a directed acyclic graph that spans ingestion, decoding, enrichment, and aggregation stages.
  • Each node in the graph has a canonical identifier stored in BlockDB and a tenant-specific view that is shared externally.
  • The graph is persisted alongside business data so that historical replays, rollbacks, and provenance proofs remain deterministic.
  • Salted view identifiers allow clients to verify lineage while keeping cross-tenant visibility locked down.
Internal hashing, salts, and derivation logic are abstracted behind the APIs. Clients only need to store the identifiers and call the endpoints described here.

Privacy & Access

  • Lineage handles exposed to customers are unique per tenant. A compromised key from one tenant cannot unlock another tenant’s lineage.
  • Revoking a tenant’s lineage access takes minutes. _tracing_id values remain valid, but lineage endpoints will no longer return payloads until access is restored.
  • Aggregated public datasets use opaque identifiers by default; lineage access is granted through contractual onboarding.

Operational Guidance

  • Capture _tracing_id in your observability stack so that tickets and dashboards link directly to lineage endpoints.
  • During incident response, pull /evm/lineage/genesis payloads into your RCA doc to show the exact chain events that triggered an alert.
  • Align lineage monitoring with Data Freshness and Delivery so you know when new datasets start emitting provenance fields.

Summary

Lineage turns every BlockDB dataset into an explainable asset. By storing _tracing_id, calling the lineage endpoints, and correlating tracking_ids_view[] with your own telemetry, you can always prove where analytics, AI features, or compliance reports originated—no proprietary implementation details required.
Improvement Summary: Reframed the Data Lineage page around customer benefits, surfaces, workflows, and API references while removing internal code snippets and sensitive implementation details.