Query execution and overlay merge
title: Query execution and overlay merge
This document describes the single query execution pipeline in Fluree DB and how it combines:
- Indexed data (binary columnar indexes)
- Overlay data (novelty + staged flakes)
It also calls out where graph scoping (g_id) is applied so named graphs remain isolated.
Pipeline overview
flowchart TD
LedgerState -->|produces| LedgerSnapshot
LedgerSnapshot -->|shared substrate| GraphDb
GraphDb -->|single-ledger| QueryRunner
GraphDb -->|member_of| DataSetDb
DataSetDb -->|federated| QueryRunner
QueryRunner -->|scan index + merge overlay| DatasetOperator
DatasetOperator -->|per-graph| BinaryScanOperator
BinaryScanOperator -->|fast path| BinaryCursor
BinaryScanOperator -->|fallback| range_with_overlay
BinaryCursor -->|graph-scoped decode| BinaryGraphView
range_with_overlay -->|delegates| RangeProvider
Where this exists in code
-
API entrypoints
fluree-db-api/src/view/query.rs: single-ledgerGraphDbqueries (query)fluree-db-api/src/view/dataset_query.rs: dataset queries (DataSetDb)
-
Unified query runner
fluree-db-query/src/execute/runner.rsprepare_execution(db: GraphDbRef<'_>, query: &ExecutableQuery)builds derived facts/ontology (if enabled), rewrites patterns, and builds the operator tree.execute_prepared(...)runs the operator tree using anExecutionContext.
-
Dataset operator
fluree-db-query/src/dataset_operator.rsDatasetOperatorwraps every triple-pattern scan. In single-graph mode (the common case) it passes through to one innerBinaryScanOperatorwith negligible overhead. In multi-graph mode (FROM/FROM NAMED datasets) it fans out one inner operator per active graph, drives their lifecycles, and stamps ledger provenance (Binding::IriMatch) on results that span multiple ledgers.DatasetBuildertrait (factory pattern): the planner constructs aScanDatasetBuilderat plan time;DatasetOperatorcallsbuild()at execution time duringopen()to produce per-graphBinaryScanOperators.- Nested composition: inner operators can themselves be
DatasetOperators — provenance stamping passesIriMatchthrough unchanged.
-
Scan operators
fluree-db-query/src/binary_scan.rsBinaryScanOperatorhandles single-graph scanning only. Selects between binary cursor (streaming, integer-ID pipeline) and range fallback atopen()time based on theExecutionContext.
-
Range fallback
fluree-db-core/src/range.rs:range_with_overlay(snapshot, g_id, overlay, ...)fluree-db-core/src/range_provider.rs:RangeProvidertrait implemented by the binary range provider
Graph scoping (g_id)
Graph scoping is applied at two key boundaries:
- Binary streaming path:
BinaryCursoroperates on aBinaryGraphView(graph-scoped decode handle), ensuring leaf/leaflet decoding, predicate dictionaries, and specialty arenas are graph-isolated. - Range path:
range_with_overlay(snapshot, g_id, overlay, ...)passesg_idinto theRangeProvider, which routes the range query to the correct per-graph index segments.
Overlay providers are graph-scoped at the trait boundary: the overlay hook receives g_id and must only return flakes for that graph. This keeps multi-tenant named graphs isolated even when overlay data is sourced externally.
Overlay merge semantics (high level)
Both scan paths implement the same logical behavior:
- Read matching flakes from the indexed base (binary files)
- Read matching flakes from the overlay (novelty/staged)
- Merge them using
(t, op)semantics so retractions cancel assertions as-of the query time bound
The details differ:
BinaryScanOperatortranslates overlay flakes into integer-ID space and merges them into the decoded columnar stream.RangeScanOperatordelegates torange_with_overlay, which combinesRangeProvideroutput with overlay output.