Skip to content

Backend Index Behavior

This document describes how vertex and edge indexes are handled across different graph database backends. Understanding this helps ensure your schema has the right indexes for efficient lookups and MERGE operations.

Identity vs Secondary Indexes

  • Identity index: Required for vertex matching/upserts. Uses Vertex.identity (or _key/id for blank vertices). Each backend handles this differently.
  • Secondary indexes: Optional indexes for query performance. Configured in database_features.vertex_indexes and database_features.edge_specs[*].indexes.

The vertex_indexes in database_features is for secondary indexes only. Identity is handled by the backend during define_vertex_indexes or at collection/vertex-type creation.

Backend Summary

Backend Identity index How
Neo4j Explicit define_vertex_indexes prepends identity index when schema is provided. No implicit primary index.
Memgraph Explicit Same as Neo4j. upsert_docs_batch also auto-creates on match_keys at runtime.
FalkorDB Explicit Same as Neo4j.
Nebula Explicit define_vertex_indexes always creates identity index first (required for LOOKUP/MATCH).
ArangoDB At collection creation create_collection receives vertex_config.index(u) and adds it. _key is auto-indexed and skipped.
TigerGraph Implicit Primary keys are auto-indexed at vertex type creation.

Implications

  • Neo4j, Memgraph, FalkorDB: If you omit database_features.vertex_indexes for a vertex, the identity index is still created automatically when define_vertex_indexes runs with a schema. You only need vertex_indexes for additional (secondary) indexes.
  • ArangoDB, TigerGraph: Identity is covered at collection/vertex-type creation. define_vertex_indexes adds only secondary indexes from vertex_indexes.
  • Nebula: Identity index is always created in define_vertex_indexes; vertex_indexes adds secondary indexes.

Schema Required

When schema is None in define_vertex_indexes, identity indexes cannot be ensured for Neo4j, Memgraph, FalkorDB, and Nebula. A warning is logged. Always pass the schema when calling define_vertex_indexes or define_indexes during init_db.

Edge upserts and MERGE (Neo4j, Memgraph, FalkorDB)

Vertex upserts use node keys from Vertex identity. For edges, endpoints are matched on those vertex keys; the relationship itself is merged using a relationship property map so parallel edges remain distinct.

GraFlo chooses property names for that map from the edge’s logical identity policy: the first entry in Edge.identities (excluding source / target tokens; including a relation token as the relationship’s relation property when applicable). If identities is empty or does not name any relationship fields, all weights.direct field names are used instead. Compile-time edge indexes from identities (via database_features) remain separate from this writer-time MERGE key selection; both should agree with your intended uniqueness for a given edge definition.