Example 4: Dynamic Relations from Keys (Neo4j)¶

This example demonstrates how to ingest complex nested JSON data into Neo4j, using the relation_from_key attribute to dynamically create relationships based on the structure of the data.

Data Structure¶

We're working with Debian package metadata that contains complex nested structures:

{
  "name": "0ad-data",
  "version": "0.0.26-1",
  "dependencies": {
    "pre-depends": [
      {
        "name": "dpkg",
        "version": ">= 1.15.6~"
      }
    ],
    "suggests": [
      {
        "name": "0ad"
      }
    ]
  },
  "description": "Real-time strategy game of ancient warfare (data files)",
  "maintainer": {
    "name": "Debian Games Team",
    "email": "pkg-games-devel@lists.alioth.debian.org"
  }
}

Schema Configuration¶

Vertices¶

We define three vertex types:

vertex_config:
    vertices:
    -   name: package
        fields:
        -   name
        -   version
        indexes:
        -   fields:
            -   name
    -   name: maintainer
        fields:
        -   name
        -   email
        indexes:
        -   fields:
            -   email
    -   name: bug
        fields:
        -   id
        -   subject
        -   severity
        -   date
        indexes:
        -   fields:
            -   id

Edges¶

Edges are defined in a simple way:

edge_config:
    edges:
    -   source: package
        target: package
    -   source: maintainer
        target: package
    -   source: package
        target: bug

Graph Structure¶

The resulting graph shows the following package dependency relationships:

Resource (Nested Structure)¶

Nested Structure Handling¶

The resource configuration handles deeply nested data:

resources:
-   resource_name: package
    apply:
    -   vertex: package
    -   key: dependencies
        apply:
        -   key: breaks
            apply:
            -   vertex: package
        -   key: conflicts
            apply:
            -   vertex: package
        -   key: depends
            apply:
            -   vertex: package
        -   key: pre-depends
            apply:
            -   vertex: package
        -   key: suggests
            apply:
            -   vertex: package
        -   key: recommends
            apply:
            -   vertex: package
    -   source: maintainer
        target: package
    -   source: package
        target: package
        relation_from_key: true
    -   key: maintainer
        apply:
        -   vertex: maintainer

We use relation_from_key: true to:

Use the JSON keys as relationship types
Create different edge types based on the nested structure
Instead of a single edge type, we get multiple edge types: breaks, conflicts, depends, pre-depends, suggests, recommends

How It Works¶

Package Creation: Each package becomes a vertex
Dynamic Relations: Each dependency type (breaks, conflicts, etc.) becomes a relationship type
Maintainer Links: Maintainer information creates maintainer → package relationships
Bug Tracking: Bug reports create package → bug relationships

Resource Structure¶

The resource mapping handles complex nested package data:

Data Ingestion¶

The ingestion process handles the complex nested structure:

from suthing import ConfigFactory, FileHandle
from graphcast import Caster, Patterns, Schema

schema = Schema.from_dict(FileHandle.load("schema.yaml"))

conn_conf = ConfigFactory.create_config({
    "protocol": "bolt",
    "hostname": "localhost",
    "port": 7688,
    "username": "neo4j",
    "password": "test!passfortesting",
})

patterns = Patterns.from_dict({
    "patterns": {
        "package": {"regex": r"^package\.meta.*\.json(?:\.gz)?$"},
        "bugs": {"regex": r"^bugs.head.*\.json(?:\.gz)?$"},
    }
})

caster = Caster(schema)
caster.ingest_files(
    path="./data",
    conn_conf=conn_conf,
    patterns=patterns,
    clean_start=True,
)

Use Cases¶

This schema is useful for:

Package Management: Modeling software dependencies and conflicts
Ecosystem Analysis: Understanding complex dependency graphs
Compliance Checking: Identifying breaking changes and conflicts
Maintenance Planning: Tracking maintainer responsibilities

Key Takeaways¶

relation_from_key: true enables dynamic relationship creation from JSON structure
Nested Processing handles complex hierarchical data
Flexible Relationships support various dependency types
Scalable Modeling works with large package ecosystems

Comparison with Example 3¶

Example 3: Uses relation_field for CSV data with explicit relationship columns
Example 4: Uses relation_from_key for JSON data with implicit relationship structure
Both: Enable multiple relationship types between the same entity pairs
Difference: Data source format and relationship specification method

Please refer to examples

For more examples and detailed explanations, refer to the API Reference.