Example 3: CSV with Edge Weights and Multiple Relations¶
This example demonstrates how to handle complex relationships where multiple edges can exist between the same pair of entities, each with different relation types and weights.
Data Structure¶
We have a CSV file representing business relationships between companies:
company_a | company_b | relation | date |
---|---|---|---|
Microsoft | OpenAI | invests_in | 2023-01-23 |
Microsoft | OpenAI | partners_with | 2023-03-15 |
Amazon | Whole Foods | acquires | 2017-08-28 |
Amazon | Whole Foods | integrates_with | 2018-02-14 |
Tesla | Panasonic | partners_with | 2014-07-31 |
Tesla | Panasonic | supplies_to | 2016-06-10 |
Apple | competes_with | 2007-06-29 | |
Apple | collaborates_with | 2021-04-28 |
Notice that the same company pairs can have multiple different relationship types (e.g., Microsoft-OpenAI has both "invests_in" and "partners_with" relationships).
Schema Configuration¶
Vertices¶
We define a simple company
vertex:
Edges¶
The key feature here is using relation_field
to dynamically create different edge types:
edge_config:
edges:
- source: company
target: company
relation_field: relation
weights:
direct:
- date
Key Concepts¶
relation_field
Attribute¶
The relation_field: relation
tells GraphCast to:
- Read the
relation
column from the CSV - Create different edge types based on the values in that column
- Instead of a single edge type, we get multiple edge types:
invests_in
,partners_with
,acquires
, etc.
Edge Weights¶
The weights.direct: [date]
configuration:
- Adds the
date
field as a weight property on each edge - This allows temporal analysis of relationships
- The date becomes a property that can be used for filtering, sorting, or analysis
Resource Mapping¶
The resource configuration maps the CSV columns to vertices and edges:
resources:
- resource_name: relations
apply:
- target_vertex: company
map:
company_a: name
- target_vertex: company
map:
company_b: name
This creates two company vertices for each row and establishes the relationship between them.
Graph Structure¶
The resulting graph structure shows multiple relationship types between the same entities:
Resource Structure¶
The resource mapping creates a clear structure for processing the CSV data:
Data Ingestion¶
The ingestion process is straightforward:
from suthing import ConfigFactory, FileHandle
from graphcast import Caster, Patterns, Schema
schema = Schema.from_dict(FileHandle.load("schema.yaml"))
conn_conf = ConfigFactory.create_config({
"protocol": "bolt",
"hostname": "localhost",
"port": 7688,
"username": "neo4j",
"password": "test!passfortesting",
})
patterns = Patterns.from_dict({
"patterns": {
"people": {"regex": "^relations.*\.csv$"},
}
})
caster = Caster(schema)
caster.ingest_files(
path=".",
conn_conf=conn_conf,
patterns=patterns,
clean_start=True
)
Use Cases¶
This pattern is particularly useful for:
- Business Intelligence: Tracking multiple types of relationships between companies
- Temporal Analysis: Analyzing how relationships evolve over time
- Network Analysis: Understanding complex business ecosystems
- Compliance: Tracking different types of business arrangements
Key Takeaways¶
relation_field
enables dynamic edge type creation from data- Multiple edges can exist between the same vertex pair
- Edge weights add temporal or quantitative properties to relationships
- Flexible modeling supports complex real-world business scenarios
Please refer to examples
For more examples and detailed explanations, refer to the API Reference.