Example 1: Multiple Tabular Sources¶
Suppose we have a table that represent people:
and a table that represents their roles in a company:
# departments
person_id,person,department
1,John Dow,Sales
2,Mary Arpe,R&D
3,Sid Mei,Customer Service
We want to define vertices Person
and Department
and set up the rules of how to map tables to vertex key-value pairs.
Let's define vertices as
vertices:
- name: person
fields:
- id
- name
- age
indexes:
- fields:
- id
- name: department
fields:
- name
indexes:
- fields:
- name
and edges as
The graph structure is quite simple:
Let's define the mappings: we want to rename the fields person
, person_id
and department
and specify explicitly target_vertex
to avoid the collision, since both Person
and Department
have a field called name
.
- resource_name: people
apply:
- vertex: person
- resource_name: departments
apply:
- map:
person: name
person_id: id
- target_vertex: department
map:
department: name
Department Resource
People Resource
Transforming the data and ingesting it into an ArangoDB takes a few lines of code:
from suthing import ConfigFactory, FileHandle
from graphcast import Caster, Patterns, Schema
schema = Schema.from_dict(FileHandle.load("schema.yaml"))
conn_conf = ConfigFactory.create_config(
{
"protocol": "http",
"hostname": "localhost",
"port": 8535,
"username": "root",
"password": "123",
"database": "_system",
}
)
patterns = Patterns.from_dict(
{
"patterns": {
"people": {"regex": "^people.*\.csv$"},
"departments": {"regex": "^dep.*\.csv$"},
}
}
)
caster = Caster(schema)
caster.ingest_files(
path=".",
conn_conf=conn_conf,
patterns=patterns,
)
Please refer to examples
For more examples and detailed explanations, refer to the API Reference.