graphcast.cli.ingest
¶
Data ingestion command-line interface for graph databases.
This module provides a CLI tool for ingesting data into graph databases. It supports batch processing, parallel execution, and various data formats. The tool can handle both initial database setup and incremental data ingestion.
Key Features
- Configurable batch processing
- Multi-core and multi-threaded execution
- Support for custom resource patterns
- Database initialization and cleanup options
- Flexible file discovery and processing
Example
$ uv run ingest --db-config-path config/db.yaml --schema-path config/schema.yaml --source-path data/ --batch-size 5000 --n-cores 4
ingest(db_config_path, schema_path, source_path, limit_files, batch_size, n_cores, n_threads, fresh_start, init_only, resource_pattern_config_path)
¶
Ingest data into a graph database.
This command processes data files and ingests them into a graph database according to the provided schema. It supports various configuration options for controlling the ingestion process.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
db_config_path
|
Path to database configuration file |
required | |
schema_path
|
Path to schema configuration file |
required | |
source_path
|
Path to source data directory |
required | |
limit_files
|
Optional limit on number of files to process |
required | |
batch_size
|
Number of items to process in each batch (default: 5000) |
required | |
n_cores
|
Number of CPU cores to use for parallel processing (default: 1) |
required | |
n_threads
|
Number of threads per core for parallel processing (default: 1) |
required | |
fresh_start
|
Whether to wipe existing database before ingestion |
required | |
init_only
|
Whether to only initialize the database without ingestion |
required | |
resource_pattern_config_path
|
Optional path to resource pattern configuration |
required |
Example
$ uv run ingest --db-config-path config/db.yaml --schema-path config/schema.yaml --source-path data/ --batch-size 5000 --n-cores 4 --fresh-start
Source code in graphcast/cli/ingest.py
35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 |
|