Example 10: TigerGraph bulk load and S3 staging¶
This example shows how to combine:
TigergraphConfig.bulk_load— CSV staging + nativeLOADING JOBinstead of REST++ upserts for the ingest run.Bindings.staging_proxy— manifest-visible names that map toS3GeneralizedConnConfigon anInMemoryConnectionProvider(no secrets in YAML).
The companion directory is:
Prerequisites¶
- A running TigerGraph instance (for example
TigergraphConfig.from_docker_env()againstdocker/tigergraph/.env). - For S3 upload during finalize: either AWS S3, or a MinIO (or other S3-compatible) server reachable from your machine and from TigerGraph if the loader must read
s3://URIs (network and IAM policies are deployment-specific).
Manifest: staging_proxy¶
The manifest adds a small staging table beside ordinary connectors:
The label bulk_s3 is referenced from TigergraphConfig.bulk_load.s3_staging_name. The label minio_bulk is the key used when registering S3GeneralizedConnConfig in Python.
Runtime: register S3 config and ingest¶
import os
from graflo.hq.connection_provider import InMemoryConnectionProvider, S3GeneralizedConnConfig
provider = InMemoryConnectionProvider()
provider.register_generalized_config(
conn_proxy="minio_bulk",
config=S3GeneralizedConnConfig(
bucket=os.environ.get("BULK_S3_BUCKET", "graflo-staging"),
region="us-east-1",
endpoint_url=os.environ.get("MINIO_ENDPOINT", "http://127.0.0.1:9000"),
aws_access_key_id=os.environ.get("MINIO_ACCESS_KEY", "minioadmin"),
aws_secret_access_key=os.environ.get("MINIO_SECRET_KEY", "minioadmin"),
),
)
engine.define_and_ingest(
manifest=manifest,
target_db_config=conn_conf,
ingestion_params=ingestion_params,
connection_provider=provider,
)
See ingest.py in the example folder for a full script that sets bulk_load on the TigerGraph config and runs define_and_ingest.
Emulating S3 locally¶
The TigerGraph bulk load guide compares MinIO, LocalStack, and moto. For a quick MinIO container:
docker run -p 9000:9000 -e MINIO_ROOT_USER=minioadmin -e MINIO_ROOT_PASSWORD=minioadmin \
minio/minio server /data --console-address ":9001"
Create a bucket (e.g. graflo-staging) in the console, then point MINIO_ENDPOINT at http://127.0.0.1:9000 when running the example.