graflo.data_source.memory¶
In-memory data source implementations.
This module provides data source implementations for in-memory data structures, including lists of dictionaries, lists of lists, and Pandas DataFrames.
InMemoryDataSource
dataclass
¶
Bases: AbstractDataSource
Data source for in-memory data structures.
This class provides a data source for Python objects that are already in memory, including lists of dictionaries, lists of lists, and Pandas DataFrames.
Attributes:
| Name | Type | Description |
|---|---|---|
data |
list[dict] | list[list] | DataFrame
|
Data to process (list[dict], list[list], or pd.DataFrame) |
columns |
list[str] | None
|
Optional column names for list[list] data |
Source code in graflo/data_source/memory.py
__post_init__()
¶
iter_batches(batch_size=1000, limit=None)
¶
Iterate over in-memory data in batches.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
batch_size
|
int
|
Number of items per batch |
1000
|
limit
|
int | None
|
Maximum number of items to retrieve |
None
|
Yields:
| Type | Description |
|---|---|
list[dict]
|
list[dict]: Batches of documents as dictionaries |