API Reference¶
Complete API documentation for the ml4t-data library, auto-generated from
source docstrings via mkdocstrings.
DataManager¶
The primary entry point for all data operations. DataManager is a facade that
delegates to focused manager classes for configuration, fetching, storage,
metadata, and batch operations.
from ml4t.data import DataManager
# Fetch-only (no storage)
manager = DataManager()
df = manager.fetch("AAPL", "2024-01-01", "2024-12-31", provider="yahoo")
# With storage for load/update workflows
from ml4t.data.storage import HiveStorage, StorageConfig
storage = HiveStorage(StorageConfig(base_path="./data"))
manager = DataManager(storage=storage, use_transactions=True)
key = manager.load("AAPL", "2024-01-01", "2024-12-31")
key = manager.update("AAPL")
DataManager
¶
DataManager(
config_path=None,
output_format="polars",
providers=None,
storage=None,
use_transactions=False,
enable_validation=True,
progress_callback=None,
**kwargs,
)
Unified interface for financial data access and storage.
The DataManager provides a single, consistent API for fetching and managing data from multiple providers. It handles:
Data Fetching: - Provider selection based on symbol patterns - Configuration management (YAML, environment, parameters) - Connection pooling and session management - Output format conversion (Polars, pandas, lazy) - Batch fetching with error handling
Storage Operations (when storage configured): - Initial data loading with validation - Incremental updates with gap detection and filling - Transaction support for ACID guarantees - Progress callbacks for UI integration - Data validation (OHLCV, cross-validation)
Usage:
Fetch only (no storage): >>> manager = DataManager() >>> df = manager.fetch("AAPL", "2024-01-01", "2024-12-31", provider="yahoo")
With storage for load/update: >>> from ml4t.data.storage.hive import HiveStorage >>> from ml4t.data.storage.backend import StorageConfig >>> storage = HiveStorage(StorageConfig(base_path="./data")) >>> manager = DataManager(storage=storage, use_transactions=True) >>> key = manager.load("AAPL", "2024-01-01", "2024-12-31") >>> key = manager.update("AAPL") # Incremental update
Initialize DataManager.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config_path
|
str | None
|
Path to YAML configuration file |
None
|
output_format
|
str
|
Output format ('polars', 'pandas', 'lazy') |
'polars'
|
providers
|
dict[str, dict[str, Any]] | None
|
Provider-specific configuration overrides |
None
|
storage
|
Any | None
|
Optional storage backend for load/update operations |
None
|
use_transactions
|
bool
|
Enable transactional storage for ACID guarantees |
False
|
enable_validation
|
bool
|
Enable data validation during load/update |
True
|
progress_callback
|
Callable[[str, float], None] | None
|
Optional callback for progress updates (message, progress) |
None
|
**kwargs
|
Additional configuration parameters |
{}
|
fetch
¶
Fetch data for a symbol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
str
|
Symbol to fetch |
required |
start
|
str
|
Start date (YYYY-MM-DD) |
required |
end
|
str
|
End date (YYYY-MM-DD) |
required |
frequency
|
str
|
Data frequency (daily, hourly, etc.) |
'daily'
|
provider
|
str | None
|
Optional provider override |
None
|
**kwargs
|
Additional provider-specific parameters |
{}
|
Returns:
| Type | Description |
|---|---|
DataFrame | LazyFrame | Any
|
Data in configured output format |
Raises:
| Type | Description |
|---|---|
ValueError
|
If no provider found or data fetch fails |
fetch_batch
¶
Fetch data for multiple symbols.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbols
|
list[str]
|
List of symbols to fetch |
required |
start
|
str
|
Start date (YYYY-MM-DD) |
required |
end
|
str
|
End date (YYYY-MM-DD) |
required |
frequency
|
str
|
Data frequency |
'daily'
|
**kwargs
|
Additional parameters |
{}
|
Returns:
| Type | Description |
|---|---|
dict[str, DataFrame | LazyFrame | Any | None]
|
Dictionary mapping symbols to data (or None if fetch failed) |
batch_load
¶
batch_load(
symbols,
start,
end,
frequency="daily",
provider=None,
max_workers=4,
fail_on_partial=False,
**kwargs,
)
Fetch data for multiple symbols and return in multi-asset stacked format.
batch_load_universe
¶
batch_load_universe(
universe,
start,
end,
frequency="daily",
provider=None,
max_workers=4,
fail_on_partial=False,
**kwargs,
)
Fetch data for all symbols in a pre-defined universe.
batch_load_from_storage
¶
batch_load_from_storage(
symbols,
start,
end,
frequency="daily",
asset_class="equities",
provider=None,
fetch_missing=True,
max_workers=4,
**kwargs,
)
Load multiple symbols from storage with optional fetch fallback.
load
¶
load(
symbol,
start,
end,
frequency="daily",
asset_class="equities",
provider=None,
bar_type="time",
bar_threshold=None,
exchange="UNKNOWN",
calendar=None,
)
Load data from provider and store it.
import_data
¶
import_data(
data,
symbol,
provider,
frequency="daily",
asset_class="equities",
bar_type="time",
bar_threshold=None,
exchange="UNKNOWN",
calendar=None,
)
Import external data into storage with metadata.
update
¶
update(
symbol,
frequency="daily",
asset_class="equities",
lookback_days=7,
fill_gaps=True,
provider=None,
)
Update existing data with incremental fetch.
list_symbols
¶
List all symbols in storage, optionally filtered by metadata.
get_metadata
¶
Get metadata for a specific symbol.
assign_sessions
¶
Assign session_date column to DataFrame based on exchange calendar.
complete_sessions
¶
complete_sessions(
df,
exchange=None,
calendar=None,
fill_gaps=True,
fill_method="forward",
zero_volume=True,
)
Complete sessions by filling gaps.
update_all
¶
Update all stored data matching the filters.
Storage¶
StorageConfig¶
Dataclass configuring the storage backend. Controls partitioning strategy, compression, locking, and metadata tracking.
from ml4t.data.storage import StorageConfig
# Hive-partitioned storage for minute data
config = StorageConfig(
base_path="./market_data",
strategy="hive",
partition_granularity="day",
compression="zstd",
)
# Flat storage for small datasets
config = StorageConfig(
base_path="./data",
strategy="flat",
compression="snappy",
)
StorageConfig
dataclass
¶
StorageConfig(
base_path,
strategy="hive",
compression="zstd",
partition_granularity="month",
partition_cols=None,
atomic_writes=True,
enable_locking=True,
metadata_tracking=True,
generate_profile=True,
)
Configuration for storage backends.
Attributes:
| Name | Type | Description |
|---|---|---|
base_path |
Path
|
Base directory for storage. |
strategy |
str
|
Storage strategy ("hive" or "flat"). |
compression |
str | None
|
Compression type for Parquet files. |
partition_granularity |
PartitionGranularityType
|
Time-based partition granularity for Hive storage. - "year": Best for daily data (~252 rows/partition for stocks) - "month": Best for hourly data (~720 rows/partition) - "day": Best for minute data (~1,440 rows/partition) - "hour": Best for second/tick data (~3,600 rows/partition) |
partition_cols |
list[str] | None
|
Deprecated. Use partition_granularity instead. |
atomic_writes |
bool
|
Use atomic writes with temp file rename. |
enable_locking |
bool
|
Enable file locking for concurrent access. |
metadata_tracking |
bool
|
Track metadata in manifest files. |
StorageBackend¶
Abstract base class defining the storage interface. All backends (Hive, Flat) implement this contract.
StorageBackend
¶
Bases: ABC
Abstract base class for storage backends.
Initialize storage backend with configuration.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
StorageConfig
|
Storage configuration |
required |
write
abstractmethod
¶
Write data to storage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
LazyFrame
|
Polars LazyFrame to write |
required |
key
|
str
|
Storage key (e.g., "BTC-USD", "SPY") |
required |
metadata
|
dict[str, Any] | None
|
Optional metadata to store alongside data |
None
|
Returns:
| Type | Description |
|---|---|
Path
|
Path to written file |
read
abstractmethod
¶
Read data from storage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
start_date
|
datetime | None
|
Optional start date filter |
None
|
end_date
|
datetime | None
|
Optional end date filter |
None
|
columns
|
list[str] | None
|
Optional columns to select |
None
|
Returns:
| Type | Description |
|---|---|
LazyFrame
|
Polars LazyFrame with requested data |
list_keys
abstractmethod
¶
List all available keys in storage.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of storage keys |
exists
abstractmethod
¶
Check if a key exists in storage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key to check |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if key exists |
delete
abstractmethod
¶
Delete data for a key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key to delete |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if deletion was successful |
get_metadata
¶
Get metadata for a key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
Returns:
| Type | Description |
|---|---|
dict[str, Any] | None
|
Metadata dict or None |
HiveStorage¶
Hive-partitioned storage with configurable time-based partitioning. Delivers 7x query performance improvement for time-range queries via partition pruning.
from ml4t.data.storage import HiveStorage, StorageConfig
config = StorageConfig(
base_path="./data",
partition_granularity="month", # year, month, day, or hour
)
storage = HiveStorage(config)
# Write data (partitions by timestamp automatically)
storage.write(df, "equities/daily/AAPL")
# Read with partition pruning
from datetime import datetime
lf = storage.read(
"equities/daily/AAPL",
start_date=datetime(2024, 6, 1),
end_date=datetime(2024, 12, 31),
columns=["timestamp", "close", "volume"],
)
df = lf.collect()
HiveStorage
¶
Bases: StorageBackend
Hive partitioned storage with configurable time-based partitioning.
This implementation provides: - 7x query performance improvement for time-based queries - Configurable partition granularity (year, month, day, hour) - Atomic writes with temp file pattern - Metadata tracking in JSON manifests - File locking for concurrent access safety - Polars lazy evaluation throughout
Partition Granularity
Configure via StorageConfig.partition_granularity: - "year": Best for daily data (~252 rows/partition) - "month": Best for hourly data (~720 rows/partition) [default] - "day": Best for minute data (~1,440 rows/partition) - "hour": Best for second/tick data (~3,600 rows/partition)
Example
from ml4t.data.storage import HiveStorage, StorageConfig
For minute data, use day-level partitioning¶
config = StorageConfig(base_path="./data", partition_granularity="day") storage = HiveStorage(config)
Initialize Hive storage backend.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
StorageConfig
|
Storage configuration |
required |
write
¶
Write data using Hive partitioning.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
LazyFrame | DataFrame | DataObject
|
Data to write (DataFrame, LazyFrame, or DataObject) |
required |
key
|
str | None
|
Storage key (e.g., "BTC-USD" or "equities/daily/AAPL"). Optional if data is DataObject. |
None
|
metadata
|
dict[str, Any] | None
|
Optional metadata dict |
None
|
Returns:
| Type | Description |
|---|---|
Path | str
|
Path to base directory (old API) or storage key string (new DataObject API) |
read
¶
Read data from Hive partitions.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
start_date
|
datetime | None
|
Optional start date filter |
None
|
end_date
|
datetime | None
|
Optional end date filter |
None
|
columns
|
list[str] | None
|
Optional columns to select |
None
|
Returns:
| Type | Description |
|---|---|
LazyFrame
|
LazyFrame with requested data |
list_keys
¶
List all keys in storage.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of storage keys |
exists
¶
Check if key exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if key exists |
delete
¶
Delete all data for a key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if successful |
get_latest_timestamp
¶
Get the latest timestamp for a symbol from a provider.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
str
|
Symbol identifier |
required |
provider
|
str
|
Data provider name |
required |
Returns:
| Type | Description |
|---|---|
datetime | None
|
Latest timestamp in the dataset, or None if no data exists |
save_chunk
¶
Save an incremental data chunk.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
DataFrame with OHLCV data |
required |
symbol
|
str
|
Symbol identifier |
required |
provider
|
str
|
Data provider name |
required |
start_time
|
datetime
|
Start time of this chunk |
required |
end_time
|
datetime
|
End time of this chunk |
required |
Returns:
| Type | Description |
|---|---|
Path
|
Path to the saved chunk file |
update_combined_file
¶
Update the main combined file with new data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
DataFrame
|
New data to append |
required |
symbol
|
str
|
Symbol identifier |
required |
provider
|
str
|
Data provider name |
required |
Returns:
| Type | Description |
|---|---|
int
|
Number of new records added (after deduplication) |
read_data
¶
Read data for a symbol with optional time filtering.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
str
|
Symbol identifier |
required |
provider
|
str
|
Data provider name |
required |
start_time
|
datetime | None
|
Optional start time filter |
None
|
end_time
|
datetime | None
|
Optional end time filter |
None
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with filtered data |
update_metadata
¶
Update metadata after incremental update.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
str
|
Symbol identifier |
required |
provider
|
str
|
Data provider name |
required |
last_update
|
datetime
|
Timestamp of this update |
required |
records_added
|
int
|
Number of records added |
required |
chunk_file
|
str
|
Name of the chunk file saved |
required |
FlatStorage¶
Simple single-file-per-key storage. Suitable for smaller datasets or when partition pruning is not beneficial.
from ml4t.data.storage import FlatStorage, StorageConfig
config = StorageConfig(base_path="./data", strategy="flat")
storage = FlatStorage(config)
storage.write(df, "reference/spy")
lf = storage.read("reference/spy")
FlatStorage
¶
Bases: StorageBackend
Flat file storage without partitioning.
This implementation provides: - Simple single-file storage per key - Atomic writes with temp file pattern - Metadata tracking in JSON manifests - File locking for concurrent access safety - Polars lazy evaluation throughout
Initialize flat storage backend.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
config
|
StorageConfig
|
Storage configuration |
required |
write
¶
Write data as a single file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
data
|
LazyFrame | DataFrame
|
Data to write |
required |
key
|
str
|
Storage key (e.g., "BTC-USD") |
required |
metadata
|
dict[str, Any] | None
|
Optional metadata |
None
|
Returns:
| Type | Description |
|---|---|
Path
|
Path to written file |
read
¶
Read data from flat file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
start_date
|
datetime | None
|
Optional start date filter |
None
|
end_date
|
datetime | None
|
Optional end date filter |
None
|
columns
|
list[str] | None
|
Optional columns to select |
None
|
Returns:
| Type | Description |
|---|---|
LazyFrame
|
LazyFrame with requested data |
list_keys
¶
List all keys in storage.
Returns:
| Type | Description |
|---|---|
list[str]
|
List of storage keys |
exists
¶
Check if key exists.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if key exists |
delete
¶
Delete data for a key.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
Returns:
| Type | Description |
|---|---|
bool
|
True if successful |
create_storage¶
Factory function for creating storage backends from a strategy name.
from ml4t.data.storage import create_storage
storage = create_storage("./data", strategy="hive", partition_granularity="day")
create_storage
¶
Create a storage backend with the specified strategy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_path
|
str | Path
|
Base directory for storage |
required |
strategy
|
str
|
Storage strategy ("hive" or "flat") |
'hive'
|
**kwargs
|
Additional configuration options |
{}
|
Returns:
| Type | Description |
|---|---|
StorageBackend
|
Configured storage backend |
Example
storage = create_storage("/data", strategy="hive") storage.write(df.lazy(), "BTC-USD")
Providers¶
BaseProvider¶
Abstract base class for all data providers. Composes rate-limiting, circuit-breaker, validation, and HTTP session mixins into a single base.
Concrete providers implement either:
_fetch_and_transform_data()for a single-step workflow, or_fetch_raw_data()+_transform_data()for a two-step workflow.
from ml4t.data.providers.base import BaseProvider
import polars as pl
class MyProvider(BaseProvider):
@property
def name(self) -> str:
return "my_provider"
def _fetch_and_transform_data(self, symbol, start, end, frequency):
# Fetch from API and return canonical OHLCV DataFrame
...
BaseProvider
¶
Bases: RateLimitMixin, CircuitBreakerMixin, ValidationMixin, SessionMixin, ABC
Enhanced base provider composing all mixins.
All providers must return OHLCV data in the canonical schema with columns in standard order: [timestamp, symbol, open, high, low, close, volume].
Each provider must implement either: - _fetch_and_transform_data() for single-step implementation - _fetch_raw_data() + _transform_data() for two-step implementation
Class Variables
DEFAULT_RATE_LIMIT: Default (calls, period_seconds) for rate limiting FREQUENCY_MAP: Mapping of frequency names to provider-specific values CIRCUIT_BREAKER_CONFIG: Circuit breaker failure threshold and reset timeout
Key Contracts
- Columns always in order: timestamp, symbol, open, high, low, close, volume
- Timestamps are Datetime type
- OHLCV values are Float64
- Symbol is uppercase String
- Data sorted by timestamp ascending
- No duplicate timestamps
Initialize base provider with common infrastructure.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
rate_limit
|
tuple[int, float] | None
|
Tuple of (calls, period_seconds) for rate limiting |
None
|
session_config
|
dict[str, Any] | None
|
HTTP session configuration |
None
|
circuit_breaker_config
|
dict[str, Any] | None
|
Circuit breaker configuration |
None
|
fetch_ohlcv
¶
Template method for fetching OHLCV data.
This method implements the common workflow: 1. Validate inputs 2. Apply rate limiting 3. Fetch and transform data (provider-specific) 4. Validate and normalize data
Providers can implement either: - _fetch_and_transform_data() for single-step implementation - _fetch_raw_data() + _transform_data() for two-step implementation
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
str
|
The symbol to fetch data for |
required |
start
|
str
|
Start date in YYYY-MM-DD format (inclusive) |
required |
end
|
str
|
End date in YYYY-MM-DD format (see note below) |
required |
frequency
|
str
|
Data frequency (daily, minute, etc.) |
'daily'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with OHLCV data in canonical schema: |
DataFrame
|
[timestamp, symbol, open, high, low, close, volume] |
Note
Date range semantics vary by provider: - Most providers: Both start and end are INCLUSIVE - Yahoo Finance: end is EXCLUSIVE (internally adds 1 day)
fetch_ohlcv_async
async
¶
Async wrapper around fetch_ohlcv using a thread pool.
Providers with native async support should override this method.
capabilities
¶
Return provider capabilities (default implementation).
Override in subclasses to provide accurate capabilities.
ProviderCapabilities¶
Frozen dataclass describing what a provider supports (intraday, crypto, forex, futures, authentication requirements, rate limits).
from ml4t.data.providers.protocols import ProviderCapabilities
caps = ProviderCapabilities(
supports_intraday=True,
supports_crypto=True,
requires_api_key=True,
rate_limit=(120, 60.0), # 120 calls per 60 seconds
)
ProviderCapabilities
dataclass
¶
ProviderCapabilities(
supports_intraday=False,
supports_crypto=False,
supports_forex=False,
supports_futures=False,
requires_api_key=False,
max_history_days=None,
rate_limit=(60, 60.0),
)
Describes what a provider can do.
Attributes:
| Name | Type | Description |
|---|---|---|
supports_intraday |
bool
|
Can fetch minute/hourly data |
supports_crypto |
bool
|
Handles cryptocurrency symbols |
supports_forex |
bool
|
Handles forex pairs |
supports_futures |
bool
|
Handles futures contracts |
requires_api_key |
bool
|
Needs authentication |
max_history_days |
int | None
|
Maximum historical data available |
rate_limit |
tuple[int, float]
|
(calls, period_seconds) tuple |
OHLCVProvider (Protocol)¶
Structural typing protocol for OHLCV providers. Any class implementing
name, fetch_ohlcv(), and capabilities() satisfies this protocol
without inheriting from BaseProvider.
OHLCVProvider
¶
Bases: Protocol
Protocol for OHLCV data providers.
Any class implementing these methods is considered an OHLCVProvider, regardless of inheritance. This enables duck typing with type safety.
Example
class MyCustomProvider: ... @property ... def name(self) -> str: ... return "custom" ... ... def fetch_ohlcv(self, symbol, start, end, frequency="daily"): ... # Custom implementation ... pass ... ... def capabilities(self) -> ProviderCapabilities: ... return ProviderCapabilities() ... isinstance(MyCustomProvider(), OHLCVProvider) # True
fetch_ohlcv
¶
Fetch OHLCV data for a symbol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
str
|
Symbol to fetch (e.g., 'AAPL', 'BTCUSDT') |
required |
start
|
str
|
Start date in YYYY-MM-DD format |
required |
end
|
str
|
End date in YYYY-MM-DD format |
required |
frequency
|
str
|
Data frequency ('daily', 'hourly', 'minute', etc.) |
'daily'
|
Returns:
| Type | Description |
|---|---|
DataFrame
|
DataFrame with columns: [timestamp, symbol, open, high, low, close, volume] |
Configuration¶
Config¶
Pydantic model for top-level library configuration. Reads defaults from
environment variables (QLDM_DATA_ROOT, QLDM_LOG_LEVEL).
from ml4t.data import Config
# Use defaults
config = Config()
# Override data root
config = Config(data_root="/mnt/fast/market_data", log_level="DEBUG")
Config
¶
Bases: BaseModel
Main configuration for QLDM.
Initialize config with environment variables.
validation
class-attribute
instance-attribute
¶
RetryConfig¶
Configuration for automatic retry with exponential backoff.
RetryConfig
¶
Bases: BaseModel
Retry configuration.
CacheConfig¶
Configuration for in-memory caching.
CacheConfig
¶
Bases: BaseModel
Cache configuration.
Exceptions¶
All exceptions inherit from ML4TDataError, which carries an optional
details dictionary for structured error context.
ML4TDataError
├── ProviderError
│ ├── NetworkError
│ │ └── RateLimitError
│ ├── AuthenticationError
│ ├── DataValidationError
│ ├── SymbolNotFoundError
│ └── DataNotAvailableError
├── StorageError
│ └── LockError
├── ConfigurationError
└── CircuitBreakerOpenError
ML4TDataError¶
ML4TDataError
¶
Bases: Exception
Base exception for all ml4t-data errors.
Initialize ml4t-data error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Error message |
required |
details
|
dict[str, Any] | None
|
Optional dictionary with error details |
None
|
ProviderError¶
ProviderError
¶
Bases: ML4TDataError
Base exception for provider-related errors.
Initialize provider error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
str
|
Provider name |
required |
message
|
str
|
Error message |
required |
details
|
dict[str, Any] | None
|
Optional error details |
None
|
NetworkError¶
NetworkError
¶
Bases: ProviderError
Network-related errors (connection, timeout, etc.).
Initialize network error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
str
|
Provider name |
required |
message
|
str
|
Error message |
'Network error occurred'
|
details
|
dict[str, Any] | None
|
Optional error details |
None
|
retry_after
|
float | None
|
Seconds to wait before retry |
None
|
RateLimitError¶
RateLimitError
¶
Bases: NetworkError
Rate limit exceeded error.
Initialize rate limit error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
str
|
Provider name |
required |
retry_after
|
float | None
|
Seconds to wait before retry |
None
|
remaining
|
int | None
|
Remaining API calls |
None
|
limit
|
int | None
|
API call limit |
None
|
AuthenticationError¶
AuthenticationError
¶
DataValidationError¶
DataValidationError
¶
Bases: ProviderError
Data validation errors.
Initialize data validation error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
str
|
Provider name |
required |
message
|
str
|
Error message |
required |
field
|
str | None
|
Field that failed validation |
None
|
value
|
Any | None
|
Invalid value |
None
|
details
|
dict[str, Any] | None
|
Optional error details |
None
|
SymbolNotFoundError¶
SymbolNotFoundError
¶
Bases: ProviderError
Symbol not found or invalid.
Initialize symbol not found error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
str
|
Provider name |
required |
symbol
|
str
|
The symbol that was not found |
required |
details
|
dict[str, Any] | None
|
Optional error details |
None
|
DataNotAvailableError¶
DataNotAvailableError
¶
Bases: ProviderError
Data not available for the requested period.
Initialize data not available error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
provider
|
str
|
Provider name |
required |
symbol
|
str
|
Symbol requested |
required |
start
|
str | None
|
Start date |
None
|
end
|
str | None
|
End date |
None
|
frequency
|
str | None
|
Data frequency |
None
|
details
|
dict[str, Any] | None
|
Optional error details |
None
|
StorageError¶
StorageError
¶
Bases: ML4TDataError
Storage-related errors.
Initialize storage error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Error message |
required |
key
|
str | None
|
Storage key involved |
None
|
details
|
dict[str, Any] | None
|
Optional error details |
None
|
LockError¶
LockError
¶
Bases: StorageError
File locking errors.
Initialize lock error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
key
|
str
|
Storage key |
required |
timeout
|
float
|
Lock timeout that was exceeded |
required |
details
|
dict[str, Any] | None
|
Optional error details |
None
|
ConfigurationError¶
ConfigurationError
¶
Bases: ML4TDataError
Configuration-related errors.
Initialize configuration error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Error message |
required |
parameter
|
str | None
|
Configuration parameter involved |
None
|
details
|
dict[str, Any] | None
|
Optional error details |
None
|
CircuitBreakerOpenError¶
CircuitBreakerOpenError
¶
Bases: ML4TDataError
Circuit breaker is open and preventing calls.
Initialize circuit breaker open error.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
message
|
str
|
Error message |
'Circuit breaker is open'
|
failure_count
|
int | None
|
Number of failures that caused circuit to open |
None
|
details
|
dict[str, Any] | None
|
Optional error details |
None
|