Metadata-Version: 2.4
Name: dfm-python
Version: 0.1.0
Summary: Dynamic Factor Model (DFM) estimation and nowcasting in Python
Project-URL: Homepage, https://github.com/yourusername/dfm-python
Project-URL: Documentation, https://github.com/yourusername/dfm-python#readme
Project-URL: Repository, https://github.com/yourusername/dfm-python
Project-URL: Issues, https://github.com/yourusername/dfm-python/issues
Author: DFM Python Contributors
License: MIT
Keywords: dfm,dynamic-factor-model,econometrics,forecasting,nowcasting,time-series
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Science/Research
Classifier: License :: OSI Approved :: MIT License
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.12
Classifier: Topic :: Scientific/Engineering
Requires-Python: >=3.12
Requires-Dist: numpy>=1.24.0
Requires-Dist: pandas>=2.0.0
Requires-Dist: scipy>=1.10.0
Provides-Extra: all
Requires-Dist: httpx>=0.24.0; extra == 'all'
Requires-Dist: hydra-core>=1.3.0; extra == 'all'
Requires-Dist: omegaconf>=2.3.0; extra == 'all'
Requires-Dist: psycopg2-binary>=2.9.0; extra == 'all'
Requires-Dist: pydantic-settings>=2.11.0; extra == 'all'
Requires-Dist: pytest>=7.0.0; extra == 'all'
Requires-Dist: supabase>=2.0.0; extra == 'all'
Provides-Extra: database
Requires-Dist: httpx>=0.24.0; extra == 'database'
Requires-Dist: psycopg2-binary>=2.9.0; extra == 'database'
Requires-Dist: pydantic-settings>=2.11.0; extra == 'database'
Requires-Dist: supabase>=2.0.0; extra == 'database'
Provides-Extra: dev
Requires-Dist: pytest>=7.0.0; extra == 'dev'
Provides-Extra: hydra
Requires-Dist: hydra-core>=1.3.0; extra == 'hydra'
Requires-Dist: omegaconf>=2.3.0; extra == 'hydra'
Description-Content-Type: text/markdown

# dfm-python

A generic Python implementation of Dynamic Factor Models (DFM) for nowcasting and forecasting.

## Features

- **Generic DFM Estimation**: EM algorithm for estimating dynamic factor models
- **Mixed-Frequency Data**: Handle data with different frequencies (daily, monthly, quarterly)
- **Missing Data Handling**: Robust handling of missing observations
- **News Decomposition**: Compare forecasts between vintages and attribute changes
- **Flexible Configuration**: Support for YAML and CSV configuration files
- **Optional Hydra Integration**: Use Hydra for experiment management (optional dependency)

## Installation

```bash
pip install dfm-python
```

### Optional Dependencies

For Hydra configuration support:
```bash
pip install dfm-python[hydra]
```

For database integration (application-specific):
```bash
pip install dfm-python[database]
```

Install all optional dependencies:
```bash
pip install dfm-python[all]
```

## Quick Start

### Basic Usage

```python
from dfm_python import load_config, load_data, dfm
import pandas as pd

# Load configuration from YAML or CSV
config = load_config('config.yaml')

# Load data from CSV file
X, Time, Z = load_data('data.csv', config, sample_start=pd.Timestamp('2000-01-01'))

# Estimate DFM
Res = dfm(X, config, threshold=1e-4)

# Access results
factors = Res.Z  # Extracted factors
loadings = Res.C  # Factor loadings
```

### Configuration

The DFM module uses a configuration object (`DFMConfig`) that defines:

- **Series**: Each time series with frequency, transformation, and block membership
- **Blocks**: Factor blocks with number of factors per block
- **Estimation Parameters**: EM algorithm settings (threshold, max_iter, etc.)

#### YAML Configuration Example

```yaml
# config.yaml
model:
  series:
    series_1:
      frequency: "m"
      transformation: "pch"
      blocks: [Global]
    series_2:
      frequency: "q"
      transformation: "pca"
      blocks: [Global, Investment]
  
  blocks:
    Global:
      factors: 1
    Investment:
      factors: 1

dfm:
  ar_lag: 1
  threshold: 1e-5
  max_iter: 5000
```

#### CSV Configuration Example

```csv
SeriesID,Frequency,Transformation,Block_Global,Block_Investment
series_1,m,pch,1,0
series_2,q,pca,1,1
```

### Data Format

CSV data files should have:
- First column: Date (YYYY-MM-DD format)
- Subsequent columns: One per series (matching SeriesID from config)
- Column names: Match SeriesID from configuration

Example:
```csv
Date,series_1,series_2
2000-01-01,100.0,50.0
2000-02-01,101.0,51.0
...
```

## Core Components

### DFMConfig

Configuration dataclass that defines the model structure:

```python
from dfm_python import DFMConfig, SeriesConfig

config = DFMConfig(
    series=[
        SeriesConfig(
            series_id="gdp",
            frequency="q",
            transformation="pca",
            blocks=[1, 0]  # Binary: loads on block 0 (Global)
        )
    ],
    block_names=["Global", "Investment"],
    factors_per_block=[1, 1],
    ar_lag=1,
    threshold=1e-5,
    max_iter=5000
)
```

### DFM Estimation

```python
from dfm_python import dfm

# Estimate DFM model
result = dfm(X, config, threshold=1e-4, max_iter=1000)

# Access results
factors = result.Z          # Factor estimates (T x r)
loadings = result.C         # Factor loadings (N x r)
transition = result.A         # Transition matrix
covariance = result.Q         # Factor covariance
```

### News Decomposition

Compare forecasts between two vintages:

```python
from dfm_python import update_nowcast

# Compare old vs new vintage
news_df, forecast_df = update_nowcast(
    X_old, X_new, Time, config, result,
    series="gdp",
    vintage_old="2024-01-01",
    vintage_new="2024-02-01"
)
```

## API Reference

### Main Functions

- `load_config(file)`: Load configuration from YAML or CSV
- `load_data(file, config)`: Load and transform data from CSV
- `dfm(X, config)`: Estimate DFM model using EM algorithm
- `update_nowcast(...)`: Perform news decomposition between vintages

### Configuration

- `DFMConfig`: Main configuration class
- `SeriesConfig`: Individual series configuration

### Results

- `DFMResult`: Result object with factors, loadings, and model parameters

## Architecture

The DFM module is designed to be **generic and application-agnostic**:

- **Core Module** (`dfm_python`): Generic DFM estimation logic
- **Adapters** (application-specific): Database integration, API clients, etc.

This separation allows the core module to be used in any context while keeping application-specific code separate.

## Requirements

- Python >= 3.12
- numpy >= 1.24.0
- pandas >= 2.0.0
- scipy >= 1.10.0

## License

MIT License

## Contributing

Contributions are welcome! Please ensure that:
- Core module remains generic (no application-specific code)
- All optional dependencies are properly handled
- Tests pass for the core functionality

## Citation

If you use this package in your research, please cite:

```bibtex
@software{dfm-python,
  title = {dfm-python: Dynamic Factor Models for Nowcasting},
  author = {DFM Python Contributors},
  year = {2024},
  url = {https://github.com/yourusername/dfm-python}
}
```
