# Implementation Plan: TOON Python Library - Core Encoding Implementation

**Branch**: `001-toon-encoding` | **Date**: 2025-01-27 | **Spec**: [spec.md](spec.md)
**Input**: Feature specification from `/specs/001-toon-encoding/spec.md`

**Note**: This template is filled in by `/speckit.plan` command. See `.specify/templates/commands/plan.md` for the execution workflow.

## Summary

Implement a pure Python library that converts Python data structures to TOON (Token-Oriented Object Notation) format, achieving 30-60% token reduction compared to JSON. The implementation follows a multi-pass optimization approach with strict type safety, comprehensive testing, and zero runtime dependencies. The library prioritizes maximum token reduction over raw encoding speed for datasets under 10MB.

## Technical Context

**Language/Version**: Python 3.10+ (constitution requirement)  
**Primary Dependencies**: None (pure Python implementation)  
**Storage**: N/A (in-memory processing with streaming output)  
**Testing**: pytest with hypothesis for property-based testing  
**Target Platform**: Cross-platform (Linux, macOS, Windows)  
**Project Type**: Single Python library package  
**Performance Goals**: 30-60% token reduction vs JSON, <10MB dataset optimization  
**Constraints**: Strict error handling, zero runtime dependencies, mypy compliance  
**Scale/Scope**: Small datasets (<10MB) with maximum token optimization

## Constitution Check

*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*

### Constitution Compliance Analysis

**✅ I. Token Efficiency First**: Plan prioritizes 30-60% token reduction with multi-pass optimization
**✅ II. Pythonic Design**: Python 3.10+, strict type annotations, pure Python implementation
**✅ III. Test-Driven Development**: pytest + hypothesis with comprehensive test categories
**✅ IV. Feature Parity**: Research based on complete SPEC.md requirements for TypeScript parity
**✅ V. Simplicity and Maintainability**: Module structure follows constitution-mandated separation

### Development Standards Compliance

**✅ Code Quality**: ruff + black + mypy integration planned
**✅ Package Structure**: Following exact module structure from constitution
**✅ Performance Requirements**: Token efficiency prioritized over raw speed
**✅ Testing Requirements**: All 8 mandated test categories addressed

### Governance Compliance

**✅ Amendment Process**: Following documented constitution principles
**✅ Compliance Requirements**: All gates passed without violations

**Result**: ✅ ALL CONSTITUTION GATES PASSED - No violations identified

### Post-Design Constitution Re-check

**✅ I. Token Efficiency First**: Multi-pass optimization and tabular array encoding designed for maximum token reduction
**✅ II. Pythonic Design**: Complete type system, proper module structure, Python 3.10+ compatibility
**✅ III. Test-Driven Development**: Comprehensive test strategy with pytest + hypothesis
**✅ IV. Feature Parity**: Data model covers all TypeScript requirements from SPEC.md
**✅ V. Simplicity and Maintainability**: Clear separation of concerns across 7 modules

**Final Status**: ✅ CONSTITUTION FULLY COMPLIANT - Ready for implementation

## Project Structure

### Documentation (this feature)

```text
specs/[###-feature]/
├── plan.md              # This file (/speckit.plan command output)
├── research.md          # Phase 0 output (/speckit.plan command)
├── data-model.md        # Phase 1 output (/speckit.plan command)
├── quickstart.md        # Phase 1 output (/speckit.plan command)
├── contracts/           # Phase 1 output (/speckit.plan command)
└── tasks.md             # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```

### Source Code (repository root)

```text
src/toon_python/
├── __init__.py          # Public API - encode() function and EncodeOptions class
├── encoder.py           # Main orchestration logic - ToonEncoder class
├── normalizer.py        # Type normalization - ValueNormalizer class
├── primitives.py        # Primitive encoding/quoting - PrimitiveEncoder class
├── formatter.py         # Output formatting and indentation
├── constants.py         # Enums and configuration constants - Delimiter enum
├── types.py            # Type definitions - JsonValue, JsonPrimitive types
└── writer.py           # Line writing utilities - LineWriter class

tests/
├── __init__.py
├── conftest.py          # pytest configuration and fixtures
├── test_encoder.py      # Main encoder functionality tests
├── test_normalizer.py   # Type normalization tests
├── test_primitives.py   # Primitive encoding and quoting tests
├── test_formatter.py    # Output formatting tests
└── test_integration.py  # End-to-end encoding workflow tests
```

**Structure Decision**: Single Python library package following constitution-mandated module separation. The `src/toon_python/` directory contains the core library with clear separation of concerns, while `tests/` contains comprehensive test coverage for all components.

## Complexity Tracking

> **No constitution violations - all complexity justified by requirements**

| Design Decision | Why Needed | Simpler Alternative Rejected Because |
|----------------|------------|-------------------------------------|
| Multi-pass optimization | Maximum token reduction requires separate analysis phases | Single-pass would sacrifice token efficiency |
| Strict error handling | Clarified requirement for fail-fast behavior | Graceful degradation would hide data quality issues |
| Streaming output | Linear memory scaling for nested structures | In-memory building would violate memory requirements |
| Comprehensive test categories | Constitution mandates 8 specific test categories | Reduced testing would violate constitution compliance |
