Metadata-Version: 2.4
Name: codevec
Version: 1.0.0
Summary: Semantic search for functions in Python codebases
Author-email: Lucas Monroe <lucas.i.monroe1@gmail.com>
License-Expression: Apache-2.0
Project-URL: Homepage, https://github.com/mlucas55/codevec
Project-URL: Bug Tracker, https://github.com/mlucas55/codevec/issues
Project-URL: Source, https://github.com/mlucas55/codevec
Keywords: semantic search,code search,embeddings,vector search,python
Classifier: Development Status :: 4 - Beta
Classifier: Intended Audience :: Developers
Classifier: Operating System :: OS Independent
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Software Development :: Libraries :: Python Modules
Classifier: Topic :: Text Processing :: Indexing
Requires-Python: >=3.10
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: chromadb<2.0.0,>=1.3.7
Requires-Dist: sentence-transformers==5.2.0
Requires-Dist: fastapi>=0.100.0
Requires-Dist: uvicorn>=0.23.0
Requires-Dist: requests>=2.28.0
Dynamic: license-file


<div align="center">

# Codevec

#### Codevec is a user-friendly semantic search tool for Python codebases.


![License: Apache 2.0](https://img.shields.io/badge/License-Apache%202.0-blue.svg)

</div>


```bash
pip install codevec
```

## Overview

Codevec is a semantic search tool for Python codebases that lets you find functions using plain English queries—no need to know exact function names or keywords to grep.

It runs entirely on lightweight local models, so indexing and searching a codebase take only seconds. Being entirely local, your code never leaves your machine: no API calls, no usage limits.

Unlike general-purpose AI assistants, Codevec is purpose-built for code search. It focuses on quickly pinpointing relevant function definitions without verbose explanations, making it especially effective for navigating large or unfamiliar repositories.

> **Note:** Codevec currently indexes Python functions only. Module-level code is not indexed.


## Quick Start


### 1. Index your codebase



```bash
vec-index ./your/project/filepath
```
> **Note:** Re-index after making significant changes to your codebase!

### 2. Search with natural language

#### Search from a terminal within the indexed codebase:
```bash
vec-search email validation
```

#### Search from a different directory:
```bash
vec-search "authentication logic" --repo ./your/project/filepath
```

### 3. results
```
(.venv) user@Computer demo-repo % vec-search email validation
Initializing search system...
Found 5 results

================================================================================

┌─ Result #1 ───────────────────────────────────────────────────────────
│ Similarity: 49.3%  │  Rerank: -2.527
├───────────────────────────────────────────────────────────────────────────────
│ 📁 File: /Users/user/development/project/utils/validation.py
│ ⚙️ Function: validate_email (line 5)
├───────────────────────────────────────────────────────────────────────────────
│ Code:
│    5 │ def validate_email(email):
│    6 │     """Check if email address is in valid format"""
│    7 │     pattern = r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'
│    8 │     return re.match(pattern, email) is not None
└───────────────────────────────────────────────────────────────────────────────
```

> **Note:** The filepath is clickable in VS Code terminals!

## Advanced Usage: Model server

Run the model server to keep models loaded in memory for faster searches:

```bash
vec-server  # Starts server on localhost:8000
            # Codevec will automatically use the server when available
```

## How It Works

**Indexing & Embedding** — Codevec walks your codebase, and uses AST parsing to discover Python functions, then uses a lightweight local transformer to generate embeddings

**ChromaDB Storage** — Embeddings are stored in a ChromaDB collection located at `.codevec/` in your project root

**Searching** — Queries are embedded and matched against ChromaDB using semantic similarity, then results are reranked using a cross-encoder for improved relevance

**Re-indexing** — Simply run `vec-index` again on the same directory to update the index with new or modified functions
