Add CLAUDE.md with comprehensive project documentation

Documents the CouncilOS architecture, tech stack, development roadmap,
CouncilState data model, UI structure, and coding conventions for AI
assistants working in this repository.

https://claude.ai/code/session_018ZdWbY5UpCiwhSA9SFkReL
This commit is contained in:
Claude 2026-02-20 15:20:42 +00:00
parent 91ee526941
commit 3eda043e9b
No known key found for this signature in database

233
CLAUDE.md Normal file
View file

@ -0,0 +1,233 @@
# CLAUDE.md — CouncilOS (KI-Konzil)
This file provides context for AI assistants (Claude and others) working in this repository. Read it fully before making any changes.
---
## Project Overview
**CouncilOS** ("KI-Rat Baukasten") is a visual no-code platform for building multi-agent AI pipelines. Users compose a "council" of AI experts — each with a defined role, system prompt, LLM model, and optional tools — and connect them via a drag-and-drop canvas. The agents then collaborate in iterative loops until a document or task reaches the desired quality.
**Repository status:** This is currently a concept and architecture stage. The only file is `README.md`, which serves as the master PRD and development roadmap. No application code exists yet.
**Primary language of documentation:** German. Code, variable names, and commit messages should be in English.
---
## Planned Tech Stack
Follow these technology choices exactly — they are architectural requirements, not suggestions:
| Layer | Technology | Reason |
|---|---|---|
| AI Orchestration | **LangGraph** (Python) | Native support for cyclic graphs and `interrupt_before` (Human-in-the-Loop) |
| Backend API | **FastAPI** (Python) | WebSocket support for real-time agent status updates |
| Frontend | **React** or **Next.js** + **React Flow** | Industry standard for interactive drag-and-drop canvas UIs |
| Database | **PostgreSQL** | Stores user data and council blueprints as JSON |
| Vector DB | **ChromaDB** (local) or **Pinecone** | Powers the PDF-reader tool |
| LLMs | **Anthropic Claude 3.5 Sonnet** + **OpenAI GPT-4o** | Via API |
| Search Tool | **Tavily Search API** | For the web-search agent tool |
| PDF Tool | **PyPDF** + vector store | For the PDF-reader agent tool |
---
## Architecture Concepts
### Core Idea: Cyclic Multi-Agent Graphs
Unlike linear pipelines, CouncilOS agents run in **loops**. A critic agent can reject a draft and send it back to the master agent repeatedly until quality meets the threshold. This is the core differentiator.
### Agent Node Properties
Each agent node on the canvas has:
- **Name** — display label
- **System Prompt** — the role/persona definition
- **LLM Model** — which model to use (Claude, GPT-4o, or local)
- **Tools** — optional toggle switches: Web Search, PDF Reader
### Edge Types
- **Linear edges** — agent A always passes output to agent B
- **Conditional edges** — agent A routes dynamically (e.g. `"rework"` → back to master, `"approve"` → next stage)
### The Global State (`CouncilState`)
This is the central data structure passed between all agents in LangGraph. Always use and extend this TypedDict:
```python
from typing import TypedDict, Annotated, List
import operator
class CouncilState(TypedDict):
input_topic: str # The user's original prompt or uploaded PDF content
current_draft: str # The document currently being worked on
feedback_history: List[str] # All critic feedback accumulated across loop iterations
route_decision: str # Routing signal: "rework" | "approve" | custom values
messages: Annotated[list, operator.add] # LLM message history (system + responses)
```
Agents should append to `feedback_history` rather than overwriting it, so the master agent can learn from all previous critique in a loop.
### Execution Modes
| Mode | Behavior |
|---|---|
| **Auto-Pilot** | Agents run fully autonomously until completion |
| **God Mode** | LangGraph pauses at each decision point via `interrupt_before`; user approves/rejects/modifies before continuing |
---
## Development Roadmap
Build in this order — **backend first, frontend second**:
### Phase 1: LangGraph Engine (Backend MVP)
- Set up Python environment and FastAPI
- Hard-code a fixed test graph: `User Input → Master AI → Critic AI → (if score < 8: back to Master; if ≥ 8: Writer AI)`
- Implement `CouncilState` and the routing logic
- Verify the loop runs correctly via terminal or Postman
### Phase 2: Visual Builder (Frontend MVP)
- Set up React + React Flow
- Build custom node components with editable name, system prompt, model selector, and tool toggles
- Build edge drawing (linear and conditional)
- Write a **parser** that converts the React Flow graph into a structured JSON and saves it to PostgreSQL
### Phase 3: Integration (Frontend ↔ Backend)
- Make LangGraph **dynamic**: read the JSON blueprint from Phase 2 and construct the graph at runtime
- Add WebSocket events: when LangGraph enters a node, emit an event so the frontend highlights that node
- Display the final output text in the frontend
### Phase 4: Tools & God Mode (Enterprise Features)
- Integrate Tavily Search API and PyPDF + vector store as agent tools
- Assign tools to specific nodes in the frontend
- Implement Human-in-the-Loop using LangGraph's `interrupt_before`
- Build the approval UI: display the paused state, reason, and Approve / Reject / Modify buttons
---
## UI Structure
### Tab A: "Rat-Architekt" (Setup Mode)
- Infinite canvas (React Flow)
- Drag nodes from a sidebar panel onto the canvas
- Click a node → open settings panel (name, system prompt, LLM, tool toggles)
- Draw edges between nodes; mark edges as conditional where needed
### Tab B: "Konferenzzimmer" (Execution Mode)
- Text input or PDF upload to start a council run
- Auto-Pilot / God Mode toggle
- Live diagram view: active agent node pulses/glows (WebSocket-driven)
- God Mode: approval popup when the graph pauses
---
## Conventions for AI Assistants
### Language
- All **code, variable names, function names, comments, and commit messages** must be in **English**
- User-facing UI text and in-code string literals for the UI may be in German (matching the product's target audience)
- Do not translate the existing German `README.md`
### Python Code Style
- Use Python 3.11+
- Type hints are mandatory — use `TypedDict` for state classes, `Annotated` for LangGraph reducers
- Follow PEP 8; use `ruff` for linting and `black` for formatting if configured
- Keep LangGraph node functions pure where possible (single input state → output state dict)
- Name node functions descriptively: `master_agent_node`, `critic_agent_node`, `route_decision`
### FastAPI Conventions
- Use async route handlers
- Separate route definitions from business logic (use service modules)
- WebSocket endpoint for live agent status: `/ws/council/{run_id}`
- REST endpoints for CRUD on council blueprints: `/api/councils/`
### React / Frontend Conventions
- Use functional components with hooks
- React Flow nodes must be wrapped in custom components (never use raw default nodes for agent cards)
- The JSON format emitted by the parser (Phase 2) must be the canonical exchange format between frontend and backend; keep it versioned
- State management: use React context or Zustand (avoid Redux unless team decides otherwise)
### Database
- Council blueprints are stored as JSONB columns in PostgreSQL
- Include a `version` field in blueprint JSON to allow schema evolution
- Use Alembic for migrations
### Testing
- Write pytest tests for all LangGraph node functions and routing logic
- Mock LLM calls in unit tests (do not make real API calls in CI)
- Frontend: React Testing Library for component tests
### Git Workflow
- Branch naming: `feature/<short-description>`, `fix/<short-description>`
- Commit messages: imperative mood, English, e.g. `Add critic agent routing logic`
- Never commit API keys or `.env` files
### Environment Variables
When code is created, expected environment variables will include:
```
ANTHROPIC_API_KEY=
OPENAI_API_KEY=
TAVILY_API_KEY=
DATABASE_URL=postgresql://...
CHROMA_PERSIST_DIR=./chroma_db
```
Create a `.env.example` file (no real values) and add `.env` to `.gitignore`.
---
## Key Design Constraints
1. **Cycles are first-class** — never flatten the graph into a DAG just to simplify code. LangGraph's cycle support is the core value proposition.
2. **State is the source of truth** — agents must not store state internally; everything passes through `CouncilState`.
3. **No hardcoded graphs in production** — Phase 1 may hard-code a test graph, but from Phase 3 onward the graph must be dynamically built from the JSON blueprint.
4. **WebSockets for real-time updates** — polling is not acceptable for agent status; use WebSocket events.
5. **Human-in-the-Loop via `interrupt_before`** — do not build a custom pause mechanism; use LangGraph's built-in support.
---
## Repository Structure (Target — Once Code Exists)
```
KI-Konzil/
├── README.md # German PRD / project blueprint (do not modify)
├── CLAUDE.md # This file
├── .env.example # Template for required environment variables
├── backend/
│ ├── main.py # FastAPI app entrypoint
│ ├── api/ # Route definitions
│ ├── services/ # Business logic, LangGraph graph builder
│ ├── agents/ # Individual agent node functions
│ ├── state.py # CouncilState TypedDict definition
│ ├── tools/ # Web search and PDF reader tool wrappers
│ └── tests/ # pytest test suite
├── frontend/
│ ├── src/
│ │ ├── components/ # React components
│ │ │ ├── nodes/ # React Flow custom node components
│ │ │ └── edges/ # React Flow custom edge components
│ │ ├── pages/ # Page-level components / Next.js pages
│ │ ├── hooks/ # Custom React hooks (WebSocket, council API)
│ │ └── utils/ # Parser: React Flow JSON → blueprint JSON
│ └── public/
└── docker-compose.yml # Local development environment
```
---
## Getting Started (Once Implemented)
```bash
# Backend
cd backend
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
uvicorn main:app --reload
# Frontend
cd frontend
npm install
npm run dev
```