Domain-Specific Knowledge Base

The trading system utilizes a Retrieval-Augmented Generation (RAG) architecture to ensure that agent recommendations are grounded in expert financial literature and real-time market data. This domain-specific knowledge base is partitioned into distinct specialized areas, allowing each agent to query a vector store relevant to its specific mandate.

Knowledge Repository Structure

The knowledge base is organized within the /knowledge directory. Documents placed in these sub-folders are automatically chunked and indexed into the system's vector database.

Knowledge Indexing

The KnowledgeIndexer class provides the interface for transforming raw text and markdown files into searchable vector embeddings.

Usage Example

To re-index the entire knowledge base (e.g., after adding new strategy documents):

from agents.rag.indexer import KnowledgeIndexer

# Initialize indexer pointing to the knowledge directory
indexer = KnowledgeIndexer(knowledge_dir="./knowledge")

# Process and index all domains
stats = indexer.index_all()

print(f"Indexing complete: {stats}")
# Output: {'technical': 150, 'sentiment': 85, ...}

Supported Formats

Markdown (.md) / Text (.txt): Preferred format. The indexer uses a header-aware splitting logic to preserve the context of sections.
Curated Summaries: While the system supports PDF processing via PyPDF2 (currently internal/disabled for raw files), the recommended workflow is to provide curated markdown summaries of research papers to ensure high-quality, actionable chunks.

Data Schema & Structured Knowledge

Knowledge retrieved via RAG is structured through Pydantic schemas. This ensures that when an agent retrieves info, it outputs data in a format the Orchestrator can parse for the dashboard and backtesting engine.

Specialization Metadata

Each agent type extends the AgentRecommendation base class with domain-specific metadata derived from the knowledge base:

class TechnicalMetadata(BaseModel):
    rsi_signal: Optional[str] # OVERSOLD/NEUTRAL/OVERBOUGHT
    macd_signal: Optional[str]
    support_level: Optional[float]
    resistance_level: Optional[float]

class SentimentMetadata(BaseModel):
    fear_greed_interpretation: Optional[str]
    news_sentiment: Optional[str] # POSITIVE/NEGATIVE/NEUTRAL
    contrarian_signal: Optional[bool]

Real-Time News Integration

In addition to static files, the knowledge base is augmented by real-time news streams via the NewsAPIConnector. This component fetches the latest developments to provide context that may not yet be captured in static documentation.

Interface

The connector prioritizes free RSS feeds (CoinTelegraph, CoinDesk) and falls back to NewsAPI for broader coverage.

from data_connectors.newsapi_connector import NewsAPIConnector

connector = NewsAPIConnector(api_key="your_api_key")
news = connector.get_bitcoin_news(limit=10)

# Returns a list of dicts:
# [{"title": "...", "description": "...", "source": "..."}]

Internal Processing Logic

The indexer performs the following steps during the ingestion phase:

Header-Aware Splitting: Splits documents by Markdown headers (#, ##) to keep related concepts together.
Paragraph Chunking: Further breaks down large sections into target chunks of ~500 characters.
Metadata Tagging: Each chunk is tagged with its source file and domain to allow for filtered retrieval during agent execution.