Domain-Specific Knowledge Base
The trading system utilizes a Retrieval-Augmented Generation (RAG) architecture to ensure that agent recommendations are grounded in expert financial literature and real-time market data. This domain-specific knowledge base is partitioned into distinct specialized areas, allowing each agent to query a vector store relevant to its specific mandate.
Knowledge Repository Structure
The knowledge base is organized within the /knowledge directory. Documents placed in these sub-folders are automatically chunked and indexed into the system's vector database.
| Domain | Folder Path | Agent Consumer | Description |
| :--- | :--- | :--- | :--- |
| Technical | knowledge/technical_analysis | Technical Agent | Indicators (RSI, MACD), chart patterns, and trend analysis strategies. |
| Sentiment | knowledge/sentiment | Sentiment Agent | Market psychology, Fear & Greed interpretations, and contrarian indicators. |
| Fundamental | knowledge/fundamental | Fundamental Agent | On-chain metrics, network health, and valuation frameworks. |
| Risk | knowledge/risk_management | Risk Agent | Position sizing, Kelly Criterion, and volatility-based stop-loss logic. |
| Research | knowledge/papers | All Agents | Academic papers and whitepapers providing deep-context theory. |
Knowledge Indexing
The KnowledgeIndexer class provides the interface for transforming raw text and markdown files into searchable vector embeddings.
Usage Example
To re-index the entire knowledge base (e.g., after adding new strategy documents):
from agents.rag.indexer import KnowledgeIndexer
# Initialize indexer pointing to the knowledge directory
indexer = KnowledgeIndexer(knowledge_dir="./knowledge")
# Process and index all domains
stats = indexer.index_all()
print(f"Indexing complete: {stats}")
# Output: {'technical': 150, 'sentiment': 85, ...}
Supported Formats
- Markdown (.md) / Text (.txt): Preferred format. The indexer uses a header-aware splitting logic to preserve the context of sections.
- Curated Summaries: While the system supports PDF processing via
PyPDF2(currently internal/disabled for raw files), the recommended workflow is to provide curated markdown summaries of research papers to ensure high-quality, actionable chunks.
Data Schema & Structured Knowledge
Knowledge retrieved via RAG is structured through Pydantic schemas. This ensures that when an agent retrieves info, it outputs data in a format the Orchestrator can parse for the dashboard and backtesting engine.
Specialization Metadata
Each agent type extends the AgentRecommendation base class with domain-specific metadata derived from the knowledge base:
class TechnicalMetadata(BaseModel):
rsi_signal: Optional[str] # OVERSOLD/NEUTRAL/OVERBOUGHT
macd_signal: Optional[str]
support_level: Optional[float]
resistance_level: Optional[float]
class SentimentMetadata(BaseModel):
fear_greed_interpretation: Optional[str]
news_sentiment: Optional[str] # POSITIVE/NEGATIVE/NEUTRAL
contrarian_signal: Optional[bool]
Real-Time News Integration
In addition to static files, the knowledge base is augmented by real-time news streams via the NewsAPIConnector. This component fetches the latest developments to provide context that may not yet be captured in static documentation.
Interface
The connector prioritizes free RSS feeds (CoinTelegraph, CoinDesk) and falls back to NewsAPI for broader coverage.
from data_connectors.newsapi_connector import NewsAPIConnector
connector = NewsAPIConnector(api_key="your_api_key")
news = connector.get_bitcoin_news(limit=10)
# Returns a list of dicts:
# [{"title": "...", "description": "...", "source": "..."}]
Internal Processing Logic
The indexer performs the following steps during the ingestion phase:
- Header-Aware Splitting: Splits documents by Markdown headers (
#,##) to keep related concepts together. - Paragraph Chunking: Further breaks down large sections into target chunks of ~500 characters.
- Metadata Tagging: Each chunk is tagged with its
sourcefile anddomainto allow for filtered retrieval during agent execution.