System Performance & Research
Evaluation metrics and performance characteristics extracted from the automated benchmarking suite.
Precision@5 Retrieval
Measures the accuracy of retrieving the top 5 most relevant memories for an LLM prompt context.
Average Query Latency
Query execution time (retrieving, scoring, and formatting facts) under concurrent workloads.
Token Consumption Efficiency
Token volume reduction compared to feeding raw conversation logs into the LLM context.
Competitive Performance Analysis
Detailed benchmark comparing self-hosted Kyros AI against alternatives.
| Evaluation Dimension | Kyros AI (Self-Hosted) | Mem0 | Standard RAG |
|---|---|---|---|
| Precision@5 Retrieval | 100% | 85% | 62% |
| Average Query Latency | 37.2ms | 148.6ms | 120.4ms |
| Token Consumption Efficiency | 99.1% Reduction | Baseline | Dynamic |
| Cryptographic Tamper Protection | Enabled (Merkle + SHA) | None | None |
| Biological-based Decay Weights | Enabled (Ebbinghaus) | None | None |
| Context-Aware Causal Graphing | Enabled | None | None |
Evaluation Methodology
Retrieval latency tests were conducted using concurrent requests over a PostgreSQL database with pgvector, caching active queries via Redis. Precision rates were computed using synthetic chatbot conversation datasets, evaluating whether the core factual statements (e.g. user profile attributes) were successfully retrieved in the top 5 results after injecting random conversational noise.