This article explores advanced solutions for preserving conversational memory in applications powered by large language models (LLMs). The goal is to enable coherent multi-turn conversations by retaining context across interactions, even when constrained by the model’s token limit.
LLMs have a limited context window, making it challenging to maintain long-term conversational memory. Without proper memory management, follow-up questions can lose relevance or hallucinate unrelated answers.
articles/memory-management/├── chat-summary-memory-buffer/ # Implementation of summarization-based memory├── chat-with-persistence/ # Implementation of persistent memory with LanceDB
If you have found another way to tackle the same issue using LlamaIndex please feel free to open a PR! Check out our CONTRIBUTING.md file for more guidance.