Knowledge Base

System retrieval published

Also known as: KB, Document Store, Corpus

Definition

A structured collection of information that an AI system can query for knowledge retrieval. In LLM contexts, knowledge bases typically store documents, chunks, or facts that can be retrieved and injected into prompts. Unlike the model's parametric knowledge (in weights), knowledge bases are explicit, updateable, and auditable.

What this is NOT

  • Not the same as the model's training data (training data is static; KB is queryable at runtime)
  • Not just a vector database (KB is the content; vector DB is one storage option)
  • Not a search engine (KB stores knowledge; search engine indexes the web)

Alternative Interpretations

Different communities use this term differently:

llm-practitioners

A vector database or document store containing chunked and embedded content that a RAG system retrieves from. Often includes metadata for filtering and source attribution.

Sources: Vector database documentation (Pinecone, Weaviate, etc.), LlamaIndex knowledge base patterns

enterprise-it

A repository of organizational knowledge—documents, FAQs, procedures, and expertise—used for knowledge management and self-service support.

Sources: Knowledge management literature, Enterprise wiki and documentation systems

semantic-web

A structured store of facts and relationships, often as a knowledge graph with entities and relations, queryable via SPARQL or similar.

Sources: Knowledge graph literature, Wikidata, DBpedia documentation

Examples

  • A company's internal documentation indexed in Pinecone
  • A legal research database with case law and statutes
  • Product catalog with descriptions, specs, and FAQs
  • A knowledge graph of medical conditions and treatments

Counterexamples

Things that might seem like Knowledge Base but are not:

  • The LLM's training corpus (not queryable at runtime)
  • A search engine like Google (indexes the web, not a curated KB)
  • The conversation history alone (that's context, not a KB)

Relations

Implementations

Tools and frameworks that implement this concept: