← Kiji Intelligence
Standalone product · for any company

Make your knowledge answerable by any AI

The Knowledge & Context Orchestrator turns sprawling document stores into clean, permission-aware knowledge any AI can retrieve and answer — prepared once for Claude, GPT, Gemini, NotebookLM or your own RAG, with every claim traceable to its source. The engine beneath Venture Intelligence, available on its own.

Claude ProjectsOpenAI Custom GPTs Gemini GemsNotebookLM RAG / vector DBs
What it does

Decouple your knowledge from any one AI

Modern AI is limited less by the model than by the data you feed it. KCO turns your document stores into clean, permission-aware, retrieval-ready knowledge — structured, deduped, redacted and sized for any model. Prepare once; ship it to Claude, GPT, Gemini, NotebookLM or your own RAG — or ask it directly. Your knowledge stays platform-agnostic, and every answer traces to its source.

Why not just upload to an LLM?

What a raw upload can't do

For a few clean files, a direct upload is fine. On real corporate data — messy folders, PDFs, spreadsheets, versions and permissions — it breaks down fast.

Raw upload to an LLM
With KCO
Limits
Hits file/size walls fast — a handful of files per tool.
Splits, sizes and curates output to fit each platform, with an index that routes the model.
Tokens
PDFs and rich formatting burn the context budget.
Strips bloat, OCRs and compresses — far more signal per token.
Tables
Pasted rows lose structure — broken math.
Markdown or JSONL per row — structure preserved.
Permissions
Everything visible to anyone in the chat.
Each chunk carries its access controls.
Privacy
Raw PII goes straight to a third-party model.
PII redacted first, with an audit trail.
Freshness
A one-time upload goes stale instantly.
Scheduled re-runs keep knowledge current.
How it works

How KCO prepares your data

A multi-stage pipeline — these are the foundations; the engine layers retrieval, enrichment and research on top.

Step 01

Front-matter metadata

A header on every file records its Drive path, name, type and ingestion time — so the model always knows where a chunk came from.

Step 02

Tabular parsing

Small tables become clean Markdown; large tables become JSONL, one self-contained record per row — no broken lookups or column drift.

Step 03

OCR + PDF vision

Layout-heavy PDFs and images are converted to clean text; with vision on, pages — including diagrams — are transcribed faithfully.

Step 04

Permissions (ACL)

Each chunk carries its source permissions, so retrieval can be filtered to exactly what a given user is allowed to see.

Step 05

Privacy & redaction

PII is redacted before anything leaves — by configurable rules or Google Cloud DLP — with a full audit trail of what was masked.

Step 06

De-duplication & versioning

Content-hash dedup, near-duplicate detection and newest-wins version grouping — with incremental re-runs that only reprocess what changed.

Step 07

Metadata & classification

Every file and chunk is tagged with domain, keywords and classification — so retrieval can filter to exactly the right material.

Step 08

Token cleansing & compression

Boilerplate, redundant whitespace and formatting bloat are stripped and normalized — more signal per token, lower cost, less “lost in the middle.”

Step 09

Broad format coverage

Google Docs, Sheets, Excel, PowerPoint, PDFs, images, code and zip archives — resolved by content and extension, even when Drive mis-tags a file.

The format that matters

Why Markdown

Every pack KCO writes is Markdown — the native language of large language models. Trained on huge amounts of it, models read its structure instead of guessing, so your knowledge lands cleaner, cheaper and more accurately. Write your prompts in Markdown too, and the model follows you the same way. Across the whole preparation — extraction, token cleansing, de-duplication and lean Markdown — a corpus typically shrinks 90%+, often up to ~95%, measured for every run.

Structure

Logic the model can see

Headings, lists and tables map directly to a document's structure, so the model reasons over the shape of your data — not just the words.

Efficiency

More signal per token

Plain text with tiny markup costs a fraction of the tokens of PDF, HTML or Word — so more of your knowledge fits the context window, at lower cost.

Prompts

Speak the same language

A # goal, ## sections, a - list of requirements and fenced examples make any model follow your instructions far more reliably.

Beyond cleaning

A full retrieval engine

KCO doesn't just tidy files — it prepares knowledge for how modern AI actually retrieves, and can answer over it directly.

Output

Every model, one prep

Sized, tagged packs for Claude, GPT, Gemini and NotebookLM — plus embedding-ready RAG / vector JSONL. Prepare once, ship to any AI.

Retrieval

Contextual & hybrid

Heading-aware chunks carry their document and section context, with per-chunk keywords so your store can fuse keyword + vector search.

Chunking

Semantic splitting

Splits at meaning boundaries — paragraph, then sentence — never mid-thought, for sharper recall than fixed-size windows.

Summaries

Hierarchical (RAPTOR)

Optional section- and document-level summaries indexed beside the chunks, so big-picture and multi-hop questions still land.

Reasoning

Graph + synthetic Q&A

An optional knowledge graph and generated question/answer pairs connect facts across documents for analytical, connect-the-dots questions.

Research

Ask your knowledge

Built-in deep research rewrites the question and routes simple vs complex queries to single- or multi-agent answering — every claim cited to its source.

Ask your knowledge

Multi-agent deep research

KCO doesn't just prepare your knowledge — it reasons over it. A team of AI agents plans, reads, writes and reviews, each role on the model it's best at, so hard questions get grounded, cited answers.

Plan

Decompose

A lead agent breaks a complex request into focused sub-questions and assigns each one.

Retrieve

Read in parallel

Subagents pull the relevant documents at once — with query rewriting (HyDE) to surface the right sources for each part.

Synthesize

Write one answer

A writer agent merges every finding into a single coherent, structured response.

Review

Check & fill gaps

A reviewer agent catches missing pieces and fills them before the answer reaches you.

Adaptive

Right-sized effort

Simple questions take a fast single pass; complex ones fan out to the full team — chosen automatically.

Cited

Grounded & governed

Every claim traces to its source document, and answers stay within each user's permissions.

Not just for funds

For any data-heavy team

KCO began inside a venture fund, but the problem it solves is universal. It has been used for deep research and analysis well beyond VC — including for a Swedish publicly-listed green-energy company, turning a sprawling document base into clean, source-traceable knowledge an AI could reason over.

Governed

Per-document access controls and a full audit trail — safe on sensitive IP and financials.

Private

PII redacted before anything leaves your environment.

Always current

Scheduled re-runs keep your knowledge base fresh as files change.

Put your documents to work

Talk to us