FAQ¶

What is DocDuck?¶

An open-source system to index your documents across providers and query them using AI with cited context.

No. Provide an OpenAI-compatible key and follow the quick start.

Only chunk text (for embeddings) and constructed prompts (for answers). Use a self-hosted or private endpoint if required.

Common text formats, DOCX, PDF* (optional), ODT, RTF, Markdown. Unsupported types are skipped.

Yes—implement a small interface (IDocumentProvider). See Provider Framework.

Set FORCE_FULL_REINDEX=true and run the indexer.

Start at 1000 chars with 200 overlap; adjust based on answer granularity.

Not yet—planned. Presently secure network access or add a reverse proxy auth layer.

Yes, but update vector dimension and reindex. Multi-model support is future work.

Roughly (#chunks * (text + ~6KB embedding + metadata)).

Run the indexer as a CronJob and the API as a Deployment. See docs sections.

Not yet; API-first. A reference UI is on the roadmap.

PostgreSQL + pgvector lowers operational complexity and is sufficient for many workloads. Abstraction path considered for future.

"DocDuck" = Get your document ducks in a row 🦆.

MIT.