# Citare Strategy *Last revised 2026-04-27.* --- ## Mission — One person processes a paper. Humanity inherits the result. Every researcher who reads "Edmondson 1999" reads it from scratch. Every LLM that cites it re-derives the same claims by re-parsing the same PDF. Every reviewer who checks a citation re-walks the same logic. The collective cost — across millions of reads — is staggering, and it produces nothing reusable. Citare's commitment: **structured understanding of a paper should be a one-shot, durable artefact.** When one researcher (or AI) extracts a paper's claims with the canonical prompt, the result becomes a public good. The next reader doesn't re-extract. They query. This is not a literature search engine. It is **post-search infrastructure** — the place where what a paper *says*, with *what causal strength*, *under what conditions*, lives in a form an AI can cite honestly without re-reading the source. ## Vision — A claim graph is the substrate AI needs to discover new theories. Once enough peer-reviewed claims are structured with their causal metadata, their boundary conditions, and their relations to other claims, three things become possible that aren't possible today: - **Cross-domain pattern detection.** The same structural relation ("X moderates Y when Z is high") may appear in psychology, economics, and machine learning literature with completely different surface vocabulary. A claim graph lets an AI see the isomorphism. - **Gap detection at scale.** "We have 12 cross-sectional studies of A→B but zero longitudinal" becomes a query, not a literature review. - **Theory generation.** Given a partial claim chain, an AI can propose the missing link as a testable hypothesis, with full provenance to the claims that constrained the proposal. We are not building "search papers faster." We are building the **substrate for AI-driven scientific discovery** — a corpus where the unit is the claim, not the paper, and where every claim carries enough metadata that downstream reasoning is honest by construction. The vision is genuinely speculative. We will not know if AI can really do these things until the corpus reaches the 10K-claim threshold. But the substrate must exist before the question can be asked. --- ## Three commitments that follow ### 1. The unit is the claim. Indexing papers tells you "this paper exists" — useful for discovery, useless for citation. Indexing claims tells you "this paper says X under condition Y with strength Z." That's what citation actually needs. Citare's schema is built around the claim, not the paper. ### 2. Causal strength is metadata, not opinion. Every claim carries the study design that produced it. From that, the server derives **safe verbs** — the verbs an honest citation can use. A cross-sectional finding returns *"is associated with"* or *"correlates with"* regardless of how the original author framed it. This single rule prevents the most common citation failure: silently upgrading "associated with" to "causes." ### 3. The corpus belongs to everyone. `register_claims` is open. No auth. Anyone with a valid extraction can contribute. The protection layer is schema validation plus minimum content checks — not human moderation. This is the only path that scales to the 10K claims the vision needs. The reciprocal commitment: the data stays free. Read access is free. Write access is free. Citare will accept hosting cost rather than introduce paywalls that fragment the corpus. --- ## What this is *not* Stating these explicitly so the project doesn't drift into them. - **Not a literature search engine.** Google Scholar indexes papers. Citare indexes claims. Different products, different success metrics. - **Not a personalised research assistant.** No accounts, no per-user state, no recommendations. - **Not a paper-summary tool.** A summary loses the very metadata that makes citation safe. - **Not a debate platform.** Disputed claims are tagged; we do not host the debate. - **Not a corpus to be sold.** Public good, by design. --- ## How to participate Two ways: **Use it.** Connect your AI to `https://citare.dev/sse` (MCP). When you write a paper or check a citation, your AI can query the graph and follow the safe verbs. Every honest citation that gets made because of Citare is a small win for the field. **Grow it.** When your AI finds that a paper is missing from Citare, let it run the canonical extraction prompt (also exposed via MCP) and submit the result. One paper takes about five minutes and roughly $1 of API cost. The result is permanent and benefits every future reader. The full extraction prompt, the PDF acquisition guide, and the registration tool are all exposed through the MCP server. No local install required. --- ## Where we are 84 papers. 2,634 claims. 2,008 relations. 1,115 integrity warnings. This is enough to validate the design and not nearly enough to enable the vision. The path from here: - **200 papers**: contributor loop hardens. Cross-paper queries start returning multiple sources with different design bases. - **1,000 papers**: the discovery surface starts to matter. Citation resolution between cited and citing papers becomes the highest-value feature. - **10,000 papers**: the vision becomes testable. Can an AI identify cross-domain isomorphisms? Can it propose hypotheses grounded in five separate claim chains? We do not yet know. --- *Citare. AI-native scholarly claim infrastructure.* *MCP endpoint: *