What Casefile Is For

Casefile has been live since April. So far it has been used mostly informally: a small number of matters that found their way to it through people I know or one degree out from there. Real volume will come over the next few months, and two of those early matters are worth describing, abstractly, because between them they show what the system is for.

The premise, briefly

If you're a self-represented litigant in any of the usual situations where the paperwork has gotten away from you, you can hand Casefile a literal dump of unstructured documents. WhatsApp message exports. Email archives. A Google Takeout file. Scanned papers from a filing cabinet. Whatever shape the evidence is in, in whatever quantity.

The system indexes everything. Full-text searchable, attachment-aware so a thread of emails-with-PDFs reads as one continuous record. It builds a timeline of who said what and when. Where ordinary indexing doesn't reach what you need, it brings LLMs in for the things they're good at: reading scanned handwritten content, normalising mixed formats into a comparable form, summarising long threads into usable signal.

The thing that layer mostly does is extract what lawyers call statements of fact: the discrete claims a party is making that a court eventually has to make findings on. Casefile pulls every statement of fact out of an archive and attaches it to a source and a date. The result is a structured database of every claim anyone has made, comparable on its face, and that database is what does most of the work after it's built. It's what lets the system put one party's claims next to another's, or next to that same party's earlier statements, and find contradictions that would otherwise be sitting in three different documents on three different dates. A court case turns on findings of fact. Most of how you get there is having a clean, comparable record of every fact in evidence.

The output is the bones of a court bundle: paginated, ordered, indexed, in a shape a court will accept. That's most of what Casefile produces by volume. The other thing it produces, and the reason for the rest of this post, is evidence that would not otherwise have been found.

Two early matters

In a corporate dispute between the two principals of a small private company, Casefile worked through several hundred thousand pages of documents. One of the principals had moved to remove the other from the company, and the action looked clean on the surface, executed by written resolution. Buried in the corpus was a small set of scanned and signed pages from years before, describing an arrangement under which the acting principal had been holding a significant portion of the other's shares on behalf of them. Both parties had forgotten the document existed. Its existence materially changes whether the removal was procedurally valid. The pages were sitting in a much larger archive that had been in scanned-and-forgotten storage for years, and nobody was going to find them by reading.

In a separate family-law matter, Casefile processed a large archive of documents alongside text-message and email exports. It identified a pattern of contradictions across statements of fact made to courts and to police, with the contradictions falling out cleanly once the statements were laid against a timeline assembled from the surrounding material. The result reframed the case. A litigant working alone, in the time available before a hearing, was not going to find this; a solicitor doing it manually would have read every page at hourly rates, and most aren't engaged for that level of forensic work in the first place.

Both matters are live. I've kept the specifics abstract for that reason.

Why a system like this finds things human review misses

There are two reasons this works, and neither is sophisticated.

Volume. A solicitor reading every page of a several-hundred-thousand-page archive is not a thing that happens. The hourly maths doesn't work for the client, and even if it did, the lawyer would burn out before they got through a tenth. The reading doesn't get done, and the things sitting in the unread parts don't get found.

Structure. A case file gets organised around a narrative: chronology, theme, whichever angle the person assembling it is working with. The piece of evidence that turns a case is often the one sitting outside whichever narrative was being built. It might be in an old scanned envelope nobody thought to open, or in an email thread that didn't seem related when it was filed. A system that reads everything without knowing in advance which pile is supposed to matter finds those.

Most of the time, the haystack is the whole problem.

What it doesn't do yet

Casefile is not perfect. It misses things, surfaces things that aren't relevant, and mislabels relationships between documents in ways that have to be corrected after the fact. The honest summary is that it is getting better every day, and that what it currently produces is roughly the equivalent of a competent junior paralegal's first pass over a corpus. The difference is that this junior has read every page and never gets tired.

What takes Casefile from a useful first pass to a working partner is the layer being built on top of it now: a set of beta tools, not yet on the live site, that put human judgment in the loop. Anything the system extracted, flagged, or inferred becomes reviewable. The user can agree, disagree, push back on relevance, or ask why something was included or left out, and they add the nuance the system can't yet have on its own. Without that layer, Casefile is a fast machine. With it, Casefile is a working partner.

That's the next thing.

What it's for

This matters more than "AI makes document review faster" because of who it serves. The people who currently cannot access competent paperwork: divorcing parents on unequal incomes, employees fighting employers many times their size, anyone whose evidence is sitting in a stack of scanned papers no lawyer is going to read for under tens of thousands of pounds. Their options have been bound by the cost of being heard.

The first cases through Casefile are early and small. The mechanic is real. The work it's doing, on a few matters that would not have produced this evidence any other way, is what takes a service from a great improvement in document review and discovery into something genuinely transformative for people who are underfunded and overwhelmed.

That's what it's for.