Improving document segmentation to preserve medical record context at scale

Author: Ashley Grodnitzky


This post is part of Pattern Labs, our research initiative focused on translating real-world experimentation into defensible, production-ready workflows for mass tort litigation.


 

Medical record review depends on context.

Attorneys, paralegals, and medical record reviewers are not simply locating facts. They are determining what happened during a specific medical visit, when it happened, and how it fits into a larger case narrative. That work depends on understanding records as a series of real-world medical events that often span multiple pages.

As records grow longer and more fragmented, preserving that context becomes harder, especially at scale.

Page-level review vs. encounter-level review

Context is often lost early in the review process, driven by how medical records are segmented. Many systems segment records at the page level, while reviewers naturally think in terms of medical visits. That difference explains why important context is dropped, even when the information exists in the record.

  Page-level review Encounter-level review
Unit of review Individual page Medical visit (encounter)
How records are evaluated Each page is reviewed on its own Pages are grouped before review
What gets included Only what appears on that page All pages from the same visit
Common outcome Diagnosis without clear date or provider Diagnosis tied to a specific visit
Reviewer experience Manual backtracking to find context Context preserved automatically
Core question answered What is written on this page? What happened during this visit?

 

How existing medical record review tools work

Medical record review in mass tort litigation is time-consuming and increasingly high-volume. Teams often review thousands of records, each containing hundreds or even thousands of pages. To keep pace, many organizations rely on tools designed to speed up review by narrowing what a reviewer needs to look at.

Most of these tools work by scanning records for specific terms such as diagnoses, procedures, or conditions, then isolating the pages where those terms appear. By focusing attention on a smaller subset of pages, they can significantly reduce review time and surface potentially relevant information more quickly.

The tradeoff is context.

Because pages are evaluated in isolation, these tools often lack awareness of what comes before or after a flagged page. Information may be found, but the surrounding visit-level context that gives it meaning is often lost.

A common page-level breakdown in litigation review

This limitation shows up in ways that feel familiar to anyone who reviews medical records.

For example:

  • One page includes a diagnosis or assessment and is flagged as relevant
  • The previous page includes the patient name, encounter date, and provider

The diagnosis is captured, but the context is missing. The output may list the condition but omit the date or leave it unclear which visit the diagnosis belongs to.

This does not happen because the information was absent from the record. It happens because the page containing the context was never included in the review.

That is the practical impact of page-level medical record review in litigation.

How legal and medical reviewers actually read medical records

Medical record review is event-based, not page-based.

When reviewers work through records, they naturally organize information by real-world medical events such as office visits, hospital stays, imaging appointments, or procedures. These events often span multiple pages, but reviewers instinctively group them together.

A page break does not signal the end of a visit. A header on one page often explains the notes on the next. Reviewers scan backward and forward to confirm dates, providers, and patient identifiers as part of understanding the full visit.

This grouping of pages into a single medical event is called a clinical encounter. It reflects how records are understood in practice, even if they are stored as individual pages.

Why reviewing the entire medical record is not a practical solution

In litigation, reviewing entire medical records at once rarely works. Medical records can span tens of thousands of pages. Review typically needs to produce findings for individual visits, not a single summary for an entire file. Reviewing everything at once also creates inefficiency and noise.

The real challenge is not accessing more pages. The challenge is keeping the right pages together.

How encounter-level review organizes records by medical events

Encounter-level review starts with a different assumption: pages from the same medical visit should be reviewed together.

Screenshot 2026-01-11 at 12.27.04 PM

At a high level, the system evaluates each medical record as a sequence of pages. The segmentation model looks at neighboring pages across the full document to determine where one clinical encounter ends and another begins. Those boundaries are then used to group pages into encounters, which can be classified by record type such as a clinical visit, hospitalization, radiology report, or lab result. This structure allows review and extraction to operate on complete medical events rather than isolated pages.

Encounter-level review focuses on whether adjacent pages describe the same medical event so that we

  • keep clinical encounters grouped
  • identify each encounter type

As a result, when a diagnosis is reviewed, the system also considers:

  • The page with the visit date
  • The page with patient and provider information
  • Any continuation pages related to that visit

The output reflects a complete picture of the encounter rather than an isolated statement pulled from a single page. This encounter-based way of organizing records mirrors how litigation teams already review medical evidence. It is also the approach used by Pattern Data.

Records are structured around medical visits before they are evaluated, so diagnoses, dates, providers, and supporting details stay connected even when they span multiple pages. The goal is not to change how reviewers think, but to reduce the friction caused by page-based document formatting that does not reflect how care is delivered or how cases are built.

For reviewers working at scale

If you spend your day flipping backward and forward through records to reconstruct visits, you already understand why context matters.

The future of medical record review depends on systems that respect how reviewers actually work, not just how documents are formatted.

FAQs for document segmentation for medical record reviews

What is document segmentation in medical record review?

Document segmentation is the process of organizing a long medical record into smaller sections so related pages stay together. In this post, it refers to grouping pages that belong to the same medical visit before review.

What is the difference between page-level review and encounter-level review?

Page-level review treats each page as its own unit, which can separate a diagnosis from the date or provider on the surrounding pages. Encounter-level review groups pages by medical visit so the context stays connected.

Why does page-level review lose important context?

Key visit details often appear on a different page than the diagnosis or procedure. When tools isolate only the “hit” page, they may exclude the page that contains the date, patient identifiers, or provider information.

Why not review the entire medical record at once?

Many records are thousands of pages long, and review typically needs to produce findings tied to individual visits. Reviewing everything at once can introduce noise and make it harder to preserve visit-level structure.

How does better segmentation help mass tort litigation workflows?

When pages are grouped by visit, reviewers spend less time reconstructing timelines, and extracted information is more likely to be tied to the correct encounter. That improves accuracy and reduces rework at scale.

Learn more about Pattern Data


back to all news