The bottleneck was always transcription
Cemetery digitization projects have existed for decades. FindAGrave, BillionGraves, and countless volunteer groups have done extraordinary work. But the fundamental process — a person reads a stone, types what they see — has not scaled well. A single volunteer can photograph a hundred stones in a day. Transcribing them accurately takes much longer.
What AI-powered OCR changes
Optical character recognition trained on memorial inscriptions can extract text from a headstone photo in seconds. But the real value is not speed alone. It is structured output with confidence scoring.
When a human reads a stone and types "SMITH", that looks like certainty. When OCR reads a stone and returns "SMITH" with 94% confidence and "SMYTH" as a 6% alternative, you have something more useful: a transparent reading with a measurable margin of doubt.
Why confidence matters more than accuracy
No system — human or machine — reads every headstone correctly. The difference is whether errors are visible. A confident human mistake enters the record and stays there. An OCR result with a confidence score below threshold enters a review queue instead.
GraveLedger routes every extraction below 80% confidence to human review. The reviewer sees the original photo alongside the OCR output and makes the call. This hybrid approach catches errors that pure automation misses and moves faster than pure manual work.
What AI handles well
- Cleanly carved modern headstones with standard fonts
- Dates in common formats
- Names in Roman alphabet with clear spacing
- Flat granite markers with high contrast
What still needs human eyes
- Deeply weathered limestone and marble from the 1800s
- Ornate Victorian script with decorative flourishes
- Stones with lichen, moss, or shadow interference
- Non-English inscriptions and historical spelling conventions
- Symbols, emblems, and lodge insignia
The review queue is the product
The temptation with automation is to remove humans from the loop. That is exactly wrong for cemetery records. The review queue is not a fallback. It is the core quality mechanism. AI handles volume. Humans handle judgment. The combination produces records that neither could achieve alone.
Where this is going
As training data grows — every reviewed and corrected extraction improves the model — OCR accuracy on memorial text will continue to improve. But the architecture should always assume that some readings will be uncertain. Building confidence visibility into the system from the start means accuracy improves without the system ever pretending to be more certain than it is.