 
						The views expressed are those of the authors and do not represent the views of the SNF Agora Institute or Johns Hopkins University.
Authors: Samuel Backer, Louis Hyman
New LLM-based OCR and post-OCR correction methods promise to transform computational historical research, yet their efficacy remains contested. We compare multiple correction approaches, including methods for “bootstrapping” fine-tuning with LLM-generated data, and measure their effect on downstream tasks. Our results suggest that standard OCR metrics often underestimate performance gains for historical research, underscoring the need for discipline-driven evaluations that can better reflect the needs of computational humanists.
 
													 
										 
													 
										 
													 
										