TY - GEN
T1 - Tabular Corner Detection in Historical Irish Records
AU - O'Shea, Enda
N1 - Publisher Copyright:
© 2023 Owner/Author.
PY - 2023/8/22
Y1 - 2023/8/22
N2 - The process of extracting relevant data from historical handwritten documents can be time-consuming and challenging. In Ireland, from 1864 to 1922, government records regarding births, deaths, and marriages were documented by local registrars using printed tabular structures. Leveraging this systematic approach, we employ a neural network capable of segmenting scanned versions of these record documents. We sought to isolate the corner points with the goal of extracting the vital tabular elements and transforming them into consistently structured standalone images. By achieving uniformity in the segmented images, we enable more accurate row and column segmentation, enhancing our ability to isolate and classify individual cell contents effectively. This process must accommodate varying image qualities, different tabular orientations and sizes resulting from diverse scanning procedures, as well as faded and damaged ink lines that naturally occur over time.
AB - The process of extracting relevant data from historical handwritten documents can be time-consuming and challenging. In Ireland, from 1864 to 1922, government records regarding births, deaths, and marriages were documented by local registrars using printed tabular structures. Leveraging this systematic approach, we employ a neural network capable of segmenting scanned versions of these record documents. We sought to isolate the corner points with the goal of extracting the vital tabular elements and transforming them into consistently structured standalone images. By achieving uniformity in the segmented images, we enable more accurate row and column segmentation, enhancing our ability to isolate and classify individual cell contents effectively. This process must accommodate varying image qualities, different tabular orientations and sizes resulting from diverse scanning procedures, as well as faded and damaged ink lines that naturally occur over time.
KW - corner detection
KW - historical documents
KW - Image segmentation
UR - http://www.scopus.com/inward/record.url?scp=85173573731&partnerID=8YFLogxK
U2 - 10.1145/3573128.3609349
DO - 10.1145/3573128.3609349
M3 - Conference contribution
AN - SCOPUS:85173573731
T3 - DocEng 2023 - Proceedings of the 2023 ACM Symposium on Document Engineering
BT - DocEng 2023 - Proceedings of the 2023 ACM Symposium on Document Engineering
PB - Association for Computing Machinery, Inc
T2 - 2023 ACM Symposium on Document Engineering, DocEng 2023
Y2 - 22 August 2023 through 25 August 2023
ER -