Graph Attention Networks for Efficient Text Line Detection on Receipt-Layout Documents
Date
August 14, 2022
Source
KDD. Document Intelligent Workshop
Authors
David Montero Martin
Mukul Kumar
David Jiménez
Javier Yebes
Abstract
Text line detection from OCR detections is an essential step in many information-extraction processes, particularly when working with unstructured documents such as purchase receipts, where utilizing this information is crucial for matching key-value pairs that are on the same line. Existing models, however, are limited to structured documents and do not generalize well to unstructured ones. To address this issue, we have created a GNN-based line detection model that is optimized for receipt-layout documents. Experiments show that the proposed method outperforms other approaches in accuracy, processing time and resource consumption.