Graph Attention Networks for Efficient Text Line Detection on Receipt-Layout Documents


August 14, 2022


KDD. Document Intelligent Workshop


David Montero Martin
Mukul Kumar
David Jiménez
Javier Yebes


Text line detection from OCR detections is an essential step in many information-extraction processes, particularly when working with unstructured documents such as purchase receipts, where utilizing this information is crucial for matching key-value pairs that are on the same line. Existing models, however, are limited to structured documents and do not generalize well to unstructured ones. To address this issue, we have created a GNN-based line detection model that is optimized for receipt-layout documents. Experiments show that the proposed method outperforms other approaches in accuracy, processing time and resource consumption.