Title
Reassembling Shredded Document Stripes Using Word-Path Metric and Greedy Composition Optimal Matching Solver
Document Type
Article
Publication Date
5-1-2020
Abstract
This paper develops a shredded document reassembly algorithm based on character/word detection. A new word compatibility estimation metric and a searching strategy called Greedy Composition and Optimal Matching (GCOM) are proposed to compose documents from their vertically shredded stripes. We reduce the stripe puzzle reassembly problem to the traveling salesman problem (TSP) on a sparse graph. The word-path compatibility metric takes advantages of the optical character recognition (OCR) to compute the compatibility score among a group of stripes. The global composition strategy, based on an integration of greedy composition and optimal matching, is proposed to search for a maximal Hamiltonian path and the final global reassembly. We demonstrate that our solver outperforms the state-of-the-art puzzle solvers on reassembling stripe shredded documents.
Publication Source (Journal or Book title)
IEEE Transactions on Multimedia
First Page
1168
Last Page
1181
Recommended Citation
Liang, Y., & Li, X. (2020). Reassembling Shredded Document Stripes Using Word-Path Metric and Greedy Composition Optimal Matching Solver. IEEE Transactions on Multimedia, 22 (5), 1168-1181. https://doi.org/10.1109/TMM.2019.2941777