Metrics

  • citations in SCIndeks: 0
  • citations in CrossRef:0
  • citations in Google Scholar:[]
  • visits in previous 30 days:3
  • full-text downloads in 30 days:2

Contents

article: 3 from 38  
Back back to result list
2020, iss. 37, pp. 35-46
Searchable digitized manuscript collections: An opportunity to read Serbian cyrillic
Univerzitetska biblioteka "Svetozar Marković", Beograd

emailandonovski@unilib.rs, dakic@unilib.rs, aleksandra@unilib.rs
Keywords: libraries; archives; manuscripts; READ project; Transkribus; transcription; neural networks; virtual research environment; Handwritten Text Recognition (HTR); Keyword Spotting (KWS)
Abstract
The READ (Recognition and Enrichment of Archival Documents) project has the potential to revolutionise access to historical collections held by cultural institutions all over Europe. This project was implemented in the period 2016/2019. It was funded by the European Commission, and involved 13 partners from the European Union. The overall objective of READ was to build a virtual research environment where archivists, humanities scholars, IT specialists and volunteers would collaborate with the ultimate goal of boosting research, innovation, development and usage of cutting edge technology for the automated recognition, transcription, indexing and enrichment of handwritten archival documents. Since its launch in 2016, in line with its concept of creating virtual research environment, the READ project was developing advanced text recognition technology on the basis of artificial neural networks. Research in pattern recognition, computer vision, document image analysis, language modelling, but also in digital humanities, archival research and related fields has seen unprecedented progress in recent years, and European research groups are on the forefront of this specific field. Newly developed technologies and tools are integrated via publicly available infrastructure - the Transkribus platform. The primary goal of Transkribus is to support users who transcribe printed or handwritten documents. Only a few years ago, it was still in the realm of fantasy that computers would become able to read historical scripts and to automatically recognise and transcribe the handwritten text of documents from the past centuries. On the other hand, users of Transkribus are able to extract data from handwritten and printed texts via HTR (Handwritten Text Recognition) technology and search digitized text without retyping, using sophisticated technology known as KWS (Keyword Spotting), while simultaneously contributing to the improvement of the same technology thanks to machine learning principles. The automated recognition of a wide variety of historical texts has significant implications for the accessibility of the written records of global cultural heritage.
References
*** Recognition and enrichment of archival documents. Preuzeto 12.8.2020, http://observatory.rich2020.eu/rich/projects/view/313331
*** Revolutioniying access to handwritten documents European Cooperative Society. Preuzeto 30.7.2020, https://readcoop.eu
Diem, M., Kleber, F., Fiel, S., Grünin, T., Gatos, B. (2017) cBAD: ICDAR2017 Competition on Baseline Detection. in: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 1: 1355-1360, Preuzeto 2.8.2020, https://api.semanticscholar.org/CorpusID:4761833
Digitisation and Digital Preservation Group (DEA Group) Preuzeto 1.8.2020, https://www.uibk.ac.at/germanistik/einrichtungen/dea.html
Gatos, B., Louloudis, G., Causer, T., Grint, K., Romero, V., Sánchez, J., Toselli, A.H., Vidal, E. (2014) Ground-truth production in the Transcriptorium project. in: 11th IAPR International Workshop on Document Analysis Systems, 237-241, Preuzeto 2.8.2020, https://api.semanticscholar.org/CorpusID:12688730
Giotis, A.P., Sfikas, G., Gatos, B., Nikou, C. (2017) A survey of document image word spotting techniques. Pattern Recognition, 68: 310-332, Preuzeto 2.8.2020, https://www.sciencedirect.com/science/ article/abs/pii/S0031320317300870?via%3Dihub
GitHub Transkribus. Preuzeto 3.8.2020, https://github.com/transkribus
Grüning, T., Leifert, G., Strauss, T., Labahn, R. (2017) A Robust and Binarization-free approach for text line detection in historical documents. in: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 1: 236-241, Preuzeto 3.8.2020, https://ieeexplore.ieee.org/abstract/document/8269978
Johannes, M., Weidemann, M., Labahn, R. Deliverable 7.9, HTR engine based on neural networks P3. Deliverable submitted to the European Commission, Preuzeto 4. 8. 2020, https://read.transkribus.eu/wp-content/uploads/2018/12/Del_D7_9.pdf
Kahle, P., Colutto, S., Hackl, G., Mühlberger, G. (2017) Transkribus: A service platform for transcription, recognition and retrieval of historical documents. in: 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), 4: 19-24, Preuzeto 3.8.2020, https://api.semanticscholar.org/CorpusID:25099654
Leifert, G., Strauß, T., Grüning, T., Labahn, R. (2014) CITlab ARGUS for historical handwritten documents. ArXivabs/1412.3949, Preuzeto 3.8.2020, http://arxiv.org/abs/1605.08412
Mühlberger, G. (2016) H2020 Project READ (Recognition and enrichment of archival documents) -2016-2019. Preuzeto 2.8.2020, http://www.academia.edu/22653102/ H2020_Project_READ_Recognition_and_Enrichment_of_Archival_Documents_-_2016-2019
Mühlberger, G., et al. (2019) Transforming scholarship in the archives through handwritten text recognition. Journal of Documentation, 75: 954-976, Preuzeto 4.8.2020, https://api.semanticscholar.org/CorpusID:196204627
Romero, V., Bosch, V., Hernández-Tornero, C., Vidal, E., Sánchez, J. (2017) A historical document handwriting transcription end-to-end system. in: Alexandre L., Salvador Sánchez J., Rodrigues J. [ed.] Pattern Recognition and Image Analysis, Springer International Publishing, 10255: 149-157, Lecture Notes in Computer Science
Sánchez, J., Romero, V., Toselli, A.H., Vidal, E. (2018) Handwritten text recognition competitions with the transcriptorium dataset. in: Document Analysis and Text Recognition, World Scientific Publishing, 213-239
Seaward, L., Kallio, M. Transkribus: Handwritten text recognition technology for historical documents. Preuzeto 4.8.2020, https://dh2017.adho.org/abstracts/649/649.pdf
 

About

article language: Serbian
document type: Review Paper
DOI: 10.19090/cit.2020.37.35-46
received: 24/08/2020
revised: 07/10/2020
accepted: 12/10/2020
published in SCIndeks: 23/12/2020

Related records

No related records