The workshop will provide a forum for highlighting current research on multilingual document analysis systems with particular emphasis on OCR. The predecessors to this workshop were held in conjunction with ICDAR 1999 in Bangalore, India, ICDAR 2009 in Barcelona, Spain, ICDAR 2011 in Beijing, China, ICDAR 2013 in Washington DC, USA and ICDAR 2015 in Nancy, France. A joint Workshop on Multilingual OCR and Analytics for Noisy Unstructured Text Data was held in conjunction with ICDAR2011 in Beijing, China. The scope of `Multilingual OCR' is defined to include systems that are capable of reading more than one language in the same document, as well as one-language-per- document systems that can be easily retargeted to new languages. The proposed workshop will provide a forum for technical discussions on three important themes: i) recent progress in the field and promising new techniques , ii) attempts to identify and address 'hard' open research problems, and iii) performance evaluation of multilingual OCR systems.
The topics that will be addressed by this Workshop are:
Proven Methodologies for OCR: Efficacy of existing methodologies for Latin script to other scripts (HMMs, Neural networks etc.)
Mixed languages: Techniques applicable/retargetable to multiple languages/scripts; documents containing multiple languages/scripts,
Newer languages/ Scripts: Techniques for dealing with problems of scripts for which OCR technology has not matured
Document Analysis: Language and script identification, machine print vs handwriting, layout analysis, reading order
Special domains: Scene text and video text, mathematical formulas; tables; abbreviations; annotations
Degraded and historical documents
Domain knowledge: Colloquialism, dialect, language models
Evaluation Methodologies: metrics, standards, ground truth; benchmark datasets
Demo systems
11月11日
2017
会议日期
初稿截稿日期
初稿录用通知日期
终稿截稿日期
注册截止日期
留言