Introduction
NeuralMarker offers a variety of Pretrained Deep Learning Models for Optical Character Recognition (OCR) from different available frameworks like TensorFlow and PyTorch.
This list of models at hand is not static, as more models in the Neural Network Library of NeuralMarker will be added, then accordingly the below list will be updated too:
Pretrained Models
1. EAST-Text-Detector
2. CRAFT
3. EasyOCR-Telugu
4. EasyOCR-Azerbaijani
5. EasyOCR-Croatian
6. EasyOCR-Estonian
7. EasyOCR-Turkish
8. EasyOCR-Chinese
9. EasyOCR-Maithili
10. EasyOCR-Slovenian
11. EasyOCR-Swedish
12. EasyOCR-Afrikaans
13. EasyOCR-Marathi
14. EasyOCR-Maltese
15. EasyOCR-Bhojpuri
16. EasyOCR-Occitan
17. EasyOCR-Ingush
18. EasyOCR-Goan
19. EasyOCR-Irish
20. EasyOCR-Norwegian
21. EasyOCR-Portuguese
22. EasyOCR-Kannada
23. EasyOCR-Maori
24. EasyOCR-Serbian_cyr
25. EasyOCR-Icelandic
26. EasyOCR-Tamil
27. EasyOCR-Arabic
28. EasyOCR-Uyghur_ar
29. EasyOCR-Mongolian
30. EasyOCR-Uyghur_cyr
31. EasyOCR-Lithuanian
32. EasyOCR-Dutch
33. EasyOCR-Danish
34. EasyOCR-Malay_lat
35. EasyOCR-Belarusian
36. EasyOCR-Urdu
37. EasyOCR-Hindi
38. EasyOCR-Kurdish
39. EasyOCR-Vietnamese
40. EasyOCR-Slovak
41. EasyOCR-French
42. EasyOCR-Russian
43. EasyOCR-Kabardian
44. EasyOCR-Chechen
45. EasyOCR-Hungarian
46. EasyOCR-Romanian_lat
47. EasyOCR-Indonesian
48. EasyOCR-Nagpuri
49. EasyOCR-Polish
50. EasyOCR-Newari
51. EasyOCR-Magahi
52. EasyOCR-Chinese_Simplified
53. EasyOCR-German
54. EasyOCR-Czech
55. EasyOCR-Adyghe
56. EasyOCR-Persian
57. EasyOCR-Albanian
58. EasyOCR-Spanish
59. EasyOCR-Serbian_lat
60. EasyOCR-Korean
61. EasyOCR-Welsh
62. EasyOCR-Lak
63. EasyOCR-Assamese
64. EasyOCR-Bulgarian
65. EasyOCR-Angika
66. EasyOCR-Italian
67. EasyOCR-Avar
68. EasyOCR-Dargwa
69. EasyOCR-Ukranian
70. EasyOCR-Latin
71. EasyOCR-Swahili_lat
72. EasyOCR-Japanese
73. EasyOCR-Lezgi
74. EasyOCR-Uzbek
75. EasyOCR-Bengali
76. EasyOCR-Tabarassan
77. EasyOCR-Nepali
78. EasyOCR-Bosnian
79. EasyOCR-Latvian
80. EasyOCR-Tagalog
81. EasyOCR-Thai
82. EasyOCR-Abaza
83. EasyOCR-English
84. East Tesseract
85. PMTD_ICDAR2015
86. PMTDICDAR2017MLT
87. Box_DN_ImageNet
How to create Annotations on an OCR Dataset with Pre-Trained Models
Steps to Follow
1. Login to the tool
2. Click on the Add button.
3. The Add Dataset Form will appear.
4. Fill in all the fields such as dataset name, dataset description, category-type, and categories.
4 a. Select category-type as OCR
4 b. Choose Pretrained Model from the available list.
4 c. Categories as "text_box" will appear for either of the Model.
5. Add data to NeuralMarker using the available options listed below:
- Google Drive Link or S3 link
- CSV File with Image URL's
- Drag and Drop
6. Click on Submit Button
7. A New Dataset will be created.
8. After New Dataset is created:
- The Dataset card will display the status "Pretrain Model running".
- The brain symbol on hover will display "AI labeling in Progress".
9. After Pretrain Model stops running over the entire dataset:
- The Dataset card will display the status "Pretrain Model done & ready for annotation".
- The brain symbol on the card will display "AI labeling report".
10. After Pretrain Model stops running over the entire dataset, an AI labeling report will be generated:
- With confidence score of annotation over the images in the dataset.
OCR OUTPUT with Human in the Loop
A. East-Text-Detector
B. CRAFT