Introduction


NeuralMarker offers a variety of Pretrained Deep Learning Models for Optical Character Recognition (OCR) from different available frameworks like TensorFlow and PyTorch.

 

This list of models at hand is not static, as more models in the Neural Network Library of NeuralMarker will be added, then accordingly the below list will be updated too:


Pretrained Models 


1.  EAST-Text-Detector



2.  CRAFT




3.  EasyOCR-Telugu


4.  EasyOCR-Azerbaijani


5.  EasyOCR-Croatian


6.  EasyOCR-Estonian


7.  EasyOCR-Turkish


8.  EasyOCR-Chinese


9.  EasyOCR-Maithili


10.  EasyOCR-Slovenian


11.  EasyOCR-Swedish


12.  EasyOCR-Afrikaans


13.  EasyOCR-Marathi


14.  EasyOCR-Maltese


15.  EasyOCR-Bhojpuri


16.  EasyOCR-Occitan


17.  EasyOCR-Ingush


18.  EasyOCR-Goan


19.  EasyOCR-Irish


20.  EasyOCR-Norwegian


21.  EasyOCR-Portuguese


22.  EasyOCR-Kannada


23.  EasyOCR-Maori


24.  EasyOCR-Serbian_cyr


25.  EasyOCR-Icelandic


26.  EasyOCR-Tamil


27.  EasyOCR-Arabic


28.  EasyOCR-Uyghur_ar


29.  EasyOCR-Mongolian


30.  EasyOCR-Uyghur_cyr


31.  EasyOCR-Lithuanian


32.  EasyOCR-Dutch


33.  EasyOCR-Danish


34.  EasyOCR-Malay_lat


35.  EasyOCR-Belarusian


36.  EasyOCR-Urdu


37.  EasyOCR-Hindi


38.  EasyOCR-Kurdish


39.  EasyOCR-Vietnamese


40.  EasyOCR-Slovak


41.  EasyOCR-French


42.  EasyOCR-Russian


43.  EasyOCR-Kabardian


44.  EasyOCR-Chechen


45.  EasyOCR-Hungarian


46.  EasyOCR-Romanian_lat


47.  EasyOCR-Indonesian


48.  EasyOCR-Nagpuri


49.  EasyOCR-Polish


50.  EasyOCR-Newari


51.  EasyOCR-Magahi


52.  EasyOCR-Chinese_Simplified


53.  EasyOCR-German


54.  EasyOCR-Czech


55.  EasyOCR-Adyghe


56.  EasyOCR-Persian


57.  EasyOCR-Albanian


58.  EasyOCR-Spanish


59.  EasyOCR-Serbian_lat


60.  EasyOCR-Korean


61.  EasyOCR-Welsh


62.  EasyOCR-Lak


63.  EasyOCR-Assamese


64.  EasyOCR-Bulgarian


65.  EasyOCR-Angika


66.  EasyOCR-Italian


67.  EasyOCR-Avar


68.  EasyOCR-Dargwa


69.  EasyOCR-Ukranian


70.  EasyOCR-Latin


71.  EasyOCR-Swahili_lat


72.  EasyOCR-Japanese


73.  EasyOCR-Lezgi


74.  EasyOCR-Uzbek


75.  EasyOCR-Bengali


76.  EasyOCR-Tabarassan


77.  EasyOCR-Nepali


78.  EasyOCR-Bosnian


79.  EasyOCR-Latvian


80.  EasyOCR-Tagalog


81.  EasyOCR-Thai


82.  EasyOCR-Abaza


83.  EasyOCR-English


84.  East Tesseract


85.  PMTD_ICDAR2015


86.  PMTDICDAR2017MLT


87.  Box_DN_ImageNet














How to create Annotations on an OCR Dataset with Pre-Trained Models


Steps to Follow


1. Login to the tool

2. Click on the Add button.

3. The Add Dataset Form will appear.

4. Fill in all the fields such as dataset name, dataset description, category-type, and categories.


     4 a. Select category-type as OCR

     4 b. Choose Pretrained Model from the available list.

     4 c. Categories as "text_box" will appear for either of the Model.



5. Add data to NeuralMarker using the available options listed below:


To Learn More, Click Here. 





6. Click on Submit Button

7. A New Dataset will be created.

8. After New Dataset is created: 

  • The Dataset card will display the status "Pretrain Model running". 
  • The brain symbol on hover will display "AI labeling in Progress".



9. After Pretrain Model stops running over the entire dataset:

  •  The Dataset card will display the status "Pretrain Model done & ready for annotation". 
  •  The brain symbol on the card will display "AI labeling report".



10. After Pretrain Model stops running over the entire dataset, an AI labeling report will be generated:

  • With confidence score of annotation over the images in the dataset.






OCR OUTPUT with Human in the Loop



A. East-Text-Detector







B. CRAFT