Introduction


NeuralMarker offers various category types to its users to upload annotation files in the form of JSON, XML, and text for varieties of Annotation formats such as TensorFlow COCO, Create ML, EAST, etc.

 

The list of supported Annotation formats for each category is included below:


1. Rectangle




A. Create ML


Create ML stores annotation of category type rectangle in a JSON file. The content of the file is given below:




I. image: Name of the Image


II. annotations include a.) Name of the category,   b.) coordinates associated with the Object of Interest/category present in the images of the dataset.


B. TensorFlow API


TensorFlow object detection API uses a CSV file to maintain the labeled data in a human-readable format. The row names of the CSV file is as follows:

Filename: Name of the file

Size of the image: width, height

Category of the object: class, 

Dimension of the bounding box: xmin, ymin, xmax, ymax





C. TensorFlow COCO / COCO


COCO stores image annotations in a JSON format. The information stored in the JSON file for the bounding box/rectangle is given below:


But before that, Let's have a look at the basic Data Structure of the JSON file:



  • info contains primary information about the image dataset.
  • images: consists of information related to all the images in the dataset. 
  • licenses: contains information regarding image licenses applied to all the images in the dataset.


Now, let's focus on data structures of JSON file specific to the rectangle annotation type



  • Annotations :  

                Each object instance annotation in an image consists of 

  • id: alphanumeric number [5faa71394a9f55002f2bba31]


  • image_id: alphanumeric number [5faa71394a9f55002f2aab31]


  • category_id: an integer


  • segmentation mask: The segmentation format depends on whether the instance of the object represents a single object (iscrowd=0 in which case polygons are used) or a collection of objects (iscrowd=1 in which case RLE is used).


  • area: it's the area covered by the bounding box around the instance of the object.


  • bbox:(x-top left, y-top left, width, height

The categories field of the annotation data structure stores the mapping of category id to category and supercategory names.


Since NeuralMarker also supports image tagging with bounding box. Therefore "image tags":[] field is visible under image section, and "annotation tags":[] field is visible under annotation field.


An example for the image and annotation section of the JSON COCO is included below:


{

    "info": {},

    "licenses": {},

    "images": [

        {

            "file_name": "Transpo_G70_TA-518126.jpg",

            "coco_url": "https://xy/datasets/5e79c70f49da3d0018ac8820/images/5faa71394a9f55002f2xxf31.jpg",

            "height": 1500,

            "width": 2000,

            "date_captured": "2020-53-10 10:53:30",

            "id": "5faa71394a9f55002f2xxf31",

            "image_tags": []

        },



"annotations": [

        {

            "id": "5fb60968b0de18001897de12",

            "segmentation": [

                [

                    791.0,

                    557.0,

                    635.0,

                    561.0,

                    479.0,

                    588.0,

                    323.0,

                    693.0,

                    261.0,

                    849.0,

                    1192.0,

                ]

            ]

            "area": 620881,

            "iscrowd": 0,

            "image_id": "5faa71394a9f55002f2xxf31",

            "bbox": [

                261.0,

                557.0,

                1356.0,

                628.0

            ],

            "category_id": 2813,

            "annotation_tags": []

        },



D. TensorFlow VOC / VOC


PASCAL VOC/ TensorFlow VOC stores per image annotations in an XML format. The information stored in the XML file for the bounding box/rectangle is given below:


A sample XML annotation based on PASCAL VOC Data Format 



Annotation Format Details:

folder                           - the name of the folder where images are stored

filename                      - the name of the image file

path                             - The path where the image file is stored

width                           - The width of the image 

height                          - The height of the image

depth                           - The depth of the image

segmented                   - 0 if bbox, 1 if segmentation


Object:  Contain object details, the component of the object tag are as follows: 

name                           - name the annotated category

pose                            - pose of the object

truncated                    - indicate the bounding box does not represent the full extent of the object.

difficult                      - An object is marked as difficult when the object is difficult to recognize.

occluded                    - An object is marked as occluded when the object is not visible entirely.

     bndbox: it specifies the extent of the object of interest visible in the image

xmin                            - left x point 

xmax                           - right x point 

ymin                            - top y point 

ymax                           - bottom y point 


Since each image has it own XML file. Therefore to upload VOC type annotations for an image dataset put the XML files of all the images in one folder and then zip that folder, and then upload that folder, using "Choose file" field.




In PASCAL VOC annotations format XML file for each image in the dataset is created. Whereas in MS-COCO only ONE JSON file for the entire dataset is created. 



E. NeuralMarker Rectangle


NeuralMarker Rectangle stores image annotations in a JSON format. The information stored in the JSON file for the rectangle category type is given below:


  • dataset id: id of the dataset all in integers


  • organization id: unique alphanumeric id allocated to the organization


  • category type: Type of the annotations drawn on the images of the dataset


  • images contain information such as :


  • image name
  • image URL
  • height, width: Dimension of the image


  • annotations contain information such as :


  • id: unique alphanumeric id allocated to the instance annotation
  • category name: name of the class 
  • area: area covered by the bounding box/rectangle
  • creator: name of the labeler, who created the instance annotation
  • bbox: Co-ordinates of the object of interest in the image [ x1,y1, bbox_w., bbox_h]


A sample of the annotations based on "NeuralMarker RectangleData Format 



Since NeuralMarker also supports image tagging with bounding box. Therefore "image tags":[] field is visible under image section, and "annotation tags":[] field is visible under annotation field.






2. Polygon



A. TensorFlow COCO / COCO


The data structure of the JSON file displayed above for the rectangle category type under the TensorFlow COCO / COCO section is identical for the polygon category type.


However, under the segmentation field of the Annotations section, polygon values will be used instead of RLE.




3. Optical Character Recognition (OCR)



A. EAST


For OCR category type, the annotation file is a TEXT file, therefore each image with text boxes has its corresponding annotation stored in a ( .txt) file. 


The data structure of the .txt file is displayed in the image below : 




4. Segmentation




A. TensorFlow COCO / COCO


The data structure of the JSON file displayed above for the rectangle category type under the TensorFlow COCO / COCO section is identical and fully compatible with the Segmentation format except for the "iscrowd" field which is unnecessary and set to 0 by default.


Additionally, In the JSON file, each category present in the image is encoded with a single RLE [Run Length Encoding] annotation instead of a polygon for segmentation category type.



B. Mask RCNN 


A Mask RCNN format is represented in a JSON file, which contains information in the form of key-value pairs.


The key values present in the JSON files are as follows :


  • "_via_settings"


  • "_via_img_metadata"


fields included in "_via_img_metadata" is as follows:


filename -                             Name of the image


size-                                      Size of the image in bytes


regions-                                 list of the annotated objects


Shape_attributes-                  contains the list of x and y coordinates of the annotated objects


all_points_x-                         contains the list of x coordinates 


all_points_y-                         contains the list of y coordinates


region_attributes-                  contain the category information


name (region_attributes)-      category name


  • "_via_attributes"



The JSON file on the Mask RCNN format comprising of the key-value pairs is shown below: