README.md 8.55 KB
Newer Older
mohanty's avatar
mohanty committed
1
2
# Getting Started

spmohanty's avatar
spmohanty committed
3
## Downloading MyFoodRepo data
mohanty's avatar
mohanty committed
4

spmohanty's avatar
spmohanty committed
5
Before getting started with the experiments the first thing is to download the Data. To make things simpler we are downloading data using AICrowd CLI.
mohanty's avatar
mohanty committed
6

spmohanty's avatar
spmohanty committed
7
8
1. To use the CLI, first you have to make an account on AICrowd[http://aicrowd.com/] and get the API Key from profile.
2. Open the terminal and run 
mohanty's avatar
mohanty committed
9
10

```shell
spmohanty's avatar
spmohanty committed
11
bash download_data.sh <YOUR_API_KEY>
mohanty's avatar
mohanty committed
12
13
```

spmohanty's avatar
spmohanty committed
14
After executing the above command, you'll see the data folder in your current directory. It contains training and validation images and respective annotations.json.
mohanty's avatar
mohanty committed
15

spmohanty's avatar
spmohanty committed
16
## Preprocessing data
mohanty's avatar
mohanty committed
17

spmohanty's avatar
spmohanty committed
18
Dataset in real-world settings are not very clean. We have performed some rudimentary pre-processing for you, such as
mohanty's avatar
mohanty committed
19

spmohanty's avatar
spmohanty committed
20
21
22
23
1. Checking and correcting if the image dimensions in the annotation file matches with the actual image size in training and validation set.
2. Annotations might contain some pictures which are rotated w.r.t the segmentations. We currently remove these pictures, although we encourage you to correct them if possible.
3. A polygon must have atleast 3 coordinates. In this pre-processing we are removing polygons which does not satisfy this criteria.
4. Bounding boxes are created manually, therefore to increase the data quality we redraw them using the polygon segments
mohanty's avatar
mohanty committed
24

spmohanty's avatar
spmohanty committed
25
To perform the pre-processing you must have the data placed in data folder. To run the pre-processing script, type `python pre-processing.py`. It will create `annotations_new.json` in training and validation data folders. 
mohanty's avatar
mohanty committed
26

spmohanty's avatar
spmohanty committed
27
## Downloading COCO-Pre-trained models
mohanty's avatar
mohanty committed
28

spmohanty's avatar
spmohanty committed
29
Transfer Learning helps downstreaming the deep learning task and coverge better with less training data. We use pre-trained models as a starting point for training on MyFoodRepo dataset. We have made it easier for you to download these models. You just have to specify which model you need. Below are the models on which we performed our experiments -
mohanty's avatar
mohanty committed
30

spmohanty's avatar
spmohanty committed
31
32
33
34
35
36
37
38
- mask-r50 = Download Mask R-CNN ResNet50 model
- mask-r101 = Download Mask R-CNN ResNet50 model
- mask-rx101 = Download Mask R-CNN ResNet50 model
- htc-r50 = Download Mask R-CNN ResNet50 model
- htc-r101 = Download Mask R-CNN ResNet50 model
- htc-rx101 = Download Mask R-CNN ResNet50 model
- detectors-r50 = Download Mask R-CNN ResNet50 model
- all = To download all the models
mohanty's avatar
mohanty committed
39

spmohanty's avatar
spmohanty committed
40
41
42
43
In terminal,

```shell
bash download_base_models.sh mask-r50
mohanty's avatar
mohanty committed
44
45
```

spmohanty's avatar
spmohanty committed
46
It will download and put the model in `./base_models/mask_rcnn` directory.
mohanty's avatar
mohanty committed
47

spmohanty's avatar
spmohanty committed
48
## Building a Model
mohanty's avatar
mohanty committed
49

spmohanty's avatar
spmohanty committed
50
This part of README covers the available MMDetection training scripts that can be used to train a model.
mohanty's avatar
mohanty committed
51

spmohanty's avatar
spmohanty committed
52
### Experimental Scripts
mohanty's avatar
mohanty committed
53

spmohanty's avatar
spmohanty committed
54
The experimental scripts are setup for you to quickly get started with model training. We have created models with three different instance segmentation algorithms (Mask R-CNN, HTC, DetectoRS). The motivation for using these three is that, Mask R-CNN is among the first few methods proposed to solve instance segmentation. The anatomical architecture of instance segmentation is a 2-stage pipeline, where first stage does the feature extraction, and second does the region proposals. The table below breifly explains the difference and changes in these algorithms.
mohanty's avatar
mohanty committed
55

spmohanty's avatar
spmohanty committed
56
57
58
59
60
| Algorithm  | Stage 1 | Stage 2 |
|------------|:-------:|--------:|
| Mask R-CNN |   FPN   |   1 RPN |
| HTC        |   FPN   |  3 RPNs |
| DetectoRS  |   RFP   |  3 RPNs |
mohanty's avatar
mohanty committed
61

spmohanty's avatar
spmohanty committed
62
63
64
- FPN - Feature Pyramid Network
- RFP - Recursive Feature Pyramid
- RPN - Region Proposal Network
mohanty's avatar
mohanty committed
65

spmohanty's avatar
spmohanty committed
66
**Scripts**- Uses above described methods
mohanty's avatar
mohanty committed
67

spmohanty's avatar
spmohanty committed
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
| Method     | Backbone     | Version       | Description                                                    |
|------------|--------------|---------------|----------------------------------------------------------------|
| Mask R-CNN | ResNet50     | Baseline      | Contains the baseline code provided by original authors        |
|            |              | Weighted Loss | Baseline + change in loss function of Mask R-CNN               |
|            |              | Augmentation  | Baseline + Albumentations augmentations                        |
|            |              | Multi-Scale   | Baseline + Multi-scale training on 3 image resolutions         |
|            |              | Combined      | Baseline + Augmentation + Multi-Scale + Tuned Hyper-parameters |
|            | ResNet101    | Baseline      | Contains the baseline code provided by original authors        |
|            |              | Combined      | Baseline + Augmentation + Multi-Scale + Tuned Hyper-parameters |
|            | ResNeXt101   | Baseline      | Contains the baseline code provided by original authors        |
|            |              | Combined      | Baseline + Augmentation + Multi-Scale + Tuned Hyper-parameters |
| HTC        | ResNet50     | Baseline      | Contains the baseline code provided by original authors        |
|            |              | Combined      | Baseline + Augmentation + Multi-Scale + Tuned Hyper-parameters |
|            | ResNet101    | Baseline      | Contains the baseline code provided by original authors        |
|            |              | Combined      | Baseline + Augmentation + Multi-Scale + Tuned Hyper-parameters |
|            | ResNeXt101   | Baseline      | Contains the baseline code provided by original authors        |
|            |              | Combined      | Baseline + Augmentation + Multi-Scale + Tuned Hyper-parameters |
| DetectoRS  | HTC+ResNet50 | Baseline      | Contains the baseline code provided by original authors        |
|            |              | Combined      | Baseline + Augmentation + Multi-Scale + Tuned Hyper-parameters |
mohanty's avatar
mohanty committed
87

spmohanty's avatar
spmohanty committed
88
89
The training scripts are located under `experimental-scripts/` directory.
Feel free to change the scripts as per your requirements.
mohanty's avatar
mohanty committed
90

spmohanty's avatar
spmohanty committed
91
### Training Models
mohanty's avatar
mohanty committed
92

spmohanty's avatar
spmohanty committed
93
94
The training scripts contains code with Weights and Biases (WandB) support. Good thing about WandB is that, you only have to configure once with the API key. Our scripts will help organize all the experiments you will perform for your research or product development.  
To start with the model training, we assume that you have completed all the above steps. We assume that you have right CUDA version for your GPU. If so, run the following command in terminal:
mohanty's avatar
mohanty committed
95

spmohanty's avatar
spmohanty committed
96
97
98
```shell
CUDA_VISIBLE_DEVICES=<DEVICE_ID> python mmdetection/tools/train.py ./experimental-scripts/mask_rcnn/baseline.py
```
mohanty's avatar
mohanty committed
99

spmohanty's avatar
spmohanty committed
100
This will save the model under `./work-dir/mask_rcnn/resnet50/baseline`.
mohanty's avatar
mohanty committed
101

spmohanty's avatar
spmohanty committed
102
Easy huh!
mohanty's avatar
mohanty committed
103

spmohanty's avatar
spmohanty committed
104
## Generating Predictions (Inference)
mohanty's avatar
mohanty committed
105

spmohanty's avatar
spmohanty committed
106
Now that you have trained models. It's time to get the inferences from the learned model. We have prepared a inference script for you that will save all the predictions in the `json` file format.
mohanty's avatar
mohanty committed
107

spmohanty's avatar
spmohanty committed
108
1. Getting inference for 1 model
mohanty's avatar
mohanty committed
109

spmohanty's avatar
spmohanty committed
110
The below script will help you to generate the predictions for a model and configuration. The variables set below are according to the current setting for this repository, ofcourse you can and should change it for other experiments. 
mohanty's avatar
mohanty committed
111

spmohanty's avatar
spmohanty committed
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
```shell
config=./experimental-scripts/mask_rcnn/baseline.py
checkpoint=./work-dir/mask_rcnn/resnet50/baseline/epoch_1.pth
out_file=./work-dir/mask_rcnn/resnet50/baseline/predictions_1.json
data=./data/val/images

CUDA_VISIBLE_DEVICES=<DEVICE_ID> python inference/predict_model.py \
    --config $config \
    --checkpoint $checkpoint \
    --data $data
    --format-only --eval-options jsonfile_prefix=predictions \
    --out_file $out_file
```

Warning: You might want to use absolute path instead of relative path, if it doesn't work.
mohanty's avatar
mohanty committed
127

spmohanty's avatar
spmohanty committed
128
2. Ensembling different models
mohanty's avatar
mohanty committed
129

spmohanty's avatar
spmohanty committed
130
131
At this point we assume that you have atleast two trained models with you. If yes, you can improve the score even further by using our ensembling approach.  
Update the `./inference/models.py` as per your requirements. You need specify the configuration file path, checkpoint path, and output file path respectively. We have updated it to the basic need i.e. (ensembling of Mask R-CNN baseline model epoch 1 and epoch 2)
mohanty's avatar
mohanty committed
132
133

```shell
spmohanty's avatar
spmohanty committed
134
135
136
137
138
139
data=./data/val/images
out_file=./work-dir/mask_rcnn/resnet50/baseline/predictions_ensemble.json

CUDA_VISIBLE_DEVICES=<DEVICE_ID> python inference/infer.py \
    --out_file $out_file \
    --data $data
mohanty's avatar
mohanty committed
140
141
```

spmohanty's avatar
spmohanty committed
142
143
144
145
## Evaluation

We use MS COCO defined evaluation metrics i.e. mAP (Mean Average Precision), mAR (Mean Average Recall). The scripts for evaluation can be found under `./eval/`.  
To evaluate the performance of a model, one needs the ground truth annotations and predictions. We are using the directory structure of this repository, although you can change the paths as per your need.
mohanty's avatar
mohanty committed
146
147

```shell
spmohanty's avatar
spmohanty committed
148
149
150
annotations=./data/val/annotations_new.json
out_file=./work-dir/mask_rcnn/resnet50/baseline/predictions_1.json
CUDA_VISIBLE_DEVICES=0 python eval/eval.py $out_file --ann $annotations
mohanty's avatar
mohanty committed
151
152
153
```


spmohanty's avatar
spmohanty committed
154
Finish. We hope you had a nice time working with our repository.