From 0613aedb7c4ee05b65893343fbea6a34b8aac77e Mon Sep 17 00:00:00 2001
From: Kai Chen <chenkaidev@gmail.com>
Date: Sun, 21 Oct 2018 01:43:47 +0800
Subject: [PATCH] update readme and version

---
 README.md                | 46 ++++++++++++++++++++++++++++++++++++++++
 mmdet/datasets/custom.py |  2 +-
 setup.py                 |  2 +-
 3 files changed, 48 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 4eb1f1b..47df2d4 100644
--- a/README.md
+++ b/README.md
@@ -34,6 +34,10 @@ This project is released under the [Apache 2.0 license](LICENSE).
 
 ## Updates
 
+v0.5.2 (21/10/2018)
+- Add support for custom datasets.
+- Add a script to convert PASCAL VOC annotations to the expected format.
+
 v0.5.1 (20/10/2018)
 - Add BBoxAssigner and BBoxSampler, the `train_cfg` field in config files are restructured.
 - `ConvFCRoIHead` / `SharedFCRoIHead` are renamed to `ConvFCBBoxHead` / `SharedFCBBoxHead` for consistency.
@@ -209,6 +213,48 @@ Expected results in WORK_DIR:
 > 1. We recommend using distributed training with NCCL2 even on a single machine, which is faster. Non-distributed training is for debugging or other purposes.
 > 2. The default learning rate is for 8 GPUs. If you use less or more than 8 GPUs, you need to set the learning rate proportional to the GPU num. E.g., modify lr to 0.01 for 4 GPUs or 0.04 for 16 GPUs.
 
+### Train on custom datasets
+
+We define a simple annotation format.
+
+The annotation of a dataset is a list of dict, each dict corresponds to an image.
+There are 3 field `filename` (relative path), `width`, `height` for testing,
+and an additional field `ann` for training. `ann` is also a dict containing at least 2 fields:
+`bboxes` and `labels`, both of which are numpy arrays. Some datasets may provide
+annotations like crowd/difficult/ignored bboxes, we use `bboxes_ignore` and `labels_ignore`
+to cover them.
+
+Here is an example.
+```
+[
+    {
+        'filename': 'a.jpg',
+        'width': 1280,
+        'height': 720,
+        'ann': {
+            'bboxes': <np.ndarray> (n, 4),
+            'labels': <np.ndarray> (n, ),
+            'bboxes_ignore': <np.ndarray> (k, 4),
+            'labels_ignore': <np.ndarray> (k, 4) (optional field)
+        }
+    },
+    ...
+]
+```
+
+There are two ways to work with custom datasets.
+
+- online conversion
+
+  You can write a new Dataset class inherited from `CustomDataset`, and overwrite two methods
+  `load_annotations(self, ann_file)` and `get_ann_info(self, idx)`, like [CocoDataset](mmdet/datasets/coco.py).
+
+- offline conversion
+
+  You can convert the annotation format to the expected format above and save it to
+  a pickle file, like [pascal_voc.py](tools/convert_datasets/pascal_voc.py).
+  Then you can simply use `CustomDataset`.
+
 ## Technical details
 
 Some implementation details and project structures are described in the [technical details](TECHNICAL_DETAILS.md).
diff --git a/mmdet/datasets/custom.py b/mmdet/datasets/custom.py
index 3ae4704..3640a83 100644
--- a/mmdet/datasets/custom.py
+++ b/mmdet/datasets/custom.py
@@ -271,4 +271,4 @@ class CustomDataset(Dataset):
         data = dict(img=imgs, img_meta=img_metas)
         if self.proposals is not None:
             data['proposals'] = proposals
-        return data
\ No newline at end of file
+        return data
diff --git a/setup.py b/setup.py
index 1803b73..9115195 100644
--- a/setup.py
+++ b/setup.py
@@ -12,7 +12,7 @@ def readme():
 
 MAJOR = 0
 MINOR = 5
-PATCH = 1
+PATCH = 2
 SUFFIX = ''
 SHORT_VERSION = '{}.{}.{}{}'.format(MAJOR, MINOR, PATCH, SUFFIX)
 
-- 
GitLab