Skip to content
Snippets Groups Projects
Unverified Commit bbb699f3 authored by Kai Chen's avatar Kai Chen Committed by GitHub
Browse files

Merge pull request #580 from hellock/docs

Update readme and tutorials
parents d9125147 d00d0be1
No related branches found
No related tags found
No related merge requests found
# Getting Started
This page provides basic tutorials about the usage of mmdetection.
For installation instructions, please see [INSTALL.md](INSTALL.md).
## Inference with pretrained models
We provide testing scripts to evaluate a whole dataset (COCO, PASCAL VOC, etc.),
and also some high-level apis for easier integration to other projects.
### Test a dataset
- [x] single GPU testing
- [x] multiple GPU testing
- [x] visualize detection results
You can use the following command to test a dataset.
```shell
python tools/test.py ${CONFIG_FILE} ${CHECKPOINT_FILE} [--gpus ${GPU_NUM}] [--proc_per_gpu ${PROC_NUM}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}] [--show]
```
Positional arguments:
- `CONFIG_FILE`: Path to the config file of the corresponding model.
- `CHECKPOINT_FILE`: Path to the checkpoint file.
Optional arguments:
- `GPU_NUM`: Number of GPUs used for testing. (default: 1)
- `PROC_NUM`: Number of processes on each GPU. (default: 1)
- `RESULT_FILE`: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.
- `EVAL_METRICS`: Items to be evaluated on the results. Allowed values are: `proposal_fast`, `proposal`, `bbox`, `segm`, `keypoints`.
- `--show`: If specified, detection results will be ploted on the images and shown in a new window. Only applicable for single GPU testing.
Examples:
Assume that you have already downloaded the checkpoints to `checkpoints/`.
1. Test Faster R-CNN and show the results.
```shell
python tools/test.py configs/faster_rcnn_r50_fpn_1x.py \
checkpoints/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth \
--show
```
2. Test Mask R-CNN and evaluate the bbox and mask AP.
```shell
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py \
checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
--out results.pkl --eval bbox mask
```
3. Test Mask R-CNN with 8 GPUs and 2 processes per GPU, and evaluate the bbox and mask AP.
```shell
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py \
checkpoints/mask_rcnn_r50_fpn_1x_20181010-069fa190.pth \
--gpus 8 --proc_per_gpu 2 --out results.pkl --eval bbox mask
```
### High-level APIs for testing images.
Here is an example of building the model and test given images.
```python
import mmcv
from mmcv.runner import load_checkpoint
from mmdet.models import build_detector
from mmdet.apis import inference_detector, show_result
cfg = mmcv.Config.fromfile('configs/faster_rcnn_r50_fpn_1x.py')
cfg.model.pretrained = None
# construct the model and load checkpoint
model = build_detector(cfg.model, test_cfg=cfg.test_cfg)
_ = load_checkpoint(model, 'https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth')
# test a single image
img = mmcv.imread('test.jpg')
result = inference_detector(model, img, cfg)
show_result(img, result)
# test a list of images
imgs = ['test1.jpg', 'test2.jpg']
for i, result in enumerate(inference_detector(model, imgs, cfg, device='cuda:0')):
print(i, imgs[i])
show_result(imgs[i], result)
```
## Train a model
mmdetection implements distributed training and non-distributed training,
which uses `MMDistributedDataParallel` and `MMDataParallel` respectively.
All outputs (log files and checkpoints) will be saved to the working directory,
which is specified by `work_dir` in the config file.
**\*Important\***: The default learning rate in config files is for 8 GPUs.
If you use less or more than 8 GPUs, you need to set the learning rate proportional
to the GPU num, e.g., 0.01 for 4 GPUs and 0.04 for 16 GPUs.
### Train with a single GPU
```shell
python tools/train.py ${CONFIG_FILE}
```
If you want to specify the working directory in the command, you can add an argument `--work_dir ${YOUR_WORK_DIR}`.
### Train with multiple GPUs
```shell
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
```
Optional arguments are:
- `--validate` (recommended): Perform evaluation at every k (default=1) epochs during the training.
- `--work_dir ${WORK_DIR}`: Override the working directory specified in the config file.
- `--resume_from ${CHECKPOINT_FILE}`: Resume from a previous checkpoint file.
### Train with multiple machines
If you run mmdetection on a cluster managed with [slurm](https://slurm.schedmd.com/), you can just use the script `slurm_train.sh`.
```shell
./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR} [${GPUS}]
```
Here is an example of using 16 GPUs to train Mask R-CNN on the dev partition.
```shell
./tools/slurm_train.sh dev mask_r50_1x configs/mask_rcnn_r50_fpn_1x.py /nfs/xxxx/mask_rcnn_r50_fpn_1x 16
```
You can check [slurm_train.sh](tools/slurm_train.sh) for full arguments and environment variables.
If you have just multiple machines connected with ethernet, you can refer to
pytorch [launch utility](https://pytorch.org/docs/stable/distributed_deprecated.html#launch-utility).
Usually it is slow if you do not have high speed networking like infiniband.
## How-to
### Use my own datasets
The simplest way is to convert your dataset to existing dataset formats (COCO or PASCAL VOC).
Here we show an example of adding a custom dataset of 5 classes, assuming it is also in COCO format.
In `mmdet/datasets/my_dataset.py`:
```python
from .coco import CocoDataset
class MyDataset(CocoDataset):
CLASSES = ('a', 'b', 'c', 'd', 'e')
```
In `mmdet/datasets/__init__.py`:
```python
from .my_dataset import MyDataset
```
Then you can use `MyDataset` in config files, with the same API as CocoDataset.
It is also fine if you do not want to convert the annotation format to COCO or PASCAL format.
Actually, we define a simple annotation format and all existing datasets are
processed to be compatible with it, either online or offline.
The annotation of a dataset is a list of dict, each dict corresponds to an image.
There are 3 field `filename` (relative path), `width`, `height` for testing,
and an additional field `ann` for training. `ann` is also a dict containing at least 2 fields:
`bboxes` and `labels`, both of which are numpy arrays. Some datasets may provide
annotations like crowd/difficult/ignored bboxes, we use `bboxes_ignore` and `labels_ignore`
to cover them.
Here is an example.
```
[
{
'filename': 'a.jpg',
'width': 1280,
'height': 720,
'ann': {
'bboxes': <np.ndarray, float32> (n, 4),
'labels': <np.ndarray, float32> (n, ),
'bboxes_ignore': <np.ndarray, float32> (k, 4),
'labels_ignore': <np.ndarray, float32> (k, ) (optional field)
}
},
...
]
```
There are two ways to work with custom datasets.
- online conversion
You can write a new Dataset class inherited from `CustomDataset`, and overwrite two methods
`load_annotations(self, ann_file)` and `get_ann_info(self, idx)`,
like [CocoDataset](mmdet/datasets/coco.py) and [VOCDataset](mmdet/datasets/voc.py).
- offline conversion
You can convert the annotation format to the expected format above and save it to
a pickle or json file, like [pascal_voc.py](tools/convert_datasets/pascal_voc.py).
Then you can simply use `CustomDataset`.
### Develop new components
We basically categorize model components into 4 types.
- backbone: usually a FCN network to extract feature maps, e.g., ResNet, MobileNet.
- neck: the component between backbones and heads, e.g., FPN, PAFPN.
- head: the component for specific tasks, e.g., bbox prediction and mask prediction.
- roi extractor: the part for extracting RoI features from feature maps, e.g., RoI Align.
Here we show how to develop new components with an example of MobileNet.
1. Create a new file `mmdet/models/backbones/mobilenet.py`.
```python
import torch.nn as nn
from ..registry import BACKBONES
@BACKBONES.register
class MobileNet(nn.Module):
def __init__(self, arg1, arg2):
pass
def forward(x): # should return a tuple
pass
```
2. Import the module in `mmdet/models/backbones/__init__.py`.
```python
from .mobilenet import MobileNet
```
3. Use it in your config file.
```python
model = dict(
...
backbone=dict(
type='MobileNet',
arg1=xxx,
arg2=xxx),
...
```
For more information on how it works, you can refer to [TECHNICAL_DETAILS.md](TECHNICAL_DETAILS.md) (TODO).
......@@ -3,7 +3,7 @@
## Introduction
The master branch works with **PyTorch 1.0**. If you would like to use PyTorch 0.4.1,
The master branch works with **PyTorch 1.0** or higher. If you would like to use PyTorch 0.4.1,
please checkout to the [pytorch-0.4.1](https://github.com/open-mmlab/mmdetection/tree/pytorch-0.4.1) branch.
mmdetection is an open source object detection toolbox based on PyTorch. It is
......@@ -24,7 +24,7 @@ a part of the open-mmlab project developed by [Multimedia Laboratory, CUHK](http
- **Efficient**
All basic bbox and mask operations run on GPUs now.
The training speed is about 5% ~ 20% faster than Detectron for different models.
The training speed is nearly 2x faster than Detectron and comparable to maskrcnn-benchmark.
- **State of the art**
......@@ -108,149 +108,8 @@ Please refer to [INSTALL.md](INSTALL.md) for installation and dataset preparatio
## Inference with pretrained models
### Test a dataset
Please see [GETTING_STARTED.md](GETTING_STARTED.md) for the basic usage of mmdetection.
- [x] single GPU testing
- [x] multiple GPU testing
- [x] visualize detection results
We allow to run one or multiple processes on each GPU, e.g. 8 processes on 8 GPU
or 16 processes on 8 GPU. When the GPU workload is not very heavy for a single
process, running multiple processes will accelerate the testing, which is specified
with the argument `--proc_per_gpu <PROCESS_NUM>`.
To test a dataset and save the results.
```shell
python tools/test.py <CONFIG_FILE> <CHECKPOINT_FILE> --gpus <GPU_NUM> --out <OUT_FILE>
```
To perform evaluation after testing, add `--eval <EVAL_TYPES>`. Supported types are:
`[proposal_fast, proposal, bbox, segm, keypoints]`.
`proposal_fast` denotes evaluating proposal recalls with our own implementation,
others denote evaluating the corresponding metric with the official coco api.
For example, to evaluate Mask R-CNN with 8 GPUs and save the result as `results.pkl`.
```shell
python tools/test.py configs/mask_rcnn_r50_fpn_1x.py <CHECKPOINT_FILE> --gpus 8 --out results.pkl --eval bbox segm
```
It is also convenient to visualize the results during testing by adding an argument `--show`.
```shell
python tools/test.py <CONFIG_FILE> <CHECKPOINT_FILE> --show
```
### Test image(s)
We provide some high-level apis (experimental) to test an image.
```python
import mmcv
from mmcv.runner import load_checkpoint
from mmdet.models import build_detector
from mmdet.apis import inference_detector, show_result
cfg = mmcv.Config.fromfile('configs/faster_rcnn_r50_fpn_1x.py')
cfg.model.pretrained = None
# construct the model and load checkpoint
model = build_detector(cfg.model, test_cfg=cfg.test_cfg)
_ = load_checkpoint(model, 'https://s3.ap-northeast-2.amazonaws.com/open-mmlab/mmdetection/models/faster_rcnn_r50_fpn_1x_20181010-3d1b3351.pth')
# test a single image
img = mmcv.imread('test.jpg')
result = inference_detector(model, img, cfg)
show_result(img, result)
# test a list of images
imgs = ['test1.jpg', 'test2.jpg']
for i, result in enumerate(inference_detector(model, imgs, cfg, device='cuda:0')):
print(i, imgs[i])
show_result(imgs[i], result)
```
## Train a model
mmdetection implements distributed training and non-distributed training,
which uses `MMDistributedDataParallel` and `MMDataParallel` respectively.
### Distributed training (Single or Multiples machines)
mmdetection potentially supports multiple launch methods, e.g., PyTorch’s built-in launch utility, slurm and MPI.
We provide a training script using the launch utility provided by PyTorch.
```shell
./tools/dist_train.sh <CONFIG_FILE> <GPU_NUM> [optional arguments]
```
Supported arguments are:
- --validate: perform evaluation every k (default=1) epochs during the training.
- --work_dir <WORK_DIR>: if specified, the path in config file will be replaced.
Expected results in WORK_DIR:
- log file
- saved checkpoints (every k epochs, defaults=1)
- a symbol link to the latest checkpoint
**Important**: The default learning rate is for 8 GPUs. If you use less or more than 8 GPUs, you need to set the learning rate proportional to the GPU num. E.g., modify lr to 0.01 for 4 GPUs or 0.04 for 16 GPUs.
### Non-distributed training
Please refer to `tools/train.py` for non-distributed training, which is not recommended
and left for debugging. Even on a single machine, distributed training is preferred.
### Train on custom datasets
We define a simple annotation format.
The annotation of a dataset is a list of dict, each dict corresponds to an image.
There are 3 field `filename` (relative path), `width`, `height` for testing,
and an additional field `ann` for training. `ann` is also a dict containing at least 2 fields:
`bboxes` and `labels`, both of which are numpy arrays. Some datasets may provide
annotations like crowd/difficult/ignored bboxes, we use `bboxes_ignore` and `labels_ignore`
to cover them.
Here is an example.
```
[
{
'filename': 'a.jpg',
'width': 1280,
'height': 720,
'ann': {
'bboxes': <np.ndarray> (n, 4),
'labels': <np.ndarray> (n, ),
'bboxes_ignore': <np.ndarray> (k, 4),
'labels_ignore': <np.ndarray> (k, ) (optional field)
}
},
...
]
```
There are two ways to work with custom datasets.
- online conversion
You can write a new Dataset class inherited from `CustomDataset`, and overwrite two methods
`load_annotations(self, ann_file)` and `get_ann_info(self, idx)`, like [CocoDataset](mmdet/datasets/coco.py) and [VOCDataset](mmdet/datasets/voc.py).
- offline conversion
You can convert the annotation format to the expected format above and save it to
a pickle or json file, like [pascal_voc.py](tools/convert_datasets/pascal_voc.py).
Then you can simply use `CustomDataset`.
## Technical details
Some implementation details and project structures are described in the [technical details](TECHNICAL_DETAILS.md).
## Citation
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment