From 251c429005eccbbc1b372ad458a3fc7dc21895d6 Mon Sep 17 00:00:00 2001 From: Kai Chen <chenkaidev@gmail.com> Date: Tue, 16 Oct 2018 14:39:05 +0800 Subject: [PATCH] update readme with some training notes --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index bef3ad5..0d18f53 100644 --- a/README.md +++ b/README.md @@ -172,9 +172,6 @@ for i, result in enumerate(inference_detector(model, imgs, cfg, device='cuda:0') mmdetection implements distributed training and non-distributed training, which uses `MMDistributedDataParallel` and `MMDataParallel` respectively. -We suggest using distributed training even on a single machine, which is faster, -and non-distributed training are left for debugging or other purposes. - ### Distributed training mmdetection potentially supports multiple launch methods, e.g., PyTorch’s built-in launch utility, slurm and MPI. @@ -202,6 +199,9 @@ Expected results in WORK_DIR: - saved checkpoints (every k epochs, defaults=1) - a symbol link to the latest checkpoint +> **Note** +> 1. We recommend using distributed training with NCCL2 even on a single machine, which is faster. Non-distributed training is for debugging or other purposes. +> 2. The default learning rate is for 8 GPUs. If you use less or more than 8 GPUs, you need to set the learning rate proportional to the GPU num. E.g., modify lr to 0.01 for 4 GPUs or 0.04 for 16 GPUs. ## Technical details -- GitLab