yolo in pytorch


a glance on conda, pytorch, yolo


linux version

(base) chenfh5@herbie:~$ cat /etc/lsb-release


wget https://mirrors.tuna.tsinghua.edu.cn/anaconda/archive/Anaconda3-2021.11-Linux-x86_64.sh --no-check-certificate
bash Anaconda3-2021.11-Linux-x86_64.sh

(base) chenfh5@herbie:~$ conda -V
conda 4.10.3

(base) chenfh5@herbie:~$ anaconda -V
anaconda Command line client (version 1.9.0)


conda create -n my_deeplearning python=3.9 numpy matplotlib pandas jupyter notebook

conda activate my_deeplearning
conda info -e
conda deactivate
conda remove -n my_deeplearning --all

(my_deeplearning) chenfh5@herbie:~$ nvidia-smi
Mon Nov 22 17:32:24 2021
| NVIDIA-SMI 455.32.00    Driver Version: 455.32.00    CUDA Version: 11.1     |

conda install pytorch torchvision torchaudio cudatoolkit=11.3

# verification
import torch as t
x = t.rand(5,3)
y = t.rand(5,3)
if t.cuda.is_available():
    x = x.cuda()
    y = y.cuda()
    print('cuda x+y={}'.format(x+y))

print('torch version={}'.format(t.__version__)) # 1.10.0
print('cuda is_available={}'.format(t.cuda.is_available())) # True


jupyter notebook --generate-config

(my_deeplearning) chenfh5@herbie:~/deployment/my_deeplearning/src$ ipython
Python 3.9.7 | packaged by conda-forge | (default, Sep 29 2021, 19:20:46)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.29.0 -- An enhanced Interactive Python. Type '?' for help.
In [1]: from notebook.auth import passwd
In [2]: passwd()
Enter password:
Verify password:
Out[2]: '***'

vim ~/.jupyter/jupyter_notebook_config.py
# append to jupyter_notebook_config.py
c.NotebookApp.ip = '*'
c.NotebookApp.port = 9091
c.NotebookApp.open_browser = False
c.NotebookApp.password = '???'
c.NotebookApp.notebook_dir = '/home/chenfh5/deployment/my_deeplearning/jupyter_data'

# start the jupyter server
conda activate my_deeplearning
jupyter notebook # adhoc
nohup jupyter notebook --allow-root > error.log & echo $!> jupyter_pid.txt # background

(my_deeplearning) chenfh5@herbie:~/deployment/my_deeplearning/jupyter_data$ vim ~/.jupyter/jupyter_notebook_config.py
(my_deeplearning) chenfh5@herbie:~/deployment/my_deeplearning/jupyter_data$ jupyter notebook
[W 09:40:59.582 NotebookApp] WARNING: The notebook server is listening on all IP addresses and not using encryption. This is not recommended.
[I 09:40:59.586 NotebookApp] Serving notebooks from local directory: /home/chenfh5/deployment/my_deeplearning/jupyter_data
[I 09:40:59.586 NotebookApp] Jupyter Notebook 6.4.6 is running at:
[I 09:40:59.586 NotebookApp] http://herbie:9091/
[I 09:40:59.586 NotebookApp] Use Control-C to stop this server and shut down all kernels (twice to skip confirmation).


jupyter pytorch verification snapshot



netstat -tulnp | grep 9093

conda create --name imagenet_simple python=3.7
conda activate imagenet_simple

/home/chenfh5/anaconda3/envs/imagenet_simple/bin/python3.7 -m pip install Flask torch torchvision
git clone --depth=1 https://github.com/avinassh/pytorch-flask-api.git
FLASK_ENV=development FLASK_APP=app flask run --host= --port=9093

curl -X POST -F file=@lena.png http://localhost:9093/predict
  "class_id": "n02869837",
  "class_name": "bonnet"


conda create --name imagenet_web python=3.9
conda activate imagenet_web

git clone --depth=1 https://github.com/avinassh/pytorch-flask-api-heroku.git
pip3 install Flask torch torchvision numpy Pillow gunicorn
FLASK_ENV=development FLASK_APP=app flask run --host= --port=9094



conda create --name imagenet_batch python=3.9
conda activate imagenet_batch

git clone --depth=1 https://github.com/ShannonAI/service-streamer.git
pip3 install torchvision pillow flask service_streamer

FLASK_ENV=development FLASK_APP=app FLASK_DEBUG=0 flask run --host= --port=9095

curl -F "file=@cat.jpg"
  "class_id": "n02123159",
  "class_name": "tiger_cat"

curl -F "file=@cat.jpg"
  "class_id": "n02123159",
  "class_name": "tiger_cat"

wrk -c 64 -d 20s --timeout=20s -s file.lua
Running 20s test @
  2 threads and 64 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency     1.93s   414.93ms   3.31s    88.07%
    Req/Sec    23.96     15.19    80.00     67.60%
  637 requests in 20.03s, 127.52KB read
Requests/sec:     31.81
Transfer/sec:      6.37KB

wrk -c 64 -d 20s --timeout=20s -s file.lua
Running 20s test @
  2 threads and 64 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency   754.17ms  372.05ms   2.21s    69.94%
    Req/Sec    82.83     93.43   310.00     79.70%
  1731 requests in 20.03s, 346.54KB read
Requests/sec:     86.42
Transfer/sec:     17.30KB


yolov5 custom training8,9,10,11

!git clone https://github.com/ultralytics/yolov5
!python train.py --img 640 --cfg yolov5s.yaml --hyp hyp.scratch.yaml --batch 32 --epochs 4 --data road_sign_data.yaml --weights yolov5s.pt --workers 24 --name my_yolov5_road_det --device 0,1,2,3
!python detect.py --source ../Road_Sign_Dataset/images/test/ --weights runs/train/my_yolov5_road_det/weights/best.pt --conf 0.25 --name my_yolov5_road_det --device 0

class Model(nn.Module) in yolo.py:
    1. load `yolov5s.yaml`
    2. parse yaml to model
    3. build stride, anchor
    4. init weight, bias

Overriding model.yaml nc=80 with nc=4
                 from  n    params  module                                  arguments                     
  0                -1  1      3520  models.common.Conv                      [3, 32, 6, 2, 2]              
  1                -1  1     18560  models.common.Conv                      [32, 64, 3, 2]                
  2                -1  1     18816  models.common.C3                        [64, 64, 1]                   
  3                -1  1     73984  models.common.Conv                      [64, 128, 3, 2]               
  4                -1  2    115712  models.common.C3                        [128, 128, 2]                 
  5                -1  1    295424  models.common.Conv                      [128, 256, 3, 2]              
  6                -1  3    625152  models.common.C3                        [256, 256, 3]                 
  7                -1  1   1180672  models.common.Conv                      [256, 512, 3, 2]              
  8                -1  1   1182720  models.common.C3                        [512, 512, 1]                 
  9                -1  1    656896  models.common.SPPF                      [512, 512, 5]                 
 10                -1  1    131584  models.common.Conv                      [512, 256, 1, 1]              
 11                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 12           [-1, 6]  1         0  models.common.Concat                    [1]                           
 13                -1  1    361984  models.common.C3                        [512, 256, 1, False]          
 14                -1  1     33024  models.common.Conv                      [256, 128, 1, 1]              
 15                -1  1         0  torch.nn.modules.upsampling.Upsample    [None, 2, 'nearest']          
 16           [-1, 4]  1         0  models.common.Concat                    [1]                           
 17                -1  1     90880  models.common.C3                        [256, 128, 1, False]          
 18                -1  1    147712  models.common.Conv                      [128, 128, 3, 2]              
 19          [-1, 14]  1         0  models.common.Concat                    [1]                           
 20                -1  1    296448  models.common.C3                        [256, 256, 1, False]          
 21                -1  1    590336  models.common.Conv                      [256, 256, 3, 2]              
 22          [-1, 10]  1         0  models.common.Concat                    [1]                           
 23                -1  1   1182720  models.common.C3                        [512, 512, 1, False]          
 24      [17, 20, 23]  1     24273  models.yolo.Detect                      [4, [[10, 13, 16, 30, 33, 23], [30, 61, 62, 45, 59, 119], [116, 90, 156, 198, 373, 326]], [128, 256, 512]]
Model Summary: 270 layers, 7030417 parameters, 7030417 gradients, 15.9 GFLOPs


class YoloNetV3(nn.Module) in model.py:

class YOLOv5(nn.Module) in yolo.py:

class Exp(BaseExp): in yolox_base.py:


  • yolov1: 7 * 7 = 49个预测
    • 7 = grid
    • 每一个预测是一个(2*(4+1)+20)=30维向量
    • 30 = 2个unsupervised bounding box * (4个边框坐标(xywh)+ 1边框置信度) + 20类对象coco20 image

      credit ref15

  • yolov2: 13 * 13 * 5 = 845个预测
    • 13 = 416*416图片/32倍下采样; 5 = kmean先验框anchor
    • 每一个预测是一个(4+1+20)=25维向量
    • 25 = 4个anchor边框坐标(xywh) + 1边框置信度 + 20类对象coco20 image

      credit ref16

  • yolov3: 13 * 13 * 3 + 26 * 26 * 3 + 52 * 52 * 3 = 10647个预测
    • 每一个预测是一个(4+1+80)=85维向量
    • 85 = 4个anchor边框坐标(xywh) + 1边框置信度 + 80类对象coco80 image

      credit ref17



因为一张图片可能有很多特征(e.g., color, texture, edge),所以可能需要学习/利用多个卷积核来提取多种图像特征

图中是4*4*4的conv, 不同颜色代表不同特征/通道/卷积权重


credit ref18

resNet24 vs. yolo13

# resnet
model = models.resnet18(pretrained=True).eval()
result = model.forward(image)
  -> class ResNet(nn.Module):
    -> ResNet.fc():
      -> fc = nn.Linear.forward()
        -> F.linear(input, self.weight, self.bias)
          -> `y = xA^T + b` (so return without box)

# yolo 
model = yolo.YOLOv5(80, img_sizes=672, score_thresh=0.3)
checkpoint = torch.load('yolov5s.pth')

results, losses = model.forward(images)
  -> class YOLOv5(nn.Module):
    -> yolov5.forward()
        -> features = yolov5.backbone()
        -> results = yolov5.head(features)
            -> class Head(nn.Module):
              -> head.forward()
                -> results = head.inference(features)
                    -> head.predictor()
                        -> class Predictor(nn.Module):
                          -> mlp
                    -> results.append(dict(boxes=box, labels=label, scores=score))

Custom traning model25

  1. download my_face_plate.zip(yolo format) frin google drive
  2. git clone draw-YOLO-box to validate the annotation information
  3. git clone yolov5
  4. add ./yolov5/data/my_face_plate.yaml
  5. run train.py with proper parameters26
    time python train.py --img 1120 --batch 64 --epochs 900 --data my_face_plate.yaml --weights yolov5s.pt --cache --project _mytrain/my_face_plate
    higher --epochs probably increase accuracy before overfit, one epoch = one full dataset used by forward and backward in the network
    higher --batch probably increase accuracy before overfit, how many images fetch in one training processing(update weights)
    higher --img can avoid small objects missing, a warning will raise when missing is high 
  6. run model.forward() to verify
  7. inference server setup


  1. 两行代码下载安装Anaconda
  2. conda配置清华源
  3. Mac下安装PyTorch
  4. 在Linux系统中安装深度学习框架Pytorch
  5. How to use multiple GPUs in pytorch?
  6. 搭建Jupyter Notebook远程云服务器
  7. 在linux服务器配置jupyter notebook, 并且配置不同的kernel和环境
  8. How to Train YOLO v5 on a Custom Dataset
  9. How to Train a YOLO v5 Object Detector on a Custom dataset
  10. Yolox之自有数据集训练
  11. my-yolov5-road.ipynb in colab
  12. yolov3 network
  13. yolov5 network
  14. yolox network
  15. YOLOv1 深入理解
  16. YOLOv2 深入理解
  17. YOLOv3 深入理解
  18. 卷积核进行降维和升维
  19. 目标检测Anchor是什么?
  20. 非极大值抑制(NMS)讲解
  21. backbone, head, neck等深度学习中的术语解释
  22. 数据增强, backbone, head, neck, 损失函数
  23. backbone, head, neck summary
  24. Inference_PyTorch
  25. yolov5 Train-Custom-Data
  26. [深度学习 三个概念:Epoch, Batch, Iteration](https://www.jianshu.com/p/22c50ded4cf7)