2020년 9월 3일 목요일

Xavier NX - Remote Python Development using VSCode

What kind of development tools do you use?
Do you connect the keyboard and monitor directly to Jetson series, or do you develop the solution from a remote computer?

I develop it through remote access on my Windows 10 laptop unless it's a special occasion.
In addition, I use development tools sometimes cli editors such as vi, nano, etc., but use Notepad++ or VSCode more frequently.
These GUI development tools provide an expansion pack for editing and managing files on remote computers. In particular, VSCode, Microsoft's free editing framework, allows you to modify source code as well as remote GUI debugging. I think many people felt the need for debugging while developing machine learning tools such as OpenCV, Tensorflow, PyTroch, and TensorRT. I also often felt that I needed the ability to test and debug while editing the source code on a remote computer. I think VSCode is probably the best development tool for this purpose.

Install VSCode on your laptop

VSCode can be installed on Windows, Mac, and Linux. It can even be installed in Jetson Nano. I use Windows 10, but Mac and Linux users will be able to use it the same way.

Laptop with VSCode (Windows or Mac)

Install VSCode Extensions on your laptop

Install Python Extensions on your laptop

Run VSCode first, then press the Extensions search button.

Then search for Python and install it. Python extension is essential for developing Python in VSCode.

Install Remote Extensions on your laptop

Then install 'Remote Development Extension' to edit the source code of the remote Jetson series. When Microsoft first introduced the Remote Development Extension, it supported only the ssh key exchange method for remote access. I once posted on October 17, 2019 under the title of Remote Python Debugging with VSCode. This article explains how to connect to a remote computer using the ssh key exchange method. This method is still valid. However, not only ssh key exchange but also id/ password authentication is possible now.

Tips : In October 17, 2019 writings, VSCode Insider was used. VSCode Insider is an experimental version of the latest features. We used VSCode Insider because it was only a short time after the Remote Development Extensions came out. However, the Remote Development Extensions are now fully stabilized and can be used in normal VSCodes, not in the Insider version.

Search for and install Remote Development. This extension installs Remote -WSL, Remote-Contains, and Remote-SSH.

Setup Remote Jetson SSH connection

On the October 17, 2019 blog, the only way to access a remote server using ssh is to use an ssh key. However, this method has several drawbacks.

If the remote server information is changed, a new ssh key must be created. I mainly work on the Raspberry Pi and NVidia Jetson series. These single board computers use SD cards as storage space. Therefore, the SD card image is frequently replaced. When the SD card is replaced, the previously created ssh keys are no longer valid. Therefore, id/password authentication is often convenient.

Since VSCode is now able to access ssh in the id/password method, this disadvantage is eliminated. Now, the config file for ssh connection in VSCode was created as follows.

First, boot the Jetson Xavier to make it accessible.

Old Method (ssh key exchange)

Run VSCode and press Ctrl + Shift + P to search for the extension command. Then select Remote-SSH: Connect to Host ... Then select C: \ Users \ Users.ssh \ config. And add the following line: Modify the IP, User, and Private Key routes as appropriate. Refer to https://tipspiggy.blogspot.com/2019/10/remote-python-debugging-with-vscode.html for key generation method.

# Read more about SSH config files: https://linux.die.net/man/5/ssh_config
Host JetsonXavierNX
    HostName 192.168.11.96
    User spypiggy
    IdentityFile ~/.ssh/jumpbox

For instructions on how to create a Jumpbox file, see https://tipspiggy.blogspot.com/2019/10/remote-python-debugging-with-vscode.html.

New Method (id/password)

# Read more about SSH config files: https://linux.die.net/man/5/ssh_config
Host JetsonXavierNX
    HostName 192.168.11.96
    User spypiggy

Now let's try new method.

press Ctrl + Shift + P to search for the "Remote-SSH:Connect to host ..." command. Then select your host that you just configured.

Then you will probably be prompted for the user's password. If you enter the correct password, you can connect to the remote host.

If you press the Explorer button and then the Open Folder button, you can specify the working directory of the remote Jetson Xavier.

Install extensions on the remote host

Remote debugging requires that you install an extension that enables remote debugging on the remote computer . Press the Extensions button or Ctrl + Shift + P and search Python again. Earlier we installed the Python module. This time we see a new "Install in SSH: JetsonXavierNX" button. This means installing the VSCode Python extension on the remote host Xavier NX. Clicking this button installs a remote extension for Python debugging on the Remote Xavier NX.

Tips : Installation of Python extensions on remote Xavier NX requires a Remote-SSH connection first..

Install Python Extensions on remote Xavier NX

If you have never installed the VSCode Python extension on a remote Xavier NX, the following screen will appear:

If you select the Python extension in the VSCode window connected to the remote host, options for the remote host appear. Click "Install in SSH: JetsonXavierNX" to install the Python extension on the remote host.

At the end of the installation, you can verify that the Python extension is installed on the remote Xavier NX as follows:

After installation, reload the VSCode workspace. You are now ready for Python remote debugging.

Perhaps you will be asked to select the Python version to be used by the remote host as follows:

Do not use Python 2. I am using Python 3 virtual environment in Xavier NX in ~/python directory. Therefore, you will choose the last ~/ python/bin.

Then select New Terminal from the terminal menu.

When the terminal opens, you can see that the Python virtual environment is automatically run.

Code completion with IntelliSense

Most users who are familiar with IDE environments such as MS Visual Studio love to use code completion features. Code auto-completion not only speeds up development, but also greatly improves productivity by dramatically reducing the amount of typographical errors. Although the Linux development environment has some of these code autocompletion features, unlike the IDE on Windows and Mac, they often have limited functionality or require a paid product. In particular, remote development environments rarely provide this functionality. Using a remote control tool such as VNC is far from the remote development environment described in this article.

After installing the remote Python debugging module, VSCode can now write Python code that works on the remote computer using autocomplete. It is a new and powerful feature that you will never experience using the VI editor on a SSH terminal.

To use the code autocomplete feature in OpenCV, ...

I'm using Python virtual environment on Xavier NX. Packages such as Tensorflow installed in a virtual environment do not have problems using autocomplete feature. But OpenCV has some problems.
JetPack 4.4 is provided with OpenCV 4.1 as standard. The location where this package is installed is as follows.

(python) spypiggy@XavierNX:~/src$ python
Python 3.6.9 (default, Jul 17 2020, 12:50:27)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
cv>>> cv2.__file__
'/usr/lib/python3.6/dist-packages/cv2/python-3.6/cv2.cpython-36m-aarch64-linux-gnu.so'

>>> import tensorflow as tf
2020-09-02 10:30:14.846743: I tensorflow/stream_executor/platform/default/dso_loader.cc:48] Successfully opened dynamic library libcudart.so.10.2
WARNING:tensorflow:Deprecation warnings have been disabled. Set TF_ENABLE_DEPRECATION_WARNINGS=1 to re-enable them.
>>> tf.__file__
'/home/spypiggy/python/lib/python3.6/site-packages/tensorflow/__init__.py'

Since Tensorflow was installed in a virtual environment, packages exist in /home/spypyggy/python/lib/python3.6/site-packages.
However, the OpenCV provided by JetPack4.4 has an installation location of /usr/lib/python3.6/dist-packages/cv2/python-3.6/.

To apply autocomplete to packages installed outside of a virtual environment, some of the configuration files in VSCode need to be modified. The .vscode directory is created in the remote host's connection directory, which is accessed using Remote-SSH in VSCode, as illustrated. And the settings.json file is automatically created.

<settings.json file in the .vscode directory>

To enable IntelliSense for packages that are installed in other, non-standard locations, add those locations to the python.autoComplete.extraPaths collection in the settings file (the default collection is empty). Modify json file as follows. As you can see from the json file, Python.autoComplete.extraPaths can add multiple paths.

{
    "python.pythonPath": "/home/spypiggy/python/bin/python",
    "python.autoComplete.extraPaths": [
        "/usr/lib/python3.6/dist-packages/cv2/python-3.6"]
}

If you run VSCode again and test it, you can see that the automatic code completion function works in OpenCV.

Details can be found in the Editing Python in Visual Studio Code.

Tips : The above is the case when Python virtual environment is used. If you do not use a virtual environment and use /usr/bin/python3 as the default Python interpreter, you do not need to modify the settings.json file described earlier. However, there will be times when it will be useful to know the autocomplete features of VSCode through settings.json file modifications.

Remote Debugging

You can now set breakpoints by hovering your mouse next to the left number in the Python source code. Create a breakpoint at the desired location. Then press the debugger button and then the start debugger button. When asked what to debug, select Python File. This will automatically run the Python file in the current window. The program automatically stops at the breakpoint you specify. At this point, you can check the variable values with the mouse and monitor the variable values step by step using the left window.

Wrapping Up

The Python debugging function provided by VSCode is fantastic. Especially on your laptop, you can connect the remote Jetson series to do Python debugging using a GUI.
In particular, it can dramatically improve productivity because it can check the value of objects at any time while executing machine learning Python code.
VSCode's remote Python code autocomplete feature and GUI debugging are available on both Jetson Nano, TX2, and Xavier NX with the JetPack series installed.

2020년 8월 27일 목요일

Xavier NX - OpenCV 4.5

Xavier NX only works with JetPack 4.4 or later. And JetPack 4.4 is provided with OpenCV 4.1.1 installed. In most cases, there is no problem with using OpenCV 4.1.1. However, from OpenCV 4.2+, Super Resolution function is provided as C/C++ API. And from 4.3, it provides Python API. And finally, in 4.4, CUDA GPU acceleration is available. Therefore, it is recommended to use version 4.4 or higher to fully use the Super Resolution function provided by OpenCV. Therefore, to use OpenCV's SuperResolution, you need to delete the OpenCV 4.1.1 version of JetPack 4.4 and install 4.4 newly.

And from OpenCV 4.2, the dnn module started supporting Nvidia GPUs. In previous versions, only the CPU was available. The Jetson series has NVidia GPUs, so if you are using OpenCV's dnn, it is recommended to upgrade to version 4.2 or higher. OpenCV version 4.2 or higher uses NVidia GPU to speed up the inference.

If you are not interested in OpenCV's SuperResolution feature or not using OpenCV dnn module, you do not necessarily need to upgrade to OpenCV 4.4 or 4.5.

Increase swap memory

When you build a large software packages like OpenCV on Jetson Nano, you may experience an out of memory phenomenon. Increasing the swap file size can prevent this malfunction. If you are using Jetson Xavier NX, you don't have to do this.

git clone https://github.com/JetsonHacksNano/installSwapfile
cd installSwapfile
./installSwapfile.sh

Above script file will increase 6GB swap files. You can change the swap file size by modifying the scripts. If you want to uninstall the swap setting, open the fstab file and delete the swap file line and reboot.

Installing OpenCV 4.5

The description from now on can also be applied to Jetson Nano and Jetson TX2 using JetPack 4.5.

One thing to note is -D CUDA_ARCH_BIN="7.2", which specifies CUDA Architecture in the process of creating Makefile using cmake. Nano, TX2, and Xavier have different values. These values are described in JetsonNano-Useful tips before using Nano.

Compute capability (version)	Micro-architecture	GPUs	Tegra,Jetson
3.2	Kepler	GK20A	Tegra K1, Jetson TK1
5.3	Maxwell	GM20B	Tegra X1, Jetson TX1, Jetson Nano
6.2	Pascal	GP10B	Tegra X2, Jetson TX2
7.2	Volta	GV10B	Tegra Xavier, Jetson Xavier NX, Jetson AGX Xavier

The following is a script file that can be easily installed.

#!/bin/bash
#
# Copyright (c) 2018, NVIDIA CORPORATION.  All rights reserved.
#
# NVIDIA Corporation and its licensors retain all intellectual property
# and proprietary rights in and to this software, related documentation
# and any modifications thereto.  Any use, reproduction, disclosure or
# distribution of this software and related documentation without an express
# license agreement from NVIDIA Corporation is strictly prohibited.
#

if [ "$#" -ne 1 ]; then
    echo "Usage: $0 <Install Folder>"
    exit
fi
folder="$1"
user="nvidia"
passwd="nvidia"

echo "** Remove OpenCV4.1 first"
sudo apt-get purge *libopencv*

echo "** Install requirement"
sudo apt-get update
sudo apt-get install -y build-essential cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
sudo apt-get install -y libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev
sudo apt-get install -y python2.7-dev python3.6-dev python-dev python-numpy python3-numpy
sudo apt-get install -y libtbb2 libtbb-dev libjpeg-dev libpng-dev libtiff-dev libjasper-dev libdc1394-22-dev
sudo apt-get install -y libv4l-dev v4l-utils qv4l2 v4l2ucp
sudo apt-get install -y curl
sudo apt-get update

echo "** Download opencv-4.5.1"
cd $folder
curl -L https://github.com/opencv/opencv/archive/4.5.1.zip -o opencv-4.5.1.zip
curl -L https://github.com/opencv/opencv_contrib/archive/4.5.1.zip -o opencv_contrib-4.5.1.zip
unzip opencv-4.5.1.zip 
unzip opencv_contrib-4.5.1.zip 
cd opencv-4.5.1/

echo "** Building..."
mkdir release
cd release/
cmake -D WITH_CUDA=ON -D ENABLE_PRECOMPILED_HEADERS=OFF  -D CUDA_ARCH_BIN="7.2" -D CUDA_ARCH_PTX="" -D OPENCV_EXTRA_MODULES_PATH=../../opencv_contrib-4.4.0/modules -D WITH_GSTREAMER=ON -D WITH_LIBV4L=ON -D BUILD_opencv_python2=ON -D BUILD_opencv_python3=ON -D BUILD_TESTS=OFF -D BUILD_PERF_TESTS=OFF -D BUILD_EXAMPLES=OFF -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local ..
make -j6
sudo make install

echo "** Install opencv-4.5.1 successfully"
echo "** Bye :)"

<opencv4.5_xavier_nx.sh>

I put the script file on my github. And since the script that changed CUDA_ARCH_BIN value for Jetson Nano and TX2 is also uploaded, you can use the script suitable for the Jetson series you are using.

Now run the script

spypiggy@XavierNX:~/src$ ./opencv4.5_xavier_nx.sh /home/spypiggy/src/

The script file needs a parameter for source code download directory. The script file takes a lot of time to build OpenCV 4.5. If you are not the root user, you may be asked to enter a password to obtain sudo privileges in the middle.

spypiggy@XavierNX:~/src$ python3
Python 3.6.9 (default, Jul 17 2020, 12:50:27)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> from cv2 import dnn_superres
>>> cv2.__version__
'4.5.1'

If the installation is finished without error, you can check OpenCV 4.5 as follows. It can be seen that dnn_superres for Super Resolution is also normally imported.

Wrapping Up

It is not difficult to upgrade OpenCV to OpenCV 4.5 on JetPack 4.5. Upgrade is possible only with the script file introduced earlier. If you want to use the new features of Ver 4.5 or need to use CUDA, you can upgrade OpenCV to 4.5.

To install OpenCV 4.5, refer to "JetPack 4.5-OpenCV 4.1 vs. OpenCV 4.5".

2020년 8월 17일 월요일

Xavier NX-DeepStream 5.0 #5 - Running YOLOv4 on DeepStream 5.0(Experimental test)

I do not recommend applying YOLOv4 to DeepStream 5.0. The following article is an experimental test of YOLOv4. The YOLO versions currently officially supported by DeepStream 5.0 are 2 and 3. Using VOLOV3 is the easiest.

So far, I have mainly used primary detector using resnet10, caffe model. This model recognizes four things: people, cars, bicycles, and road signs. YoloV4 is claims to have state-of-the-art accuracy while maintains a high processing frame rate. It achieves an accuracy of 43.5% AP (65.7% AP₅₀) for the MS COCO with an approximately 65 FPS inference speed on Tesla V100. You can find more information on YOLOv4 on YOLOv4 github .

I have described in my previous blog the installation and implementation of YOLOv4 in Jetson Nano. These contents can be applied without change to Xavier NX.

Prerequisites

Several blogs have explained how to implement DeepStream using Python in Xavier NX. Since DeepStream can be used in all Jetson series using JetPack, the contents described here can be used in Jetson Nano, TX2, and Xavier. This time, we will cover the necessary contents to apply the DeepStream installed and tested earlier to your application. Please be sure to read the preceding articles first.

DeepStream 5.0 only supports VOLOV3. Therefore, supporting YOLOV4 in DeepStream 5.0 requires some work.

To create the TensorRT model required by DeepStream, the following steps are required.

Convert the Yolov4 Darknet model to standard ONNX.
Convert ONNX model to TensorRT model.

Create YOLOV4 TensorRT Model

First, download YOLOv4 from Tianxiaomo' github, which is implemented as a PyTorch model and provides conversion scripts.

#First activate python virtual environment
spypiggy@XavierNX:~$ source /home/spypiggy/python/bin/activate
(python) spypiggy@XavierNX:~$ cd src
#Download PyTorch version of YOLOv4 to converting ONNX model, TensorRT model
(python) spypiggy@XavierNX:~/src$ git clone https://github.com/Tianxiaomo/pytorch-YOLOv4.git
(python) spypiggy@XavierNX:~/src$ cd pytorch-YOLOv4
(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ pip3 install onnxruntime
(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ wget https://github.com/AlexeyAB/darknet/releases/rknet_yolo_v3_optimal/yolov4.weights

Convert DarkNet Model to ONNX Model

Demo_darknet2onnx in the phytorch-YOLOV4 directory.Convert to ONNX model using py file. This command creates a new yolov4_1_3_608_608_static.onnx file.

(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ python3 demo_darknet2onnx.py ./cfg/yolov4.cfg yolov4.weights ./data/giraffe.jpg 1

Convert ONNX Model to TensorRT Model

If the yolov4_1_3_608_608_608_static.onnx file is created properly, the next step is to convert this model to a TensorRT model. This process takes a lot of time. You can have a coffee break.

(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ /usr/src/tensorrt/bin/trtexec --onnx=yolov4_1_3_608_608_static.onnx --explicitBatch --saveEngine=yolov4_1_3_608_608_fp16.engine --workspace=4096 --fp16

(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ ls -al
total 639520
drwxrwxr-x  7 spypiggy spypiggy      4096 Aug 16 22:49 .
drwxr-xr-x 13 spypiggy spypiggy      4096 Aug 16 10:44 ..
drwxrwxr-x  2 spypiggy spypiggy      4096 Aug 16 21:50 cfg
-rw-rw-r--  1 spypiggy spypiggy      1573 Aug 16 10:44 cfg.py
drwxrwxr-x  2 spypiggy spypiggy      4096 Aug 16 10:44 data
-rw-rw-r--  1 spypiggy spypiggy     16988 Aug 16 10:44 dataset.py
drwxrwxr-x  3 spypiggy spypiggy      4096 Aug 16 11:42 DeepStream
-rw-rw-r--  1 spypiggy spypiggy      2223 Aug 16 10:44 demo_darknet2onnx.py
-rw-rw-r--  1 spypiggy spypiggy      4588 Aug 16 10:44 demo.py
-rw-rw-r--  1 spypiggy spypiggy      3456 Aug 16 10:44 demo_pytorch2onnx.py
-rw-rw-r--  1 spypiggy spypiggy      2974 Aug 16 10:44 demo_tensorflow.py
-rw-rw-r--  1 spypiggy spypiggy      7191 Aug 16 10:44 demo_trt.py
-rw-rw-r--  1 spypiggy spypiggy     11814 Aug 16 10:44 evaluate_on_coco.py
drwxrwxr-x  8 spypiggy spypiggy      4096 Aug 16 10:44 .git
-rw-rw-r--  1 spypiggy spypiggy       131 Aug 16 10:44 .gitignore
-rw-rw-r--  1 spypiggy spypiggy     11560 Aug 16 10:44 License.txt
-rw-rw-r--  1 spypiggy spypiggy     17061 Aug 16 10:44 models.py
-rw-rw-r--  1 spypiggy spypiggy    237646 Aug 16 10:57 predictions_onnx.jpg
-rw-rw-r--  1 spypiggy spypiggy     10566 Aug 16 10:44 README.md
-rw-rw-r--  1 spypiggy spypiggy       158 Aug 16 10:44 requirements.txt
drwxrwxr-x  4 spypiggy spypiggy      4096 Aug 16 10:55 tool
-rw-rw-r--  1 spypiggy spypiggy     28042 Aug 16 10:44 train.py
-rw-rw-r--  1 spypiggy spypiggy      3111 Aug 16 10:44 Use_yolov4_to_train_your_own_data.md
-rw-rw-r--  1 spypiggy spypiggy 138693567 Aug 16 12:58 yolov4_1_3_608_608_fp16.engine
-rw-rw-r--  1 spypiggy spypiggy 258030612 Aug 16 10:57 yolov4_1_3_608_608_static.onnx
-rw-rw-r--  1 spypiggy spypiggy 257717640 Apr 27 08:35 yolov4.weights

If no errors have occurred, you can check the newly created yolov4_1_3_608_608_fp16. engine, yolov4_1_3_608_608_static.onnx file. I will use the model yolov4_1_3_608_608_fp16. engine model made for TensorRT.

Rebuild objectDetector_Yolo

DeepStream provides plug-ins that can use YOLOv2, v3. We need to rebuild this code after some modification to use YOLOv4. The source code location is /opt/nvidia/deepstream/sources/objectDetector_Yolo. The previously downloaded https://github.com/Tianxiaomo/pytorch-YOLOv4 also provides a source code that has modified YOLOV4 to be used in DeepStream 5.0. Anything is fine, but I will modify the source code of the directory where DeepStream 5.0 was installed. The two source codes are essentially the same and I used most of the Tianxiao code. However, only a few changes have been made to maintain the coding style with the existing file.

If you wan to use the original source code, modify the /opt/nvidia/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/nvdsparsebbox_Yolo.cpp.

If you want to use the Tianxiaomo's github code, build without changes in the /home/spypiggy/src/pytorch-YOLOv4/DeepStream/nvdsinfer_custom_impl_Yolo/ directory.

You can download the source code(nvdsparsebbox_Yolo.cpp) at https://github.com/raspberry-pi-maker/NVIDIA-Jetson/tree/master/DeepStream 5.0.

Makefile to be used for build requires environment variable CUDA_VER. Because we are using JetPack 4.4, the CUDA version is 10.2.
Therefore, export the environment variable CUDA_VER before building as follows.

spypiggy@XavierNX:~$cd /opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo
spypiggy@XavierNX:/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo$ export CUDA_VER=10.2
#Before build, copy the nvdsparsebbox_Yolo.cpp from https://github.com/raspberry-pi-maker/NVIDIA-Jetson/tree/master/DeepStream 5.0
spypiggy@XavierNX:/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo$ sudo make clean
spypiggy@XavierNX:/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo$ sudo make
spypiggy@XavierNX:/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo$ ls -al
total 1912
drwxr-xr-x 2 root root   4096 Aug 16 21:21 .
drwxr-xr-x 3 root root   4096 Aug  6 08:12 ..
-rw-r--r-- 1 root root   3373 Jul 27 04:19 kernels.cu
-rw-r--r-- 1 root root  16760 Aug 16 11:16 kernels.o
-rwxr-xr-x 1 root root 786176 Aug 16 21:17 libnvdsinfer_custom_impl_Yolo.so
-rw-r--r-- 1 root root   2319 Jul 27 04:19 Makefile
-rw-r--r-- 1 root root   4101 Jul 27 04:19 nvdsinfer_yolo_engine.cpp
-rw-r--r-- 1 root root  14600 Aug 16 11:16 nvdsinfer_yolo_engine.o
-rw-r--r-- 1 root root  20867 Aug 16 21:16 nvdsparsebbox_Yolo.cpp
-rw-r--r-- 1 root root 270344 Aug 16 21:17 nvdsparsebbox_Yolo.o
-rw-r--r-- 1 root root  16571 Jul 27 04:19 trt_utils.cpp
-rw-r--r-- 1 root root   3449 Jul 27 04:19 trt_utils.h
-rw-r--r-- 1 root root 208176 Aug 16 11:16 trt_utils.o
-rw-r--r-- 1 root root  20099 Jul 27 04:19 yolo.cpp
-rw-r--r-- 1 root root   3242 Jul 27 04:19 yolo.h
-rw-r--r-- 1 root root 498632 Aug 16 11:16 yolo.o
-rw-r--r-- 1 root root   3961 Jul 27 04:19 yoloPlugins.cpp
-rw-r--r-- 1 root root   5345 Jul 27 04:19 yoloPlugins.h
-rw-r--r-- 1 root root  38024 Aug 16 11:16 yoloPlugins.o

If the libnvdsinfer_custom_impl_Yolo.so file is newly created, it is a success. This file will later be used critically in the pipeline of DeepStream 5.0.

Configuration Files

/home/spypiggy/src/pytorch-YOLOV4/DeepStream
You can find two configuration files in the directory. It is necessary to modify the path of the model file in this file.

spypiggy@XavierNX:~/src/pytorch-YOLOv4/DeepStream$ pwd
/home/spypiggy/src/pytorch-YOLOv4/DeepStream
spypiggy@XavierNX:~/src/pytorch-YOLOv4/DeepStream$ ls -al
total 28
drwxrwxr-x 3 spypiggy spypiggy 4096 Aug 16 11:42 .
drwxrwxr-x 7 spypiggy spypiggy 4096 Aug 16 22:49 ..
-rw-rw-r-- 1 spypiggy spypiggy 3680 Aug 16 21:25 config_infer_primary_yoloV4.txt
-rw-rw-r-- 1 spypiggy spypiggy 4095 Aug 16 21:48 deepstream_app_config_yoloV4.txt
-rw-rw-r-- 1 spypiggy spypiggy  621 Aug 16 10:44 labels.txt
drwxrwxr-x 2 spypiggy spypiggy 4096 Aug 16 22:50 nvdsinfer_custom_impl_Yolo
-rw-rw-r-- 1 spypiggy spypiggy  504 Aug 16 10:44 Readme.md

The following setup files have been routed to suit my environment. If your installation path is different, please correct it accordingly. I used sink0, sink1 in the dipstream_app_config_yoloV4.txt file. sink0 is for screen output, sink1 is for file output.

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
#custom-network-config=/home/spypiggy/src/pytorch-YOLOv4/cfg/yolov4.cfg
# model-file=yolov3-tiny.weights
model-engine-file=/home/spypiggy/src/pytorch-YOLOv4/yolov4_1_3_608_608_fp16.engine
labelfile-path=/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/labels.txt
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
gie-unique-id=1
network-type=0
#is-classifier=0
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=4
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV4
custom-lib-path=/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
nms-iou-threshold=0.6
pre-cluster-threshold=0.4

<config_infer_primary_yoloV4.txt>

[application]
enable-perf-measurement=1
perf-measurement-interval-sec=5
#gie-kitti-output-dir=streamscl

[tiled-display]
enable=0
rows=1
columns=1
width=1280
height=720
gpu-id=0
#(0): nvbuf-mem-default - Default memory allocated, specific to particular platform
#(1): nvbuf-mem-cuda-pinned - Allocate Pinned/Host cuda memory, applicable for Tesla
#(2): nvbuf-mem-cuda-device - Allocate Device cuda memory, applicable for Tesla
#(3): nvbuf-mem-cuda-unified - Allocate Unified cuda memory, applicable for Tesla
#(4): nvbuf-mem-surface-array - Allocate Surface Array memory, applicable for Jetson
nvbuf-memory-type=0

[source0]
enable=1
#Type - 1=CameraV4L2 2=URI 3=MultiURI
type=3
uri=file:/opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_1080p_h264.mp4
#uri=file:/opt/nvidia/deepstream/deepstream-5.0/samples/streams/sample_720p.h264
num-sources=1
gpu-id=0
# (0): memtype_device   - Memory type Device
# (1): memtype_pinned   - Memory type Host Pinned
# (2): memtype_unified  - Memory type Unified
cudadec-memtype=0

#For Screen Output
[sink0]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=2
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0

#For File Output
[sink1]
enable=1
#Type - 1=FakeSink 2=EglSink 3=File
type=3
sync=0
source-id=0
gpu-id=0
nvbuf-memory-type=0
#1=mp4 2=mkv
container=1
#1=h264 2=h265
codec=1
output-file=yolov4.mp4

[osd]
enable=1
gpu-id=0
border-width=1
text-size=12
text-color=1;1;1;1;
text-bg-color=0.3;0.3;0.3;1
font=Serif
show-clock=0
clock-x-offset=800
clock-y-offset=820
clock-text-size=12
clock-color=1;0;0;0
nvbuf-memory-type=0

[streammux]
gpu-id=0
##Boolean property to inform muxer that sources are live
live-source=0
batch-size=1
##time out in usec, to wait after the first buffer is available
##to push the batch even if the complete batch is not formed
batched-push-timeout=40000
## Set muxer output width and height
width=1280
height=720
##Enable to maintain aspect ratio wrt source, and allow black borders, works
##along with width, height properties
enable-padding=0
nvbuf-memory-type=0

# config-file property is mandatory for any gie section.
# Other properties are optional and if set will override the properties set in
# the infer config file.
[primary-gie]
enable=1
gpu-id=0
model-engine-file=/home/spypiggy/src/pytorch-YOLOv4/yolov4_1_3_608_608_fp16.engine
labelfile-path=labels.txt
#batch-size=1
#Required by the app for OSD, not a plugin property
bbox-border-color0=1;0;0;1
bbox-border-color1=0;1;1;1
bbox-border-color2=0;0;1;1
bbox-border-color3=0;1;0;1
interval=0
gie-unique-id=1
nvbuf-memory-type=0
config-file=/home/spypiggy/src/pytorch-YOLOv4/DeepStream/config_infer_primary_yoloV4.txt

[tracker]
enable=0
tracker-width=512
tracker-height=320
ll-lib-file=/opt/nvidia/deepstream/deepstream-5.0/lib/libnvds_mot_klt.so

[tests]
file-loop=0

<deepstream_app_config_yoloV4.txt>

Test YOLOV4 in DeepStream

All the preparations for testing YOLOV4 in DeepStream are complete. You can now test it to work properly with the dipstream-app command.

Run the YOLOv4 in DeepStream 5.0.

spypiggy@XavierNX:~/src/pytorch-YOLOv4$ deepstream-app -c ./Deepstream/deepStream_app_config_yoloV4.txt

I can see that it works properly. And when the program ends, you will see that the yolov4.mp4 file is also created.

Tips : If you look at the picture above, duplicate recognition problems occur. This problem can be solved by properly raising the threshold values in the configuration file. Threshold values should be determined by testing, but it is easy to determine between 0.6 and 0.9.

[class-attrs-all]
nms-iou-threshold=0.7
pre-cluster-threshold=0.7

Applying YOLOV4 tiny model in DeepStream

After changing the YOLOV4 DarkNet model to the ONNX model, work on the tiny model as the sequence in which the YOLOV4 DarkNet model was changed back to the TensorRT model.

(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v4_pre/yolov4-tiny.weights

(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ python3 demo_darknet2onnx.py ./cfg/yolov4-tiny.cfg yolov4-tiny.weights ./data/giraffe.jpg 1

The above command will probably generate the file yolov4_1_3_416_416_static.onnx. Now convert yolov4_1_3_416_416_static.onnx file to TensorRT model.

(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ /usr/src/tensorrt/bin/trtexec --onnx=yolov4_1_3_416_416_static.onnx --explicitBatch --saveEngine=yolov4_1_3_416_416_fp16.engine --workspace=4096 --fp16

The above command will probably generate the file yolov4_1_3_416_416_static.engine.

Then create confguration file for tiny model. The python programs make their own pipelines. Therefore, the previously used dipstream_app_config_yoloV4.txt file is no longer required.

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
#custom-network-config=/home/spypiggy/src/pytorch-YOLOv4/cfg/yolov4-tiny.cfg
# model-file=yolov3-tiny.weights
model-engine-file=/home/spypiggy/src/pytorch-YOLOv4/yolov4_1_3_416_416_fp16.engine
labelfile-path=/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/labels.txt
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=2
num-detected-classes=80
gie-unique-id=1
network-type=0
#is-classifier=0
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=4
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV4
custom-lib-path=/opt/nvidia/deepstream/deepstream-5.0/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
nms-iou-threshold=0.6
pre-cluster-threshold=0.4

<config_infer_primary_yoloV4_tiny.txt>

Run the YOLOv4-tiny in DeepStream 5.0.

spypiggy@XavierNX:~/src/pytorch-YOLOv4$ deepstream-app -c ./Deepstream/deepStream_app_config_yoloV4_tiny.txt

You will feel much faster processing speed than YOLOV4. Instead, the accuracy is slightly lower than that of YOLOv4.

DeepStream YOLOv4 Python implementation

Earlier I tested YOLOV4 using 'deepstream-app' program provided by DeepStream. But my ultimate goal is to implement YOLOV4 in the Python program. And as in previous examples, the goal is to implement the perceived outcome using the probe function.

In the previous blog Xavier NX-DeepStream 5.0 #1-Installation, if you look at the directory of deepstream_python_apps that we installed, there is deepstream_python_apps/apps/common. You need to copy this directory or add the deepstream_python_apps/apps/common directory to your path in your Python code using the sys.path.append() function. The example introduced below is a modified example of /home/spypiggy/src/deepstream_python_apps/apps/deepstream-test1/deepstream_test_1.py.

#!/usr/bin/env python3

################################################################################
# Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
#
# Permission is hereby granted, free of charge, to any person obtaining a
# copy of this software and associated documentation files (the "Software"),
# to deal in the Software without restriction, including without limitation
# the rights to use, copy, modify, merge, publish, distribute, sublicense,
# and/or sell copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
# THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
# DEALINGS IN THE SOFTWARE.
################################################################################

import sys, time
#To use common functions, you should add this path.  
sys.path.append('/home/spypiggy/src/deepstream_python_apps/apps')
import gi
gi.require_version('Gst', '1.0')
from gi.repository import GObject, Gst
from common.is_aarch_64 import is_aarch64
from common.bus_call import bus_call
import pyds

start = time.time()

def osd_sink_pad_buffer_probe(pad,info,u_data):
    global start
    frame_number=0

    num_rects=0

    gst_buffer = info.get_buffer()
    if not gst_buffer:
        print("Unable to get GstBuffer ")
        return

    batch_meta = pyds.gst_buffer_get_nvds_batch_meta(hash(gst_buffer))
    l_frame = batch_meta.frame_meta_list
    while l_frame is not None:
        now = time.time()
        try:
            frame_meta = pyds.NvDsFrameMeta.cast(l_frame.data)
        except StopIteration:
            break

        frame_number=frame_meta.frame_num
        num_rects = frame_meta.num_obj_meta
        l_obj=frame_meta.obj_meta_list
        while l_obj is not None:
            try:
                obj_meta=pyds.NvDsObjectMeta.cast(l_obj.data)
                print('class_id={}'.format(obj_meta.class_id))
                print('object_id={}'.format(obj_meta.object_id))
                print('obj_label={}'.format(obj_meta.obj_label))

                print('    rect_params height={}'.format(obj_meta.rect_params.height))
                print('    rect_params left={}'.format(obj_meta.rect_params.left))
                print('    rect_params top={}'.format(obj_meta.rect_params.top))
                print('    rect_params width={}'.format(obj_meta.rect_params.width))

            except StopIteration:
                break

            obj_meta.rect_params.border_color.set(0.0, 1.0, 1.0, 0.0)    # It seems that only the alpha channel is not working. (red, green, blue , alpha)
            try: 
                l_obj=l_obj.next
            except StopIteration:
                break

        display_meta=pyds.nvds_acquire_display_meta_from_pool(batch_meta)
        display_meta.num_labels = 1
        py_nvosd_text_params = display_meta.text_params[0]
        py_nvosd_text_params.display_text = "Frame Number={} Number of Objects={} FPS={}".format(frame_number, num_rects, (1 / (now - start)))

        py_nvosd_text_params.x_offset = 10
        py_nvosd_text_params.y_offset = 12

        py_nvosd_text_params.font_params.font_name = "/usr/share/fonts/truetype/dejavu/DejaVuSans-Bold.ttf"
        py_nvosd_text_params.font_params.font_size = 20
        py_nvosd_text_params.font_params.font_color.set(0.2, 0.2, 1.0, 1) # (red, green, blue , alpha)
        py_nvosd_text_params.set_bg_clr = 1
        py_nvosd_text_params.text_bg_clr.set(0.2, 0.2, 0.2, 0.3)

        pyds.nvds_add_display_meta_to_frame(frame_meta, display_meta)
        try:
            l_frame=l_frame.next
        except StopIteration:
            break
   
        start = now     
        
    return Gst.PadProbeReturn.OK    #DROP, HANDLED, OK, PASS, REMOVE 	


def main(args):
    # Check input arguments
    if len(args) != 2:
        sys.stderr.write("usage: %s <media file or uri>\n" % args[0])
        sys.exit(1)

    GObject.threads_init()
    Gst.init(None)
    print("Creating Pipeline \n ")
    pipeline = Gst.Pipeline()

    if not pipeline:
        sys.stderr.write(" Unable to create Pipeline \n")

    source = Gst.ElementFactory.make("filesrc", "file-source")
    if not source:
        sys.stderr.write(" Unable to create Source \n")
    h264parser = Gst.ElementFactory.make("h264parse", "h264-parser")
    if not h264parser:
        sys.stderr.write(" Unable to create h264 parser \n")
    decoder = Gst.ElementFactory.make("nvv4l2decoder", "nvv4l2-decoder")
    if not decoder:
        sys.stderr.write(" Unable to create Nvv4l2 Decoder \n")
    streammux = Gst.ElementFactory.make("nvstreammux", "Stream-muxer")
    if not streammux:
        sys.stderr.write(" Unable to create NvStreamMux \n")
    pgie = Gst.ElementFactory.make("nvinfer", "primary-inference")
    if not pgie:
        sys.stderr.write(" Unable to create pgie \n")

    nvvidconv = Gst.ElementFactory.make("nvvideoconvert", "convertor")
    if not nvvidconv:
        sys.stderr.write(" Unable to create nvvidconv \n")
    nvosd = Gst.ElementFactory.make("nvdsosd", "onscreendisplay")

    if not nvosd:
        sys.stderr.write(" Unable to create nvosd \n")
    if is_aarch64():
        transform = Gst.ElementFactory.make("nvegltransform", "nvegl-transform")
    sink = Gst.ElementFactory.make("nveglglessink", "nvvideo-renderer")
    if not sink:
        sys.stderr.write(" Unable to create egl sink \n")
    source.set_property('location', args[1])
    streammux.set_property('width', 1920)
    streammux.set_property('height', 1080)
    streammux.set_property('batch-size', 1)
    streammux.set_property('batched-push-timeout', 4000000)
    pgie.set_property('config-file-path', "DeepStream/config_infer_primary_yoloV4_tiny.txt")

    pipeline.add(source)
    pipeline.add(h264parser)
    pipeline.add(decoder)
    pipeline.add(streammux)
    pipeline.add(pgie)
    pipeline.add(nvvidconv)
    pipeline.add(nvosd)
    pipeline.add(sink)
    if is_aarch64():
        pipeline.add(transform)

    print("Linking elements in the Pipeline \n")
    source.link(h264parser)
    h264parser.link(decoder)

    sinkpad = streammux.get_request_pad("sink_0")
    if not sinkpad:
        sys.stderr.write(" Unable to get the sink pad of streammux \n")
    srcpad = decoder.get_static_pad("src")
    if not srcpad:
        sys.stderr.write(" Unable to get source pad of decoder \n")
    srcpad.link(sinkpad)
    streammux.link(pgie)
    pgie.link(nvvidconv)
    nvvidconv.link(nvosd)
    if is_aarch64():
        nvosd.link(transform)
        transform.link(sink)
    else:
        nvosd.link(sink)
    loop = GObject.MainLoop()
    bus = pipeline.get_bus()
    bus.add_signal_watch()
    bus.connect ("message", bus_call, loop)
    osdsinkpad = nvosd.get_static_pad("sink")
    if not osdsinkpad:
        sys.stderr.write(" Unable to get sink pad of nvosd \n")

    osdsinkpad.add_probe(Gst.PadProbeType.BUFFER, osd_sink_pad_buffer_probe, 0)

    pipeline.set_state(Gst.State.PLAYING)
    try:
        loop.run()
    except:
        pass
    # cleanup
    pipeline.set_state(Gst.State.NULL)

if __name__ == '__main__':
    sys.exit(main(sys.argv))

<demo_yolo.py>

If you run the code:

(python) spypiggy@XavierNX:~/src/pytorch-YOLOv4$ python demo_yolo.py /opt/nvidia/deepstream/deepstream/samples/streams/sample_720p.h264

......
......

class_id=2
object_id=18446744073709551615
obj_label=car
    rect_params height=29.0625
    rect_params left=553.125
    rect_params top=471.5625
    rect_params width=33.75
class_id=0
object_id=18446744073709551615
obj_label=person
    rect_params height=62.34375
    rect_params left=446.953125
    rect_params top=488.4375
    rect_params width=23.4375
class_id=0
object_id=18446744073709551615
obj_label=person
    rect_params height=70.078125
    rect_params left=413.90625
    rect_params top=478.828125
    rect_params width=20.15625
class_id=0
object_id=18446744073709551615
obj_label=person
    rect_params height=63.75
    rect_params left=427.5
    rect_params top=486.5625
    rect_params width=24.375
class_id=2
object_id=18446744073709551615
obj_label=car
    rect_params height=107.0877914428711
    rect_params left=634.0028686523438
    rect_params top=477.3771667480469
    rect_params width=134.88897705078125

......
......

class_id is the value of label.txt. Therefore, it is noted that it is different from the class_id used by the primary detector resnet10.caffemodel tested in the previous blog. While resnet10.caffemodel recognizes 4 classes, YOLO recognizes 80 classes.

At first glance, it seems to work well without any problems. However, the output box coordinate values including the class_id value in the probe function are not updated at all. For this reason, I also excluded the object statistics from the upper text display.

Under the Hood

If you intend to use the YOLOv3 model, do it in the following order:

Download the YOLOv3 models

I will use the previous /home/spypiggy/src/pyritorch-YOLOV4 as the working directory.
Download the weight file and cfg file for YOLOv3.

spypiggy@XavierNX:~/src/pytorch-YOLOv4$ wget https://pjreddie.com/media/files/yolov3.weights
spypiggy@XavierNX:~/src/pytorch-YOLOv4$ wget https://pjreddie.com/media/files/yolov3-tiny.weights
spypiggy@XavierNX:~/src/pytorch-YOLOv4$ cd cfg
spypiggy@XavierNX:~/src/pytorch-YOLOv4/cfg$ wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3.cfg 
spypiggy@XavierNX:~/src/pytorch-YOLOv4/cfg$ wget https://raw.githubusercontent.com/AlexeyAB/darknet/master/cfg/yolov3-tiny.cfg

Prepare the configuration files

Copy the config file from /opt/nvidia/deepstream/deepstream-5.0/source/objectDetector_Yolo directory to your working directory. If you use Python to construct a pipeline, you don't need the file dipstream_app_config_XXX.txt. Copy only config_infer_primary_yoloV3.txt, config_infer_primary_tyoloV3_txt file. Correct the path name of the weight and cfg files, and the label files.

These are my configuration files.

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=/home/spypiggy/src/pytorch-YOLOv4/cfg/yolov3.cfg
model-file=/home/spypiggy/src/pytorch-YOLOv4/yolov3.weights
#model-engine-file=yolov3_b1_gpu0_int8.engine
labelfile-path=/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/labels.txt
int8-calib-file=/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/yolov3-calibration.table.trt7.0
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=80
gie-unique-id=1
# 0:Detector, 1:Classifier, 2:Segmentation
network-type=0
#is-classifier=0
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3
custom-lib-path=/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
nms-iou-threshold=0.7
threshold=0.7

<config_infer_primary_tyoloV3_txt>

[property]
gpu-id=0
net-scale-factor=0.0039215697906911373
#0=RGB, 1=BGR
model-color-format=0
custom-network-config=/home/spypiggy/src/pytorch-YOLOv4/cfg/yolov3-tiny.cfg
model-file=/home/spypiggy/src/pytorch-YOLOv4/yolov3-tiny.weights
#model-engine-file=yolov3-tiny_b1_gpu0_fp32.engine
labelfile-path=/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/labels.txt
## 0=FP32, 1=INT8, 2=FP16 mode
network-mode=0
num-detected-classes=80
gie-unique-id=1
# 0:Detector, 1:Classifier, 2:Segmentation
network-type=0
#is-classifier=0
## 0=Group Rectangles, 1=DBSCAN, 2=NMS, 3= DBSCAN+NMS Hybrid, 4 = None(No clustering)
cluster-mode=2
maintain-aspect-ratio=1
parse-bbox-func-name=NvDsInferParseCustomYoloV3Tiny
custom-lib-path=/opt/nvidia/deepstream/deepstream/sources/objectDetector_Yolo/nvdsinfer_custom_impl_Yolo/libnvdsinfer_custom_impl_Yolo.so
engine-create-func-name=NvDsInferYoloCudaEngineGet
#scaling-filter=0
#scaling-compute-hw=0

[class-attrs-all]
nms-iou-threshold=0.7
threshold=0.7

<config_infer_primary_tyoloV3_tiny_txt>

Now, in the demo_yolo.py Python code, you can replace the configuration file of the nvinfer element with the desired YOLOv3 file. In YOLOv3, the model file was used without converting to the ONNX -> TensorRT model. However, DeepStream internally undergoes a tensorRT conversion process. Because of this process, the initial loading time is a little longer. When you run YOLOV3, there is no significant difference in execution speed and accuracy compared to YOLOV4.

Wrapping UP

It is recommended to use YOLOv3, which is officially supported by DeepStream 5.0. I know that NVidia will support YOLOv4 in the future.