2020년 7월 24일 금요일

Jetson Xavier NX - Human Pose estimation using OpenPose


OpenPose(https://github.com/CMU-Perceptual-Computing-Lab/openpose) is one of the most popular pose estimation framework. I have explained in the previous blog how to install and use OpenPose on Jetson Nano. The performance of OpenPose on the Jetson Nano was not very satisfactory. The performance of 0.6 to 0.8 FPS is insufficient for commercialization. I always use 10FPS as my criterion. I think Pose Estimation models are practical if they can perform near 10 FPS. The Jetson Xavier NX outperforms the Jetson Nano. Compared to the Jetson TX2, it performs much better. So, as soon as I got the Xavier NX a few days ago, I decided to try OpenPose again.

Warning : July 20, 2020, To use OpenPose on Xavier NX, use the JetPack 4.4 DP (Developer Preview) version instead of JetPack Production Release. The reason is explained in the text below.

cmake version check

To build Openpose on the Jetson Xavier NX, you should have cmake version 3.12.2 or higher. First check the cmake version. Xavier NX using Jetpack 4.4 has a cmake version of 3.10.2.
spypiggy@XavierNX:~/src$ cmake --version
cmake version 3.10.2

If your Jetson Xavier NX's cmake version is lower than 3.12.2, remove the old cmake and rebuild from source codes. Check the latest version at https://github.com/Kitware/CMake/releases . At this time(2020.07.23), Ver 3.18.0 is the latest cmake verison.
Change the Power Mode to 2 to build the Xavier NX's six cores to the fullest. Details on the power mode of the Xavier NX are explained at https://spyjetson.blogspot.com/2020/07/jetson-xavier-nx-jetpack-44production.html.

spypiggy@XavierNX:~$ sudo apt-get install libssl-dev libcurl4-openssl-dev
spypiggy@XavierNX:~$ sudo apt-get remove cmake
spypiggy@XavierNX:~$ cd ~/src
spypiggy@XavierNX:~/src$ wget https://github.com/Kitware/CMake/releases/download/v3.18.0/cmake-3.18.0.tar.gz
spypiggy@XavierNX:~/src$ tar -xvzf cmake-3.18.0.tar.gz
spypiggy@XavierNX:~/src$ cd cmake-3.18.0
spypiggy@XavierNX:~/src/cmake-3.18.0$ sudo nvpmodel -m 2
spypiggy@XavierNX:~/src/
cmake-3.18.0$ ./bootstrap
spypiggy@XavierNX:~/src/cmake-3.18.0$ make -j6 spypiggy@XavierNX:~/src/cmake-3.18.0$ sudo make install

Then restart your ssh session.

Install OpenPose

Installation is not difficult. Follow these steps. OpenPose uses caffe for it's deep learning framework. These steps will install caffe framework too. Don't forget to do final steps to use python.

JetPack 4.4(Developer Preview) Build

If you build OpenPose on JetPack 4.3 and Jetson Nano in the previous article, there is no problem, but the build on JetPack 4.4 gives an error during cmake process. If you proceed as below, there is a problem that cannot find the cudnn version. This is because cudnn's version up has changed the version of the header file to be checked.

Warning : In previous versions, the CUDNN MAJOR value was defined in the /usr/include/cudnn.h file. However, in JetPack 4.4 this definition has been moved to the /usr/include/cudnn_version.h file. This is why an error occurs. OpenPose(To be precise, Caffe) has to fix this error, but right now we have to fix it and use it.

If it is Jetson Nano, change CUDA_ARCH_BIN value to 5.3 in the cmake option below. The content about cuda architecture  is  available at https://en.wikipedia.org/wiki/CUDA.

spypiggy@XavierNX:~$ cd ~/src
spypiggy@XavierNX:~/src$ git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose.git
spypiggy@XavierNX:~/src$ cd openpose
spypiggy@XavierNX:~/src/openpose$ sudo bash ./scripts/ubuntu/install_deps.sh
spypiggy@XavierNX:~/src/openpose$ mkdir build
spypiggy@XavierNX:~/src/openpose$ cd build
spypiggy@XavierNX:~/src/openpose/build$ sudo cmake -D CMAKE_INSTALL_PREFIX=/usr/local \
-D CUDA_HOST_COMPILER=/usr/bin/cc \
-D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda \
-D CUDA_USE_STATIC_CUDA_RUNTIME=ON \
-D CUDA_rt_LIBRARY=/usr/lib/aarch64-linux-gnu/librt.so \
-D CUDA_ARCH_BIN=7.2 \
-D GPU_MODE=CUDA \
-D DOWNLOAD_FACE_MODEL=ON \
-D DOWNLOAD_COCO_MODEL=ON \
-D USE_OPENCV=ON \
-D BUILD_PYTHON=ON \
-D BUILD_EXAMPLES=ON \
-D BUILD_DOCS=OFF \
-D DOWNLOAD_HAND_MODEL=ON ..

...
...
-- Building with CUDA. -- CUDA detected: 10.2 -- Found cuDNN: ver. ??? found (include: /usr/include, library: /usr/lib/aarch64-linux-gnu/libcudnn.so) CMake Error at cmake/Cuda.cmake:263 (message): cuDNN version >3 is required. Call Stack (most recent call first): cmake/Cuda.cmake:291 (detect_cuDNN) CMakeLists.txt:422 (include)

Replace the string "cudnn.h" in the files Cuda.cmake and FindCuDNN.cmake with "cudnn_version.h" using the following "sed" command:

spypiggy@XavierNX:~/src/openpose/build$ sed -i -e 's/cudnn.h/cudnn_version.h/g' ../cmake/Cuda.cmake
spypiggy@XavierNX:~/src/openpose/build$ sed -i -e 's/cudnn.h/cudnn_version.h/g' ../cmake/Modules/FindCuDNN.cmake

Now run the cmake command again. You can see that the cmake command works.


spypiggy@XavierNX:~/src/openpose/build$ sudo cmake -D CMAKE_INSTALL_PREFIX=/usr/local \ -D CUDA_HOST_COMPILER=/usr/bin/cc \ -D CUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda \ -D CUDA_USE_STATIC_CUDA_RUNTIME=ON \ -D CUDA_rt_LIBRARY=/usr/lib/aarch64-linux-gnu/librt.so \ -D GPU_MODE=CUDA \ -D DOWNLOAD_FACE_MODEL=ON \ -D DOWNLOAD_COCO_MODEL=ON \ -D USE_OPENCV=ON \ -D BUILD_PYTHON=ON \ -D BUILD_EXAMPLES=ON \ -D BUILD_DOCS=OFF \ -D DOWNLOAD_HAND_MODEL=ON ..
...
...

-- Building with CUDA. -- CUDA detected: 10.2 -- Found cuDNN: ver. 8.0.0 found (include: /usr/include, library: /usr/lib/aarch64-linux-gnu/libcudnn.so) -- Added CUDA NVCC flags for: sm_72 -- Found cuDNN: ver. 8.0.0 found (include: /usr/include, library: /usr/lib/aarch64-linux-gnu/libcudnn.so) -- Found GFlags: /usr/include -- Found gflags (include: /usr/include, library: /usr/lib/aarch64-linux-gnu/libgflags.so) -- Found Glog: /usr/include -- Found glog (include: /usr/include, library: /usr/lib/aarch64-linux-gnu/libglog.so) -- Found Protobuf: /usr/lib/aarch64-linux-gnu/libprotobuf.so;-lpthread (found version "3.0.0") -- Found OpenCV: /usr (found version "4.1.1")
...
...
-- Models Downloaded. -- Configuring done -- Generating done -- Build files have been written to: /home/spypiggy/src/openpose/build


spypiggy@XavierNX:~/src/openpose/build$ sudo sed -i -e 's/cudnn.h/cudnn_version.h/g' ../3rdparty/caffe/cmake/Cuda.cmake
spypiggy@XavierNX:~/src/openpose
/build$ sudo make -j6 spypiggy@XavierNX:~/src/openpose/build$ sudo make install #==== python build ==== 
# don't do make install command. because it installs openpose module to python 2.7 directory
spypiggy@XavierNX:~/src/openpose/build$ cd python
spypiggy@XavierNX:~/src/openpose/build/python$ make -j6


JetPack 4.4(Production Release) Build

In the JetPack 4.4 Production Release's CuDNN , some of the existing functions and definitions are no longer supported. This is not abrupt, and it is already announced by NVidia. The Caffe framework, which is installed together during the OpenPose build process, uses some of the functions and definitions of the lower version of CuDNN. Therefore, if you follow the description in JetPack 4.4 DP version, an error occurs like this. JetPack 4.4 DP, Production Release both have the same CuDNN version 8.0.0. But the actual CuDNN seems to be different.

/home/spypiggy/src/openpose/3rdparty/caffe/src/caffe/layers/cudnn_conv_layer.cpp:136:7: error: CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT was not declared in this scope
       CUDNN_CONVOLUTION_FWD_SPECIFY_WORKSPACE_LIMIT,
/home/spypiggy/src/openpose/3rdparty/caffe/src/caffe/layers/cudnn_conv_layer.cpp:131:17: error: there are no arguments to cudnnGetConvolutionForwardAlgorithm that depend on a template parameter, so a declaration of cudnnGetConvolutionForwardAlgorithm must be available [-fpermissive]
     CUDNN_CHECK(cudnnGetConvolutionForwardAlgorithm(handle_[0],
/home/spypiggy/src/openpose/3rdparty/caffe/src/caffe/layers/cudnn_conv_layer.cpp:151:11: error: CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT was not declared in this scope
           CUDNN_CONVOLUTION_BWD_FILTER_SPECIFY_WORKSPACE_LIMIT,
......
<Error messages when building OpenPose on the JetPack 4.4 Production Release>

This is described at https://forums.developer.nvidia.com/t/jetpack-4-4-l4t-r32-4-3-production-release/140870/4. Therefore, until caffe supports CuDNN 8.0.0 of JetPack 4.4 Production Release,you have to  build OpenPose using JetPack 4.4 DP version. If you want to build OpenPose in JetPack 4.4 Production Release, you may need to lower the version of JetPack 4.4 Production Release CuDNN or use a version of Caffe fork that supports JetPack 4.4 Production Release CuDNN . It is not easy.

Python import path Problem

Before you run python samples, first check the openpose python file. Yes it's under the "openpose installation directory/build/python/openpose".


(python) spypiggy@XavierNX:~/src/openpose/build/python/openpose$ ls -al
total 360
drwxr-xr-x 3 spypiggy spypiggy   4096 Jul 24 04:02 .
drwxr-xr-x 4 spypiggy spypiggy   4096 Jul 24 04:02 ..
drwxr-xr-x 3 spypiggy spypiggy   4096 Jul 24 04:02 CMakeFiles
-rw-r--r-- 1 spypiggy spypiggy   3140 Jul 24 03:15 cmake_install.cmake
-rw-rw-r-- 1 spypiggy spypiggy     39 Jul 24 03:15 __init__.py
-rw-r--r-- 1 spypiggy spypiggy   8248 Jul 24 03:15 Makefile
-rwxr-xr-x 1 spypiggy spypiggy 332296 Jul 24 04:02 pyopenpose.cpython-36m-aarch64-linux-gnu.so


You can check python import path using sys.path command. As you can see there's no openpose python directory. So you must add openpose python directory to the sys.path


(python) spypiggy@XavierNX:~/src/openpose/build/python/openpose$ python3
Python 3.6.9 (default, Jul 17 2020, 12:50:27)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> sys.path
['', '/usr/lib/python36.zip', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynload', '/home/spypiggy/python/lib/python3.6/site-packages', '/home/spypiggy/.local/lib/python3.6/site-packages', '/usr/lib/python3.6/site-packages', '/usr/local/lib/python3.6/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.6/dist-packages']

It's time to copy our OpenPose python modules to python3.6 path directory(/usr/lib/python3.6/dist-packages). Then you can import openpose without path problem.

sudo cp -r ~/src/openpose/build/python/openpose/ /usr/lib/python3.6/dist-packages

Let's test your openpose python library.

(python) spypiggy@XavierNX:~/src/openpose/build/python/openpose$ python3
Python 3.6.9 (default, Jul 17 2020, 12:50:27)
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import openpose
>>> from openpose import pyopenpose as op
>>>

It's a successful installation.

Under the hood

Now let's dig deeper.

Run a sample program

Let's run a sample program to test whether the OpenPose is properly installed.

spypiggy@XavierNX:~/src/openpose$ ./build/examples/openpose/openpose.bin --video ./examples/media/video.avi
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.

If you see this output, it's a successfull installation.


This sample program records 1.5 fps with video.avi video clip. Although the Jetson Nano is higher than the 0.6FPS, it is still not satisfactory. However, since this screen uses X.11 forwarding, you can see improved results by connecting the monitor directly to the Xavier NX and testing it.


Video file Test

I prefer remote working with ssh. Connecting a keyboard mouse and monitor to the Jetson Xavier NX is cumbersome. I use X.11 forwarding for simple GUI output, but I don't use X.11 as much as possible because video output is a heavy load for streaming. If streaming takes a long time, the FPS value of OpenPose that processes video frames may be distorted. Therefore, when using remote ssh, video file output is preferred over video screen output.
This example also reads every frame of the video file, records the FPS value on the screen, and stores it in a new video file. If you replace the camera that reads the video file, it will process the real-time video frame.
This program receives the video file name and inference image size as input parameters.

import sys
import time
import math
import cv2
import numpy as np
from openpose import pyopenpose as op
#import pyopenpose as op
import argparse

parser = argparse.ArgumentParser(description="OpenPose Example")
parser.add_argument("--video", type=str, required = True, help="video file name")
parser.add_argument("--res", type=str, default = "640x480", help="video file resolution")
args = parser.parse_args()
res = args.res.split('x')
res[0], res[1] = int(res[0]), int(res[1])
 

if __name__ == '__main__':
    fps_time = 0

    params = dict()
    params["model_folder"] = "/home/spypiggy/src/openpose/models/"
    params["net_resolution"] = args.res

    # Starting OpenPose
    opWrapper = op.WrapperPython()
    opWrapper.configure(params)
    opWrapper.start()


    print("OpenPose start")
    cap = cv2.VideoCapture(args.video)

    ret_val, img = cap.read()
    fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
    #out_video = cv2.VideoWriter('/tmp/output.mp4', fourcc, cap.get(cv2.CAP_PROP_FPS), (640, 480))
    out_video = cv2.VideoWriter('/tmp/output.mp4', fourcc, cap.get(cv2.CAP_PROP_FPS), (res[0], res[1]))

    count = 0
    t_netfps_time = 0
    t_fps_time = 0
    if cap is None:
        print("Video[%s] Open Error"%(args.video))
        sys.exit(0)
    while cap.isOpened():
        ret_val, dst = cap.read()
        if ret_val == False:
            print("Frame read End")
            break
        dst = cv2.resize(dst, dsize=(res[0], res[1]), interpolation=cv2.INTER_AREA)    

        datum = op.Datum()
        datum.cvInputData = dst
        net_fps = time.time()
        opWrapper.emplaceAndPop([datum])
        fps = 1.0 / (time.time() - fps_time)
        netfps = 1.0 / (time.time() - net_fps)
        t_netfps_time += netfps
        t_fps_time += fps
        
        fps_time = time.time()
        newImage = datum.cvOutputData[:, :, :]
        cv2.putText(newImage , "FPS: %f" % (fps), (20, 40),  cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
        out_video.write(newImage)

        print("captured fps[%f] net_fps[%f]"%(fps, netfps))
        #cv2.imshow("OpenPose 1.5.1 - Tutorial Python API", newImage)
        count += 1
    
    print("==== Summary ====")
    print("Inference Size : %s"%(args.res))
    if count:
        print("avg fps[%f] avg net_fps[%f]"%(t_fps_time / count, t_netfps_time / count))

    cv2.destroyAllWindows()
    out_video.release()
    cap.release()
<video_openpose.py>

In the Python virtual environment, I work as follows:
It can be seen that the performance of OpenPose decreases as the inference size increases.


(python) spypiggy@XavierNX:~/src/openpose/build/examples/tutorial_api_python$ python3 video_openpose.py --video='../../../examples/media/video.avi' --res='160x128'
....
....
captured fps[3.666109] net_fps[3.869309]
captured fps[3.616160] net_fps[3.809790]
captured fps[3.650942] net_fps[3.851781]
captured fps[3.646393] net_fps[3.845392]
Frame read End
==== Summary ====
Inference Size : 160x128
avg fps[3.567190] avg net_fps[3.790678]


(python) spypiggy@XavierNX:~/src/openpose/build/examples/tutorial_api_python$ python3 video_openpose.py --video='../../../examples/media/video.avi' --res='320x256'
....
....
captured fps[3.139003] net_fps[3.349005]
captured fps[3.115451] net_fps[3.318620]
captured fps[3.142029] net_fps[3.355768]
Frame read End
==== Summary ====
Inference Size : 320x256
avg fps[3.097217] avg net_fps[3.306334]

(python) spypiggy@XavierNX:~/src/openpose/build/examples/tutorial_api_python$ python3 video_openpose.py --video='../../../examples/media/video.avi' --res='640x480'
....
....
captured fps[1.524550] net_fps[1.592164]
captured fps[1.523916] net_fps[1.598005]
captured fps[1.529618] net_fps[1.599179]
Frame read End
==== Summary ====
Inference Size : 640x480
avg fps[1.514937] avg net_fps[1.590585]

And the following figure is the frame extracted from the /tmp/output.mp4 file left by video_openpose.py.

</tmp/output.mp4 using 640x480 inference size >

Tips: There are many Python example files in the /home/spypiggy/src/openpose/build/examples/tutorial_api_python directory, including faces, hands, and heat pipes.

How to interpret Keypoints recognized by OpenPose is explained at https://spyjetson.blogspot.com/2019/10/jetsonnano-human-pose-estimation-using.html.

Wrapping up

OpenPose is a highly accurate KeyPoint Detection framework and can be installed on various platforms where Caffe can be installed.
I previously tested OpenPose on the Jetson Nano, and the FPS value was lower than 1. In spite of its high accuracy,  it is difficult to apply because the processing speed is too low.
If you adjust the inference size properly, you can achieve about 4 FPS in OpenPose on the Xavier NX

The biggest problem is that it doesn't compile in JetPack 4.4 Production release. Currently, it is only possible to build properly in JetPack 4.4 DP version.















댓글 3개:

  1. Amazing stuff! Please update us when Openpose is able to compile on Jetpack 4.4 production release!

    답글삭제
  2. Nice to meet you. Thanks for the great idea.
    I've just installed OpenPose on my Jetson Xavier NX using your article as a reference. However, when I start OpenPose, the USB camera does not start and the PC stops. Is there anything I can expect to find to cause this?

    답글삭제
    답글

    1. First check whether the USB webcam is recognized properly.
      The lsusb command might check the webcam status.
      In my other blog (https://spyjetson.blogspot.com/2020/02/webcam-search-for-supported-resolutions.html), I explained about this topic.
      And I think it's a good idea to test only Webcam without using OpenPose.
      And be sure to use JetPack 4.4 DP version, not production release to use OpenPose.

      삭제