NVIDIA Jetson and Raspberry Pi: JetsonTX2 - Human Pose estimation using OpenPose

I used Jetson TX2, Ubuntu 18.04 Official image with root account. I have explained unsing OpenPose on the Jetson Nano in my previous article(https://spyjetson.blogspot.com/2019/10/jetsonnano-human-pose-estimation-using.html)
This article is quite similar to that. Because the CPU architectures of both devices are same and both devices have the same OS(Ubuntu 18.04).
When using the Jetson Nano, I used /usr/local/src directory as my main work directory.
But using Jetson TX2, I'm using /work as my main working directory. Because I've installed a SSD on my Jetson TX2, and mounted SSD to /work. SSD installation on the TX2 guide is here(https://spyjetson.blogspot.com/2019/11/jetson-tx2-install-ssd.html)

Prerequisites

Before you build OpenPose, you must pre install these packages on the Jetson TX2. See the URLs.

apt-get update -y
apt-get upgrade -y
apt-get install libboost-dev libboost-all-dev
apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev libatlas-base-dev liblmdb-dev libblas-dev libatlas-base-dev libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler

Next, upgrade OpenCV to 4.1x

OpenCV installation page: https://spyjetson.blogspot.com/2019/11/jetson-tx2-opencv-411-upgrade.html

OpenCV head file location check

If you upgraded OpenCV to 4.1.1, check the header files location. In my case, thsey are in the "/usr/local/include/opencv4/opencv2". It's not correct. I moved opencv2 directory to "/usr/local/include".

mv -r /usr/local/include/opencv4/opencv2 /usr/local/include/

cmake version check

To build Openpose on the Jetson TX2, you should have cmake version 3.12.2 or higher. First check the cmake version.

cmake --version

If your Jetson TX2's cmake version is lower than 3.12.2, remove the old cmake and rebuild from source codes. Visit the https://github.com/Kitware/CMake/releases/ and check the latest version first, and download it.(2019.11 latest version:3.15.5)
For cmake supporting HTTPS, you have to build cmake with HTTPS support option.
Run bootstrap with --system-curl options.

apt-get perge cmake

apt-get install libcurl4 libcurl4-openssl-dev -y

cd /work/src
wget https://github.com/Kitware/CMake/releases/download/v3.15.5/cmake-3.15.5.tar.gz
tar -xvzf cmake-3.15.5.tar.gz
cd cmake-3.15.5
./bootstrap --system-curl
make -j4
make install

Then restart your ssh session.

Install OpenPose

Follow these steps. OpenPose uses caffe framework for it's deep learning. These steps will install caffe framework too. Don't forget to do final steps to use python.

cd /work/src
git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose
cd openpose
bash ./scripts/ubuntu/install_deps.sh
mkdir build
cd build
cmake -D CUDA_ARCH_BIN=62 -D BUILD_PYTHON=ON -D BUILD_EXAMPLES=OFF  ..
make -j4
make install
#==== python build ====

# don't do make install command. because it installs openpose module to python 2.7 directory

cd python
make -j4

make install

Error occured

When cmake runs, I got an error message like this.

./include/caffe/util/cudnn.hpp:15:2: error: #error "NVCaffe 0.16 and higher requires CuDNN version 6.0.0 or higher"

After spending a lot of time, the cause was found. Cudnn version was not defined. Why?
I traced the cudnn.h file. And I found that the cudnn_v7.h file sie is zero. I don't know the reason. I'm using Jetpack 4.2.2 and used SDKManager to install the libraries.

This commands shows symbolic link tracing.

spytx@spytx-desktop:/usr/include$ ls -al cudnn*
lrwxrwxrwx 1 root root 26 11월  4 11:18 cudnn.h -> /etc/alternatives/libcudnn
spytx@spytx-desktop:/usr/include$ ls -al /etc/alternatives/libcudnn
lrwxrwxrwx 1 root root 41 11월  4 11:18 /etc/alternatives/libcudnn -> /usr/include/aarch64-linux-gnu/cudnn_v7.h
spytx@spytx-desktop:/usr/include$ ls -al /usr/include/aarch64-linux-gnu/cudnn_v7.h
-rw-r--r-- 1 root root 0 11월  4 11:18 /usr/include/aarch64-linux-gnu/cudnn_v7.h

Avoiding errors

I had to find the right file but I can't find arm version of libcudnn7-dev package file because this library is probably managed by SDKmanager.

I downloaded a libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64.deb file instead at https://developer.nvidia.com/rdp/cudnn-archive (Download cuDNN v7.5.0 (Feb 21, 2019), for CUDA 10.0).

I don't install the deb file, but extract the files in the deb file only. Ugly method....

>ar x libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64.deb
>ls -al
-rw-r--r--  1 root root       852 11월  7 22:32 control.tar.xz
-rw-r--r--  1 root root 140166104 11월  7 22:32 data.tar.x

>tar -xvf data.tar.xz
./
./usr/
./usr/include/
./usr/include/x86_64-linux-gnu/
./usr/include/x86_64-linux-gnu/cudnn_v7.h
./usr/lib/
./usr/lib/x86_64-linux-gnu/
./usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
./usr/share/
./usr/share/doc/
./usr/share/doc/libcudnn7-dev/
./usr/share/doc/libcudnn7-dev/changelog.Debian.gz
./usr/share/doc/libcudnn7-dev/copyright
./usr/share/lintian/
./usr/share/lintian/overrides/
./usr/share/lintian/overrides/libcudnn7-dev

Then I checked the cudnn_v7.h file whether it has some CPU dependency code. Fortunately there was no CPU dependent code. I copied the header file to my TX2's /usr/include/aarch64-linux-gnu directory. This cudnn_v7.h file is in my repo(https://github.com/raspberry-pi-maker/NVIDIA-Jetson/tree/master/openpose-TX2/missing%20cudnn%20header)

Then I restart the build process. The result is successful.

Under the hood

I explained how to avoid problems with OpenPose in Python, how to interpret keypoints, etc in my article(JetsonNano - Human Pose estimation using OpenPose). Therefore, I will focus here on the performance comparison between the Nano and TX2.

To avoid the import path problem, first copy the openpose python packages to the known path(/usr/lib/python3.6/dist-packages)

cp -r /work/src/openpose/build/python/openpose/ /usr/lib/python3.6/dist-packages

Run a sample program

Let's run a sample program to test whether the OpenPose is properly installed.

root@spytx-desktop:/work/src/openpose# ./build/examples/openpose/openpose.bin --video ./examples/media/video.avi
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.

If you see this output, it's a successfull installation.

TX2 vs Nano

When I tested on the Nano, the fps(frames per second) value ia 0.6, so the TX2 shows 2.5 times performance.

Camera test

Let's test a TX2 to compare the performance of Nano. Because the Jetson TX2 has a built in camera, you don't have to use a webcam. In python code, the Jetson TX2's built in CSI camera is used in a slightly different way. In order to capture there are two ways: Using v4l2 or using nvcamerasrc(nvarguscamerasrc). The nvarguscamerasrc plugin was created by nvidia and it has access to the ISP that helps converting from bayer to yuv suitable for the video encoders. Starting on L4T R23.2 there is a /dev/video0 node to capture, however, this node will give you frames in bayer which are NOT suitable to encode because it grabs frames directly from the ov5693 camera without using the ISP.
Therefore, in most cases, the nvcamerasrc(nvarguscamerasrc) method is used.

The nvcamerasrc plugin is deprecated from r31.1. Above r31.1, use nvarguscamerasrc plugin instead.

Run this command to check the release version.

root@spytx-desktop:~# head -n 1 /etc/nv_tegra_release
# R32 (release), REVISION: 2.1, GCID: 16294929, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 13 04:45:36 UTC 2019

The result revision information of above command is "R32.2.1".

import logging
import sys
import time
import math
import cv2
import numpy as np
from openpose import pyopenpose as op

if __name__ == '__main__':
    fps_time = 0

    params = dict()
    params["model_folder"] = "../../models/"

    # Starting OpenPose
    opWrapper = op.WrapperPython()
    opWrapper.configure(params)
    opWrapper.start()


    print("OpenPose start")
    cap = cv2.VideoCapture("nvarguscamerasrc ! video/x-raw(memory:NVMM), width=(int)640, height=(int)480,format=(string)NV12, framerate=(fraction)24/1 ! nvvidconv flip-method=0 ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink")
    #cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    #cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
      
    ret_val, img = cap.read()
    fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
    out_video = cv2.VideoWriter('/tmp/output.mp4', fourcc, cap.get(cv2.CAP_PROP_FPS), (640, 480))
    
    count = 0

    if cap is None:
        print("Camera Open Error")
        sys.exit(0)
    while cap.isOpened() and count < 30:
        ret_val, dst = cap.read()
        if ret_val == False:
            print("Camera read Error")
            break
        #dst = cv2.resize(image, dsize=(320, 240), interpolation=cv2.INTER_AREA)
        #cv2.imshow("OpenPose 1.5.1 - Tutorial Python API", dst)
        #continue

        datum = op.Datum()
        datum.cvInputData = dst
        opWrapper.emplaceAndPop([datum])
        fps = 1.0 / (time.time() - fps_time)
        fps_time = time.time()
        newImage = datum.cvOutputData[:, :, :]
        cv2.putText(newImage , "FPS: %f" % (fps), (20, 40),  cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
        out_video.write(newImage)

        print("captured fps %f"%(fps))
        cv2.imshow("OpenPose 1.5.1 - Tutorial Python API", newImage)
        count += 1


    cv2.destroyAllWindows()        
    out_video.release()
    cap.release()

You can get the source with the following command:

git clone https://github.com/raspberry-pi-maker/NVIDIA-Jetson.git

Copy the python file run_cam_tx2.py to "/work/src/openpose/examples/tutorial_api_python" directory. and run it.

cd /work/src/openpose/examples/tutorial_api_python
python3 run_cam_tx2.py

TX2 vs Nano

When I tested on the Nano, the fps(frames per second) value ia 0.8, so the TX2 shows 2.5 times performance.

Using keypoints

See my other article on analyzing keypoints and predicting their actual behavior in Python. (https://spyjetson.blogspot.com/2019/10/jetsonnano-human-pose-estimation-using.html)

Wrapping up

When using OpenPose, TX2 showed a 2.5x performance improvement over Nano. The results are not satisfactory, but if you lower the input video resolution, You can increase the 2fps value, so I think it can be used to some extent in actual projects.

If you want the most satisfactory human pose estimation performance on Jetson Series, see the following article(https://spyjetson.blogspot.com/2019/12/jetsonnano-human-pose-estimation-using.html). NVIDIA team introduces human pose estimation using models optimized for TensorRT.

댓글 8개:

공돌이2019년 11월 14일 오후 11:32
안녕하세요 포스트 잘 보고 있습니다.
TX2 보드에 openpose를 사용하는 프로젝트를 진행중인 3학년 학생입니다. 진행 중에 어려움이 있어서
혹시 관련 지식으로 도움을 좀 받을 수 있을까요?
답글삭제
답글
spypiggy2019년 11월 15일 오전 10:53
저도 Jetson을 다룬지 얼마되지 않아서 도움이 될런지 모르겠습니다.
어떻게 도와드리면 되죠?
도움이 필요한 부분을 메일로 보내주시면 제가 가능한 범위에서는 답변을 드리겠습니다.
답글삭제
답글
영선2020년 8월 14일 오후 9:47
안녕하세요.TX2 보드에 openpose를 사용하는 프로젝트를 진행중인 학생입니다. 실시간으로 처리하려해서 속도가 어느정도 나와주어야하는데 다운로드 후 1.5 fps 밖에 안나오더라구요. 혹시 도움을 받을 수 있을까요?
답글삭제
답글
Salva2021년 5월 7일 오후 8:56
Hi, after following your tutorial, I can't run the example because openpose.bin is missing.
More people is having this issue: https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/719
Can you please fix that?
Thanks in advance.
답글삭제
답글

댓글 추가

2019년 11월 7일 목요일

JetsonTX2 - Human Pose estimation using OpenPose

Prerequisites

OpenCV head file location check

cmake version check

To build Openpose on the Jetson TX2, you should have cmake version 3.12.2 or higher. First check the cmake version.

Install OpenPose

Error occured

Avoiding errors

Under the hood

Run a sample program

TX2 vs Nano

Camera test

TX2 vs Nano

Using keypoints

Wrapping up

댓글 8개: