JetsonTX2 - Human Pose estimation using OpenPose

I used Jetson TX2, Ubuntu 18.04 Official image with root account. I have explained unsing OpenPose on the Jetson Nano in my previous article(https://spyjetson.blogspot.com/2019/10/jetsonnano-human-pose-estimation-using.html)
This article is quite similar to that. Because the CPU architectures of both devices are same and both devices have the same OS(Ubuntu 18.04).
When using the Jetson Nano, I used /usr/local/src directory as my main work directory.
But using Jetson TX2, I'm using /work as my main working directory. Because I've installed a SSD on my Jetson TX2, and mounted SSD to /work. SSD installation on the TX2 guide is here(https://spyjetson.blogspot.com/2019/11/jetson-tx2-install-ssd.html)



Before you build OpenPose, you must pre install these packages on the Jetson TX2. See the URLs.

apt-get update -y
apt-get upgrade -y
apt-get install libboost-dev libboost-all-dev
apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev libatlas-base-dev liblmdb-dev libblas-dev libatlas-base-dev libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler

Next, upgrade OpenCV to 4.1x

OpenCV installation page: https://spyjetson.blogspot.com/2019/11/jetson-tx2-opencv-411-upgrade.html

OpenCV head file location check

If you upgraded OpenCV to 4.1.1, check the header files location. In my case, thsey are in the "/usr/local/include/opencv4/opencv2". It's not correct. I moved opencv2 directory to "/usr/local/include".

mv -r /usr/local/include/opencv4/opencv2 /usr/local/include/

cmake version check  

To build Openpose on the Jetson TX2, you should have cmake version 3.12.2 or higher. First check the cmake version.

cmake --version

If your Jetson TX2's cmake version is lower than 3.12.2, remove the old cmake and rebuild from source codes. Visit the https://github.com/Kitware/CMake/releases/ and check the latest version first, and download it.(2019.11 latest version:3.15.5)
For cmake supporting HTTPS,  you have to build cmake with HTTPS support option.
Run bootstrap with --system-curl options.

apt-get perge cmake
apt-get install libcurl4 libcurl4-openssl-dev -y
cd /work/src
wget https://github.com/Kitware/CMake/releases/download/v3.15.5/cmake-3.15.5.tar.gz
tar -xvzf cmake-3.15.5.tar.gz
cd cmake-3.15.5
./bootstrap --system-curl
make -j4
make install 

Then restart your ssh session.


Install OpenPose

Follow these steps. OpenPose uses caffe framework for it's deep learning. These steps will install caffe framework too. Don't forget to do final steps to use python.

cd /work/src
git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose
cd openpose
bash ./scripts/ubuntu/install_deps.sh
mkdir build
cd build
make -j4
make install
#==== python build ==== 
# don't do make install command. because it installs openpose module to python 2.7 directory
cd python
make -j4
make install 

Error occured

When cmake runs, I got an error message like this.

./include/caffe/util/cudnn.hpp:15:2: error: #error "NVCaffe 0.16 and higher requires CuDNN version 6.0.0 or higher"

After spending a lot of time, the cause was found. Cudnn version was not defined. Why?
I traced the cudnn.h file. And I found that the cudnn_v7.h file sie is zero. I don't know the reason. I'm using Jetpack 4.2.2 and used SDKManager to install the libraries.

This commands shows symbolic link tracing.
spytx@spytx-desktop:/usr/include$ ls -al cudnn*
lrwxrwxrwx 1 root root 26 11월  4 11:18 cudnn.h -> /etc/alternatives/libcudnn
spytx@spytx-desktop:/usr/include$ ls -al /etc/alternatives/libcudnn
lrwxrwxrwx 1 root root 41 11월  4 11:18 /etc/alternatives/libcudnn -> /usr/include/aarch64-linux-gnu/cudnn_v7.h
spytx@spytx-desktop:/usr/include$ ls -al /usr/include/aarch64-linux-gnu/cudnn_v7.h
-rw-r--r-- 1 root root 0 11월  4 11:18 /usr/include/aarch64-linux-gnu/cudnn_v7.h

Avoiding errors

I had to find the right file but I can't find arm version of libcudnn7-dev package file because this library is probably managed by SDKmanager.

I downloaded a libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64.deb file instead at https://developer.nvidia.com/rdp/cudnn-archive (Download cuDNN v7.5.0 (Feb 21, 2019), for CUDA 10.0).

I don't install the deb file, but extract the files in the deb file only. Ugly method....

>ar x libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64.deb
>ls -al
-rw-r--r--  1 root root       852 11월  7 22:32 control.tar.xz
-rw-r--r--  1 root root 140166104 11월  7 22:32 data.tar.x

>tar -xvf data.tar.xz

Then I checked the cudnn_v7.h file whether it has some CPU dependency code. Fortunately there was no CPU dependent code. I copied the header file to my TX2's /usr/include/aarch64-linux-gnu directory. This cudnn_v7.h file is in my repo(https://github.com/raspberry-pi-maker/NVIDIA-Jetson/tree/master/openpose-TX2/missing%20cudnn%20header)

Then I restart the build process. The result is successful.

Under the hood

I explained how to avoid problems with OpenPose in Python, how to interpret keypoints, etc in my article(JetsonNano - Human Pose estimation using OpenPose). Therefore, I will focus here on the performance comparison between the Nano and TX2.

To avoid the import path problem, first copy the openpose python packages to the known path(/usr/lib/python3.6/dist-packages)

cp -r /work/src/openpose/build/python/openpose/ /usr/lib/python3.6/dist-packages

Run a sample program

Let's run a sample program to test whether the OpenPose is properly installed.

root@spytx-desktop:/work/src/openpose# ./build/examples/openpose/openpose.bin --video ./examples/media/video.avi
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.

If you see this output, it's a successfull installation.

TX2 vs Nano

When I tested on the Nano, the fps(frames per second) value ia 0.6, so the TX2 shows 2.5 times performance.

Camera test

Let's test a TX2  to compare the performance of Nano. Because the Jetson TX2 has a built in camera, you don't have to use a webcam. In python code, the Jetson TX2's built in CSI camera is used in a slightly different way. In order to capture there are two ways: Using v4l2 or using nvcamerasrc(nvarguscamerasrc). The nvarguscamerasrc plugin was created by nvidia and it has access to the ISP that helps converting from bayer to yuv suitable for the video encoders. Starting on L4T R23.2 there is a /dev/video0 node to capture, however, this node will give you frames in bayer which are NOT suitable to encode because it grabs frames directly from the ov5693 camera without using the ISP.
Therefore, in most cases, the nvcamerasrc(nvarguscamerasrc) method is used.

The nvcamerasrc plugin is deprecated from r31.1. Above r31.1, use nvarguscamerasrc plugin instead.

Run this command to check the release version.

root@spytx-desktop:~# head -n 1 /etc/nv_tegra_release
# R32 (release), REVISION: 2.1, GCID: 16294929, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 13 04:45:36 UTC 2019

The result revision information of above command is "R32.2.1".

import logging
import sys
import time
import math
import cv2
import numpy as np
from openpose import pyopenpose as op

if __name__ == '__main__':
    fps_time = 0

    params = dict()
    params["model_folder"] = "../../models/"

    # Starting OpenPose
    opWrapper = op.WrapperPython()

    print("OpenPose start")
    cap = cv2.VideoCapture("nvarguscamerasrc ! video/x-raw(memory:NVMM), width=(int)640, height=(int)480,format=(string)NV12, framerate=(fraction)24/1 ! nvvidconv flip-method=0 ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink")
    #cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    #cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
    ret_val, img = cap.read()
    fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
    out_video = cv2.VideoWriter('/tmp/output.mp4', fourcc, cap.get(cv2.CAP_PROP_FPS), (640, 480))
    count = 0

    if cap is None:
        print("Camera Open Error")
    while cap.isOpened() and count < 30:
        ret_val, dst = cap.read()
        if ret_val == False:
            print("Camera read Error")
        #dst = cv2.resize(image, dsize=(320, 240), interpolation=cv2.INTER_AREA)
        #cv2.imshow("OpenPose 1.5.1 - Tutorial Python API", dst)

        datum = op.Datum()
        datum.cvInputData = dst
        fps = 1.0 / (time.time() - fps_time)
        fps_time = time.time()
        newImage = datum.cvOutputData[:, :, :]
        cv2.putText(newImage , "FPS: %f" % (fps), (20, 40),  cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)

        print("captured fps %f"%(fps))
        cv2.imshow("OpenPose 1.5.1 - Tutorial Python API", newImage)
        count += 1


You can get the source with the following command:

git clone https://github.com/raspberry-pi-maker/NVIDIA-Jetson.git

Copy the python file run_cam_tx2.py to "/work/src/openpose/examples/tutorial_api_python" directory. and run it.

cd /work/src/openpose/examples/tutorial_api_python
python3 run_cam_tx2.py

TX2 vs Nano

When I tested on the Nano, the fps(frames per second) value ia 0.8, so the TX2 shows 2.5 times performance.

Using keypoints

See my other article on analyzing keypoints and predicting their actual behavior in Python. (https://spyjetson.blogspot.com/2019/10/jetsonnano-human-pose-estimation-using.html)

Wrapping up

When using OpenPose, TX2 showed a 2.5x performance improvement over Nano. The results are not satisfactory, but if you lower the input video resolution, You can increase the 2fps value, so I think it can be used to some extent in actual projects.

If you want the most satisfactory human pose estimation performance on Jetson Series, see the following article(https://spyjetson.blogspot.com/2019/12/jetsonnano-human-pose-estimation-using.html). NVIDIA team introduces human pose estimation using models optimized for TensorRT. 

