2019년 11월 7일 목요일

JetsonTX2 - Human Pose estimation using OpenPose

I used Jetson TX2, Ubuntu 18.04 Official image with root account. I have explained unsing OpenPose on the Jetson Nano in my previous article(https://spyjetson.blogspot.com/2019/10/jetsonnano-human-pose-estimation-using.html)
This article is quite similar to that. Because the CPU architectures of both devices are same and both devices have the same OS(Ubuntu 18.04).
When using the Jetson Nano, I used /usr/local/src directory as my main work directory.
But using Jetson TX2, I'm using /work as my main working directory. Because I've installed a SSD on my Jetson TX2, and mounted SSD to /work. SSD installation on the TX2 guide is here(https://spyjetson.blogspot.com/2019/11/jetson-tx2-install-ssd.html)



 

Prerequisites

Before you build OpenPose, you must pre install these packages on the Jetson TX2. See the URLs.


apt-get update -y
apt-get upgrade -y
apt-get install libboost-dev libboost-all-dev
apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev libatlas-base-dev liblmdb-dev libblas-dev libatlas-base-dev libprotobuf-dev libleveldb-dev libsnappy-dev libhdf5-serial-dev protobuf-compiler


Next, upgrade OpenCV to 4.1x

OpenCV installation page: https://spyjetson.blogspot.com/2019/11/jetson-tx2-opencv-411-upgrade.html

OpenCV head file location check

If you upgraded OpenCV to 4.1.1, check the header files location. In my case, thsey are in the "/usr/local/include/opencv4/opencv2". It's not correct. I moved opencv2 directory to "/usr/local/include".

mv -r /usr/local/include/opencv4/opencv2 /usr/local/include/

cmake version check  

To build Openpose on the Jetson TX2, you should have cmake version 3.12.2 or higher. First check the cmake version.

cmake --version

If your Jetson TX2's cmake version is lower than 3.12.2, remove the old cmake and rebuild from source codes. Visit the https://github.com/Kitware/CMake/releases/ and check the latest version first, and download it.(2019.11 latest version:3.15.5)
For cmake supporting HTTPS,  you have to build cmake with HTTPS support option.
Run bootstrap with --system-curl options.


apt-get perge cmake
apt-get install libcurl4 libcurl4-openssl-dev -y
cd /work/src
wget https://github.com/Kitware/CMake/releases/download/v3.15.5/cmake-3.15.5.tar.gz
tar -xvzf cmake-3.15.5.tar.gz
cd cmake-3.15.5
./bootstrap --system-curl
make -j4
make install 

Then restart your ssh session.

 

Install OpenPose

Follow these steps. OpenPose uses caffe framework for it's deep learning. These steps will install caffe framework too. Don't forget to do final steps to use python.


cd /work/src
git clone https://github.com/CMU-Perceptual-Computing-Lab/openpose
cd openpose
bash ./scripts/ubuntu/install_deps.sh
mkdir build
cd build
cmake -D CUDA_ARCH_BIN=62 -D BUILD_PYTHON=ON -D BUILD_EXAMPLES=OFF  ..
make -j4
make install
#==== python build ==== 
# don't do make install command. because it installs openpose module to python 2.7 directory
cd python
make -j4
make install 


Error occured


When cmake runs, I got an error message like this.

./include/caffe/util/cudnn.hpp:15:2: error: #error "NVCaffe 0.16 and higher requires CuDNN version 6.0.0 or higher"

After spending a lot of time, the cause was found. Cudnn version was not defined. Why?
I traced the cudnn.h file. And I found that the cudnn_v7.h file sie is zero. I don't know the reason. I'm using Jetpack 4.2.2 and used SDKManager to install the libraries.

This commands shows symbolic link tracing.
spytx@spytx-desktop:/usr/include$ ls -al cudnn*
lrwxrwxrwx 1 root root 26 11월  4 11:18 cudnn.h -> /etc/alternatives/libcudnn
spytx@spytx-desktop:/usr/include$ ls -al /etc/alternatives/libcudnn
lrwxrwxrwx 1 root root 41 11월  4 11:18 /etc/alternatives/libcudnn -> /usr/include/aarch64-linux-gnu/cudnn_v7.h
spytx@spytx-desktop:/usr/include$ ls -al /usr/include/aarch64-linux-gnu/cudnn_v7.h
-rw-r--r-- 1 root root 0 11월  4 11:18 /usr/include/aarch64-linux-gnu/cudnn_v7.h


Avoiding errors

I had to find the right file but I can't find arm version of libcudnn7-dev package file because this library is probably managed by SDKmanager.

I downloaded a libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64.deb file instead at https://developer.nvidia.com/rdp/cudnn-archive (Download cuDNN v7.5.0 (Feb 21, 2019), for CUDA 10.0).

I don't install the deb file, but extract the files in the deb file only. Ugly method....


>ar x libcudnn7-dev_7.5.0.56-1+cuda10.0_amd64.deb
>ls -al
-rw-r--r--  1 root root       852 11월  7 22:32 control.tar.xz
-rw-r--r--  1 root root 140166104 11월  7 22:32 data.tar.x

>tar -xvf data.tar.xz
./
./usr/
./usr/include/
./usr/include/x86_64-linux-gnu/
./usr/include/x86_64-linux-gnu/cudnn_v7.h
./usr/lib/
./usr/lib/x86_64-linux-gnu/
./usr/lib/x86_64-linux-gnu/libcudnn_static_v7.a
./usr/share/
./usr/share/doc/
./usr/share/doc/libcudnn7-dev/
./usr/share/doc/libcudnn7-dev/changelog.Debian.gz
./usr/share/doc/libcudnn7-dev/copyright
./usr/share/lintian/
./usr/share/lintian/overrides/
./usr/share/lintian/overrides/libcudnn7-dev

Then I checked the cudnn_v7.h file whether it has some CPU dependency code. Fortunately there was no CPU dependent code. I copied the header file to my TX2's /usr/include/aarch64-linux-gnu directory. This cudnn_v7.h file is in my repo(https://github.com/raspberry-pi-maker/NVIDIA-Jetson/tree/master/openpose-TX2/missing%20cudnn%20header)

Then I restart the build process. The result is successful.



Under the hood

I explained how to avoid problems with OpenPose in Python, how to interpret keypoints, etc in my article(JetsonNano - Human Pose estimation using OpenPose). Therefore, I will focus here on the performance comparison between the Nano and TX2.


To avoid the import path problem, first copy the openpose python packages to the known path(/usr/lib/python3.6/dist-packages)

cp -r /work/src/openpose/build/python/openpose/ /usr/lib/python3.6/dist-packages


Run a sample program

Let's run a sample program to test whether the OpenPose is properly installed.


root@spytx-desktop:/work/src/openpose# ./build/examples/openpose/openpose.bin --video ./examples/media/video.avi
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting all available GPUs... Detected 1 GPU(s), using 1 of them starting at GPU 0.

If you see this output, it's a successfull installation.

TX2 vs Nano

When I tested on the Nano, the fps(frames per second) value ia 0.6, so the TX2 shows 2.5 times performance.


Camera test

Let's test a TX2  to compare the performance of Nano. Because the Jetson TX2 has a built in camera, you don't have to use a webcam. In python code, the Jetson TX2's built in CSI camera is used in a slightly different way. In order to capture there are two ways: Using v4l2 or using nvcamerasrc(nvarguscamerasrc). The nvarguscamerasrc plugin was created by nvidia and it has access to the ISP that helps converting from bayer to yuv suitable for the video encoders. Starting on L4T R23.2 there is a /dev/video0 node to capture, however, this node will give you frames in bayer which are NOT suitable to encode because it grabs frames directly from the ov5693 camera without using the ISP.
Therefore, in most cases, the nvcamerasrc(nvarguscamerasrc) method is used.

The nvcamerasrc plugin is deprecated from r31.1. Above r31.1, use nvarguscamerasrc plugin instead.

Run this command to check the release version.


root@spytx-desktop:~# head -n 1 /etc/nv_tegra_release
# R32 (release), REVISION: 2.1, GCID: 16294929, BOARD: t186ref, EABI: aarch64, DATE: Tue Aug 13 04:45:36 UTC 2019

The result revision information of above command is "R32.2.1".



import logging
import sys
import time
import math
import cv2
import numpy as np
from openpose import pyopenpose as op

if __name__ == '__main__':
    fps_time = 0

    params = dict()
    params["model_folder"] = "../../models/"

    # Starting OpenPose
    opWrapper = op.WrapperPython()
    opWrapper.configure(params)
    opWrapper.start()


    print("OpenPose start")
    cap = cv2.VideoCapture("nvarguscamerasrc ! video/x-raw(memory:NVMM), width=(int)640, height=(int)480,format=(string)NV12, framerate=(fraction)24/1 ! nvvidconv flip-method=0 ! video/x-raw, format=(string)BGRx ! videoconvert ! video/x-raw, format=(string)BGR ! appsink")
    #cap.set(cv2.CAP_PROP_FRAME_WIDTH, 640)
    #cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 480)
      
    ret_val, img = cap.read()
    fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
    out_video = cv2.VideoWriter('/tmp/output.mp4', fourcc, cap.get(cv2.CAP_PROP_FPS), (640, 480))
    
    count = 0

    if cap is None:
        print("Camera Open Error")
        sys.exit(0)
    while cap.isOpened() and count < 30:
        ret_val, dst = cap.read()
        if ret_val == False:
            print("Camera read Error")
            break
        #dst = cv2.resize(image, dsize=(320, 240), interpolation=cv2.INTER_AREA)
        #cv2.imshow("OpenPose 1.5.1 - Tutorial Python API", dst)
        #continue

        datum = op.Datum()
        datum.cvInputData = dst
        opWrapper.emplaceAndPop([datum])
        fps = 1.0 / (time.time() - fps_time)
        fps_time = time.time()
        newImage = datum.cvOutputData[:, :, :]
        cv2.putText(newImage , "FPS: %f" % (fps), (20, 40),  cv2.FONT_HERSHEY_SIMPLEX, 1, (0, 255, 0), 2)
        out_video.write(newImage)

        print("captured fps %f"%(fps))
        cv2.imshow("OpenPose 1.5.1 - Tutorial Python API", newImage)
        count += 1


    cv2.destroyAllWindows()        
    out_video.release()
    cap.release()


You can get the source with the following command:


git clone https://github.com/raspberry-pi-maker/NVIDIA-Jetson.git

Copy the python file run_cam_tx2.py to "/work/src/openpose/examples/tutorial_api_python" directory. and run it.


cd /work/src/openpose/examples/tutorial_api_python
python3 run_cam_tx2.py



TX2 vs Nano

When I tested on the Nano, the fps(frames per second) value ia 0.8, so the TX2 shows 2.5 times performance.



Using keypoints

See my other article on analyzing keypoints and predicting their actual behavior in Python. (https://spyjetson.blogspot.com/2019/10/jetsonnano-human-pose-estimation-using.html)


Wrapping up

When using OpenPose, TX2 showed a 2.5x performance improvement over Nano. The results are not satisfactory, but if you lower the input video resolution, You can increase the 2fps value, so I think it can be used to some extent in actual projects.

If you want the most satisfactory human pose estimation performance on Jetson Series, see the following article(https://spyjetson.blogspot.com/2019/12/jetsonnano-human-pose-estimation-using.html). NVIDIA team introduces human pose estimation using models optimized for TensorRT. 





댓글 8개:

  1. 안녕하세요 포스트 잘 보고 있습니다.
    TX2 보드에 openpose를 사용하는 프로젝트를 진행중인 3학년 학생입니다. 진행 중에 어려움이 있어서
    혹시 관련 지식으로 도움을 좀 받을 수 있을까요?

    답글삭제
  2. 저도 Jetson을 다룬지 얼마되지 않아서 도움이 될런지 모르겠습니다.
    어떻게 도와드리면 되죠?
    도움이 필요한 부분을 메일로 보내주시면 제가 가능한 범위에서는 답변을 드리겠습니다.

    답글삭제
    답글
    1. 정말 감사합니다~!
      프로필에 있는 메일로 연락드렸습니다!

      삭제
    2. 정말 감사합니다~!
      프로필에 있는 메일로 연락드렸습니다!

      삭제
  3. 안녕하세요.TX2 보드에 openpose를 사용하는 프로젝트를 진행중인 학생입니다. 실시간으로 처리하려해서 속도가 어느정도 나와주어야하는데 다운로드 후 1.5 fps 밖에 안나오더라구요. 혹시 도움을 받을 수 있을까요?

    답글삭제
    답글
    1. 제가 테스트해본 결과로는 Inference 이미지 사이즈에 따라 달라지지만 대략 FPS 값이 다음 수준이었습니다.
      Xavier NX-3.5 ~ 4
      TX2 ~ 2
      Nano ~1

      OpenPose가 상당히 정확한 결과를 보여주지만 Jetson에서 사용하기에는 속도 문제가 항상 걸립니다.
      OpenPose가 비록 CUDA를 사용하지만 모델 자체가 워낙 무거워서 속도 향상에 근본적인 문제가 있습니다.
      만약 CMU의 OpenPose를 꼭 사용해야 한다면 속도 향상이 쉽지 않을 것 같습니다.
      HyperPose(https://github.com/tensorlayer/hyperpose)가 OpenPose를 TensorRT용으로 최적화했다고 하는데 아직 테스트는 못해보았습니다.
      만약 CMU OpenPose를 사용해야 한다면 HyperPose를 한번 살펴보시길 권합니다.

      Pose Estimation이 목적이라면 Jetson 시리즈에서 가장 권할만안 것은 TensorRT를 사용한 모델입니다.
      빠른 속도와 어느 정도의 정확도(OpenPose보다는 못합니다)가 나옵니다.
      https://spyjetson.blogspot.com/2019/12/jetsontx2-human-pose-estimation-using.html 글에서 TX2에서 사용하는 방법을 적어두었습니다.



      삭제
  4. Hi, after following your tutorial, I can't run the example because openpose.bin is missing.
    More people is having this issue: https://github.com/CMU-Perceptual-Computing-Lab/openpose/issues/719
    Can you please fix that?
    Thanks in advance.

    답글삭제
    답글
    1. Hi Salva.
      The cmake only makes "Makefile".
      So after run the cmake command, you should run the "make && make install" commands.
      Pls see my other blog about installing OpenPose on the JetPack 4.5 "https://spyjetson.blogspot.com/2021/02/jetpack-45-install-latest-version-of_13.html"

      삭제