2023년 6월 10일 토요일

Xavier NX - YOLOv8 TensorRT model Object Tracking (JetPack 5.1)

 In my previous article Xavier NX - YOLOv8 Built-in Object Tracking and Vehicle Counting (JetPack 5.1), I explained Object Tracking newly added to YOLOV8.

And Xavier NX - YOLOv8 to TensorRT (JetPack 5.1) also explained how to convert a YOLOV8 model to TensorRT. On the Jetson series using NVidia GPUs, TensorRT models perform exceptionally well, making them a good choice if you need a speed boost.

Unfortunately, when using TensorRT models, the tracking function provided by YOLOV8 cannot be used. Inevitably, an independent tracking model must be used together.

In this article, we will learn how to implement the tracking function in a TensorRT model created using YOLOv8.


Prerequisites


Tracking Model

ByteTrack, BotSORT, and StrongSORT provided by YOLOv8 are commonly used tracking models. Among them, ByteTracking is the lightest model and its performance is also good.

In this article, I will use ByteTrack with TensorRT.


Downloading ByteTrack

ByteTrack is maintained at https://github.com/ifzhang/ByteTrack.git, but many developers have modified and released it.

I will download mikel-brostrom's github and use it with some modifications.

First clone Github.

git clone https://github.com/mikel-brostrom/yolo_tracking.git


The source code is structured as follows. Among these, the parts we will use are the tracking models included in boxmot.

(base) spypiggy@spypiggy-NX:~/src$ conda activate yolov8
(yolov8) spypiggy@spypiggy-NX:~/src$ ls -al yolo_tracking/
total 100
drwxrwxr-x  8 spypiggy spypiggy  4096  6월  9 19:46 .
drwxrwxr-x 10 spypiggy spypiggy  4096  6월  9 19:46 ..
drwxrwxr-x  3 spypiggy spypiggy  4096  6월  9 19:46 assets
drwxrwxr-x  9 spypiggy spypiggy  4096  6월  9 19:46 boxmot
-rw-rw-r--  1 spypiggy spypiggy   459  6월  9 19:46 CITATION.cff
-rw-rw-r--  1 spypiggy spypiggy  1720  6월  9 19:46 Dockerfile
drwxrwxr-x  2 spypiggy spypiggy  4096  6월  9 19:46 examples
drwxrwxr-x  8 spypiggy spypiggy  4096  6월  9 19:46 .git
drwxrwxr-x  4 spypiggy spypiggy  4096  6월  9 19:46 .github
-rw-rw-r--  1 spypiggy spypiggy   340  6월  9 19:46 .gitignore
-rw-rw-r--  1 spypiggy spypiggy 34523  6월  9 19:46 LICENSE
-rwxrwxr-x  1 spypiggy spypiggy 12194  6월  9 19:46 README.md
-rwxrwxr-x  1 spypiggy spypiggy  1062  6월  9 19:46 requirements.txt
-rw-rw-r--  1 spypiggy spypiggy  2270  6월  9 19:46 setup.py
drwxrwxr-x  2 spypiggy spypiggy  4096  6월  9 19:46 tests
(yolov8) spypiggy@spypiggy-NX:~/src$ ls -al yolo_tracking/boxmot/
total 44
drwxrwxr-x 9 spypiggy spypiggy 4096  6월  9 19:46 .
drwxrwxr-x 8 spypiggy spypiggy 4096  6월  9 19:46 ..
drwxrwxr-x 3 spypiggy spypiggy 4096  6월  9 19:46 botsort
drwxrwxr-x 3 spypiggy spypiggy 4096  6월  9 19:46 bytetrack
drwxrwxr-x 3 spypiggy spypiggy 4096  6월  9 19:46 deep
drwxrwxr-x 3 spypiggy spypiggy 4096  6월  9 19:46 deepocsort
-rw-rw-r-- 1 spypiggy spypiggy  543  6월  9 19:46 __init__.py
drwxrwxr-x 3 spypiggy spypiggy 4096  6월  9 19:46 ocsort
drwxrwxr-x 6 spypiggy spypiggy 4096  6월  9 19:46 strongsort
-rw-rw-r-- 1 spypiggy spypiggy 3027  6월  9 19:46 tracker_zoo.py
drwxrwxr-x 2 spypiggy spypiggy 4096  6월  9 19:46 utils

Of these, I will copy and use only the bytetrack directory. 

Copy bytetrack to my working directory ~/src/yollov8 as follows.

cp -r yolo_tracking/boxmot/bytetrack ~/src/yolov8

Caution: Some modifications to the code are required to use this directory alone. The modified code is on my github. Therefore, do not use mikel-brostrom's code directly, but use the modified code on my github.


Using the tracker is not difficult. You have to go through the following process:

  • Import the required packages.
  • Create a tracker object.
  • Put the YOLOV8 recognition result into the tracker's update function.

The update return value includes values such as bounding box information, tracker id, confidence, and class id.


Implementing TensorRT and ByteTrack together


The following code is the code related to bytetrack added to the video_detect_cv_trt.py code previously used in YOLOv8 to TensorRT. I used yolov8s.engine TensorRT model.


'''
from https://github.com/mikel-brostrom/yolo_tracking/tree/master/boxmot
copy boxmot (where trackers resides)
'''
from ultralytics import YOLO
import cv2
import numpy as np
import time, sys
from models import TRTModule  # isort:skip
from models.torch_utils import det_postprocess
from models.utils import blob, letterbox, path_to_list
from config import CLASSES, COLORS
import argparse
import torch

from bytetrack.byte_tracker import BYTETracker

parser = argparse.ArgumentParser()
parser.add_argument('--track',  type=str, default="bytetrack" )  #At this point, only supports bytetrack
args = parser.parse_args()

colors = [(255,0 , 0), (0,255,0), (0,0,255)]
font = cv2.FONT_HERSHEY_SIMPLEX   
fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')

device = 'cuda:0'
engine = "yolov8s.engine"
# Load a model
Engine = TRTModule(engine, device)
H, W = Engine.inp_info[0].shape[-2:]
Engine.set_desired(['num_dets', 'bboxes', 'scores', 'labels'])

tracker = BYTETracker()
#label_map = model.names

f = 0
net_total = 0.0
total = 0.0

def draw(img, boxes):
    index = 0
    for box in boxes.data:
        p1 =  (int(box[0].item()), int(box[1].item()))
        p2 =  (int(box[2].item()), int(box[3].item()))
        img = cv2.rectangle(img, p1, p2, colors[index % len(colors)], 3)
        text = label_map[int(box[5].item())] + " %4.2f"%(box[4].item()) 
        cv2.putText(img, text, (p1[0], p1[1] - 10), font, fontScale = 1, color = colors[index % len(colors)], thickness = 2)
        index += 1
    # cv2.imshow("draw", img)
    # cv2.waitKey(1)
    out_video.write(img)



def main():
    global f, net_total, total
    cap = cv2.VideoCapture("./highway_traffic.mp4")
    # Skip first frame result
    ret, img = cap.read()
    h, w, c = img.shape
    img, ratio, dwdh = letterbox(img, (W, H))
    rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
    tensor = blob(rgb, return_seg=False)
    dwdh = torch.asarray(dwdh * 2, dtype=torch.float32, device=device)
    tensor = torch.asarray(tensor, device=device)
    data = Engine(tensor)

    fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v')
    out_video = cv2.VideoWriter('./trt_track_result.mp4', fourcc, cap.get(cv2.CAP_PROP_FPS), (w, h))



    while cap.isOpened():
        s = time.time()
        ret, img = cap.read()
        if ret == False:
            break
            
        draw = img.copy()
        img, ratio, dwdh = letterbox(img, (W, H))
        rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        tensor = blob(rgb, return_seg=False)
        dwdh = torch.asarray(dwdh * 2, dtype=torch.float32, device=device)
        tensor = torch.asarray(tensor, device=device)


        net_s = time.time()
        data = Engine(tensor)
        net_e = time.time()

        bboxes, scores, labels = det_postprocess(data)
        bboxes -= dwdh
        bboxes /= ratio
        track_data = []
        for (bbox, score, label) in zip(bboxes, scores, labels):
            bbox = bbox.round().int().tolist()
            cls_id = int(label)
            cls = CLASSES[cls_id]
            color = COLORS[cls]
            track_data.append([bbox[0], bbox[1], bbox[2], bbox[3], score.item(), cls_id])
            cv2.rectangle(draw, bbox[:2], bbox[2:], color, 2)
            cv2.putText(draw,
                        f'{cls}:{score:.3f}', (bbox[0], bbox[1] - 2),
                        cv2.FONT_HERSHEY_SIMPLEX,
                        0.75, [225, 255, 255],
                        thickness=2)


        np_track = np.array(track_data)
        outputs = tracker.update(np_track, None)
        if outputs.size:
            for i in range(outputs.shape[0]):   #[][] 
                start = outputs[i][0:2]
                #print(start)
                end = outputs[i][2:2]
                track_id = outputs[i][4]
                coff = outputs[i][5]
                cls_id = outputs[i][6]
                cv2.putText(draw,
                            f'{track_id}', (int(start[0]) - 20, int(start[1]) - 2),
                            cv2.FONT_HERSHEY_SIMPLEX,
                            0.75, [225, 255, 0],
                            thickness=2)


        #cv2.imshow('result', draw)
        #cv2.waitKey(1)
        e = time.time()
        net_total += (net_e - net_s)
        total += (e - s)
        f += 1
        out_video.write(draw) # 
    
    fps = f / total 
    net_fps = f / net_total 

    print("Total processed frames:%d"%f)
    print("FPS:%4.2f"%fps)
    print("Net FPS:%4.2f"%net_fps)
    cv2.destroyAllWindows()
    cap.release()
    out_video.release()

if __name__ == "__main__":
    main()

<sample_track_trt.py>


Now, let's check trt_track_result.mp4 where the result is saved after executing the above code.


(yolov8) spypiggy@spypiggy-NX:~/src/yolov8$ python sample_track_trt.py
[06/10/2023-10:14:27] [TRT] [W] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors.
Total processed frames:1548
FPS:17.03
Net FPS:46.28


Looking at the result, it can be seen that the net FPS is almost the same value, but the FPS value has decreased. This is because Tracker processing took more time.

and "Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors." You can ignore the warning message.

The TensorRT model I used was made by XavierNX. Sometimes TensorRT misrecognizes engine files as being created on a different GPU.

See https://github.com/dusty-nv/jetson-inference/issues/883 for a discussion on this.


As dusty-nv explains, you can safely ignore this warning.

However, keep in mind that you must use the engine on the same GPU and TensorRT version that the engine was built on.


If you look at the resulting video, you can see that the Tracker ID was created properly.



Wrapping up

The best way to use YOLOv8 in the Jetson series is to use it after converting it to TensorRT. Jetson products, including Xavier NX, are provided as JetPacks for ease of use, even though GPUs are included. However, Jetson's performance is by no means better than a system using an RTX GPU on a regular PC. Difficulties in using large models due to small memory and relatively low performance of ARM CPU compared to X86 put a great burden on actual use.

Therefore, when using YOLOv8l and YOLOv8x models, which are YOLO models with high accuracy, in Xavier NX, the processing speed is too slow. However, if you convert to TensorRT and use it, you can make up for this shortcoming because the processing speed is more than doubled.

Several articles have explained how to use YOLOV8, how to convert TensorRT, how to track and count.

The source code can be downloaded from my GitHub.


댓글 없음:

댓글 쓰기