In my previous article Xavier NX - YOLOv8 Built-in Object Tracking and Vehicle Counting (JetPack 5.1), I explained Object Tracking newly added to YOLOV8.
And Xavier NX - YOLOv8 to TensorRT (JetPack 5.1) also explained how to convert a YOLOV8 model to TensorRT. On the Jetson series using NVidia GPUs, TensorRT models perform exceptionally well, making them a good choice if you need a speed boost.
Unfortunately, when using TensorRT models, the tracking function provided by YOLOV8 cannot be used. Inevitably, an independent tracking model must be used together.
In this article, we will learn how to implement the tracking function in a TensorRT model created using YOLOv8.
Prerequisites
- Xavier NX - YOLOv8 Video Object Detection (JetPack 5.1)
- Xavier NX - YOLOv8 Built-in Object Tracking and Vehicle Counting (JetPack 5.1)
- Xavier NX - YOLOv8 to TensorRT (JetPack 5.1)
Tracking Model
ByteTrack, BotSORT, and StrongSORT provided by YOLOv8 are commonly used tracking models. Among them, ByteTracking is the lightest model and its performance is also good.
In this article, I will use ByteTrack with TensorRT.
Downloading ByteTrack
ByteTrack is maintained at https://github.com/ifzhang/ByteTrack.git, but many developers have modified and released it.
I will download mikel-brostrom's github and use it with some modifications.
First clone Github.
git clone https://github.com/mikel-brostrom/yolo_tracking.git
The source code is structured as follows. Among these, the parts we will use are the tracking models included in boxmot.
(base) spypiggy@spypiggy-NX:~/src$ conda activate yolov8 (yolov8) spypiggy@spypiggy-NX:~/src$ ls -al yolo_tracking/ total 100 drwxrwxr-x 8 spypiggy spypiggy 4096 6월 9 19:46 . drwxrwxr-x 10 spypiggy spypiggy 4096 6월 9 19:46 .. drwxrwxr-x 3 spypiggy spypiggy 4096 6월 9 19:46 assets drwxrwxr-x 9 spypiggy spypiggy 4096 6월 9 19:46 boxmot -rw-rw-r-- 1 spypiggy spypiggy 459 6월 9 19:46 CITATION.cff -rw-rw-r-- 1 spypiggy spypiggy 1720 6월 9 19:46 Dockerfile drwxrwxr-x 2 spypiggy spypiggy 4096 6월 9 19:46 examples drwxrwxr-x 8 spypiggy spypiggy 4096 6월 9 19:46 .git drwxrwxr-x 4 spypiggy spypiggy 4096 6월 9 19:46 .github -rw-rw-r-- 1 spypiggy spypiggy 340 6월 9 19:46 .gitignore -rw-rw-r-- 1 spypiggy spypiggy 34523 6월 9 19:46 LICENSE -rwxrwxr-x 1 spypiggy spypiggy 12194 6월 9 19:46 README.md -rwxrwxr-x 1 spypiggy spypiggy 1062 6월 9 19:46 requirements.txt -rw-rw-r-- 1 spypiggy spypiggy 2270 6월 9 19:46 setup.py drwxrwxr-x 2 spypiggy spypiggy 4096 6월 9 19:46 tests (yolov8) spypiggy@spypiggy-NX:~/src$ ls -al yolo_tracking/boxmot/ total 44 drwxrwxr-x 9 spypiggy spypiggy 4096 6월 9 19:46 . drwxrwxr-x 8 spypiggy spypiggy 4096 6월 9 19:46 .. drwxrwxr-x 3 spypiggy spypiggy 4096 6월 9 19:46 botsort drwxrwxr-x 3 spypiggy spypiggy 4096 6월 9 19:46 bytetrack drwxrwxr-x 3 spypiggy spypiggy 4096 6월 9 19:46 deep drwxrwxr-x 3 spypiggy spypiggy 4096 6월 9 19:46 deepocsort -rw-rw-r-- 1 spypiggy spypiggy 543 6월 9 19:46 __init__.py drwxrwxr-x 3 spypiggy spypiggy 4096 6월 9 19:46 ocsort drwxrwxr-x 6 spypiggy spypiggy 4096 6월 9 19:46 strongsort -rw-rw-r-- 1 spypiggy spypiggy 3027 6월 9 19:46 tracker_zoo.py drwxrwxr-x 2 spypiggy spypiggy 4096 6월 9 19:46 utils
Of these, I will copy and use only the bytetrack directory.
Copy bytetrack to my working directory ~/src/yollov8 as follows.
cp -r yolo_tracking/boxmot/bytetrack ~/src/yolov8
Caution: Some modifications to the code are required to use this directory alone. The modified code is on my github. Therefore, do not use mikel-brostrom's code directly, but use the modified code on my github.
Using the tracker is not difficult. You have to go through the following process:
- Import the required packages.
- Create a tracker object.
- Put the YOLOV8 recognition result into the tracker's update function.
The update return value includes values such as bounding box information, tracker id, confidence, and class id.
Implementing TensorRT and ByteTrack together
The following code is the code related to bytetrack added to the video_detect_cv_trt.py code previously used in YOLOv8 to TensorRT. I used yolov8s.engine TensorRT model.
''' from https://github.com/mikel-brostrom/yolo_tracking/tree/master/boxmot copy boxmot (where trackers resides) ''' from ultralytics import YOLO import cv2 import numpy as np import time, sys from models import TRTModule # isort:skip from models.torch_utils import det_postprocess from models.utils import blob, letterbox, path_to_list from config import CLASSES, COLORS import argparse import torch from bytetrack.byte_tracker import BYTETracker parser = argparse.ArgumentParser() parser.add_argument('--track', type=str, default="bytetrack" ) #At this point, only supports bytetrack args = parser.parse_args() colors = [(255,0 , 0), (0,255,0), (0,0,255)] font = cv2.FONT_HERSHEY_SIMPLEX fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v') device = 'cuda:0' engine = "yolov8s.engine" # Load a model Engine = TRTModule(engine, device) H, W = Engine.inp_info[0].shape[-2:] Engine.set_desired(['num_dets', 'bboxes', 'scores', 'labels']) tracker = BYTETracker() #label_map = model.names f = 0 net_total = 0.0 total = 0.0 def draw(img, boxes): index = 0 for box in boxes.data: p1 = (int(box[0].item()), int(box[1].item())) p2 = (int(box[2].item()), int(box[3].item())) img = cv2.rectangle(img, p1, p2, colors[index % len(colors)], 3) text = label_map[int(box[5].item())] + " %4.2f"%(box[4].item()) cv2.putText(img, text, (p1[0], p1[1] - 10), font, fontScale = 1, color = colors[index % len(colors)], thickness = 2) index += 1 # cv2.imshow("draw", img) # cv2.waitKey(1) out_video.write(img) def main(): global f, net_total, total cap = cv2.VideoCapture("./highway_traffic.mp4") # Skip first frame result ret, img = cap.read() h, w, c = img.shape img, ratio, dwdh = letterbox(img, (W, H)) rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) tensor = blob(rgb, return_seg=False) dwdh = torch.asarray(dwdh * 2, dtype=torch.float32, device=device) tensor = torch.asarray(tensor, device=device) data = Engine(tensor) fourcc = cv2.VideoWriter_fourcc('m', 'p', '4', 'v') out_video = cv2.VideoWriter('./trt_track_result.mp4', fourcc, cap.get(cv2.CAP_PROP_FPS), (w, h)) while cap.isOpened(): s = time.time() ret, img = cap.read() if ret == False: break draw = img.copy() img, ratio, dwdh = letterbox(img, (W, H)) rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) tensor = blob(rgb, return_seg=False) dwdh = torch.asarray(dwdh * 2, dtype=torch.float32, device=device) tensor = torch.asarray(tensor, device=device) net_s = time.time() data = Engine(tensor) net_e = time.time() bboxes, scores, labels = det_postprocess(data) bboxes -= dwdh bboxes /= ratio track_data = [] for (bbox, score, label) in zip(bboxes, scores, labels): bbox = bbox.round().int().tolist() cls_id = int(label) cls = CLASSES[cls_id] color = COLORS[cls] track_data.append([bbox[0], bbox[1], bbox[2], bbox[3], score.item(), cls_id]) cv2.rectangle(draw, bbox[:2], bbox[2:], color, 2) cv2.putText(draw, f'{cls}:{score:.3f}', (bbox[0], bbox[1] - 2), cv2.FONT_HERSHEY_SIMPLEX, 0.75, [225, 255, 255], thickness=2) np_track = np.array(track_data) outputs = tracker.update(np_track, None) if outputs.size: for i in range(outputs.shape[0]): #[][] start = outputs[i][0:2] #print(start) end = outputs[i][2:2] track_id = outputs[i][4] coff = outputs[i][5] cls_id = outputs[i][6] cv2.putText(draw, f'{track_id}', (int(start[0]) - 20, int(start[1]) - 2), cv2.FONT_HERSHEY_SIMPLEX, 0.75, [225, 255, 0], thickness=2) #cv2.imshow('result', draw) #cv2.waitKey(1) e = time.time() net_total += (net_e - net_s) total += (e - s) f += 1 out_video.write(draw) # fps = f / total net_fps = f / net_total print("Total processed frames:%d"%f) print("FPS:%4.2f"%fps) print("Net FPS:%4.2f"%net_fps) cv2.destroyAllWindows() cap.release() out_video.release() if __name__ == "__main__": main()
<sample_track_trt.py>
Now, let's check trt_track_result.mp4 where the result is saved after executing the above code.
(yolov8) spypiggy@spypiggy-NX:~/src/yolov8$ python sample_track_trt.py [06/10/2023-10:14:27] [TRT] [W] Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors. Total processed frames:1548 FPS:17.03 Net FPS:46.28
Looking at the result, it can be seen that the net FPS is almost the same value, but the FPS value has decreased. This is because Tracker processing took more time.
and "Using an engine plan file across different models of devices is not recommended and is likely to affect performance or even cause errors." You can ignore the warning message.
The TensorRT model I used was made by XavierNX. Sometimes TensorRT misrecognizes engine files as being created on a different GPU.
See https://github.com/dusty-nv/jetson-inference/issues/883 for a discussion on this.
As dusty-nv explains, you can safely ignore this warning.
However, keep in mind that you must use the engine on the same GPU and TensorRT version that the engine was built on.
If you look at the resulting video, you can see that the Tracker ID was created properly.
Wrapping up
The best way to use YOLOv8 in the Jetson series is to use it after converting it to TensorRT. Jetson products, including Xavier NX, are provided as JetPacks for ease of use, even though GPUs are included. However, Jetson's performance is by no means better than a system using an RTX GPU on a regular PC. Difficulties in using large models due to small memory and relatively low performance of ARM CPU compared to X86 put a great burden on actual use.
Therefore, when using YOLOv8l and YOLOv8x models, which are YOLO models with high accuracy, in Xavier NX, the processing speed is too slow. However, if you convert to TensorRT and use it, you can make up for this shortcoming because the processing speed is more than doubled.
Several articles have explained how to use YOLOV8, how to convert TensorRT, how to track and count.
The source code can be downloaded from my GitHub.