当前位置: 代码网 > 科技>电脑产品>内存 > 【目标检测】YOLOV8实战入门(五)模型预测

【目标检测】YOLOV8实战入门(五)模型预测

2024年08月05日 内存 我要评论
预测模式可以为各种任务生成预测,在使用流模式时返回结果对象列表或结果对象的内存高效生成器。文件加载,用户可以提供图像或视频来执行推理。模型预测输入图像或视频中对象的类别和位置。的流媒体模式应用于长视频或大型预测源,否则结果将在内存中累积并最终导致内存不足错误。函数在图像对象中绘制结果。它绘制在结果对象中找到的所有组件(框、掩码、分类日志等)— 具有用于操作边界框的属性和方法的boxes。— 用于索引掩码或获取段坐标的掩码对象。模式用于在新图像或视频上使用经过训练的。模型进行预测,在此模式下,模型从。

predict模式用于在新图像或视频上使用经过训练的yolov8模型进行预测,在此模式下,模型从checkpoint 文件加载,用户可以提供图像或视频来执行推理。模型预测输入图像或视频中对象的类别和位置。

from ultralytics import yolo
from pil import image
import cv2

model = yolo("model.pt")
# 接受所有格式-image/dir/path/url/video/pil/ndarray。0用于网络摄像头
results = model.predict(source="0")
results = model.predict(source="folder", show=true) # 展示预测结果

# from pil
im1 = image.open("bus.jpg")
results = model.predict(source=im1, save=true)  # 保存绘制的图像

# from ndarray
im2 = cv2.imread("bus.jpg")
results = model.predict(source=im2, save=true, save_txt=true)  # 将预测保存为标签

# from list of pil/ndarray
results = model.predict(source=[im1, im2])

yolov8预测模式可以为各种任务生成预测,在使用流模式时返回结果对象列表或结果对象的内存高效生成器。通过在预测器的调用方法中传递stream=true来启用流模式。stream=true的流媒体模式应用于长视频或大型预测源,否则结果将在内存中累积并最终导致内存不足错误。

inputs = [img, img]  # list of numpy arrays
results = model(inputs, stream=true)  # generator of results objects

for result in results:
    boxes = result.boxes  # boxes object for bbox outputs
    masks = result.masks  # masks object for segmentation masks outputs
    probs = result.probs  # class probabilities for classification outputs

相关参数如下

keyvaluedescription
source'ultralytics/assets'source directory for images or videos
conf0.25object confidence threshold for detection
iou0.7intersection over union (iou) threshold for nms
halffalseuse half precision (fp16)
devicenonedevice to run on, i.e. cuda device=0/1/2/3 or device=cpu
showfalseshow results if possible
savefalsesave images with results
save_txtfalsesave results as .txt file
save_conffalsesave results with confidence scores
save_cropfalsesave cropped images with results
hide_labelsfalsehide labels
hide_conffalsehide confidence scores
max_det300maximum number of detections per image
vid_stridefalsevideo frame-rate stride
line_thickness3bounding box thickness (pixels)
visualizefalsevisualize model features
augmentfalseapply image augmentation to prediction sources
agnostic_nmsfalseclass-agnostic nms
retina_masksfalseuse high-resolution segmentation masks
classesnonefilter results by class, i.e. class=0, or class=[0,2,3]
boxestrueshow boxes in segmentation predictions

yolov8可以接受各种输入源,如下表所示。这包括图像、url、pil图像、opencv、numpy数组、torch张量、csv文件、视频、目录、全局、youtube视频和流。该表指示每个源是否可以在流模式下使用stream=true✅以及每个源的示例参数。

sourcemodel(arg)typenotes
image'im.jpg'str, path
url'https://ultralytics.com/images/bus.jpg'str
screenshot'screen'str
pilimage.open('im.jpg')pil.imagehwc, rgb
opencvcv2.imread('im.jpg')[:,:,::-1]np.ndarrayhwc, bgr to rgb
numpynp.zeros((640,1280,3))np.ndarrayhwc
torchtorch.zeros(16,3,320,640)torch.tensorbchw, rgb
csv'sources.csv'str, pathrtsp, rtmp, http
video ✅'vid.mp4'str, path
directory ✅'path/'str, path
glob ✅'path/*.jpg'struse * operator
youtube ✅'https://youtu.be/zgi9g1ksqhc'str
stream ✅'rtsp://example.com/media.mp4'strrtsp, rtmp, http

图像类型

image suffixesexample predict commandreference
.bmpyolo predict source=image.bmpmicrosoft bmp file format
.dngyolo predict source=image.dngadobe dng
.jpegyolo predict source=image.jpegjpeg
.jpgyolo predict source=image.jpgjpeg
.mpoyolo predict source=image.mpomulti picture object
.pngyolo predict source=image.pngportable network graphics
.tifyolo predict source=image.tiftag image file format
.tiffyolo predict source=image.tifftag image file format
.webpyolo predict source=image.webpwebp
.pfmyolo predict source=image.pfmportable floatmap

视频类型

video suffixesexample predict commandreference
.asfyolo predict source=video.asfadvanced systems format
.aviyolo predict source=video.aviaudio video interleave
.gifyolo predict source=video.gifgraphics interchange format
.m4vyolo predict source=video.m4vmpeg-4 part 14
.mkvyolo predict source=video.mkvmatroska
.movyolo predict source=video.movquicktime file format
.mp4yolo predict source=video.mp4mpeg-4 part 14 - wikipedia
.mpegyolo predict source=video.mpegmpeg-1 part 2
.mpgyolo predict source=video.mpgmpeg-1 part 2
.tsyolo predict source=video.tsmpeg transport stream
.wmvyolo predict source=video.wmvwindows media video
.webmyolo predict source=video.webmwebm project

预测结果对象包含以下组件:

results.boxes: — 具有用于操作边界框的属性和方法的boxes

results.masks: — 用于索引掩码或获取段坐标的掩码对象

results.probs: — 包含类概率或logits

results.orig_img: — 载入内存的原始图像

results.path: — 包含输入图像路径的路径

默认情况下,每个结果都由一个torch. tensor组成,它允许轻松操作:

results = results.cuda()
results = results.cpu()
results = results.to('cpu')
results = results.numpy()

from ultralytics import yolo
import cv2
from ultralytics.yolo.utils.benchmarks import benchmark

model = yolo("yolov8-seg.yaml").load('yolov8n-seg.pt')
results = model.predict(r'e:\cs\dl\yolo\yolov8study\bus.jpg')
boxes = results[0].boxes
masks = results[0].masks
probs = results[0].probs 
print(f"boxes:{boxes[0]}")
print(f"masks:{masks.xy }")
print(f"probs:{probs}")

output:

image 1/1 e:\cs\dl\yolo\yolov8study\bus.jpg: 640x480 4 0s, 1 5, 1 36, 25.9ms
speed: 4.0ms preprocess, 25.9ms inference, 10.0ms postprocess per image at shape (1, 3, 640, 640)
warning  'boxes.boxes' is deprecated. use 'boxes.data' instead.
boxes:ultralytics.yolo.engine.results.boxes object with attributes:

boxes: tensor([[670.1221, 389.6674, 809.4929, 876.5032,   0.8875,   0.0000]], device='cuda:0')   
cls: tensor([0.], device='cuda:0')
conf: tensor([0.8875], device='cuda:0')
data: tensor([[670.1221, 389.6674, 809.4929, 876.5032,   0.8875,   0.0000]], device='cuda:0')    
id: none
is_track: false
orig_shape: tensor([1080,  810], device='cuda:0')
shape: torch.size([1, 6])
xywh: tensor([[739.8075, 633.0853, 139.3708, 486.8358]], device='cuda:0')
xywhn: tensor([[0.9133, 0.5862, 0.1721, 0.4508]], device='cuda:0')
xyxy: tensor([[670.1221, 389.6674, 809.4929, 876.5032]], device='cuda:0')
xyxyn: tensor([[0.8273, 0.3608, 0.9994, 0.8116]], device='cuda:0')
masks:[array([[     804.94,       391.5],
       [     794.81,      401.62],
       [     794.81,      403.31],
       [     791.44,      406.69],
       ......
probs:none

我们可以使用result对象的plot()函数在图像对象中绘制结果。它绘制在结果对象中找到的所有组件(框、掩码、分类日志等)

annotated_frame = results[0].plot()
# display the annotated frame
cv2.imshow("yolov8 inference", annotated_frame)
cv2.waitkey()
cv2.destroyallwindows()

在这里插入图片描述


使用opencv(cv2)和yolov8对视频帧运行推理的python脚本。

import cv2
from ultralytics import yolo

# load the yolov8 model
model = model = yolo("yolov8-seg.yaml").load('yolov8n-seg.pt')

# open the video file
video_path = "sample.mp4"
cap = cv2.videocapture(video_path)

# loop through the video frames
while cap.isopened():
    # read a frame from the video
    success, frame = cap.read()

    if success:
        # run yolov8 inference on the frame
        results = model(frame)

        # visualize the results on the frame
        annotated_frame = results[0].plot()

        # display the annotated frame
        cv2.imshow("yolov8 inference", annotated_frame)

        # break the loop if 'q' is pressed
        if cv2.waitkey(1) & 0xff == ord("q"):
            break
    else:
        # break the loop if the end of the video is reached
        break

# release the video capture object and close the display window
cap.release()
cv2.destroyallwindows()

在这里插入图片描述

(0)

相关文章:

版权声明:本文内容由互联网用户贡献,该文观点仅代表作者本人。本站仅提供信息存储服务,不拥有所有权,不承担相关法律责任。 如发现本站有涉嫌抄袭侵权/违法违规的内容, 请发送邮件至 2386932994@qq.com 举报,一经查实将立刻删除。

发表评论

验证码:
Copyright © 2017-2025  代码网 保留所有权利. 粤ICP备2024248653号
站长QQ:2386932994 | 联系邮箱:2386932994@qq.com