Salta el contingut

2.Training models using transfer learning

In this lesson, we will learn how to train an object detection model using transfer learning. These techniques reduce computational effort and improve model performance, even with small datasets.

First, we add the help module:

addpath('help-module');

Training

Required YOLO Parameters

The following parameters are mandatory when configuring a YOLO training session:

  • configFile (str) : Path to the configuration file (usually in .yaml format) that defines the dataset structure and classes. This file should include paths to training, validation, and optional test sets, as well as the list of class names. As an example the file we will use in this tutorial is data.yaml which already contains all the defined structure.
  • baseModel (str): This parameter specifies which model will be used for training. It allows you to either continue training a previously trained model or start training from a pre-trained YOLO model.You can choose among different YOLO versions, tasks (e.g., object detection, segmentation, classification), and model sizes. For these tutorials, we will use YOLOv8 for object detection. The available pre-trained models, ordered from smallest to largest, are: yolov8n.pt, yolov8s.pt, yolov8m.pt, yolov8l.pt, yolov8x.pt. It is recommended in these lessons to use smaller models (e.g., yolov8n.pt or yolov8s.pt) for faster training and inference speed.
  • MaxEpochs (int): This parameter sets the maximum number of epochs for training, where an epoch is one complete pass over all the images in the training dataset. Increasing this value usually improves model performance but also extends training time.
  • ImageSize (list[int]): The size (in pixels) that all images will be resized to before training. Larger sizes may improve accuracy but require more computational resources.

In the following code, we define the values for these required parameters. Use these parameters to speed up execution:

configFile = fullfile(pwd, 'datasets', 'fruits_3_4998', 'data.yaml');
baseModel = 'yolov8s-oiv7.pt';
options.MaxEpochs = 1;
options.ImageSize = [256 256 3];

YOLO layers visualization

The following code allows you to visualize the architecture and parameters of a YOLOv8 model:

det = yolov8ObjectDetector2('yolov8s');
Pretrained yolov8s network already exists.
% Analyze loaded model
analyzeNetwork(det.Network);

This command opens a new window displaying all the layers and parameters of the YOLO model being used. Keep in mind that different YOLOv8 variants (e.g., yolov8s, yolov8m, etc.) differ not only in the number of parameters but also slightly in their architectural complexity.

In the case of YOLOv8, the model typically consists of 23 main blocks. These can be grouped into three main components:

  • Backbone (first 10 blocks): Responsible for extracting features from the input image.
  • Neck (next 12 blocks): Combines features at different scales to enhance detection performance.
  • Head (last block): Responsible for predicting bounding boxes, class labels, and confidence scores.

Freezing Layer Parameter for Transfer Learning

One of the main advantages of YOLO models is that they come pre-trained. Training an object detection model from scratch usually requires millions of images and considerable computational power. This problem can be solved using transfer learning, which involves taking a model that has already been trained on a large dataset and slightly retraining it using a much smaller set of new data.

Transfer learning is particularly effective in computer vision because convolutional neural networks (CNNs), which are the foundation of YOLO (and all object detection models), learn features in a hierarchical manner. The initial layers detect low-level features such as edges and textures, while deeper layers capture more complex patterns and object structures. This hierarchy makes the early layers highly reusable across different tasks.

image_0.png

Typically, the early layers responsible for feature extraction (the backbone) are frozen during training, meaning their weights are not updated. This allows the model to retain the general visual features learned from large datasets. Meanwhile, the later layers are fine-tuned on the new dataset to adapt to the specific task at hand.

In some scenarios, the entire model may be fine-tuned using a lower learning rate, allowing it to gradually adapt to the new data while minimizing the risk of overwriting valuable pre-trained knowledge.

This is why we use YOLOv8. It comes pre-trained on large datasets such as COCO (e.g., yolov8n.pt, yolov8s.pt) or Open Image V7 (e.g., yolov8n-oiv7.pt, yolov8s-oiv7.pt). Using these models significantly reduces training time and improves performance, even with smaller datasets.

Choosing which layers to freeze

The choice of how many YOLOv8 layers to freeze depends on how similar your target dataset is to the pretrained data. YOLOv8 has 22 freeze-able layers: the first 10 extract generic features (edges, textures, shapes), while later layers specialize in detection. Since our classes (apple, orange, pear) are well represented in COCO and Open Images V7, the domains closely match. This allows us to freeze most layers to keep general features intact and fine-tune only the higher-level layers for our fruits, which speeds training and lowers overfitting risk.

To freeze specific layers during training, we use the freeze parameter as follows:

optionalArgs = {'freeze', 15};

There are additional parameters available to configure the model, which will be explained in Lesson 4: Experimentation.

Starting training

Next, we train the YOLO model with the parameters we defined earlier. This may take approximately 2 minutes.

[yolov8Det, results] = trainYOLOv8ObjectDetector( ...
    configFile, ...
    baseModel, ...
    options, ...
    optionalArgs{:});
ans = 
  PythonEnvironment with properties:

          Version: "3.11"
       Executable: "C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\win64\python\python.exe"
          Library: "C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\win64\python\python311.dll"
             Home: "C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\win64\python"
           Status: Terminated
    ExecutionMode: OutOfProcess

False
is instance freeze
New https://pypi.org/project/ultralytics/8.3.174 available  Update with 'pip install -U ultralytics'
Ultralytics YOLOv8.2.66  Python-3.11.5 torch-2.7.0+cpu CPU (11th Gen Intel Core(TM) i5-1135G7 2.40GHz)
□[34m□[1mengine\trainer: □[0mtask=detect, mode=train, model=yolov8s-oiv7.pt, data=C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\data.yaml, epochs=1, time=None, patience=100, batch=16, imgsz=256, save=True, save_period=-1, cache=False, device=None, workers=8, project=None, name=first, exist_ok=False, pretrained=True, optimizer=auto, verbose=True, seed=0, deterministic=True, single_cls=False, rect=False, cos_lr=False, close_mosaic=10, resume=False, amp=True, fraction=1.0, profile=False, freeze=15, multi_scale=False, overlap_mask=True, mask_ratio=4, dropout=0.0, val=True, split=val, save_json=False, save_hybrid=False, conf=None, iou=0.7, max_det=300, half=False, dnn=False, plots=True, source=None, vid_stride=1, stream_buffer=False, visualize=False, augment=False, agnostic_nms=False, classes=None, retina_masks=False, embed=None, show=False, save_frames=False, save_txt=False, save_conf=False, save_crop=False, show_labels=True, show_conf=True, show_boxes=True, line_width=None, format=torchscript, keras=False, optimize=False, int8=False, dynamic=False, simplify=False, opset=None, workspace=4, nms=False, lr0=0.01, lrf=0.01, momentum=0.937, weight_decay=0.0005, warmup_epochs=3.0, warmup_momentum=0.8, warmup_bias_lr=0.1, box=7.5, cls=0.5, dfl=1.5, pose=12.0, kobj=1.0, label_smoothing=0.0, nbs=64, hsv_h=0.015, hsv_s=0.7, hsv_v=0.4, degrees=0.0, translate=0.1, scale=0.5, shear=0.0, perspective=0.0, flipud=0.0, fliplr=0.5, bgr=0.0, mosaic=1.0, mixup=0.0, copy_paste=0.0, auto_augment=randaugment, erasing=0.4, crop_fraction=1.0, cfg=None, tracker=botsort.yaml, save_dir=C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first
Overriding model.yaml nc=601 with nc=3

                   from  n    params  module                                       arguments                     
  0                  -1  1       928  ultralytics.nn.modules.conv.Conv             [3, 32, 3, 2]                 
  1                  -1  1     18560  ultralytics.nn.modules.conv.Conv             [32, 64, 3, 2]                
  2                  -1  1     29056  ultralytics.nn.modules.block.C2f             [64, 64, 1, True]             
  3                  -1  1     73984  ultralytics.nn.modules.conv.Conv             [64, 128, 3, 2]               
  4                  -1  2    197632  ultralytics.nn.modules.block.C2f             [128, 128, 2, True]           
  5                  -1  1    295424  ultralytics.nn.modules.conv.Conv             [128, 256, 3, 2]              
  6                  -1  2    788480  ultralytics.nn.modules.block.C2f             [256, 256, 2, True]           
  7                  -1  1   1180672  ultralytics.nn.modules.conv.Conv             [256, 512, 3, 2]              
  8                  -1  1   1838080  ultralytics.nn.modules.block.C2f             [512, 512, 1, True]           
  9                  -1  1    656896  ultralytics.nn.modules.block.SPPF            [512, 512, 5]                 
 10                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 11             [-1, 6]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 12                  -1  1    591360  ultralytics.nn.modules.block.C2f             [768, 256, 1]                 
 13                  -1  1         0  torch.nn.modules.upsampling.Upsample         [None, 2, 'nearest']          
 14             [-1, 4]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 15                  -1  1    148224  ultralytics.nn.modules.block.C2f             [384, 128, 1]                 
 16                  -1  1    147712  ultralytics.nn.modules.conv.Conv             [128, 128, 3, 2]              
 17            [-1, 12]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 18                  -1  1    493056  ultralytics.nn.modules.block.C2f             [384, 256, 1]                 
 19                  -1  1    590336  ultralytics.nn.modules.conv.Conv             [256, 256, 3, 2]              
 20             [-1, 9]  1         0  ultralytics.nn.modules.conv.Concat           [1]                           
 21                  -1  1   1969152  ultralytics.nn.modules.block.C2f             [768, 512, 1]                 
 22        [15, 18, 21]  1   2117209  ultralytics.nn.modules.head.Detect           [3, [128, 256, 512]]          
YOLOv8s summary: 225 layers, 11,136,761 parameters, 11,136,745 gradients, 28.7 GFLOPs

Transferred 349/355 items from pretrained weights
□[34m□[1mTensorBoard: □[0mStart with 'tensorboard --logdir C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first', view at http://localhost:6006/
Freezing layer 'model.0.conv.weight'
Freezing layer 'model.0.bn.weight'
Freezing layer 'model.0.bn.bias'
Freezing layer 'model.1.conv.weight'
Freezing layer 'model.1.bn.weight'
Freezing layer 'model.1.bn.bias'
Freezing layer 'model.2.cv1.conv.weight'
Freezing layer 'model.2.cv1.bn.weight'
Freezing layer 'model.2.cv1.bn.bias'
Freezing layer 'model.2.cv2.conv.weight'
Freezing layer 'model.2.cv2.bn.weight'
Freezing layer 'model.2.cv2.bn.bias'
Freezing layer 'model.2.m.0.cv1.conv.weight'
Freezing layer 'model.2.m.0.cv1.bn.weight'
Freezing layer 'model.2.m.0.cv1.bn.bias'
Freezing layer 'model.2.m.0.cv2.conv.weight'
Freezing layer 'model.2.m.0.cv2.bn.weight'
Freezing layer 'model.2.m.0.cv2.bn.bias'
Freezing layer 'model.3.conv.weight'
Freezing layer 'model.3.bn.weight'
Freezing layer 'model.3.bn.bias'
Freezing layer 'model.4.cv1.conv.weight'
Freezing layer 'model.4.cv1.bn.weight'
Freezing layer 'model.4.cv1.bn.bias'
Freezing layer 'model.4.cv2.conv.weight'
Freezing layer 'model.4.cv2.bn.weight'
Freezing layer 'model.4.cv2.bn.bias'
Freezing layer 'model.4.m.0.cv1.conv.weight'
Freezing layer 'model.4.m.0.cv1.bn.weight'
Freezing layer 'model.4.m.0.cv1.bn.bias'
Freezing layer 'model.4.m.0.cv2.conv.weight'
Freezing layer 'model.4.m.0.cv2.bn.weight'
Freezing layer 'model.4.m.0.cv2.bn.bias'
Freezing layer 'model.4.m.1.cv1.conv.weight'
Freezing layer 'model.4.m.1.cv1.bn.weight'
Freezing layer 'model.4.m.1.cv1.bn.bias'
Freezing layer 'model.4.m.1.cv2.conv.weight'
Freezing layer 'model.4.m.1.cv2.bn.weight'
Freezing layer 'model.4.m.1.cv2.bn.bias'
Freezing layer 'model.5.conv.weight'
Freezing layer 'model.5.bn.weight'
Freezing layer 'model.5.bn.bias'
Freezing layer 'model.6.cv1.conv.weight'
Freezing layer 'model.6.cv1.bn.weight'
Freezing layer 'model.6.cv1.bn.bias'
Freezing layer 'model.6.cv2.conv.weight'
Freezing layer 'model.6.cv2.bn.weight'
Freezing layer 'model.6.cv2.bn.bias'
Freezing layer 'model.6.m.0.cv1.conv.weight'
Freezing layer 'model.6.m.0.cv1.bn.weight'
Freezing layer 'model.6.m.0.cv1.bn.bias'
Freezing layer 'model.6.m.0.cv2.conv.weight'
Freezing layer 'model.6.m.0.cv2.bn.weight'
Freezing layer 'model.6.m.0.cv2.bn.bias'
Freezing layer 'model.6.m.1.cv1.conv.weight'
Freezing layer 'model.6.m.1.cv1.bn.weight'
Freezing layer 'model.6.m.1.cv1.bn.bias'
Freezing layer 'model.6.m.1.cv2.conv.weight'
Freezing layer 'model.6.m.1.cv2.bn.weight'
Freezing layer 'model.6.m.1.cv2.bn.bias'
Freezing layer 'model.7.conv.weight'
Freezing layer 'model.7.bn.weight'
Freezing layer 'model.7.bn.bias'
Freezing layer 'model.8.cv1.conv.weight'
Freezing layer 'model.8.cv1.bn.weight'
Freezing layer 'model.8.cv1.bn.bias'
Freezing layer 'model.8.cv2.conv.weight'
Freezing layer 'model.8.cv2.bn.weight'
Freezing layer 'model.8.cv2.bn.bias'
Freezing layer 'model.8.m.0.cv1.conv.weight'
Freezing layer 'model.8.m.0.cv1.bn.weight'
Freezing layer 'model.8.m.0.cv1.bn.bias'
Freezing layer 'model.8.m.0.cv2.conv.weight'
Freezing layer 'model.8.m.0.cv2.bn.weight'
Freezing layer 'model.8.m.0.cv2.bn.bias'
Freezing layer 'model.9.cv1.conv.weight'
Freezing layer 'model.9.cv1.bn.weight'
Freezing layer 'model.9.cv1.bn.bias'
Freezing layer 'model.9.cv2.conv.weight'
Freezing layer 'model.9.cv2.bn.weight'
Freezing layer 'model.9.cv2.bn.bias'
Freezing layer 'model.12.cv1.conv.weight'
Freezing layer 'model.12.cv1.bn.weight'
Freezing layer 'model.12.cv1.bn.bias'
Freezing layer 'model.12.cv2.conv.weight'
Freezing layer 'model.12.cv2.bn.weight'
Freezing layer 'model.12.cv2.bn.bias'
Freezing layer 'model.12.m.0.cv1.conv.weight'
Freezing layer 'model.12.m.0.cv1.bn.weight'
Freezing layer 'model.12.m.0.cv1.bn.bias'
Freezing layer 'model.12.m.0.cv2.conv.weight'
Freezing layer 'model.12.m.0.cv2.bn.weight'
Freezing layer 'model.12.m.0.cv2.bn.bias'
Freezing layer 'model.22.dfl.conv.weight'
C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\win64\python\Lib\site-packages\ultralytics\engine\trainer.py:268: FutureWarning: `torch.cuda.amp.GradScaler(args...)` is deprecated. Please use `torch.amp.GradScaler('cuda', args...)` instead.
  self.scaler = torch.cuda.amp.GradScaler(enabled=self.amp)

□[34m□[1mtrain: □[0mScanning C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\labels.cache... 1019 images, 136 backgrounds, 13 corrupt: 100%|##########| 1019/1019 [00:00<?, ?it/s]
□[34m□[1mtrain: □[0mScanning C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\labels.cache... 1019 images, 136 backgrounds, 13 corrupt: 100%|##########| 1019/1019 [00:00<?, ?it/s]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\0510fe79-68bd-4b4a-951f-dab86920adeb.png: ignoring corrupt image/label: negative label values [  -0.047368]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\0eaaca45-2f14-4cef-87e4-25c02c6bf3b2.png: ignoring corrupt image/label: negative label values [  -0.018722]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\15aef5fc-3df2-4f0e-8dd8-95ea8c730407.png: ignoring corrupt image/label: negative label values [  -0.022958]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\175a13d3-1a18-428a-a577-48807a5871fe.png: ignoring corrupt image/label: negative label values [  -0.083108]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\23232dd4-2cb4-4229-b335-98862a76323a.png: ignoring corrupt image/label: negative label values [  -0.049029]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\3d14a064-5447-4257-a19d-ab909a56da01.png: ignoring corrupt image/label: negative label values [  -0.016604]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\55c845f7-6a6e-4b05-9ad6-c50cb78e9874.png: ignoring corrupt image/label: negative label values [  -0.063713]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\757484cc-6ad5-4ad9-b03b-b0b4ce7ef7ee.png: ignoring corrupt image/label: negative label values [  -0.054816]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\7bbc8b1d-26fb-4268-8d72-6c10bdedce90.png: ignoring corrupt image/label: negative label values [  -0.046767]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\8b961909-3110-4bc8-b08f-f6bb6eee123a.png: ignoring corrupt image/label: negative label values [   -0.02139]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\d402070d-cb8b-4109-aea2-7b33f01cadad.png: ignoring corrupt image/label: negative label values [  -0.015335]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\f8891e69-28dd-4fc8-8a2e-4d1ad88145b8.png: ignoring corrupt image/label: negative label values [  -0.059821]
□[34m□[1mtrain: □[0mWARNING ⚠️ C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\train\images\fef31132-3b8d-48f2-9493-1e522dde091c.png: ignoring corrupt image/label: negative label values [  -0.075973]
C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\win64\python\Lib\site-packages\torch\utils\data\dataloader.py:665: UserWarning: 'pin_memory' argument is set as true but no accelerator is found, then device pinned memory won't be used.
  warnings.warn(warn_msg)

□[34m□[1mval: □[0mScanning C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\val\labels.cache... 144 images, 0 backgrounds, 0 corrupt: 100%|##########| 144/144 [00:00<?, ?it/s]
□[34m□[1mval: □[0mScanning C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\val\labels.cache... 144 images, 0 backgrounds, 0 corrupt: 100%|##########| 144/144 [00:00<?, ?it/s]
Plotting labels to C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\labels.jpg... 
□[34m□[1moptimizer:□[0m 'optimizer=auto' found, ignoring 'lr0=0.01' and 'momentum=0.937' and determining best 'optimizer', 'lr0' and 'momentum' automatically... 
□[34m□[1moptimizer:□[0m AdamW(lr=0.001429, momentum=0.9) with parameter groups 57 weight(decay=0.0), 64 weight(decay=0.0005), 63 bias(decay=0.0)
□[34m□[1mTensorBoard: □[0mmodel graph visualization added ✅
Image sizes 256 train, 256 val
Using 0 dataloader workers
Logging results to □[1mC:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first□[0m
Starting training for 1 epochs...

      Epoch    GPU_mem   box_loss   cls_loss   dfl_loss  Instances       Size

  0%|          | 0/63 [00:00<?, ?it/s]
        1/1         0G     0.9298      3.627     0.9719         69        256:   0%|          | 0/63 [00:01<?, ?it/s]
        1/1         0G     0.9298      3.627     0.9719         69        256:   2%|1         | 1/63 [00:01<01:51,  1.79s/it]
        1/1         0G     0.9627      3.689     0.9906         73        256:   2%|1         | 1/63 [00:04<01:51,  1.79s/it]
        1/1         0G     0.9627      3.689     0.9906         73        256:   3%|3         | 2/63 [00:04<02:13,  2.19s/it]
        1/1         0G     0.9275      3.788      1.002         58        256:   3%|3         | 2/63 [00:06<02:13,  2.19s/it]
        1/1         0G     0.9275      3.788      1.002         58        256:   5%|4         | 3/63 [00:06<02:13,  2.22s/it]
        1/1         0G     0.9068      3.739     0.9966         54        256:   5%|4         | 3/63 [00:09<02:13,  2.22s/it]
        1/1         0G     0.9068      3.739     0.9966         54        256:   6%|6         | 4/63 [00:09<02:21,  2.39s/it]
        1/1         0G     0.9173      3.631      1.002         57        256:   6%|6         | 4/63 [00:12<02:21,  2.39s/it]
        1/1         0G     0.9173      3.631      1.002         57        256:   8%|7         | 5/63 [00:12<02:29,  2.57s/it]
        1/1         0G     0.9202      3.549      1.009         67        256:   8%|7         | 5/63 [00:14<02:29,  2.57s/it]
        1/1         0G     0.9202      3.549      1.009         67        256:  10%|9         | 6/63 [00:14<02:23,  2.51s/it]
        1/1         0G     0.9254      3.444       1.02         66        256:  10%|9         | 6/63 [00:16<02:23,  2.51s/it]
        1/1         0G     0.9254      3.444       1.02         66        256:  11%|#1        | 7/63 [00:16<02:09,  2.31s/it]
        1/1         0G      0.923      3.352       1.02         73        256:  11%|#1        | 7/63 [00:18<02:09,  2.31s/it]
        1/1         0G      0.923      3.352       1.02         73        256:  13%|#2        | 8/63 [00:18<01:59,  2.16s/it]
        1/1         0G     0.9153      3.242      1.015         65        256:  13%|#2        | 8/63 [00:20<01:59,  2.16s/it]
        1/1         0G     0.9153      3.242      1.015         65        256:  14%|#4        | 9/63 [00:20<01:52,  2.08s/it]
        1/1         0G     0.9033      3.136       1.01         52        256:  14%|#4        | 9/63 [00:21<01:52,  2.08s/it]
        1/1         0G     0.9033      3.136       1.01         52        256:  16%|#5        | 10/63 [00:21<01:47,  2.02s/it]
        1/1         0G     0.9005      3.025      1.012         66        256:  16%|#5        | 10/63 [00:23<01:47,  2.02s/it]
        1/1         0G     0.9005      3.025      1.012         66        256:  17%|#7        | 11/63 [00:24<01:44,  2.02s/it]
        1/1         0G     0.8889      2.926      1.007         63        256:  17%|#7        | 11/63 [00:25<01:44,  2.02s/it]
        1/1         0G     0.8889      2.926      1.007         63        256:  19%|#9        | 12/63 [00:25<01:40,  1.97s/it]
        1/1         0G     0.8769      2.819      1.004         59        256:  19%|#9        | 12/63 [00:27<01:40,  1.97s/it]
        1/1         0G     0.8769      2.819      1.004         59        256:  21%|##        | 13/63 [00:27<01:36,  1.93s/it]
        1/1         0G     0.8719      2.757      1.003         40        256:  21%|##        | 13/63 [00:29<01:36,  1.93s/it]
        1/1         0G     0.8719      2.757      1.003         40        256:  22%|##2       | 14/63 [00:29<01:35,  1.96s/it]
        1/1         0G     0.8685      2.685      1.003         84        256:  22%|##2       | 14/63 [00:31<01:35,  1.96s/it]
        1/1         0G     0.8685      2.685      1.003         84        256:  24%|##3       | 15/63 [00:31<01:34,  1.97s/it]
        1/1         0G       0.86      2.605     0.9977         76        256:  24%|##3       | 15/63 [00:33<01:34,  1.97s/it]
        1/1         0G       0.86      2.605     0.9977         76        256:  25%|##5       | 16/63 [00:33<01:31,  1.94s/it]
        1/1         0G     0.8428      2.519     0.9916         64        256:  25%|##5       | 16/63 [00:35<01:31,  1.94s/it]
        1/1         0G     0.8428      2.519     0.9916         64        256:  27%|##6       | 17/63 [00:35<01:27,  1.91s/it]
        1/1         0G     0.8388      2.441      0.992         55        256:  27%|##6       | 17/63 [00:37<01:27,  1.91s/it]
        1/1         0G     0.8388      2.441      0.992         55        256:  29%|##8       | 18/63 [00:37<01:24,  1.88s/it]
        1/1         0G       0.84      2.373     0.9888         79        256:  29%|##8       | 18/63 [00:39<01:24,  1.88s/it]
        1/1         0G       0.84      2.373     0.9888         79        256:  30%|###       | 19/63 [00:39<01:21,  1.85s/it]
        1/1         0G      0.835       2.31     0.9859         53        256:  30%|###       | 19/63 [00:40<01:21,  1.85s/it]
        1/1         0G      0.835       2.31     0.9859         53        256:  32%|###1      | 20/63 [00:40<01:19,  1.84s/it]
        1/1         0G     0.8379      2.257     0.9843         93        256:  32%|###1      | 20/63 [00:42<01:19,  1.84s/it]
        1/1         0G     0.8379      2.257     0.9843         93        256:  33%|###3      | 21/63 [00:42<01:17,  1.84s/it]
        1/1         0G     0.8338      2.203     0.9858         67        256:  33%|###3      | 21/63 [00:44<01:17,  1.84s/it]
        1/1         0G     0.8338      2.203     0.9858         67        256:  35%|###4      | 22/63 [00:44<01:14,  1.83s/it]
        1/1         0G     0.8331      2.148     0.9844         68        256:  35%|###4      | 22/63 [00:46<01:14,  1.83s/it]
        1/1         0G     0.8331      2.148     0.9844         68        256:  37%|###6      | 23/63 [00:46<01:13,  1.83s/it]
        1/1         0G     0.8254      2.099     0.9816         49        256:  37%|###6      | 23/63 [00:48<01:13,  1.83s/it]
        1/1         0G     0.8254      2.099     0.9816         49        256:  38%|###8      | 24/63 [00:48<01:10,  1.82s/it]
        1/1         0G     0.8228      2.055     0.9807         74        256:  38%|###8      | 24/63 [00:49<01:10,  1.82s/it]
        1/1         0G     0.8228      2.055     0.9807         74        256:  40%|###9      | 25/63 [00:49<01:09,  1.82s/it]
        1/1         0G     0.8168       2.01      0.977         68        256:  40%|###9      | 25/63 [00:51<01:09,  1.82s/it]
        1/1         0G     0.8168       2.01      0.977         68        256:  41%|####1     | 26/63 [00:51<01:05,  1.77s/it]
        1/1         0G     0.8111      1.967     0.9749         82        256:  41%|####1     | 26/63 [00:53<01:05,  1.77s/it]
        1/1         0G     0.8111      1.967     0.9749         82        256:  43%|####2     | 27/63 [00:53<01:02,  1.74s/it]
        1/1         0G     0.8055      1.927     0.9738         76        256:  43%|####2     | 27/63 [00:54<01:02,  1.74s/it]
        1/1         0G     0.8055      1.927     0.9738         76        256:  44%|####4     | 28/63 [00:54<01:00,  1.72s/it]
        1/1         0G     0.8027      1.896     0.9747         55        256:  44%|####4     | 28/63 [00:56<01:00,  1.72s/it]
        1/1         0G     0.8027      1.896     0.9747         55        256:  46%|####6     | 29/63 [00:56<00:58,  1.72s/it]
        1/1         0G        0.8      1.864     0.9742         72        256:  46%|####6     | 29/63 [00:58<00:58,  1.72s/it]
        1/1         0G        0.8      1.864     0.9742         72        256:  48%|####7     | 30/63 [00:58<00:56,  1.72s/it]
        1/1         0G     0.7944      1.829     0.9732         56        256:  48%|####7     | 30/63 [01:00<00:56,  1.72s/it]
        1/1         0G     0.7944      1.829     0.9732         56        256:  49%|####9     | 31/63 [01:00<00:55,  1.73s/it]
        1/1         0G     0.7917      1.795     0.9737         64        256:  49%|####9     | 31/63 [01:02<00:55,  1.73s/it]
        1/1         0G     0.7917      1.795     0.9737         64        256:  51%|#####     | 32/63 [01:02<00:58,  1.90s/it]
        1/1         0G     0.7873      1.764     0.9697         69        256:  51%|#####     | 32/63 [01:04<00:58,  1.90s/it]
        1/1         0G     0.7873      1.764     0.9697         69        256:  52%|#####2    | 33/63 [01:04<01:01,  2.04s/it]
        1/1         0G      0.787      1.738     0.9682         52        256:  52%|#####2    | 33/63 [01:06<01:01,  2.04s/it]
        1/1         0G      0.787      1.738     0.9682         52        256:  54%|#####3    | 34/63 [01:06<00:59,  2.04s/it]
        1/1         0G     0.7849      1.713     0.9681         63        256:  54%|#####3    | 34/63 [01:08<00:59,  2.04s/it]
        1/1         0G     0.7849      1.713     0.9681         63        256:  56%|#####5    | 35/63 [01:08<00:52,  1.88s/it]
        1/1         0G     0.7815      1.692      0.967         71        256:  56%|#####5    | 35/63 [01:10<00:52,  1.88s/it]
        1/1         0G     0.7815      1.692      0.967         71        256:  57%|#####7    | 36/63 [01:10<00:50,  1.86s/it]
        1/1         0G     0.7792      1.668      0.966         77        256:  57%|#####7    | 36/63 [01:11<00:50,  1.86s/it]
        1/1         0G     0.7792      1.668      0.966         77        256:  59%|#####8    | 37/63 [01:11<00:45,  1.74s/it]
        1/1         0G     0.7767      1.641     0.9655         55        256:  59%|#####8    | 37/63 [01:13<00:45,  1.74s/it]
        1/1         0G     0.7767      1.641     0.9655         55        256:  60%|######    | 38/63 [01:13<00:42,  1.69s/it]
        1/1         0G     0.7738      1.619     0.9645         59        256:  60%|######    | 38/63 [01:14<00:42,  1.69s/it]
        1/1         0G     0.7738      1.619     0.9645         59        256:  62%|######1   | 39/63 [01:14<00:40,  1.68s/it]
        1/1         0G     0.7722      1.597     0.9632         79        256:  62%|######1   | 39/63 [01:16<00:40,  1.68s/it]
        1/1         0G     0.7722      1.597     0.9632         79        256:  63%|######3   | 40/63 [01:16<00:41,  1.81s/it]
        1/1         0G     0.7683      1.577     0.9616         80        256:  63%|######3   | 40/63 [01:18<00:41,  1.81s/it]
        1/1         0G     0.7683      1.577     0.9616         80        256:  65%|######5   | 41/63 [01:18<00:39,  1.78s/it]
        1/1         0G     0.7645      1.554     0.9601         59        256:  65%|######5   | 41/63 [01:20<00:39,  1.78s/it]
        1/1         0G     0.7645      1.554     0.9601         59        256:  67%|######6   | 42/63 [01:20<00:36,  1.72s/it]
        1/1         0G     0.7644      1.541     0.9603         42        256:  67%|######6   | 42/63 [01:21<00:36,  1.72s/it]
        1/1         0G     0.7644      1.541     0.9603         42        256:  68%|######8   | 43/63 [01:21<00:34,  1.72s/it]
        1/1         0G     0.7588      1.523     0.9583         59        256:  68%|######8   | 43/63 [01:24<00:34,  1.72s/it]
        1/1         0G     0.7588      1.523     0.9583         59        256:  70%|######9   | 44/63 [01:24<00:36,  1.93s/it]
        1/1         0G     0.7553      1.506      0.956         58        256:  70%|######9   | 44/63 [01:27<00:36,  1.93s/it]
        1/1         0G     0.7553      1.506      0.956         58        256:  71%|#######1  | 45/63 [01:27<00:41,  2.31s/it]
        1/1         0G     0.7536      1.489      0.955         48        256:  71%|#######1  | 45/63 [01:30<00:41,  2.31s/it]
        1/1         0G     0.7536      1.489      0.955         48        256:  73%|#######3  | 46/63 [01:30<00:42,  2.51s/it]
        1/1         0G     0.7496      1.472     0.9532         58        256:  73%|#######3  | 46/63 [01:32<00:42,  2.51s/it]
        1/1         0G     0.7496      1.472     0.9532         58        256:  75%|#######4  | 47/63 [01:32<00:38,  2.38s/it]
        1/1         0G     0.7499      1.458     0.9536         60        256:  75%|#######4  | 47/63 [01:34<00:38,  2.38s/it]
        1/1         0G     0.7499      1.458     0.9536         60        256:  76%|#######6  | 48/63 [01:34<00:34,  2.28s/it]
        1/1         0G     0.7497      1.442     0.9519         60        256:  76%|#######6  | 48/63 [01:36<00:34,  2.28s/it]
        1/1         0G     0.7497      1.442     0.9519         60        256:  78%|#######7  | 49/63 [01:36<00:31,  2.24s/it]
        1/1         0G     0.7487      1.431     0.9503         57        256:  78%|#######7  | 49/63 [01:38<00:31,  2.24s/it]
        1/1         0G     0.7487      1.431     0.9503         57        256:  79%|#######9  | 50/63 [01:38<00:28,  2.19s/it]
        1/1         0G     0.7491      1.417     0.9517         49        256:  79%|#######9  | 50/63 [01:40<00:28,  2.19s/it]
        1/1         0G     0.7491      1.417     0.9517         49        256:  81%|########  | 51/63 [01:40<00:25,  2.11s/it]
        1/1         0G     0.7476      1.407     0.9512         62        256:  81%|########  | 51/63 [01:42<00:25,  2.11s/it]
        1/1         0G     0.7476      1.407     0.9512         62        256:  83%|########2 | 52/63 [01:42<00:23,  2.09s/it]
        1/1         0G     0.7443      1.391     0.9499         54        256:  83%|########2 | 52/63 [01:44<00:23,  2.09s/it]
        1/1         0G     0.7443      1.391     0.9499         54        256:  84%|########4 | 53/63 [01:44<00:20,  2.04s/it]
        1/1         0G     0.7434      1.377     0.9505         65        256:  84%|########4 | 53/63 [01:46<00:20,  2.04s/it]
        1/1         0G     0.7434      1.377     0.9505         65        256:  86%|########5 | 54/63 [01:46<00:18,  2.07s/it]
        1/1         0G     0.7416      1.365     0.9503         60        256:  86%|########5 | 54/63 [01:49<00:18,  2.07s/it]
        1/1         0G     0.7416      1.365     0.9503         60        256:  87%|########7 | 55/63 [01:49<00:16,  2.07s/it]
        1/1         0G     0.7406      1.354     0.9508         61        256:  87%|########7 | 55/63 [01:51<00:16,  2.07s/it]
        1/1         0G     0.7406      1.354     0.9508         61        256:  89%|########8 | 56/63 [01:51<00:14,  2.11s/it]
        1/1         0G     0.7386      1.341     0.9497         66        256:  89%|########8 | 56/63 [01:53<00:14,  2.11s/it]
        1/1         0G     0.7386      1.341     0.9497         66        256:  90%|######### | 57/63 [01:53<00:12,  2.07s/it]
        1/1         0G      0.737       1.33     0.9496         57        256:  90%|######### | 57/63 [01:55<00:12,  2.07s/it]
        1/1         0G      0.737       1.33     0.9496         57        256:  92%|#########2| 58/63 [01:55<00:10,  2.09s/it]
        1/1         0G     0.7359       1.32     0.9494         79        256:  92%|#########2| 58/63 [01:57<00:10,  2.09s/it]
        1/1         0G     0.7359       1.32     0.9494         79        256:  94%|#########3| 59/63 [01:57<00:08,  2.08s/it]
        1/1         0G     0.7359      1.309     0.9487         83        256:  94%|#########3| 59/63 [01:59<00:08,  2.08s/it]
        1/1         0G     0.7359      1.309     0.9487         83        256:  95%|#########5| 60/63 [01:59<00:06,  2.19s/it]
        1/1         0G     0.7341        1.3     0.9478         76        256:  95%|#########5| 60/63 [02:01<00:06,  2.19s/it]
        1/1         0G     0.7341        1.3     0.9478         76        256:  97%|#########6| 61/63 [02:01<00:04,  2.18s/it]
        1/1         0G      0.732      1.288     0.9469         64        256:  97%|#########6| 61/63 [02:04<00:04,  2.18s/it]
        1/1         0G      0.732      1.288     0.9469         64        256:  98%|#########8| 62/63 [02:04<00:02,  2.20s/it]
        1/1         0G     0.7312      1.277      0.946         70        256:  98%|#########8| 62/63 [02:06<00:02,  2.20s/it]
        1/1         0G     0.7312      1.277      0.946         70        256: 100%|##########| 63/63 [02:06<00:00,  2.12s/it]
        1/1         0G     0.7312      1.277      0.946         70        256: 100%|##########| 63/63 [02:06<00:00,  2.00s/it]

                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):   0%|          | 0/5 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  20%|##        | 1/5 [00:03<00:14,  3.68s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  40%|####      | 2/5 [00:06<00:09,  3.17s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  60%|######    | 3/5 [00:09<00:06,  3.04s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  80%|########  | 4/5 [00:12<00:02,  2.98s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|##########| 5/5 [00:13<00:00,  2.40s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|##########| 5/5 [00:13<00:00,  2.73s/it]
                   all        144        426      0.856      0.824      0.905      0.722

1 epochs completed in 0.041 hours.
Optimizer stripped from C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights\last.pt, 22.5MB
Optimizer stripped from C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights\best.pt, 22.5MB

Validating C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights\best.pt...
Ultralytics YOLOv8.2.66  Python-3.11.5 torch-2.7.0+cpu CPU (11th Gen Intel Core(TM) i5-1135G7 2.40GHz)
YOLOv8s summary (fused): 168 layers, 11,126,745 parameters, 0 gradients, 28.4 GFLOPs

                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):   0%|          | 0/5 [00:00<?, ?it/s]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  20%|##        | 1/5 [00:02<00:09,  2.27s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  40%|####      | 2/5 [00:04<00:06,  2.22s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  60%|######    | 3/5 [00:06<00:04,  2.27s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95):  80%|########  | 4/5 [00:08<00:02,  2.20s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|##########| 5/5 [00:10<00:00,  1.82s/it]
                 Class     Images  Instances      Box(P          R      mAP50  mAP50-95): 100%|##########| 5/5 [00:10<00:00,  2.01s/it]
                   all        144        426      0.856      0.823      0.905      0.722
                 apple         69        153      0.792      0.944      0.933      0.829
                orange         51         84      0.947      0.833      0.948      0.777
                  pear         72        189      0.829      0.692      0.834      0.561
Speed: 0.5ms preprocess, 58.1ms inference, 0.0ms loss, 1.0ms postprocess per image
Results saved to □[1mC:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first□[0m
Ultralytics YOLOv8.2.66  Python-3.11.5 torch-2.7.0+cpu CPU (11th Gen Intel Core(TM) i5-1135G7 2.40GHz)
YOLOv8s summary (fused): 168 layers, 11,126,745 parameters, 0 gradients, 28.4 GFLOPs

□[34m□[1mPyTorch:□[0m starting from 'C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights\best.pt' with input shape (1, 3, 256, 256) BCHW and output shape(s) (1, 7, 1344) (21.4 MB)

□[34m□[1mONNX:□[0m starting export with onnx 1.16.1 opset 14...
□[34m□[1mONNX:□[0m export success ✅ 1.2s, saved as 'C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights\best.onnx' (42.5 MB)

Export complete (3.3s)
Results saved to □[1mC:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights□[0m
Predict:         yolo predict task=detect model=C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights\best.onnx imgsz=256  
Validate:        yolo val task=detect model=C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\runs\detect\first\weights\best.onnx imgsz=256 data=C:\Users\Noel Nathan\Desktop\Universidad\8tQuadrimestre\tfg\from-code-to-robot\computer_vision\datasets\fruits_3_4998\data.yaml  
Visualize:       https://netron.app

Continue a train

If the training process is interrupted for any reason, it can be resumed using the resume option. This feature loads the weights from the last saved model and also restores the optimizer state, learning rate scheduler, and epoch number. This allows you to continue the training process seamlessly from where it was left off.

model_folder = fullfile(pwd, 'runs', 'detect', 'train3', 'weights', 'best.pt');
yolov8Det = trainYOLOv8ObjectDetector( ...
    configFile, ...
    model_folder, ...
    options, ...
    'resume', true ...                    
);

For more details, refer to the Resume Training section in the official documentation.

Retrain a model

Alternatively, you can continue training an already trained model by using it as a starting point instead of a base model like yolov8n.pt or yolov8s.pt. In this case, you simply load the weights from the previously trained model.

However, unlike the resume option, this approach does not restore the optimizer state, learning rate schedule, or epoch number. These parameters are reset, and training starts from epoch 1 with the current training options.

This method is useful when you want to fine-tune a model further or adapt it to a new dataset, but without continuing exactly where the previous training left off.

model_folder = fullfile(pwd, 'runs', 'detect', 'train1', 'weights', 'best.pt');
yolov8Det = trainYOLOv8ObjectDetector( ...
    configFile, ...
    baseModel, ...
    options
    optionalArgs{:});

Load a model

When training a YOLO model, the trained weights are typically saved in the runs/detect/train@/weights directory, where @ is an incrementing counter (e.g., train1, train2, etc.) to avoid overwriting previous runs.

If this folder is not created, it is likely that the YOLO settings were not properly configured during the 1. Installation.md step. To resolve this, re-run that section and ensure everything is correctly set up.

Inside the weights folder, you’ll find several model files, including last.pt, best.pt, and best.onnx. The function utils.loadModel always loads the best.onnx model from the selected folder.

Therefore, in the following code, you should load the model from the first training run:

modelPath = fullfile(pwd, 'runs', 'detect', 'first', 'weights');
configFile = fullfile(pwd, 'datasets', 'fruits_3_4998', 'data.yaml');

yolov8Det = utils.loadModel(modelPath, configFile);

Prediction

In this section, we use the model we trained earlier to detect fruits in the following images.

utils.detectAndDisplayImage(yolov8Det, 'pear.jpg'); 

figure_0.png

utils.detectAndDisplayImage(yolov8Det, 'apple.jpg');

figure_1.png

utils.detectAndDisplayImage(yolov8Det, 'orange.jpg');

figure_2.png

As observed in the results, the model produces an excessive number of detections, leading to a low-precision output.

We will discuss this topic in detail in the next lesson.