The nice thing about ChatGPT and similar systems is that the complexity of AI/ML functionality is hidden behind a friendly natural language interface. This makes it easily reachable to the masses. But behind this easy to use facade is a lot of advanced functionality that involve a sequence of data processing steps called a pipeline. An AI-powered business card reader, for example, would first detect text and then recognize the individual letters within the context of the words they belong to. A license plate reader would be similar. Detection is an important process that you often need in your AI/ML projects. And that’s why we will be looking at YOLO.
YOLO or You Only Look Once is a family of object detection models designed for speed and accuracy. It was introduced 2015 in the paper “You Only Look Once: Unified, Real-Time Object Detection” by Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. YOLO’s core innovation was to treat object detection as a single regression problem. Instead of using a sliding window or region proposals, YOLO divided an image into a grid and predicted bounding boxes and class probabilities simultaneously for each grid cell in one go. This radically improved speed, making it suitable even for real-time detection.
One of the best known implementation of YOLO is from Ultralytics, a software company and research group that specializes in deep learning models for computer vision, particularly in the field of object detection. Their implementation of YOLO makes it very easy to get started and thus have one of the most widely used tools for real-time object detection. Which is why we will be using it.
First install dependencies:
pip3 install ultralytics opencv-python
Then the code:
from ultralytics import YOLO
import cv2
import sys
# Load a pre-trained YOLOv8 model
model = YOLO('yolov8n.pt')
# Get the image path from the command line
if len(sys.argv) < 2:
print("Usage: python yolo-test.py <image_path>")
sys.exit(1)
image_path = sys.argv[1]
# Load an image
image = cv2.imread(image_path)
# Run YOLO object detection on the image
results = model(image)
# Show the results (with bounding boxes, labels, and confidence scores)
results[0].show()
# Optionally save the output with bounding boxes
results[0].save() # Saves the output image
cv2.destroyAllWindows()
sys.exit(0)
Running this:
python3 yolo-test.py car.jpg
You can also just skip the code and just run yolo directly:
yolo predict model=yolov8n.pt source=car.jpg
And you will see that it detected the cars:
However, the default models are not capable of detecting license plates which is what I need. I therefore have to train the model to detect license plates.
Training these advanced models require significant computational resources, so running them on hardware with sufficient GPU power is recommended. If you don’t have a GPU or need more GPU power, check out my post on renting GPUs with Vast.ai.
Step 1. Prepare the dataset
Normally you would need to gather images of license plates, preprocess, and label the images. Luckily, I found a big license plate dataset on Roboflow that’s already preprocessed and labeled. I downloaded it and used it for training.
I did need to change the path in the data.yml a bit.
path: License Plate Recognition
train: train/images
val: valid/images
test: test/images
names:
0: License_Plate
Step 2. Start training.
We train on the dataset described in data.yml using yolov8n.pt as starting point for 50 epochs. Image size is 640×640.
sudo yolo train data="datasets/License Plate Recognition/data.yaml" model=yolov8n.pt epochs=50 imgsz=640
If you get interrupted you can resume the training. Instead of using yolov8n.pt as starting point, we use best.pt, the last best weights saved during training.
yolo train data="datasets/License Plate Recognition/data.yaml" model=runs/train/exp/weights/best.pt epochs=50 imgsz=640 resume=True
You should see the progression of the training. This can take from minutes to hours depending on your GPU power.
Once the training is completed, you will see the following.
Step 3. Detect
I modified the sample code above to use the new model:
# model = YOLO('yolov8n.pt')
model = YOLO('runs/detect/train/weights/best.pt') # Our trained model
And then ran it.
python3 yolo-test.py car.jpg
Or, again, you can just use yolo directly:
yolo predict model=runs/detect/train/weights/best.pt source=car.jpg
And now it is able to detect the license plates:
That’s it!