ONNX Runtime Engine for swatahVision

This code creates a custom inference engine that allows models in ONNX format to run inside the swatahVision framework.

In simple terms, it helps the system:

Load an ONNX model
Prepare an image so the model can understand it
Run the model on the image
Return the prediction results

What is ONNX?

ONNX (Open Neural Network Exchange) is a format used to store machine learning models.

It allows models created in frameworks like:

PyTorch
TensorFlow
Keras

to run in many different environments.

This code uses ONNX Runtime, which is a tool that executes ONNX models efficiently.

Main Purpose of This Code

This file creates a class called:

OnnxRuntimeEngine

This class connects:

swatahVision → ONNX Runtime

So that swatahVision can run ONNX models easily.

Libraries Used

The code uses the following Python libraries:

onnxruntime → runs ONNX models
numpy → handles numerical data
opencv (cv2) → processes images

Class: OnnxRuntimeEngine

This class inherits from the base engine:

RuntimeEngine

It defines how:

the model is loaded
the input image is prepared
the model prediction is executed

1. Loading the Model

Function:

load()

This function loads the ONNX model into memory.

It also selects the hardware used to run the model.

Possible hardware options:

CPU
GPU

Example behavior:

CPU → CPUExecutionProvider
GPU → CUDAExecutionProvider

After loading the model, the function also collects information about:

model inputs
input shapes
input data types
model outputs

2. Running the Model (Inference)

Function:

infer()

This function performs the main prediction.

Steps performed:

Take the input image
Prepare the image using preprocessing
Send the image to the model
Get the prediction result

The model then returns:

raw output
metadata information

3. Getting Model Information

Function:

get_model_info()

This function reads details from the ONNX model.

It extracts information such as:

Input names
Input shapes
Input data types
Output names

This helps the engine understand what format the model expects.

4. Image Preprocessing

Function:

preprocess()

Before sending an image to the model, it must be prepared correctly.

This function performs that preparation.

The steps include:

1. Resize the image

The image is resized to match the model's expected input size.

2. Maintain aspect ratio (Letterbox)

Instead of stretching the image, the code keeps the original shape.

It does this by:

resizing the image
adding padding around it

This process is called letterboxing.

3. Change image format

Images normally look like this:

(H, W, C)
Height, Width, Channels

But models expect:

(C, H, W)
Channels, Height, Width

So the code rearranges the image format.

4. Handle batch inputs

The code can process:

single image
multiple images at once

If needed, it automatically adds a batch dimension.

5. Convert data type

The model may expect input as:

float32
uint8

The code converts the image to the correct type.

Letterbox Function (Important)

The letterbox() function resizes an image without distortion.

Steps:

Calculate scaling factor
Resize image
Add padding to reach required size

Example:

Original Image

400 × 300

Model Input

640 × 640

The code:

resizes the image
adds black padding around it

This keeps the image proportions correct.

Output of Preprocessing

The preprocessing function returns two things:

processed_image
meta_information

The metadata contains:

scale factor
padding values

This information can later be used to adjust predictions back to the original image size.

Overall Workflow

The complete process looks like this:

Load Model
      ↓
Receive Image
      ↓
Preprocess Image
      ↓
Run Model
      ↓
Return Prediction

Why This Code Is Useful

This engine allows:

Running ONNX models inside swatahVision
Supporting both CPU and GPU
Automatically handling image preprocessing
Supporting single and batch inputs

It simplifies the process of deploying ONNX models in computer vision applications.

Summary

This code builds a bridge between:

swatahVision Framework
        ↓
ONNX Runtime Engine
        ↓
Machine Learning Model

It makes it easier to:

load models
prepare images
run predictions
retrieve results