Skip to content

swatahVision

swatahVision Cover

An open-source Vision AI stack for real-world applications

swatahVision is an open-source Vision AI stack that brings together models, runtimes, post-processing, tracking, and visualization into a clean, reusable Python package.

It’s built to make Vision AI practical: load a model, run inference, get structured outputs, visualize results, and ship pipelines faster — without reinventing glue-code every time.

Part of the VisionAI4Bhārat initiative.


What is swatahVision?

swatahVision provides a unified interface for:

  • Inference across multiple runtimes (e.g., ONNX Runtime, OpenVINO)
  • Vision tasks
  • Object Detection
  • Image Classification
  • (Extensible for OCR / segmentation / multimodal pipelines)
  • Post-processing adapters
  • YOLO-style decoders + NMS
  • SSD-style decoders
  • RetinaNet-style decoders
  • Tracking
  • Integrated ByteTrack support to assign stable tracker_id across frames
  • Visualization
  • Built-in drawing utilities for boxes, labels, and overlays
  • Video processing utilities
  • Frame generators and pipeline helpers

Features

  • Unified model wrapper: same API for different backends/runtimes
  • Structured outputs:
  • Detections: boxes, confidence, class_id, tracker_id (and more)
  • Classification: class_id, confidence, top-k
  • Production-friendly utilities: FPS monitor, video readers/writers, batching-ready patterns
  • Lightweight & composable: designed as a stack, not a monolith
  • Real world examples and analytics
  • ✅ **Built-in annotation utilities
  • ✅ **Minimal and clean API

Core Components

Model

sv.Model(model, engine, hardware)

Handles model loading and inference execution.

Image API

image = sv.Image.load_from_file(path)
sv.Image.show(image)

Post Processing

sv.Classification.from_mobilenet(...)
sv.Detections.from_ssd(...)

Annotation

  • Bounding boxes
  • Labels
  • Custom colors
  • Text positioning

Design Philosophy

  • Simplicity
  • Lightweight deployment
  • Fast prototyping
  • Developer-friendly workflows

Install

From source

git clone https://github.com/VisionAI4Bharat/swatahVision.git
cd swatahVision
pip install -e .

Quickstart

Load a model and run inference

import swatahVision as sv

model = sv.Model(
    model="path/to/model.onnx",
    engine=sv.Engine.ONNX,
    hardware=sv.Hardware.CPU
)

outputs = model(image, input_size=(640, 640))

Detection

Convert raw outputs to Detections

import swatahVision as sv

detections = sv.Detections.from_yolo(
    outputs,
    conf_threshold=0.3,
    nms_threshold=0.5,
    class_agnostic=False
)

print(len(detections))
print(detections.xyxy[:3])

Filter / slice detections

high_conf = detections[detections.confidence > 0.6]
persons = detections[detections.class_id == 0]

Draw boxes

import cv2
import swatahVision as sv

frame = cv2.imread("image.jpg")

annotated = sv.UI.draw_bboxes(
    image=frame.copy(),
    detections=detections,
    conf=0.3
)

cv2.imwrite("out.jpg", annotated)

License

This work is licensed under LGPL 3.0