swatahVision
An open-source Vision AI stack for real-world applications
swatahVision is an open-source Vision AI stack that brings together models, runtimes, post-processing, tracking, and visualization into a clean, reusable Python package.
It’s built to make Vision AI practical: load a model, run inference, get structured outputs, visualize results, and ship pipelines faster — without reinventing glue-code every time.
Part of the VisionAI4Bhārat initiative.
What is swatahVision?
swatahVision provides a unified interface for:
- Inference across multiple runtimes (e.g., ONNX Runtime, OpenVINO)
- Vision tasks
- Object Detection
- Image Classification
- (Extensible for OCR / segmentation / multimodal pipelines)
- Post-processing adapters
- YOLO-style decoders + NMS
- SSD-style decoders
- RetinaNet-style decoders
- Tracking
- Integrated ByteTrack support to assign stable
tracker_idacross frames - Visualization
- Built-in drawing utilities for boxes, labels, and overlays
- Video processing utilities
- Frame generators and pipeline helpers
Features
- ✅ Unified model wrapper: same API for different backends/runtimes
- ✅ Structured outputs:
Detections: boxes, confidence, class_id, tracker_id (and more)Classification: class_id, confidence, top-k- ✅ Production-friendly utilities: FPS monitor, video readers/writers, batching-ready patterns
- ✅ Lightweight & composable: designed as a stack, not a monolith
- ✅ Real world examples and analytics
- ✅ **Built-in annotation utilities
- ✅ **Minimal and clean API
Core Components
Model
sv.Model(model, engine, hardware)
Handles model loading and inference execution.
Image API
image = sv.Image.load_from_file(path)
sv.Image.show(image)
Post Processing
sv.Classification.from_mobilenet(...)
sv.Detections.from_ssd(...)
Annotation
- Bounding boxes
- Labels
- Custom colors
- Text positioning
Design Philosophy
- Simplicity
- Lightweight deployment
- Fast prototyping
- Developer-friendly workflows
Install
From source
git clone https://github.com/VisionAI4Bharat/swatahVision.git
cd swatahVision
pip install -e .
Quickstart
Load a model and run inference
import swatahVision as sv
model = sv.Model(
model="path/to/model.onnx",
engine=sv.Engine.ONNX,
hardware=sv.Hardware.CPU
)
outputs = model(image, input_size=(640, 640))
Detection
Convert raw outputs to Detections
import swatahVision as sv
detections = sv.Detections.from_yolo(
outputs,
conf_threshold=0.3,
nms_threshold=0.5,
class_agnostic=False
)
print(len(detections))
print(detections.xyxy[:3])
Filter / slice detections
high_conf = detections[detections.confidence > 0.6]
persons = detections[detections.class_id == 0]
Draw boxes
import cv2
import swatahVision as sv
frame = cv2.imread("image.jpg")
annotated = sv.UI.draw_bboxes(
image=frame.copy(),
detections=detections,
conf=0.3
)
cv2.imwrite("out.jpg", annotated)
License
This work is licensed under LGPL 3.0