Pose Estimation using MoveNet (ONNX)

This project demonstrates human pose estimation using a MoveNet ONNX model. It supports image and video inference and provides structured pose outputs (keypoints + confidence) along with visualization.

🚀 Features

MoveNet ONNX inference (CPU)
Image and video pose estimation
Keypoints parsing (17 COCO joints)
Skeleton visualization
Clean class-based architecture
Compatible with swatahVision style parsers

📥 Model Download

Pretrained models for swatahVision are available in the Model Zoo.

🔗 https://visionai4bharat.github.io/swatahVision/model_zoo/

📁 Project Structure

pose_estimation/
│
├── Movenet.onnx                # Pose estimation model
├── movenet_pose.py             # Inference class (video / image)
├── pose.py                     # Pose parser (from_movenet)
├── assets/
│   └── sample.jpg              # Example input image
└── README.md

📦 Requirements

Install dependencies:

pip install onnxruntime opencv-python numpy

(Optional if using swatahVision)

pip install swatahVision

⬇️ Download the MoveNet ONNX Model

Option 1 — Download pre-converted ONNX (recommended)

Download MoveNet ONNX model from a model hub (example sources):
ONNX model zoo mirrors
TensorFlow → ONNX community conversions
Your internal model storage
Rename the file to:

Movenet.onnx

Place it inside:

pose_estimation/

Option 2 — Convert MoveNet to ONNX yourself

If you have TensorFlow MoveNet:

Install:

pip install tf2onnx tensorflow

Convert:

python -m tf2onnx.convert \
--saved-model movenet_saved_model \
--output Movenet.onnx \
--opset 13

Move the generated file into the project folder.

▶️ Run Pose Estimation (Image)

Example:

import swatahVision as sv
from pose import Pose

model = sv.Model(
    model="pose_estimation/Movenet.onnx",
    engine=sv.Engine.ONNX,
    hardware=sv.Hardware.CPU
)

image = sv.Image.load_from_file("pose_estimation/assets/sample.jpg")

outs = model(image)

pose = Pose.from_movenet(outs)

print(pose.keypoints)
print(pose.confidence)

▶️ Run Pose Estimation (Video)

python movenet_pose.py

Press q to exit.

🧠 Output Format

MoveNet returns 17 keypoints:

[x, y] coordinates
confidence score

Keypoints follow COCO order:

nose
eyes
ears
shoulders
elbows
wrists
hips
knees
ankles

🛠️ How it works

Pipeline:

Load ONNX model
Preprocess frame (resize → normalize)
Run ONNX inference
Parse keypoints via Pose.from_movenet
Draw skeleton

✅ Supported Tasks

Pose estimation
Real-time video pose
Visualization
Framework integration (swatahVision)

📌 Notes

Model expects 192×192 input
Works on CPU
Confidence threshold can be adjusted
Compatible with other pose ONNX models with similar output

🤝 Contributing

Fork the repository
Create a branch
Make changes
Open Pull Request