PyTorch Archives - Page 3 of 43

Semantic Segmentation using Web-DINO

In this article, we are modifying the Web-DINO 300M architecture for semantic segmentation. We will add a simple segmentation decoder head and train the model for person segmentation. ...

Image Classification with Web-DINO

Sovit Ranjan Rath June 23, 2025 0 Comments

In this article we use the Web-DINO model for image classification. We modify the Web-DINO 300M model, adding a classification head on top, freezing the backbone, and training on cotton disease classification task. ...

Getting Started with SmolVLM2 – Code Inference

Sovit Ranjan Rath June 9, 2025 0 Comment

In this article, we cover inference code for SmolVLM2. We carry out image and video inference experiments using SmolVLM2-2.2B-Instruct and SmolVLM2-256M-Instruct. ...

Fine-Tuning SmolVLM for Receipt OCR

Sovit Ranjan Rath May 26, 2025 2 Comments

In this article, we are fine-tuning the SmolVLM-256M model for receipt OCR on the SROIE v2 dataset after generating the ground truth data using QwenVL-2B model. ...

Gemma 3 – Advancing Open, Lightweight, Multimodal AI

Sovit Ranjan Rath May 19, 2025 0 Comment

In this article, we explore Gemma 3. We start with the need for Gemma 3, its architecture and multimodal capabilities, and carry out inference using Hugging Face. ...

Category: PyTorch

Semantic Segmentation using Web-DINO

Image Classification with Web-DINO

Getting Started with SmolVLM2 – Code Inference

Fine-Tuning SmolVLM for Receipt OCR

Gemma 3 – Advancing Open, Lightweight, Multimodal AI