Deep Learning Archives - DebuggerCafe

Getting Started with Molmo2

In this article, we get started with Molmo2. We start with the discussion of the important aspects from the technical article and report. Then we move to a simple inference pipeline for image VQA, vidoe VQA, and image pointing. ...

Getting Started with GLM-4.6V

Sovit Ranjan Rath April 20, 2026 0 Comment

In this article, we cover the GLM-4.6V model. Specifically, we cover the technical capabilities of the model along with inference for image description, OCR, and image to HTML code. ...

Fine-Tuning DeepSeek-OCR 2

Sovit Ranjan Rath April 13, 2026 0 Comment

In this article, we fine-tune the DeepSeek-OCR 2 model using Unsloth for Hindi language OCR. We create a simple Gradio application to run inference and check the diff between the original and the inference result. ...

SAM 3 UI – Image, Video, and Multi-Object Inference

Sovit Ranjan Rath February 23, 2026 0 Comment

In this article, we create a simple SAM 3 Gradio UI for image and video segmentation. SAM 3 UI supports segmenting objects belonging to the different categories while using less than 10GB VRAM. ...

SAM 3 Inference and Paper Explanation

Sovit Ranjan Rath February 9, 2026 2 Comments

In this article we cover the SAM3 model. We discuss the SAM3 paper briefly including the motivation, the architecture, and the data engine. Next, we move on to image and video inference using SAM3. ...

Category: Deep Learning

Getting Started with Molmo2

Getting Started with GLM-4.6V

Fine-Tuning DeepSeek-OCR 2

SAM 3 UI – Image, Video, and Multi-Object Inference

SAM 3 Inference and Paper Explanation