DebuggerCafe - Deep Learning, Machine Learning, Artificial Intelligence

Creating a Sketch to HTML Application with Qwen3-VL

Creating a Sketch to Image Application with Qwen3-VL

In this article, we explore creating a simple sketch to HTML application using Qwen3-VL where users can upload an image or screenshot for a potential website and the Qwen3-VL model will give back the HTML. ...

Introduction to Qwen3-VL

Sovit Ranjan Rath December 15, 2025 2 Comments

In this article, we explore the Qwen3-VL model, the latest iteration of the Qwen-VL series. We start with model architecture and benchmarks, and then move to hands-on inference for object detection, OCR, video understanding, and sketch-to-HTML using Qwen3-VL. ...

Fine-Tuning Phi-3.5 Vision Instruct

Sovit Ranjan Rath December 8, 2025 0 Comment

In this article we are fine-tuning the Phi-3.5 Vision Instruct model on a receipt OCR dataset. We are using Hugging Face libraries and training a LoRA. ...

Object Detection with DEIMv2

Sovit Ranjan Rath December 1, 2025 0 Comment

In this article, we explore the DEIMv2 object detection model based on the DINOv3 and HGNetv2 backbones, along with carrying inference on images and videos. ...

Introduction to Moondream3 and Tasks

Sovit Ranjan Rath November 24, 2025 0 Comment

In this article, we cover Moondream3, the latest iteration in Moondream VLM family. We cover the model architecture and carry out inference using the different tasks that it supports. ...