DebuggerCafe - Deep Learning, Machine Learning, Artificial Intelligence

Video Summarizer Using Qwen2.5-Omni

In this article, we build a simple video summarizer application using Qwen2.5-Omni 3B model with the UI powered by Gradio. ...

Introduction to BAGEL: An Unified Multimodal Model

Sovit Ranjan Rath July 28, 2025 0 Comment

In this article, we cover the introduction to BAGEL, an unified multimodal model for image generation, image editing, and free-form image manipulation with non-thinking and thinking capabilties. ...

Fine-Tuning SmolLM2

Sovit Ranjan Rath July 21, 2025 0 Comment

Fine-tuning SmolLM2-135M Instruct model on the WMT14 French-to-English subset for machine translation using a small language model. ...

LitGPT – Getting Started

Sovit Ranjan Rath July 14, 2025 0 Comment

In this article, we explore LitGPT. We cover chatting with pretrained models, fine-tuning on custom dataset, and evaluation of model after fine-tuning. ...

Qwen3 – Unified Models for Thinking and Non-Thinking

Sovit Ranjan Rath July 7, 2025 0 Comment

In this article, we discuss the latest iteration in the Qwen family of models, Qwen3. We discuss the need for Qwen3, the architecture, and the training strategy. ...