In this article, we cover the GLM-4.6V model. Specifically, we cover the technical capabilities of the model along with inference for image description, OCR, and image to HTML code. ...
Getting Started with GLM-4.6V
In this article, we cover the GLM-4.6V model. Specifically, we cover the technical capabilities of the model along with inference for image description, OCR, and image to HTML code. ...
In this article, we fine-tune the DeepSeek-OCR 2 model using Unsloth for Hindi language OCR. We create a simple Gradio application to run inference and check the diff between the original and the inference result. ...
In this article, we discuss the DeepSeek-OCR 2 paper. We start from the DeepEncoder V2, the architecture, and finally discuss the code from Hugging Face. ...
In this article, we cover inference using DeepSeek-OCR 2. We create a simple pipeline where we can provide either to path to a PDF or an image for processing. We also create a Gradio application for for better experience with DeepSeek-OCR 2. ...
In this article, we tackle multi-turn tool call with LLM assistants with gpt-oss-chat library. We create a simple system where the assistant can call multiple tools for a single user query depending on the requirements. ...
Business WordPress Theme copyright 2025