In this article, we cover an introduction to Qwen3.5 by going through the important aspects of the official article along with image and video inference using vLLM and llama.cpp. ...
Introduction to Qwen3.5 – Overview, vLLM, and llama.cpp
In this article, we cover an introduction to Qwen3.5 by going through the important aspects of the official article along with image and video inference using vLLM and llama.cpp. ...
In this article, we tackle multi-turn tool call with LLM assistants with gpt-oss-chat library. We create a simple system where the assistant can call multiple tools for a single user query depending on the requirements. ...
In this article, we add RAG as a tool call to gpt-oss-chat where we let the assistant decide when to search a user provided document via Qdrant in-memory DB. ...
In this article, we add web search tool call to gpt-oss-chat CLI mode. We cover the definition of tools, how to handle streaming with tool call, and other caveats. ...
In this article, we work on gpt-oss-chat. A local user friendly chat interface powered by gpt-oss-20b, with local RAG and web search capabilities. ...
Business WordPress Theme copyright 2025