Local LLM Apps
Tool | Category | Segment | Platform / Tool | Plan / License | Monthly Price USD | Pricing Model | Free Tier / OSS | Included Usage / Limits | Local Runtime / Model Sources | App UX / Knowledge Features | API / Extensibility | Integrations / Frameworks | Deployment / Hosting | Security / Privacy | Team / Governance | Best Fit | Main Limits / Caveats |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
No tagline | Local LLM Apps | Low-level local LLM engine | llama.cpp | MIT / open source | $0 software; hardware/model costs separate | Local inference engine, CLI and server | ✓ | Repository provides C/C++ LLM inference with GGUF support and local server tooling | GGUF models, quantized open-weight models, CPU/GPU backends and many architectures supported by project | CLI tools, simple server/web UI, benchmarks, quantization and low-level runtime controls | C/C++ library, local HTTP server, OpenAI-compatible endpoints, bindings and downstream apps | Ollama, llamafile, KoboldCpp, LM Studio-style stacks, Python bindings and embedded apps | Local binary/library, embedded runtime or self-hosted local server | No cloud by default; security depends how the server is bound and exposed | Library/engine-level governance only; no app team layer | Developers needing maximum control over local inference internals | Not a polished end-user app; model setup and flags can be technical |
No tagline | Local LLM Apps | Local OpenAI-compatible stack | LocalAI | MIT / open source | $0 software; hardware/model costs separate | Self-hosted OpenAI/Anthropic-compatible local AI stack | ✓ | Official site describes LocalAI as a free OpenAI and Anthropic alternative running locally on user hardware | LLMs, embeddings, image, audio, agents and document intelligence through modular local backends | API-first workflow rather than a consumer chat UI; companion LocalAGI/LocalRecall expand agent/search flows | OpenAI-compatible API, Anthropic-style compatibility, modular backends and Docker deployment | OpenAI-compatible clients, LangChain, LlamaIndex, local agents, RAG and multimodal backends | Docker, local server, workstation or on-prem host | No cloud required; endpoint exposure, model files and plugins need operator security review | No SaaS governance by default; suitable for self-managed internal endpoints | Developers replacing cloud API endpoints with a local/private stack | More backend-oriented than desktop-app UX; setup can be more technical |
No tagline | Local LLM Apps | Desktop local LLM workstation | LM Studio | Proprietary desktop app | $0 for personal/approved use; business terms should be checked | Free desktop app with separate terms for work/business use | Yes, free desktop entry | Docs describe local/offline use and a localhost OpenAI-style server; model downloads and hardware are separate | Hugging Face/open-weight models, GGUF models, MLX on Apple Silicon and Ollama/local endpoints | Model discovery, download, chat, local server, playground and parameter controls | OpenAI-like local server, structured output support, tool/MCP features and local network mode | OpenAI-compatible clients, MCP servers, local apps, coding tools and custom endpoints | Windows, macOS and Linux desktop; localhost or local-network serving | Requests to local server stay local; cloud endpoints or remote providers follow their own terms | Desktop app is individual-first; business compliance depends current LM Studio terms | Users who want a polished GUI for downloading and testing local models | Not open source; enterprise/commercial usage terms must be verified before rollout |
No tagline | Local LLM Apps | All-in-one local AI workspace | AnythingLLM | MIT core / self-hosted terms | $0 Desktop/self-hosted entry; hosted plans separate | Free local desktop and self-hosted app; hosted/business packaging separate | ✓ | Official site promotes a free Desktop app and local/enterprise model provider support | Ollama, LM Studio, LocalAI, OpenAI, Azure, AWS, Anthropic and other local or enterprise LLM providers | Workspaces, document chat, RAG, agents, MCP, plugins, desktop OS workflow and multimodal chat | API, MCP compatibility, workspace agents, embeddings/vector stores and Docker/self-hosted configuration | Local LLM runtimes, cloud providers, vector DBs, document sources, MCP and enterprise connectors | Desktop app, Docker/self-hosted server or hosted AnythingLLM | Desktop/self-hosted data can stay local; provider calls follow configured backend privacy | Single-user desktop and multi-user self-hosted/hosted options; licensing terms should be reviewed | Private document chat and agent workflows without building a full RAG stack | Hosted pricing and enterprise controls differ from free desktop; RAG quality depends ingestion and embeddings |
No tagline | Local LLM Apps | Desktop AI client | Chatbox | GPL-3.0 community edition | $0 community software; Pro/cloud/provider costs separate | Desktop client with community and commercial packaging | Yes, community edition | Official repo describes a desktop client for ChatGPT, Claude and other LLMs with Ollama local model support | Ollama local models, OpenAI, Azure OpenAI, Claude, Gemini and custom providers | Desktop chat, local data storage, prompts, multi-provider switching and cross-platform app UX | Custom providers, local Ollama connection, image generation provider support and desktop settings | Ollama, OpenAI, Azure, Claude, Gemini, ChatGLM and custom API endpoints | Windows, macOS, Linux desktop | Local data storage; provider calls leave device according to selected model backend | Community app is individual-first; commercial/team controls depend paid product | Users wanting a simple desktop AI client that can point at local and cloud models | GPL obligations for community edition; exact paid feature split can change |
No tagline | Local LLM Apps | Portable local LLM runner | KoboldCpp | AGPL-3.0 / open source | $0 software; hardware/model costs separate | One-file local LLM runner and UI | ✓ | Repository describes easy GGML/GGUF local text generation with a KoboldAI-style UI and no complex install | GGUF/GGML models, llama.cpp-derived backends, selected image/audio/TTS capabilities depending build | Built-in web UI, story/chat modes, sampler controls, context controls and local model loading | Kobold API compatibility, OpenAI-style endpoints in recent builds, CLI flags and frontend integrations | SillyTavern, KoboldAI ecosystem, local GGUF models and llama.cpp tooling | Single local executable on desktop/server; optional LAN host | Fully local when used with local models; LAN exposure must be controlled | No team governance; single-user/hobbyist workflow | Creative writing and roleplay users who want a self-contained local executable | Advanced configuration can be dense; large models still require strong RAM/VRAM |
No tagline | Local LLM Apps | Desktop local AI studio | Msty | Proprietary desktop app | $0 Free; Aurum Annual $129/user/year | Free personal plan plus paid commercial/support license | Yes, free plan | Pricing page lists Free forever with local and online models; professional/business use requires paid license | Ollama-backed local models, MLX models on Apple Silicon, llama.cpp models and online providers | Split chats, knowledge stacks/RAG, web search, attachments, prompts library, branching and flowchat | Local model management, custom endpoints, advanced export/search in paid tier and desktop/web studio access | Ollama, MLX, llama.cpp, Azure OpenAI and online/local model providers | Desktop app and Msty Studio web/desktop access depending plan | Local models run on the user's machine; online providers and web search may transmit data | Free individual plan; paid license covers commercial use and more advanced features | Power users who want a polished desktop workspace with local models and RAG | Proprietary; commercial usage requires paid license; pricing may change |
No tagline | Local LLM Apps | Single-file local LLM package | llamafile | Apache-2.0 / MIT components | $0 software; hardware/model costs separate | Single executable model runtime and server | ✓ | Mozilla page describes bundling model weights, inference engine and runtime into one executable file | GGUF/open-weight models packaged with llama.cpp/Cosmopolitan runtime | Download-run local chat/server workflow with no separate install in supported builds | OpenAI-compatible local server mode, CLI, packaged model distribution and embedded runtime | Mozilla AI tooling, llama.cpp, local apps and single-file distribution workflows | Single local executable on desktop/server across supported OS targets | Runs locally and offline after download; file provenance and model license should be verified | No team governance; distribution and update governance are external | Portable demos, classrooms, offline field work and low-friction local AI experiments | Model files can be large; packaging reduces setup but not hardware requirements |
No tagline | Local LLM Apps | Advanced local model web UI | text-generation-webui | AGPL-3.0 / open source | $0 software; hardware/model costs separate | Open-source local web UI for many model loaders | ✓ | Repository describes a Gradio web UI for LLMs with transformers, GPTQ, AWQ, EXL2, llama.cpp and GGUF support | Transformers, llama.cpp/GGUF, ExLlama, GPTQ, AWQ, EXL2 and other local model formats/loaders | Chat/notebook modes, model loader controls, extensions, character prompts and generation parameters | OpenAI-compatible API extension, Gradio extensions, model loaders and community plugins | Hugging Face models, llama.cpp, ExLlama, Transformers, SillyTavern and local tooling | Local Python environment, one-click installers or server-style deployment | Runs locally; extensions and remote model downloads need trust review | No native enterprise governance; operator controls users/networking | Experimenters who need broad loader support and fine-grained generation controls | More setup and dependency complexity than Ollama or LM Studio |
No tagline | Local LLM Apps | Self-hosted chat framework | LobeChat | MIT / open source | $0 self-hosted software; hosted/provider costs separate | Open-source self-hosted UI with optional cloud/provider costs | ✓ | Repository describes an open-source modern AI chat framework supporting Ollama, Qwen, DeepSeek and major providers | Ollama, OpenAI, Claude, Gemini, Qwen, DeepSeek, OpenAI-compatible endpoints and multimodal models | Modern chat UI, knowledge base, plugins, assistants, artifacts, TTS/vision and one-click deployments | Plugin system, server database mode, auth integrations, OpenAI-compatible providers and deployment templates | Vercel, Docker, serverless, local models, cloud providers and knowledge base backends | Self-hosted web app, Vercel/serverless or private server | Self-hosted control; provider and storage choices determine data flow | Individual and team deployment possible; full governance depends deployment/auth setup | Builders who want a polished self-hostable AI chat product with plugin UX | More app-framework complexity than a simple desktop local runner |
No tagline | Local LLM Apps | Private desktop chatbot | GPT4All | Open source / commercial use allowed per repo | $0 software; hardware and model licenses separate | Free desktop app and local inference ecosystem | ✓ | Official docs describe private local desktop chat, LocalDocs and no required API calls or GPU for basic use | GPT4All model ecosystem, local GGUF-style models and API/provider connections | Desktop chat, local model browser, LocalDocs private document Q&A and settings for local inference | Python SDK, local API server, bindings and desktop integrations | Nomic ecosystem, local documents, Python apps and OpenAI-compatible workflows | Windows, macOS, Linux desktop and local API server | Private local operation when using local models; document data remains on device unless external providers are configured | No built-in enterprise team layer in desktop app | Non-technical users who want private local document chat on everyday computers | Model quality is constrained by local hardware; larger models still need significant RAM/VRAM |
No tagline | Local LLM Apps | Offline desktop assistant | Jan | Open source desktop app | $0 local use; provider/model costs separate | Free local desktop app plus optional external provider usage | ✓ | Docs state local use is always free and Jan can work offline after models are downloaded | Local models via built-in runtimes/Hugging Face plus remote providers such as OpenAI-compatible APIs | Desktop ChatGPT-style interface, model hub, assistants, local API server and file/chat workflows | OpenAI-compatible local server, extensions, provider routing and model management | Hugging Face models, local engines, OpenAI-compatible apps and desktop OS integrations | Windows, macOS and Linux desktop | Offline operation possible; cloud model calls send data to selected provider | Mostly individual/local workflow; team governance depends external deployment choices | Users who want an open-source offline ChatGPT replacement with a desktop feel | Local model performance depends hardware; optional cloud services and roadmap should be checked |
No tagline | Local LLM Apps | Local model runtime and app | Ollama | MIT / open source | $0 software; hardware and model licenses separate | Local runtime, model manager and desktop/CLI workflow | ✓ | Official project runs models locally and exposes a local API; model downloads and compute are user-provided | Ollama model library, GGUF-derived quantized models, Llama, Qwen, Gemma, Mistral, DeepSeek and custom Modelfiles | CLI, desktop installer, model pull/run workflow and simple chat loop | Local REST API, OpenAI-compatible endpoints, Docker image, Python/JS clients and Modelfile customization | Open WebUI, AnythingLLM, Page Assist, LangChain, LlamaIndex, Continue, Cline and many OpenAI-compatible tools | Local desktop, local server, Docker or LAN-accessible self-hosted runtime | Runs locally by default; exposed LAN/public servers need explicit network hardening | No SaaS governance in OSS runtime; model and endpoint access are operator-managed | Fastest path to local model execution for developers and hobbyists | Quality and speed depend on local hardware; model licenses vary; misconfigured servers can expose local endpoints |
No tagline | Local LLM Apps | Self-hosted AI chat platform | Open WebUI | Open WebUI License / source available | $0 self-hosted software; hosting and model costs separate | Self-hosted web UI and platform with optional external model costs | Yes, source-available self-hosting | Official repo positions Open WebUI as a user-friendly AI interface supporting Ollama and OpenAI-compatible APIs | Ollama, OpenAI-compatible APIs, external providers, RAG stores and tool servers | ChatGPT-like UI, users, workspaces, files/RAG, tools, functions, admin controls and model switching | Pipelines, functions, OpenAPI tools, REST/WebSocket APIs, OAuth/LDAP/SCIM options and Docker/Helm deployment | Ollama, OpenAI APIs, Kubernetes, Docker, Helm, LDAP/OAuth/OIDC and vector/RAG backends | Docker, pip, Kubernetes/Helm, local machine or private server | Self-hosted data control; auth, network exposure and provider routing must be configured carefully | Admin panel, users, groups and enterprise-style auth integrations | Teams that want a self-hosted internal AI portal over local and hosted models | License has branding/trademark requirements; operating it securely requires admin work |
No tagline | Local LLM Apps | Self-hosted multi-provider chat | LibreChat | MIT / open source | $0 software; model/API/hosting costs separate | Free self-hosted web application | ✓ | Official site describes LibreChat as free, open source, self-hosted and no subscription | Ollama/local endpoints plus OpenAI, Anthropic, Google, Azure, AWS Bedrock and other providers | Unified chat UI, agents, files, code interpreter-style workflows, plugins and multi-model switching | MCP support, custom endpoints, plugins, OAuth/SAML/LDAP, moderation and rate limiting | Docker, MongoDB, Redis, OAuth providers, local LLM endpoints and major cloud LLM APIs | Self-hosted Docker/server deployment | Self-hosted control; conversations still go to configured model providers unless local endpoints are used | Multi-user auth, SSO options, rate limiting and admin-oriented controls | Organizations wanting a self-hosted ChatGPT-style portal with provider flexibility | Requires operating databases and auth; MongoDB licensing/compliance should be reviewed for commercial deployments |
No tagline | Local LLM Apps | Local AI app launcher | Pinokio | Free desktop app / source available | $0 software; app/model/hardware costs separate | Local AI app launcher and localhost cloud | ✓ | Official docs describe Pinokio as a local platform to install, run and automate AI apps on a user's own machine | Runs local AI apps and servers rather than one model format; apps may include Ollama, ComfyUI, Whisper and web UIs | One-click install/run, built-in browser, dependency management, app discovery and local automation scripts | JSON app scripts, local runtimes, app recipes, local web servers and agent/app control features | Python, Node.js, Bun, Git, Conda, local AI apps and web UI stacks | Desktop app hosting local web apps on the user's machine | Local-first by design; scripts can execute code and must be trusted before install | No enterprise governance by default; app/script trust is user-managed | Non-specialists who want one-click local AI app installation | Security depends heavily on script provenance; not just an LLM chat application |
No tagline | Local LLM Apps | Personal AI second brain | Khoj | AGPL-3.0 / open source | $0 self-hosted software; cloud/provider costs separate | Self-hostable app plus optional cloud service | ✓ | Official repo describes Khoj as open-source and self-hostable, scaling from on-device personal AI to cloud-scale enterprise AI | Local LLMs such as llama/qwen/mistral via local providers plus online models and Khoj cloud | Chat with docs/web, semantic search, custom agents, automations, newsletters and apps/plugins | APIs, agents, browser/desktop/mobile, Obsidian/Emacs integrations and local/online model routing | Obsidian, Emacs, browser, desktop, WhatsApp, local LLMs, web search and document stores | Self-hosted on local machine/server or hosted Khoj cloud | Self-hosting can keep docs local; cloud app and online models change data flow | Self-hosted governance is operator-managed; enterprise/cloud governance depends plan | Personal knowledge bases that need local/private model options plus search | Broader second-brain/RAG app, not a minimal local model runner |
No tagline | Local LLM Apps | Browser sidebar for local models | Page Assist | Open source browser extension | $0 software; backend/provider costs separate | Browser extension and web UI for local AI models | ✓ | Official site describes Page Assist as an open-source browser extension with sidebar and web UI for local AI models | Ollama, Chrome AI/Gemini Nano beta, OpenAI-compatible providers such as llama.cpp, LM Studio, Llamafile and vLLM | Browser sidebar, page-aware chat, web UI, PDF/document chat and search while browsing | Browser extension APIs, local provider URLs, OpenAI-compatible endpoints and page context tools | Chrome/Edge-style browsers, Ollama, LM Studio, llama.cpp, Llamafile, vLLM and local documents | Browser extension plus local model provider on desktop | Page context can be sent to configured local or remote provider; browser permissions need review | Individual browser workflow; no enterprise governance unless managed by browser policy | Users who want local AI assistance inside the browser | Depends on a separate local model backend; browser extension permissions must be understood |
No tagline | Local LLM Apps | Local computer-use assistant | Open Interpreter | AGPL-3.0 / open source | $0 software; provider/model/runtime costs separate | Local CLI/agent that can use local or hosted models | ✓ | Official repo says Open Interpreter lets LLMs run code locally and provides a ChatGPT-like terminal interface | Local models through Ollama, LM Studio, Jan, Llamafile and provider-agnostic model configuration | Terminal chat, code execution, local file/media/data workflows and approval before running code | Provider-agnostic model picker, local model guides, shell/Python/JS execution and MCP-like local workflow | Ollama, LM Studio, Jan, Llamafile, OpenAI-compatible APIs, local shell and browser tools | Local CLI and desktop/agent workflows | Powerful local code execution; user approval and sandboxing discipline are critical | No team governance by default; local permissions and review process are user-managed | Technical users who want a local AI agent operating on their computer | High risk if users approve unsafe code; local model quality may be insufficient for complex tasks |
No tagline | Local LLM Apps | Local chat interface | Serge | Apache-2.0 / open source | $0 software; hardware/model costs separate | Dockerized local chat app over llama.cpp | ✓ | Repository describes Serge as a web interface for chatting with Alpaca through llama.cpp, fully dockerized with an API | llama.cpp-compatible local models, originally Alpaca/LLaMA-style local models | Simple web chat UI, local conversations and dockerized app/API stack | Docker API, llama.cpp backend and local web interface | Docker, llama.cpp and local model files | Self-hosted Docker on local machine/server | No API keys or cloud required when using local models; Docker/network exposure must be controlled | No team governance; hobbyist/self-hosted app | Users wanting a simple historical self-hosted llama.cpp chat UI | Older project and model assumptions; verify maintenance before relying on it |
No tagline | Local LLM Apps | Offline desktop/mobile chat | Atomic Chat | Open source desktop/mobile app | $0 software | Free local AI chat app | ✓ | Official terms say Atomic Chat is free, open source, local, requires no account/subscription and runs models on device | Bundled/local models and Atomic local inference stack; OpenAI-compatible local server on desktop builds | Private offline chat, desktop/mobile app UX and simple local model operation | OpenAI-compatible localhost server, GitHub source and local inference components | Desktop apps, mobile apps, local tools and OpenAI-compatible clients | macOS, Windows, iPhone and other app targets depending release availability | No account/server data for local use; uninstall removes local app data per terms | Individual-first app; no team governance captured | Users who want a zero-account local chat client with mobile/desktop reach | Newer project; model catalog, platform parity and maturity should be validated |
No tagline | Local LLM Apps | Private document assistant | PrivateGPT | Apache-2.0 / open source | $0 software; hardware/model costs separate | Self-hosted private document Q&A app/framework | ✓ | Project describes private document interaction where data does not leave the execution environment | Local LLMs, local embeddings, document ingestion and configurable model backends | Document ingest, private Q&A, local RAG pipeline and API/server modes | Python APIs, local vector stores, embeddings, LLM backends and Docker/deployment templates | llama.cpp, Ollama-compatible/local models, embeddings models and document pipelines | Local Python app, Docker or private server | Designed for private local documents; first-time model downloads and configured providers need review | No broad team governance unless deployed behind internal auth controls | Users who want local document Q&A without sending files to SaaS | More RAG-focused than general chat; maintaining model/storage dependencies takes work |
No tagline | Local LLM Apps | Power-user LLM frontend | SillyTavern | AGPL-3.0 / open source | $0 software; backend/model/provider costs separate | Locally installed LLM frontend for many backends | ✓ | Official repo says SillyTavern is locally installed and provides no hosted service or user tracking | KoboldAI/CPP, Ooba, Tabby, Ollama/OpenAI-compatible APIs, Claude, OpenRouter, NovelAI and image/TTS backends | Roleplay/story UI, character cards, lorebooks, prompt controls, extensions, Visual Novel mode and mobile-friendly layout | Extension system, backend adapters, image/TTS APIs, prompt macros and community content | KoboldCpp, text-generation-webui, Tabby, OpenAI-compatible APIs, ComfyUI, Automatic1111 and TTS tools | Local Node.js app, Docker or local network front end | No hosted service; privacy depends selected backend and whether endpoints are remote | Community/hobbyist governance; no enterprise admin layer | Power users building local creative writing and character chat workflows | Steep learning curve; some use cases require careful content and safety policies |
No tagline | Local LLM Apps | Offline ChatGPT-like app | LlamaGPT | MIT / open source | $0 software; hardware/model costs separate | Self-hosted offline chatbot | ✓ | Repository describes a self-hosted offline ChatGPT-like chatbot with no data leaving the device | Llama 2 era local models and llama.cpp-python style local inference stack | Simple ChatGPT-like UI, Docker compose deployment and private local chat | Docker Compose, local model files and web app/API stack | Umbrel, Docker, local server and llama.cpp-style local inference | Self-hosted Docker or umbrelOS home server | Offline/private by design once installed; model/source downloads require trust review | No team governance; personal/home-server orientation | Home lab users wanting a simple private chatbot appliance | Model stack is older; verify maintenance and model support before new production use |
No tagline | Local LLM Apps | Native macOS local chat | LlamaChat | Open source macOS app | $0 software; hardware/model costs separate | Free local macOS chat app | ✓ | Official site says LlamaChat lets users chat with LLaMA, Alpaca and GPT4All models running locally on Mac | Local LLaMA, Alpaca and GPT4All-style models on macOS | Native macOS chat UI focused on local model interaction | GitHub source, local model loading and app-level integration with macOS | macOS, GPT4All-era local models and local model files | Native macOS desktop app | Local-only model operation when using local files; model provenance and app maintenance should be checked | No team governance; individual desktop workflow | Mac users wanting a lightweight native local LLM chat client | Narrow platform scope and older model assumptions compared with Jan or LM Studio |