Local LLM Apps

Tool	Category	Segment	Platform / Tool	Plan / License	Monthly Price USD	Pricing Model	Free Tier / OSS	Included Usage / Limits	Local Runtime / Model Sources	App UX / Knowledge Features	API / Extensibility	Integrations / Frameworks	Deployment / Hosting	Security / Privacy	Team / Governance	Best Fit	Main Limits / Caveats
llama.cpp OSS No tagline	Local LLM Apps	Low-level local LLM engine	llama.cpp	MIT / open source	$0 software; hardware/model costs separate	Local inference engine, CLI and server	✓	Repository provides C/C++ LLM inference with GGUF support and local server tooling	GGUF models, quantized open-weight models, CPU/GPU backends and many architectures supported by project	CLI tools, simple server/web UI, benchmarks, quantization and low-level runtime controls	C/C++ library, local HTTP server, OpenAI-compatible endpoints, bindings and downstream apps	Ollama, llamafile, KoboldCpp, LM Studio-style stacks, Python bindings and embedded apps	Local binary/library, embedded runtime or self-hosted local server	No cloud by default; security depends how the server is bound and exposed	Library/engine-level governance only; no app team layer	Developers needing maximum control over local inference internals	Not a polished end-user app; model setup and flags can be technical
LocalAI OSS No tagline	Local LLM Apps	Local OpenAI-compatible stack	LocalAI	MIT / open source	$0 software; hardware/model costs separate	Self-hosted OpenAI/Anthropic-compatible local AI stack	✓	Official site describes LocalAI as a free OpenAI and Anthropic alternative running locally on user hardware	LLMs, embeddings, image, audio, agents and document intelligence through modular local backends	API-first workflow rather than a consumer chat UI; companion LocalAGI/LocalRecall expand agent/search flows	OpenAI-compatible API, Anthropic-style compatibility, modular backends and Docker deployment	OpenAI-compatible clients, LangChain, LlamaIndex, local agents, RAG and multimodal backends	Docker, local server, workstation or on-prem host	No cloud required; endpoint exposure, model files and plugins need operator security review	No SaaS governance by default; suitable for self-managed internal endpoints	Developers replacing cloud API endpoints with a local/private stack	More backend-oriented than desktop-app UX; setup can be more technical
LM Studio Desktop No tagline	Local LLM Apps	Desktop local LLM workstation	LM Studio	Proprietary desktop app	$0 for personal/approved use; business terms should be checked	Free desktop app with separate terms for work/business use	Yes, free desktop entry	Docs describe local/offline use and a localhost OpenAI-style server; model downloads and hardware are separate	Hugging Face/open-weight models, GGUF models, MLX on Apple Silicon and Ollama/local endpoints	Model discovery, download, chat, local server, playground and parameter controls	OpenAI-like local server, structured output support, tool/MCP features and local network mode	OpenAI-compatible clients, MCP servers, local apps, coding tools and custom endpoints	Windows, macOS and Linux desktop; localhost or local-network serving	Requests to local server stay local; cloud endpoints or remote providers follow their own terms	Desktop app is individual-first; business compliance depends current LM Studio terms	Users who want a polished GUI for downloading and testing local models	Not open source; enterprise/commercial usage terms must be verified before rollout
AnythingLLM Desktop / Self-hosted No tagline	Local LLM Apps	All-in-one local AI workspace	AnythingLLM	MIT core / self-hosted terms	$0 Desktop/self-hosted entry; hosted plans separate	Free local desktop and self-hosted app; hosted/business packaging separate	✓	Official site promotes a free Desktop app and local/enterprise model provider support	Ollama, LM Studio, LocalAI, OpenAI, Azure, AWS, Anthropic and other local or enterprise LLM providers	Workspaces, document chat, RAG, agents, MCP, plugins, desktop OS workflow and multimodal chat	API, MCP compatibility, workspace agents, embeddings/vector stores and Docker/self-hosted configuration	Local LLM runtimes, cloud providers, vector DBs, document sources, MCP and enterprise connectors	Desktop app, Docker/self-hosted server or hosted AnythingLLM	Desktop/self-hosted data can stay local; provider calls follow configured backend privacy	Single-user desktop and multi-user self-hosted/hosted options; licensing terms should be reviewed	Private document chat and agent workflows without building a full RAG stack	Hosted pricing and enterprise controls differ from free desktop; RAG quality depends ingestion and embeddings
Chatbox Community Edition No tagline	Local LLM Apps	Desktop AI client	Chatbox	GPL-3.0 community edition	$0 community software; Pro/cloud/provider costs separate	Desktop client with community and commercial packaging	Yes, community edition	Official repo describes a desktop client for ChatGPT, Claude and other LLMs with Ollama local model support	Ollama local models, OpenAI, Azure OpenAI, Claude, Gemini and custom providers	Desktop chat, local data storage, prompts, multi-provider switching and cross-platform app UX	Custom providers, local Ollama connection, image generation provider support and desktop settings	Ollama, OpenAI, Azure, Claude, Gemini, ChatGLM and custom API endpoints	Windows, macOS, Linux desktop	Local data storage; provider calls leave device according to selected model backend	Community app is individual-first; commercial/team controls depend paid product	Users wanting a simple desktop AI client that can point at local and cloud models	GPL obligations for community edition; exact paid feature split can change
KoboldCpp OSS No tagline	Local LLM Apps	Portable local LLM runner	KoboldCpp	AGPL-3.0 / open source	$0 software; hardware/model costs separate	One-file local LLM runner and UI	✓	Repository describes easy GGML/GGUF local text generation with a KoboldAI-style UI and no complex install	GGUF/GGML models, llama.cpp-derived backends, selected image/audio/TTS capabilities depending build	Built-in web UI, story/chat modes, sampler controls, context controls and local model loading	Kobold API compatibility, OpenAI-style endpoints in recent builds, CLI flags and frontend integrations	SillyTavern, KoboldAI ecosystem, local GGUF models and llama.cpp tooling	Single local executable on desktop/server; optional LAN host	Fully local when used with local models; LAN exposure must be controlled	No team governance; single-user/hobbyist workflow	Creative writing and roleplay users who want a self-contained local executable	Advanced configuration can be dense; large models still require strong RAM/VRAM
Msty Free No tagline	Local LLM Apps	Desktop local AI studio	Msty	Proprietary desktop app	$0 Free; Aurum Annual $129/user/year	Free personal plan plus paid commercial/support license	Yes, free plan	Pricing page lists Free forever with local and online models; professional/business use requires paid license	Ollama-backed local models, MLX models on Apple Silicon, llama.cpp models and online providers	Split chats, knowledge stacks/RAG, web search, attachments, prompts library, branching and flowchat	Local model management, custom endpoints, advanced export/search in paid tier and desktop/web studio access	Ollama, MLX, llama.cpp, Azure OpenAI and online/local model providers	Desktop app and Msty Studio web/desktop access depending plan	Local models run on the user's machine; online providers and web search may transmit data	Free individual plan; paid license covers commercial use and more advanced features	Power users who want a polished desktop workspace with local models and RAG	Proprietary; commercial usage requires paid license; pricing may change
llamafile OSS No tagline	Local LLM Apps	Single-file local LLM package	llamafile	Apache-2.0 / MIT components	$0 software; hardware/model costs separate	Single executable model runtime and server	✓	Mozilla page describes bundling model weights, inference engine and runtime into one executable file	GGUF/open-weight models packaged with llama.cpp/Cosmopolitan runtime	Download-run local chat/server workflow with no separate install in supported builds	OpenAI-compatible local server mode, CLI, packaged model distribution and embedded runtime	Mozilla AI tooling, llama.cpp, local apps and single-file distribution workflows	Single local executable on desktop/server across supported OS targets	Runs locally and offline after download; file provenance and model license should be verified	No team governance; distribution and update governance are external	Portable demos, classrooms, offline field work and low-friction local AI experiments	Model files can be large; packaging reduces setup but not hardware requirements
text-generation-webui OSS No tagline	Local LLM Apps	Advanced local model web UI	text-generation-webui	AGPL-3.0 / open source	$0 software; hardware/model costs separate	Open-source local web UI for many model loaders	✓	Repository describes a Gradio web UI for LLMs with transformers, GPTQ, AWQ, EXL2, llama.cpp and GGUF support	Transformers, llama.cpp/GGUF, ExLlama, GPTQ, AWQ, EXL2 and other local model formats/loaders	Chat/notebook modes, model loader controls, extensions, character prompts and generation parameters	OpenAI-compatible API extension, Gradio extensions, model loaders and community plugins	Hugging Face models, llama.cpp, ExLlama, Transformers, SillyTavern and local tooling	Local Python environment, one-click installers or server-style deployment	Runs locally; extensions and remote model downloads need trust review	No native enterprise governance; operator controls users/networking	Experimenters who need broad loader support and fine-grained generation controls	More setup and dependency complexity than Ollama or LM Studio
LobeChat OSS No tagline	Local LLM Apps	Self-hosted chat framework	LobeChat	MIT / open source	$0 self-hosted software; hosted/provider costs separate	Open-source self-hosted UI with optional cloud/provider costs	✓	Repository describes an open-source modern AI chat framework supporting Ollama, Qwen, DeepSeek and major providers	Ollama, OpenAI, Claude, Gemini, Qwen, DeepSeek, OpenAI-compatible endpoints and multimodal models	Modern chat UI, knowledge base, plugins, assistants, artifacts, TTS/vision and one-click deployments	Plugin system, server database mode, auth integrations, OpenAI-compatible providers and deployment templates	Vercel, Docker, serverless, local models, cloud providers and knowledge base backends	Self-hosted web app, Vercel/serverless or private server	Self-hosted control; provider and storage choices determine data flow	Individual and team deployment possible; full governance depends deployment/auth setup	Builders who want a polished self-hostable AI chat product with plugin UX	More app-framework complexity than a simple desktop local runner
GPT4All Desktop No tagline	Local LLM Apps	Private desktop chatbot	GPT4All	Open source / commercial use allowed per repo	$0 software; hardware and model licenses separate	Free desktop app and local inference ecosystem	✓	Official docs describe private local desktop chat, LocalDocs and no required API calls or GPU for basic use	GPT4All model ecosystem, local GGUF-style models and API/provider connections	Desktop chat, local model browser, LocalDocs private document Q&A and settings for local inference	Python SDK, local API server, bindings and desktop integrations	Nomic ecosystem, local documents, Python apps and OpenAI-compatible workflows	Windows, macOS, Linux desktop and local API server	Private local operation when using local models; document data remains on device unless external providers are configured	No built-in enterprise team layer in desktop app	Non-technical users who want private local document chat on everyday computers	Model quality is constrained by local hardware; larger models still need significant RAM/VRAM
Jan Local No tagline	Local LLM Apps	Offline desktop assistant	Jan	Open source desktop app	$0 local use; provider/model costs separate	Free local desktop app plus optional external provider usage	✓	Docs state local use is always free and Jan can work offline after models are downloaded	Local models via built-in runtimes/Hugging Face plus remote providers such as OpenAI-compatible APIs	Desktop ChatGPT-style interface, model hub, assistants, local API server and file/chat workflows	OpenAI-compatible local server, extensions, provider routing and model management	Hugging Face models, local engines, OpenAI-compatible apps and desktop OS integrations	Windows, macOS and Linux desktop	Offline operation possible; cloud model calls send data to selected provider	Mostly individual/local workflow; team governance depends external deployment choices	Users who want an open-source offline ChatGPT replacement with a desktop feel	Local model performance depends hardware; optional cloud services and roadmap should be checked
Ollama OSS No tagline	Local LLM Apps	Local model runtime and app	Ollama	MIT / open source	$0 software; hardware and model licenses separate	Local runtime, model manager and desktop/CLI workflow	✓	Official project runs models locally and exposes a local API; model downloads and compute are user-provided	Ollama model library, GGUF-derived quantized models, Llama, Qwen, Gemma, Mistral, DeepSeek and custom Modelfiles	CLI, desktop installer, model pull/run workflow and simple chat loop	Local REST API, OpenAI-compatible endpoints, Docker image, Python/JS clients and Modelfile customization	Open WebUI, AnythingLLM, Page Assist, LangChain, LlamaIndex, Continue, Cline and many OpenAI-compatible tools	Local desktop, local server, Docker or LAN-accessible self-hosted runtime	Runs locally by default; exposed LAN/public servers need explicit network hardening	No SaaS governance in OSS runtime; model and endpoint access are operator-managed	Fastest path to local model execution for developers and hobbyists	Quality and speed depend on local hardware; model licenses vary; misconfigured servers can expose local endpoints
Open WebUI Self-hosted No tagline	Local LLM Apps	Self-hosted AI chat platform	Open WebUI	Open WebUI License / source available	$0 self-hosted software; hosting and model costs separate	Self-hosted web UI and platform with optional external model costs	Yes, source-available self-hosting	Official repo positions Open WebUI as a user-friendly AI interface supporting Ollama and OpenAI-compatible APIs	Ollama, OpenAI-compatible APIs, external providers, RAG stores and tool servers	ChatGPT-like UI, users, workspaces, files/RAG, tools, functions, admin controls and model switching	Pipelines, functions, OpenAPI tools, REST/WebSocket APIs, OAuth/LDAP/SCIM options and Docker/Helm deployment	Ollama, OpenAI APIs, Kubernetes, Docker, Helm, LDAP/OAuth/OIDC and vector/RAG backends	Docker, pip, Kubernetes/Helm, local machine or private server	Self-hosted data control; auth, network exposure and provider routing must be configured carefully	Admin panel, users, groups and enterprise-style auth integrations	Teams that want a self-hosted internal AI portal over local and hosted models	License has branding/trademark requirements; operating it securely requires admin work
LibreChat OSS No tagline	Local LLM Apps	Self-hosted multi-provider chat	LibreChat	MIT / open source	$0 software; model/API/hosting costs separate	Free self-hosted web application	✓	Official site describes LibreChat as free, open source, self-hosted and no subscription	Ollama/local endpoints plus OpenAI, Anthropic, Google, Azure, AWS Bedrock and other providers	Unified chat UI, agents, files, code interpreter-style workflows, plugins and multi-model switching	MCP support, custom endpoints, plugins, OAuth/SAML/LDAP, moderation and rate limiting	Docker, MongoDB, Redis, OAuth providers, local LLM endpoints and major cloud LLM APIs	Self-hosted Docker/server deployment	Self-hosted control; conversations still go to configured model providers unless local endpoints are used	Multi-user auth, SSO options, rate limiting and admin-oriented controls	Organizations wanting a self-hosted ChatGPT-style portal with provider flexibility	Requires operating databases and auth; MongoDB licensing/compliance should be reviewed for commercial deployments
Pinokio Desktop No tagline	Local LLM Apps	Local AI app launcher	Pinokio	Free desktop app / source available	$0 software; app/model/hardware costs separate	Local AI app launcher and localhost cloud	✓	Official docs describe Pinokio as a local platform to install, run and automate AI apps on a user's own machine	Runs local AI apps and servers rather than one model format; apps may include Ollama, ComfyUI, Whisper and web UIs	One-click install/run, built-in browser, dependency management, app discovery and local automation scripts	JSON app scripts, local runtimes, app recipes, local web servers and agent/app control features	Python, Node.js, Bun, Git, Conda, local AI apps and web UI stacks	Desktop app hosting local web apps on the user's machine	Local-first by design; scripts can execute code and must be trusted before install	No enterprise governance by default; app/script trust is user-managed	Non-specialists who want one-click local AI app installation	Security depends heavily on script provenance; not just an LLM chat application
Khoj Self-hosted No tagline	Local LLM Apps	Personal AI second brain	Khoj	AGPL-3.0 / open source	$0 self-hosted software; cloud/provider costs separate	Self-hostable app plus optional cloud service	✓	Official repo describes Khoj as open-source and self-hostable, scaling from on-device personal AI to cloud-scale enterprise AI	Local LLMs such as llama/qwen/mistral via local providers plus online models and Khoj cloud	Chat with docs/web, semantic search, custom agents, automations, newsletters and apps/plugins	APIs, agents, browser/desktop/mobile, Obsidian/Emacs integrations and local/online model routing	Obsidian, Emacs, browser, desktop, WhatsApp, local LLMs, web search and document stores	Self-hosted on local machine/server or hosted Khoj cloud	Self-hosting can keep docs local; cloud app and online models change data flow	Self-hosted governance is operator-managed; enterprise/cloud governance depends plan	Personal knowledge bases that need local/private model options plus search	Broader second-brain/RAG app, not a minimal local model runner
Page Assist OSS No tagline	Local LLM Apps	Browser sidebar for local models	Page Assist	Open source browser extension	$0 software; backend/provider costs separate	Browser extension and web UI for local AI models	✓	Official site describes Page Assist as an open-source browser extension with sidebar and web UI for local AI models	Ollama, Chrome AI/Gemini Nano beta, OpenAI-compatible providers such as llama.cpp, LM Studio, Llamafile and vLLM	Browser sidebar, page-aware chat, web UI, PDF/document chat and search while browsing	Browser extension APIs, local provider URLs, OpenAI-compatible endpoints and page context tools	Chrome/Edge-style browsers, Ollama, LM Studio, llama.cpp, Llamafile, vLLM and local documents	Browser extension plus local model provider on desktop	Page context can be sent to configured local or remote provider; browser permissions need review	Individual browser workflow; no enterprise governance unless managed by browser policy	Users who want local AI assistance inside the browser	Depends on a separate local model backend; browser extension permissions must be understood
Open Interpreter CLI No tagline	Local LLM Apps	Local computer-use assistant	Open Interpreter	AGPL-3.0 / open source	$0 software; provider/model/runtime costs separate	Local CLI/agent that can use local or hosted models	✓	Official repo says Open Interpreter lets LLMs run code locally and provides a ChatGPT-like terminal interface	Local models through Ollama, LM Studio, Jan, Llamafile and provider-agnostic model configuration	Terminal chat, code execution, local file/media/data workflows and approval before running code	Provider-agnostic model picker, local model guides, shell/Python/JS execution and MCP-like local workflow	Ollama, LM Studio, Jan, Llamafile, OpenAI-compatible APIs, local shell and browser tools	Local CLI and desktop/agent workflows	Powerful local code execution; user approval and sandboxing discipline are critical	No team governance by default; local permissions and review process are user-managed	Technical users who want a local AI agent operating on their computer	High risk if users approve unsafe code; local model quality may be insufficient for complex tasks
Serge OSS No tagline	Local LLM Apps	Local chat interface	Serge	Apache-2.0 / open source	$0 software; hardware/model costs separate	Dockerized local chat app over llama.cpp	✓	Repository describes Serge as a web interface for chatting with Alpaca through llama.cpp, fully dockerized with an API	llama.cpp-compatible local models, originally Alpaca/LLaMA-style local models	Simple web chat UI, local conversations and dockerized app/API stack	Docker API, llama.cpp backend and local web interface	Docker, llama.cpp and local model files	Self-hosted Docker on local machine/server	No API keys or cloud required when using local models; Docker/network exposure must be controlled	No team governance; hobbyist/self-hosted app	Users wanting a simple historical self-hosted llama.cpp chat UI	Older project and model assumptions; verify maintenance before relying on it
Atomic Chat OSS No tagline	Local LLM Apps	Offline desktop/mobile chat	Atomic Chat	Open source desktop/mobile app	$0 software	Free local AI chat app	✓	Official terms say Atomic Chat is free, open source, local, requires no account/subscription and runs models on device	Bundled/local models and Atomic local inference stack; OpenAI-compatible local server on desktop builds	Private offline chat, desktop/mobile app UX and simple local model operation	OpenAI-compatible localhost server, GitHub source and local inference components	Desktop apps, mobile apps, local tools and OpenAI-compatible clients	macOS, Windows, iPhone and other app targets depending release availability	No account/server data for local use; uninstall removes local app data per terms	Individual-first app; no team governance captured	Users who want a zero-account local chat client with mobile/desktop reach	Newer project; model catalog, platform parity and maturity should be validated
PrivateGPT OSS No tagline	Local LLM Apps	Private document assistant	PrivateGPT	Apache-2.0 / open source	$0 software; hardware/model costs separate	Self-hosted private document Q&A app/framework	✓	Project describes private document interaction where data does not leave the execution environment	Local LLMs, local embeddings, document ingestion and configurable model backends	Document ingest, private Q&A, local RAG pipeline and API/server modes	Python APIs, local vector stores, embeddings, LLM backends and Docker/deployment templates	llama.cpp, Ollama-compatible/local models, embeddings models and document pipelines	Local Python app, Docker or private server	Designed for private local documents; first-time model downloads and configured providers need review	No broad team governance unless deployed behind internal auth controls	Users who want local document Q&A without sending files to SaaS	More RAG-focused than general chat; maintaining model/storage dependencies takes work
SillyTavern OSS No tagline	Local LLM Apps	Power-user LLM frontend	SillyTavern	AGPL-3.0 / open source	$0 software; backend/model/provider costs separate	Locally installed LLM frontend for many backends	✓	Official repo says SillyTavern is locally installed and provides no hosted service or user tracking	KoboldAI/CPP, Ooba, Tabby, Ollama/OpenAI-compatible APIs, Claude, OpenRouter, NovelAI and image/TTS backends	Roleplay/story UI, character cards, lorebooks, prompt controls, extensions, Visual Novel mode and mobile-friendly layout	Extension system, backend adapters, image/TTS APIs, prompt macros and community content	KoboldCpp, text-generation-webui, Tabby, OpenAI-compatible APIs, ComfyUI, Automatic1111 and TTS tools	Local Node.js app, Docker or local network front end	No hosted service; privacy depends selected backend and whether endpoints are remote	Community/hobbyist governance; no enterprise admin layer	Power users building local creative writing and character chat workflows	Steep learning curve; some use cases require careful content and safety policies
LlamaGPT OSS No tagline	Local LLM Apps	Offline ChatGPT-like app	LlamaGPT	MIT / open source	$0 software; hardware/model costs separate	Self-hosted offline chatbot	✓	Repository describes a self-hosted offline ChatGPT-like chatbot with no data leaving the device	Llama 2 era local models and llama.cpp-python style local inference stack	Simple ChatGPT-like UI, Docker compose deployment and private local chat	Docker Compose, local model files and web app/API stack	Umbrel, Docker, local server and llama.cpp-style local inference	Self-hosted Docker or umbrelOS home server	Offline/private by design once installed; model/source downloads require trust review	No team governance; personal/home-server orientation	Home lab users wanting a simple private chatbot appliance	Model stack is older; verify maintenance and model support before new production use
LlamaChat App No tagline	Local LLM Apps	Native macOS local chat	LlamaChat	Open source macOS app	$0 software; hardware/model costs separate	Free local macOS chat app	✓	Official site says LlamaChat lets users chat with LLaMA, Alpaca and GPT4All models running locally on Mac	Local LLaMA, Alpaca and GPT4All-style models on macOS	Native macOS chat UI focused on local model interaction	GitHub source, local model loading and app-level integration with macOS	macOS, GPT4All-era local models and local model files	Native macOS desktop app	Local-only model operation when using local files; model provenance and app maintenance should be checked	No team governance; individual desktop workflow	Mac users wanting a lightweight native local LLM chat client	Narrow platform scope and older model assumptions compared with Jan or LM Studio