Show HN: VimLM – A Local, Offline Coding Assistant for Vim
github.comVimLM is a local, offline coding assistant for Vim. It’s like Copilot but runs entirely on your machine—no APIs, no tracking, no cloud.
- Deep Context: Understands your codebase (current file, selections, references). - Conversational: Iterate with follow-ups like "Add error handling". - Vim-Native: Keybindings like `Ctrl-l` for prompts, `Ctrl-p` to replace code. - Inline Commands: `!include` files, `!deploy` code, `!continue` long responses.
Perfect for privacy-conscious devs or air-gapped environments.
Try it: ``` pip install vimlm vimlm ```
[GitHub](https://github.com/JosefAlbers/VimLM)
Awesome. AI isn't making Vim less relevant, it's now more relevant than ever. When every editor can have maximum magic with the same model and LSP, why not use the tool that lets you also review AI generated diffs and navigate at lightning speed. Vim is a tool that can actually keep up with how fast AI can accelerate the dev cycle.
Also love to see these local solutions. Coding shouldn't just be for the rich who can afford to pay for cloud solutions. We need open, local models and plugins.
Thanks! I totally agree. I’m looking at ways to further tighten the pairing between Vim’s native tools and LLMs (like with :diff and :make/:copen to run the code, feed errors back to the LLM, then apply the fixes, etc). The catch is model variability—what works for Llama doesn’t always work with R1 because of formatting/behavior quirks, and vice versa. Finding a common ground for all models is proving tricky.
Why does it need an Apple M-series chip? Any hope for it getting on an intel chip and using it with Linux?
It uses MLX (https://github.com/ml-explore/mlx), Apple’s ML framework, for running LLMs.
Why people tend to nail some stuff into their products?
We have been talking about the AI revolution for several years already, and yet there is no IDE or plugin for VS Code that supports multiple OpenAI compatible endpoints. Some, like Cody, do not even support "private" LLMs other than the ollama endpoint on localhost. Cursor supports only one endpoint for OpenAI API compatible models.
I made a custom version of ChatGPT.nvim for myself to be able to use models I like (mostly removing hardcoded gpt-3), though I dropped it because then I needed to invest time into maintaining and improving this version for myself instead of doing my job.
I'd like to run several specialized models with a vLLM engine and serve them at different endpoints, and then I'd like an IDE to be able to use these specialized LLMs for different purposes. Does anyone know a vim/neovim/vscode plugin that supports several OPENAI_API_HOST endpoints?
For now, this is only possible with agent frameworks, but that's not really what I need.
Not OP but it presumably uses an open LLM that won't run in a timely manner without being on a faster computer.
What is a good method for sandboxing models? I would like to trust these projects, but downloading hard-to-analyze arbitrary code and running it seems problematic.
Probably nspawn[0]. Think of it like chroot on steroids and not as heavy as docker. You can run these containers in an empirical mode, so modifications are not permanent. Like typical systemd you can also limit read/write access, networking, and anything else you want. This can even include things like limiting commands and all that. So you can make the program only able to run in its scope, only read, and only use a very limited command set.
Not the most secure thing, but you can move up to a VM, then probably want a network gaped second machine if you're seriously concerned but not enough to go offsite.
[0] https://wiki.archlinux.org/title/Systemd-nspawn
The attack surface area for local LLMs is much smaller than almost any program that you would download. Make sure you trust whatever LLM execution stack is being used (apparently MLX here? I'm not familiar with that one specifically), and then the amount of additional code associated with a given LLM should be tiny - most of it is a weight blob that may be tough to understand but can't really do anything nefarious, data just passes through it.
Again, not sure what MLX does but c.f. the files for DeepSeek-R1 on huggingface: https://huggingface.co/deepseek-ai/DeepSeek-R1/tree/main
Two files contain arbitrary executable code - one defines a simple config on top of a common config class, the other defines the model architecture. Even if you can't verify yourself that nothing sneaky is happening, it's easy for the community because the structure of valid config+model definition files is so tightly constrained - no network calls, no filesystem access, just definitions of (usually pytorch) model layers that get assembled into a computation graph. Anything deviating from that form is going to stand out. It's quite easy to analyze.
> and then the amount of additional code associated with a given LLM should be tiny
What about this reporting (which is a deserialization issue, it seems like)?
- https://www.wiz.io/blog/wiz-and-hugging-face-address-risks-t...
- https://jfrog.com/blog/data-scientists-targeted-by-malicious...
This project apparently uses MLX, Apple’s ML framework, which doesn’t use Python’s pickle library that’s behind the safety issue. There are several options for storing models/tensors in MLX, none of which I think have such (de-)serialization issues: https://ml-explore.github.io/mlx/build/html/usage/saving_and...
Running it in a podman/docker container would be more than sufficient and is probably the easiest approach.
Consider exposing commands that the user can then assign to their own preferred keybindings instead of choosing for them
Thanks for the suggestion! The plugin currently supports toggling between <Leader>/<C-*> via USE_LEADER config flag. I will add a field in the config file for more customizability (e.g., "KEYBINDINGS": {"mapl":"<C-a>", "mapj":"<Leader>o", ...} in cfg.json).
https://github.com/tpope/vim-fugitive/blob/b068eaf1e6cbe35d1... for reference, an example from a tpope plugin
Whoa, thanks! Will definitely look into that
a good update for an editor that cant handle indenting out of the box!
Doesn’t `=` handle indenting?