Run LLMs entirely in the browser with a simple headless React hook, useLLM().

Run LLMs entirely in the browser with a simple headless React hook, useLLM().

Live demo: http://chat.matt-rickard.com
GitHub: https://github.com/r2d4/react-llm

react-llm/headless lets you customize everything from the system prompt to the user/assistant role names. It manages a WebGPU-powered background worker.

react-llm sets everything up for you — an off-the-main-thread worker that fetches the model from a CDN (HuggingFace), cross-compiles the WebAssembly components (like the tokenizer and model bindings), and manages the model state (attention kv cache, and more).

Everything runs clientside — the model is cached and inferenced in the browser. Conversations are stored in session storage.

Under the hood, it’s powered by Apache TVM Unity and MLC.

 

© 版权声明

相关文章

暂无评论

暂无评论...