#python#deep_learning#inference#llm#nlp#pytorch#transformer
Nano-vLLM is a small, fast, and easy-to-understand tool for running large language models offline. It matches the speed of bigger systems like vLLM but uses only about 1,200 lines of clean Python code, making it simple to read and modify. It includes smart features like prefix caching and tensor parallelism to boost performance. You can install it easily and run models like Qwen3-0.6B on your own GPU. This tool is great if you want fast, efficient AI inference without complex setups, ideal for learning, research, or small deployments on limited hardware.
https://github.com/GeeeekExplorer/nano-vllm
RsS iS dEaD LOL: discover RSS Feeds of your follows on Mastodon
频道曾经提及过一个叫 FeedsMage 的服务,用于从你 fo 的推友的 Bio 里找链接,再从链接里找 Feed ,最后可生成一个 #OPML 文件。RsS iS dEaD LOL 则是长毛象版的 FeedsMage,从你 fo 的 Fediverse 用户的 Bio 里找链接,发现 RSS,然后可生成 #OPML:
https://rss-is-dead.lol/
例如我的:
https://rss-is-dead.lol/user?profileUrl=https%3A%2F%2Fmastodon.social%2Fusers%2FAboutRSS
发现于作者嘟文:
https://mastodon.social/@paulcuth/112178886374464145