#cplusplus
ik_llama.cpp is an improved version of llama.cpp that runs faster on CPUs and hybrid GPU/CPU setups. It supports many new advanced quantization methods, which help models use less memory and run more efficiently. It also offers better performance for special models like DeepSeek and MoE, with faster prompt processing and token generation. You can run it on various hardware, including Android, and it has features to control where model data is stored (CPU or GPU). This means you get quicker AI responses and can handle bigger or more complex models smoothly on your computer or device[2][1][4].
https://github.com/ikawrakow/ik_llama.cpp
https://github.com/aio-libs/aiohttp-mako
#mako template renderer for #aiohttp.web based on aiohttp_jinja2. Library has almost same api and support python 3.5 (PEP492) syntax. It is used in aiohttp_debugtoolbar.
#Mako is a #template library written in Python. It provides a familiar, non-XML syntax which compiles into Python modules for maximum performance. Mako's syntax and #API borrows from the best ideas of many others, including #Django and #Jinja2 templates, #Cheetah, #Myghty, and #Genshi. Conceptually, Mako is an embedded Python (i.e. Python Server Page) language, which refines the familiar ideas of componentized layout and inheritance to produce one of the most straightforward and flexible models available, while also maintaining close ties to Python calling and scoping semantics.
http://www.makotemplates.org/