minipasila

joined 2 years ago

[–] [email protected] 1 points 2 years ago

I don't know about that, but you could try GGML (llama.cpp). It has quantization up to 2-bits so that might be small enough.