High-throughput local LLM inference and serving engine.
Adapt an open-weight model to your domain with a small dataset.