FreeLLMAPI
Stacks the free tiers of 16 LLM providers (Google Gemini, Groq, Cerebras, Mistral, Cohere, NVIDIA, OpenRouter, GitHub Models, Cloudflare, Z.ai, HuggingFace, and more) behind one OpenAI-compatible endpoint with a prioritized fallback chain that auto-switches when a provider hits its daily cap. Self-hosted via Docker (port 3001); a single unified key fronts your encrypted provider keys. Combined free allowances reach a theoretical ~1.7B tokens/month. Self-labeled personal experimentation only: top models have the smallest daily caps, so effective quality drops late in the day and resets at midnight UTC.
Alternatives
2 workflows use FreeLLMAPI
FreeLLMAPI: one socket, sixteen free model tiers with auto-fallback
Front the free tiers of many providers with a single OpenAI-compatible endpoint and a prioritized fallback chain, so your apps point at one key and the router switches providers automatically when one runs out for the day.
Text your own AI assistant on WhatsApp: Hermes wired to FreeLLMAPI
Point Hermes Agent at a FreeLLMAPI backend and connect it to WhatsApp, so a memory-keeping assistant runs 24/7 on a free always-on server and costs nothing per message, with the wiring validated before you link a number.