Maybe-Ray

This model is currently experiencing high demand

Recently, I have been running into the "This model is currently experiencing high demand" message using the Gemini API. This is frustrating since my side-project ikka requires Gemini to perform some summarisation and ranking of the day's news articles. This has me on some days waiting for an opportune time to try to run my scripts so that I can give my users some articles to read.

The worst of this was when the Gemini API was experiencing high demand for over 4 days. I ended up creating a workaround in which I would switch to Gemini 2.5 Flash. But even then, sometimes the Flash model experienced high demand. So, out of pure necessity, I decided to switch to OpenRouter. This move has come with two advantages:

  1. I can now compare and experiment with different models. This allows me to test out what model works best for my particular task. This also strips my project of any dependency on Gemini and vendor lock-in for Google services. I also discovered that Google was extracting quite a bit of telemetry from their Google Agent SDK.
  2. I now have comparable fallback models for instances when my main model might be down. When I was using 2.5 Flash as a fallback model, the results were usually not as great as my main model. I can use other LLM providers and models, such as DeepSeek or Claude, and know the results will not be on par but close enough to the output I wanted.

My experience with OpenRouter has been pretty great so far, and I generally have no complaints. I've been experimenting with DeepSeek 4 Flash, and all I can say is that it is one of the best models I have used. I'm just frustrated that it takes a lot of time to return a response. If it were as fast as the Gemini models, it would definitely be my main model.