My framework uses Python API for GCE to create instances on the fly as needed. All the instances are based on single machine image. There is a limit on the rate of the instance-creation operation. However, I noticed that sometimes GCE allows the creation of several instances in a row without much delay. Hence, I do not want to fix the delay between instance creations in my code.
For now, I use exponentially increasing delays between attempts at instance creation. Although I could not find anything related in the documentation, I decided to try asking: is there a way to use the API to find out directly when the next instance creation will be allowed?
It would be helpful if your question included a minimal repro of your code, it's scope (what rate of instance creation are you attempting?) and details of the error that you're receiving.
Although not surprising (!), I was unaware that there is a rate limit on instance creation (
instances.insert).The Compute Engine API rate limits does not appear (!?) to include
instances.insert.If you're hitting a rate limit you can:
It's possible that you're encountering service capacity limits (I've seen these recently in us-east1). There is insufficient capacity to fulfill your requests in the region|zone. In this case, the only alternative is to look in other regions|zones for capacity.