On November 24, 2023, Olo's Omnivore API experienced a disruption between 21:17 UTC and 22:12 UTC. During this time all API operations with the exception of Add Payment, Open Ticket, and Submit Order were failing, and 25% of Omnivore-related webhooks experienced delayed delivery.
On November 24, 2023, Olo experienced a disruption to the Omnivore API and related webhook delivery, caused by a failure in the automated process for creating new Omnivore API instances. As traffic to the Omnivore API increased, its auto-scaling system was unable to add capacity to meet it. As a result, at 21:17 UTC all API operations with the exception of Add Payment, Open Ticket, and Submit Order began to fail, and 25% of Omnivore-related webhooks began to experience delayed delivery.
We discovered that some of our package dependencies had been updated by their maintainers to require a newer runtime version than what was available in our deployment pipeline. This caused the bootstrapping process to fail for new instances that were needed to handle current traffic levels. With this identified, we implemented and deployed a fix to remove the failing dependencies from the API's critical path, allowing the system to resume scaling out additional API instances and restoring service at 21:12 UTC.