Mar 18, from 1:22PM PDT (20:22 UTC) to 3:20PM PDT (22:20 UTC), when a candidate visited a Direct Booking Link, selected a time, and clicked “Book,” Ashby presented the candidate with an error that the time could not be booked.
Our infrastructure runs snapshots of code built for specific computer architectures we provision in our servers on Amazon Web Services. Last week, we inadvertently changed the targeted computer architecture of our direct booking code snapshots to ones incompatible with our servers. Since we keep backups of snapshots, direct booking continued to work until the old snapshot expired today. The expiration caused our infrastructure to use the newer incompatible snapshots and fail.
Our on-call engineer identified the issue internally within 30 minutes, updated status.ashbyhq.com immediately, and resolved the issue approximately 2 hours later.
As part of remediation, we contacted affected candidates and customers directly.
That day, we added monitoring to detect this specific issue immediately (versus minutes later).
We also performed a post-mortem and determined that the root cause was the lack of developer tooling and safeguards on some of our infrastructure. We have started adding the necessary hardening features to prevent this situation from occurring again.