Outage of Global instance, May 8, 2026

Hello,

We regret that the Global instance of KoboToolbox, kf.kobotoolbox.org, is down as of 00:32 UTC due to an AWS infrastructure problem. The EU instance is not affected.

Amazon is posting public updates about the situation at https://health.aws.amazon.com/health/status#ec2-us-east-1_1778199926. As of now, 01:38 UTC, the current status is as follows:

May 07 5:53 PM PDT We continue to investigate instance impairments to a single Availability Zone (use1-az4) in the US-EAST-1 Region. We have experienced an increase in temperatures within a single data center, which in some cases has caused impairments for instances in the Availability Zone. EC2 instances and EBS volumes hosted on impacted hardware are affected by the loss of power during the thermal event. Other AWS services that depend on the affected EC2 instances and EBS volumes in this Availability Zone, may also experience impairments. We will continue to provide updates as recovery continues.

Kobo staff will continue to monitor the situation and bring the Global instance back online as soon as Amazon corrects the underlying infrastructure failure.

We apologize for this disruption and thank you for your patience.

1 Like

Amazon has provided this update:

May 07 6:47 PM PDT We continue to work towards mitigating the increased temperatures to its normal levels in the affected Availability Zone (use1-az4) in the US-EAST-1 Region. Other AWS services that depend on the affected EC2 instances and EBS volumes in this Availability Zone, may also experience impairments. We have weighed away traffic for most services at this time. We recommend customers utilize one of the other Availability Zones in the US-EAST-1 Region at this time, as existing instances in other AZ’s remain unaffected by this issue. Customers may experience longer than usual provisioning times. We will provide an update by 7:45 PM PDT, or sooner if we have additional information to share.

1 Like

Further update from Amazon:

May 07 8:06 PM PDT We are actively working to restore temperatures to normal levels in the affected Availability Zone (use1-az4) in the US-EAST-1 Region, though progress is slower than originally anticipated. Since our last update we have made incremental progress to restore cooling systems within the affected AZ, which will not be visible to external customers but are required for the restoration of affected services. In the impacted Availability Zone, EC2 Instances, EBS Volumes, and other AWS Services are also experiencing elevated error rates and latencies for some workflows. As part of our recovery effort, we have shifted traffic away from the impacted Availability Zone for most services. We recommend customers utilize one of the other Availability Zones in the US-EAST-1 Region, as existing instances in other AZs remain unaffected by this issue. If immediate recovery is required, we recommend customers restore from EBS Snapshots and/or replace affected resources by launching new replacement resources in one of the unaffected zones. We will provide an update by 10:00 PM PDT, or sooner if we have additional information to share.

We are pursuing their recommendations about snapshots, but all operations are currently stuck in a “pending” state. We are also following up with their technical support staff for further guidance.

1 Like

Will there be data loss?

1 Like

We do not anticipate any data loss [other than maybe data that was immediately still in-flight at the time of the outage] and we always maintain regular backups. We’ll report back when AWS is back up and our service is restored.

2 Likes

Kindly advise on the anticipated timeline for resolution, as I have a data collection session scheduled within the next three hours. Please let us know whether we should wait or consider alternative options.

Welcome to the community, @imukoko! We will update you as soon as the issue gets resolved! :folded_hands:

@imukoko, we only have the information provided to us by AWS, but I would consider alternative options just to be safe.

1 Like

Update from Amazon:

May 07 10:11 PM PDT We are observing early signs of recovery. We continue to work towards restoring temperatures to normal levels and bring impacted racks back online in the affected Availability Zone (use1-az4) in the US-EAST-1 Region. We have been able to get additional cooling system capacity online, which has allowed us to recover some affected racks and are actively working to recover additional racks in a controlled and safe manner. In the impacted Availability Zone, EC2 Instances, EBS Volumes, and other AWS Services may continue to experience elevated error rates and latencies for some workflows until full recovery is achieved. We will provide an update by 11:30 PM PDT, or sooner if we have additional information to share.

1 Like

We currently have data collection ongoing. If a form is saved in our surveyors’ drafts, will they be able to submit it after services are restored, or will that data be lost? We can pause the survey if the latter is the case.

same question.

Yes, they will.

1 Like

How long does it usually take to recover the server? Our team has travelled far away and now we’re entering this problem. We have switched to the eu server temporarily for now but we hope that it will be available again soon

Sorry I don’t have more information for you. This is an extremely unusual event. I really have no precedent to cite to tell you how long it will take Amazon to deal with their overheated data center.

Data collection was ongoing. There is also data received from the data collectors and data saved as a draft. Is there a chance of losing the survey form, collected data, draft data, and the project in general? Do you advise us to stop data collection or continue and save as a draft?

Many thanks.

I equally have teams in the field.

@jnm Could you kindly advise how to change to the EU server in the meantime?

If you do not already have an account in the Kobo EU server, then you can go to the server and register as a new user, using your same email if you desire. Having done so, you can then upload and redeploy your existing XLSForm to the EU server and collect data; please note it will require configuring a different server name in KoboCollect.

Once you have collected your data/submissions then there are ways to transfer your data (back) to a different server (ie Global server) if you need to; we can assist with showing you how to do that.

1 Like

Thanks, @Xiphware :folded_hands:

It seems that Amazon has restored access to the machines housing our database volumes, and we are in the process of bringing the Global service back up right now.

3 Likes

Many thanks! Does it mean I can access our stored data even when I change the server. This is in case it happens again now that @jnm has said the severs on going back online.

AWS has fixed their us-east-1 cooling issue and our Kobo Global server should now be back online and behaving normally.

Please note you may be prompted to re-login as your previous temporary Kobo session may have been be invalidated

As a consequence of the Global server basically having to be re-started, if you had shared your project submission URL specifically for Enketo web submissions between the 1.5hr window between 11:00pm Thur UTC and 12:30am Fri UTC [11:00pm was our previous backup] then this URL will now be invalid and will not accept web submissions. If so, then please go back to your Project’s FORM page and re-share the new URL that is provided by Copy/Open:

Note: KoboCollect is unaffected by this URL change, so if you had previously fetched your form to KoboCollect then your submission(s) should now submit fine.

Thank you for your patience. As @jnm said, AWS going down for this length of time is highly unusual.

2 Likes