Backup configuration during Kobo Install is not working

Hello,

I have Kobo Installed on my own server which is AWS EC2 instance using https://github.com/kobotoolbox/kobo-install. During the installation, the setup asked for backup configuration which I did as below but I don’t see any backup is created on my AWS bucket.

Do you want to activate backups?
	1) Yes
	2) No
[1]: 
╔═════════════════════════════════════════════════════════════════╗
║ Schedules use linux cron syntax with UTC datetimes.             ║
║ For example, schedule at 12:00 AM E.S.T every Sunday would be:  ║
║ 0 5 * * 0                                                       ║
║                                                                 ║
║ Please visit https://crontab.guru/ to generate a cron schedule. ║
╚═════════════════════════════════════════════════════════════════╝
PostgreSQL backup schedule?
[*/2 * * * *]: 
MongoDB backup schedule?
[*/4 * * * *]: 
Redis backup schedule?
[*/6 * * * *]: 
AWS Backups bucket name [bucket-name]: 
How many yearly backups to keep?
[2]: 
How many monthly backups to keep?
[12]: 
How many weekly backups to keep?
[4]: 
How many daily backups to keep?
[10]: 
MongoDB backup minimum size (in MB)?
Files below this size will be ignored when rotating backups.
[5]: 
PostgresSQL backup minimum size (in MB)?
Files below this size will be ignored when rotating backups.
[5]: 
Redis backup minimum size (in MB)?
Files below this size will be ignored when rotating backups.
[5]: 
Chunk size of multipart uploads (in MB)?
[5]: 
Use AWS LifeCycle deletion rule?
	1) Yes
	2) No
[1]: 

Similarly, as the setup asked for AWS bucket name for backup, it is not asking for AWS Key and Secret of the bucket, can I know how it is going to store the backup on S3 then? Can someone please help and let me know where I am doing wrong?

Hello,

Is there any resolution on this? Can someone from Kobo Toolbox team or from Community can reply?

Thanks

Hi
When I look at the code, I suspect the issue is here
image

The instructions seems to have indicated clearly
image

I went onto the link provided and assume you must have done so and I feel that you are not following the approach

Stephane

Hi,

Thank you for the reply. I don’t think this is the actual problem. I also tried setting the cron with every hour backup like 0 5 * * * and that also didn’t work.

This backup is scheduled to run every 2 minutes for Postgres, every 4 minute for Mongo etc. and I have double checked it on crontab.guru website and it gives me exactly the same settings. Please find the attached image here

Is there any restriction set in code that schedule should be monthly or at-least bi-weekly and anything less will not be considered? Also, my other question, that backup is asking for AWS bucket but not its Key and Secret, then how it is going to put backup files on S3?

Regards,
Danish

Hello @dhakim,

As I said on GitHub, Key and Secret are shared with S3 storage.
Be sure to activate it before and give your user the correct permissions (write access to storage bucket and backup bucket).

You can also have a look at logs
in kobo-docker/log/ . You should see a backup.log file in each DB folder (i.e. postgres, mongo, redis) .

IMHO, a backup every 2 minutes is pretty aggressive. As soon as your DB will grow. I’m not sure the task will be able to complete before the next starts.

2 Likes

Hi @nolive,

Thank you for you above reply. As per your suggestions, I checked the user settings and it is fine, as I can using AWS CLI using same S3 access key and secret which I have provided to Kobo-Install

Regarding the crontab of every 2 minutes, I just did that to see if the back-ups are really working, which I changed to every 2 days.

So far, backups are not working through kobo-install, and I have written the shell script for the time being which takes the postgres db backup and push it to my S3 bucket. I will post here if the backup through kobo-install configured properly.

Hi @nolive,

As per your suggestions above, I checked the logs and it is showing that it is unable to connect with S3 bucket, although I am able able to connect using aws cli.

ubuntu@server:~/kobo-docker/log/postgres$ sudo cat backup.log 
Traceback (most recent call last):
  File "/kobo-docker-scripts/backup-to-s3.py", line 62, in <module>
    s3bucket = s3connection.get_bucket(AWS_BUCKET)
  File "/tmp/backup-virtualenv/local/lib/python2.7/site-packages/boto/s3/connection.py", line 509, in get_bucket
    return self.head_bucket(bucket_name, headers=headers)
  File "/tmp/backup-virtualenv/local/lib/python2.7/site-packages/boto/s3/connection.py", line 556, in head_bucket
    response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 400 Bad Request

Same errors found in mongo and redis too.

@dhakim,

Let me try to reproduce it locally and I let you know.
In the meantime, Can you confirm which branch, commit you are using for:

  • kobo-docker
  • kobo-install

Thank you

Thank you @nolive, I am using kobo-install commit master branch

Hello @dhakim,

I’ve run a fresh install of kobo-install with master branch.
I forced the backup (for postgres) to run within the next 15 minutes and it worked.

root@postgres:/# tail -f /srv/logs/backup.log
...
Backing up to "postgres/yearly/postgres-9.5-kobo.local-20200416_201002.pg_dump"...
Wrote 1 chunks; 262.1 MB
Finished! Wrote 1 chunks; 262.1 MB

I have also tested it manually by running these commands:

ubuntu@server:~/kobo-install/$ ./run.py -cb exec postgres bash
root@postgres:/# /tmp/backup-virtualenv/bin/python /kobo-docker-scripts/backup-to-s3.py

It did work too.
To be sure, I’ve tested in MongoDB and redis containers. The backup run flawlessly too.

To validate we are using the same versions, can you confirm you have the same version as me?

ubuntu@server:~/kobo-install/$ ./run.py --version
KoBoInstall Version: 2aea483
ubuntu@server:~/kobo-install/$ git log --format="%H" -n 1
2aea483520dd6de95cc2d8e8e99f639bcc4b1ce2

In kobo-docker:

ubuntu@server:~/kobo-docker/$ git log --format="%H" -n 1 
feb2fd6f48444460e8c885afd79e9b5523a53aaa

If you don’t, please run ./run.py --upgrade.
It’s recommended to run ./run.py --setup after upgrading.

You said you could log in AWS CLI with the same credentials, but are you sure you can list, read and write in the backup bucket, not only the media bucket.

1 Like

Hi @nolive,

I have tried above steps but still the issue stays. I have investigated the issue and found below kobocat logs regarding S3 bucket connectivity.

ERROR 2020-04-24 07:02:17,388 base 229 140446137800512 Internal Server Error: /api/v1/forms
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 132, in get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/usr/local/lib/python2.7/dist-packages/django/utils/decorators.py", line 145, in inner
    return func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/django/views/decorators/csrf.py", line 58, in wrapped_view
    return view_func(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rest_framework/viewsets.py", line 87, in view
    return self.dispatch(request, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/rest_framework/views.py", line 466, in dispatch
    response = self.handle_exception(exc)
  File "/usr/local/lib/python2.7/dist-packages/rest_framework/views.py", line 463, in dispatch
    response = handler(request, *args, **kwargs)
  File "./onadata/apps/api/viewsets/xform_viewset.py", line 740, in create
    survey = utils.publish_xlsform(request, owner)
  File "./onadata/apps/api/tools.py", line 266, in publish_xlsform
    return publish_form(set_form)
  File "./onadata/libs/utils/logger_tools.py", line 455, in publish_form
    return callback()
  File "./onadata/apps/api/tools.py", line 264, in set_form
    return form.publish(user)
  File "./onadata/apps/main/forms.py", line 326, in publish
    return publish_xls_form(cleaned_xls_file, user, id_string)
  File "./onadata/libs/utils/logger_tools.py", line 515, in publish_xls_form
    xls=xls_file
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 127, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 348, in create
    obj.save(force_insert=True, using=self.db)
  File "./onadata/apps/viewer/models/data_dictionary.py", line 159, in save
    super(DataDictionary, self).save(*args, **kwargs)
  File "./onadata/apps/logger/models/xform.py", line 211, in save
    super(XForm, self).save(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 734, in save
    force_update=force_update, update_fields=update_fields)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 762, in save_base
    updated = self._save_table(raw, cls, force_insert, force_update, using, update_fields)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 846, in _save_table
    result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/base.py", line 885, in _do_insert
    using=using, raw=raw)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/manager.py", line 127, in manager_method
    return getattr(self.get_queryset(), name)(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/query.py", line 920, in _insert
    return query.get_compiler(using=using).execute_sql(return_id)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/compiler.py", line 973, in execute_sql
    for sql, params in self.as_sql():
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/sql/compiler.py", line 931, in as_sql
    for obj in self.query.objs
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/fields/files.py", line 314, in pre_save
    file.save(file.name, file, save=False)
  File "/usr/local/lib/python2.7/dist-packages/django/db/models/fields/files.py", line 93, in save
    self.name = self.storage.save(name, content, max_length=self.field.max_length)
  File "/usr/local/lib/python2.7/dist-packages/django/core/files/storage.py", line 63, in save
    name = self._save(name, content)
  File "/srv/pip_editable_packages/django-storages/storages/backends/s3boto.py", line 409, in _save
    key = self.bucket.get_key(encoded_name)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/bucket.py", line 193, in get_key
    key, resp = self._get_key_internal(key_name, headers, query_args_l)
  File "/usr/local/lib/python2.7/dist-packages/boto/s3/bucket.py", line 232, in _get_key_internal
    response.status, response.reason, '')
S3ResponseError: S3ResponseError: 400 Bad Request

{address space usage: 915894272 bytes/873MB} {rss usage: 117383168 bytes/111MB} [pid: 229|app: 0|req: 17/17] 172.22.0.3 () {38 vars in 656 bytes} [Fri Apr 24 07:02:16 2020] POST /api/v1/forms => generated 4269 bytes in 443 msecs (HTTP/1.1 500) 4 headers in 156 bytes (2 switches on core 0)

This issue was also causing the form deployment issue too, so after I disabled the AWS S3 image storage and backup setting by running python3 run.py -s, my form deployment issue is resolved. I am not quite sure how to fix it, can you please help.

@dhakim,

tbh: I don’t know what else to tell you except please double check your credentials and your S3 bucket policy.

I know you said you tested your credentials with AWS CLI, but are you sure the correct policy is applied on the bucket?

As I said, everything works fine on my side. Backups, deployments.
We’ve just released a new version with several bug fixes. You can give a try.

**ATTENTION: ** Be sure to read this BEFORE.

2 Likes

I had a similar issue. When I setup using the AWS S3 option I was not able to deploy forms and was getting similar errors: S3ResponseError: S3ResponseError: 400 Bad Request

I confirmed that my bucket and AWS Key and Secret were setup properly and wrote my own script that successfully uploaded a test file. Everything seemed like it should work, but it did not.

After some more digging I believe the problem was related to the region in which I created the bucket (Ohio). KoboCat i using the boto library (not boto3) and I believe it uses a different signature version that is not supported in all regions. I created a new bucket in N. Virginia and it worked properly.

While the AWS Console header lists Global and says “S3 does not require region selection.” you can select the region when you create a new bucket.

Hope this can help someone save a few hours.

2 Likes

Hi @mattlangeman,

Welcome to the community! Thank you for sharing your experience and the solution to the issue you had. This should benefit our entire community users having similar issue. Expecting the same in the upcoming days as well.

Have a great day!

God bless you @mattlangeman. I think this issue should be documented so this doesn’t waste so many hours to people to come.
thanks again!

I know this thread is old, but I had a question related: Is it possible to set AWS S3 Storage to “No” while then activating backups to use AWS S3?

  1. If AWS credentials are provided, backups are sent to configured bucket
    [this works fine, but also stores xls form files on S3.]

  2. If AWS storage is selected, credentials must be provided if backups are activated
    [I provided access keys, works fine, backups go to S3, but so do files, can they be separated so ONLY backups are stored in S3]

I havent tried to turn off S3 storage and then watch for backups in S3 yet as this instance is in use already and trying to advise on best configuration going forward

Thank you!

Thanks so much @mattlangeman! I was having the exact same issue, and didn’t know what else to try after verifying everything else was working well. Creating a new bucket located at US East (N. Virginia) us-east-1 fixed the problem.

I agree with Rodrigo, the above should be mentioned during the setup for everyone to be aware of. @Kal_Lam

Thanks for sharing the solution Matt, I owe you one :slight_smile:

Best Regards