Upgrading to separate databases for KPI and KoBoCAT

Background

KoBoToolbox previously used a shared Postgres database for its two Django applications, KPI and KoBoCAT. Since this has some disadvantages—including the requirement that Django be updated in lockstep across both apps—we’ve decided to give KPI and KoBoCAT their own, separate Postgres databases.

KoBoCAT has responsibility for storing all form submissions and, as a result, generally has a larger amount of data than KPI. For this reason, our approach is to copy the KPI tables from the original shared database into a new database. We leave the KoBoCAT tables where they are: if your original, shared database is called kobotoolbox (the default), then, when the migration is finished, KoBoCAT will use kobotoolbox and KPI will use a new database, called koboform by default.

Warnings

Upgrading to two databases requires downtime, because all the KPI data must be copied from one Postgres database to another. If your installation is small, this will be negligible. If you run a large server, however, please plan carefully: expect at least one minute of downtime for every 1,500 KPI assets. Assets are surveys and library items: questions, blocks, and templates. Anticipate an additional 30 minutes for the systems administration work itself: running commands, waiting for Docker images to download and containers to start, etc.

Sessions will be reset, meaning that all users will need to log in again after the upgrade. This happens because we now use Redis to share sessions between KPI and KoBoCAT instead of relying on both applications accessing a shared Postgres table.

Make sure you have enough disk space before proceeding. You’ll be creating a backup of your entire KoBo installation (configuration as well as data), and you’ll be copying the entire KPI database. For safety, we don’t yet delete anything from the original database after the copy completes, but if you’re an experienced database administrator, you can remove all the KPI tables from the KoBoCAT database once you’re satisfied with the upgrade.

It’s difficult to roll back to a single-database setup once you’ve upgraded and started making changes in the KPI application, because you would need to merge, manually, the two databases back into a single database. You can easily restore from a backup, but that would lose any work you may have done in KPI after the database split. If you manage an instance for multiple users, make sure to test the application thoroughly after upgrading before inviting everyone else to continue their work with KoBoToolbox.

Performing the upgrade

This procedure assumes you are using kobo-install to run KoBoToolbox on a single server. If you have a different setup and need assistance, please post a message below.

  • Please make sure you have Python 3 installed in the environment where you run kobo-install/run.py.
  • Stop all KoBo containers:
    $ python3 kobo-install/run.py --stop
    
  • Back up everything related to your KoBo installation using rsync. For a standalone server, this is usually four directories: kobo-deployments, kobo-docker, kobo-install, and nginx-certbot. If your containers are not stopped, your backup will be corrupt.
    $ sudo rsync -avP \
        kobo-deployments \
        kobo-docker \
        kobo-install \
        nginx-certbot \
        backup-20200430/
    
  • Run the same rsync command again; no files should be listed, indicating that everything was already copied. Example output:
    sending incremental file list
    
    sent 93,627 bytes  received 428 bytes  188,110.00 bytes/sec
    total size is 492,598,759  speedup is 5,237.35
    
  • Enter the kobo-install directory:
    $ cd kobo-install
    
  • Fetch new tags from GitHub:
    $ git fetch --tags
    
  • Check out the 2.020.24b release:
    $ git checkout 2.020.24b
    
  • Update the installation:
    $ python3 run.py --update
    
  • Choose Yes to proceed when prompted that it's strongly recommended to run `./run.py --setup` to regenerate environment files.

    :information_source: At each prompt, you may press enter to accept your previously configured value without re-typing it, but take care not to skip over the next step.

  • After you are asked Do you want to (re)start containers?, to which you should answer Yes, you will be asked about Upgrading from single-database setup to separate databases for KPI and KoBoCAT. Read the notice about downtime and choose Yes to proceed if acceptable. This is the last opportunity to abort the upgrade without restoring from a backup.

    :warning: If you decide not to proceed at this point, you have two options:

    1. Follow the on-screen instructions to downgrade to the shared-database-obsolete branch;
    2. Or, roll back using the rsync backup you made at the beginning of this procedure. See the section titled “Rolling back a failed upgrade” below.
  • If you elected to proceed, you should see messages related database creation and copying tables, followed by a notice that The database upgrade finished successfully! Thanks for using KoBoToolbox.

    :information_source: The message WARNING: Found orphan containers can safely be ignored.

  • Your containers should now start as usual, which may take several minutes.

    :information_source: It is not uncommon to see `KoBoToolbox` has not started yet. This is can be normal with low CPU/RAM computers. On a VPS with 2 GB RAM, 512 MB swap, and a single core of an AMD EPYC 7601, starting up the first time takes over 10 minutes.
    If you want to get a sense of what’s happening, open a separate terminal and run top. You should see grunt, node, or npm consuming lots of CPU time (as Enketo completes its initial build).

    • If the CPU is idle but the application still does not load, there’s likely a problem. Answer No at the next prompt to Wait for another 600 seconds? and select Yes to try Restarting the frontend containers.
    • If startup continues to fail, inspect:
      • The output of python3 run.py --logs;
      • The uWSGI logs:
        • kobo-docker/log/kpi/uwsgi.log;
        • kobo-docker/log/kobocat/uwsgi.log;
      • NGINX logs in kobo-docker/log/nginx, particularly *.error.log.
    • If necessary, roll back using the rsync backup you made at the beginning of this procedure. See the section titled “Rolling back a failed upgrade” below.
  • Once your containers start successfully, update the KoBoCAT database so that REST Services continues to function correctly:
    1. Enter the KoBoCAT Docker container:
      $ python3 run.py -cf exec kobocat bash
      
    2. Run the update_kpi_hooks_endpoint management command:
      # ./manage.py update_kpi_hooks_endpoint
      
    3. Once you see Done!, type exit to leave the KoBoCAT Docker container.
  • Test to make sure the application is functioning well. Check as much as your time budget allows, but at a minimum, make sure that you can:
    • Access old projects and their submissions;
    • Access previously-created Enketo survey URLs and submit data through them;
    • Create, deploy, and submit data to new projects;
    • Export data to XLS.
  • If everything looks good, you’re likely done! :raised_hands: Some installations will require a few extra steps; continue reading the “Less common scenarios” section below to see if this applies to you.

Less common scenarios

  • If your users have previously uploaded forms directly to KoBoCAT, a.k.a. “Projects (legacy)”, and now want to use KPI REST Services to push submissions from those forms to external servers, you’ll need to run a management command:
    1. Enter the KPI Docker container—from your kobo-install directory, execute:
      $ python3 run.py -cf exec kpi bash
      
    2. Run the populate_kc_xform_kpi_asset_uid management command:
      # ./manage.py populate_kc_xform_kpi_asset_uid
      
    3. Done! will appear on the screen when the process is finished. You can then exit the container and close your terminal.
  • KPI allows you to configure CORS headers using the Django admin interface, i.e. at https://[your KPI host]/admin/external_integrations/corsmodel/. The underlying library we use for CORS has changed their logic to require that the scheme (http or https) be specified in each origin. If you previously had, for instance, example.com as an allowed CORS origin, simply change it to https://example.com.

Rolling back a failed upgrade

  • If you discover problems or were unable to start your containers, roll back using the rsync backup you made earlier:
    • Ensure all containers are stopped:
      $ python3 kobo-install/run.py --stop
      
    • Move all directories pertaining to the failed upgrade into a new directory, e.g.
      $ sudo mv \
          kobo-deployments \
          kobo-docker \
          kobo-install \
          nginx-certbot \
          failed-update-20200430/
      
    • Restore the backup, e.g.
      $ sudo rsync -a backup-20200430/* ./
      
    • Start your instance as you would have before attempting this procedure:
      $ python3 kobo-install/run.py --start
      
    • Report error messages or trouble symptoms here by posting a message below.
1 Like

Thanks for the detailed instructions. This all worked very well when doing an upgrade.

I ran into trouble when forking KPI master to make some customization. As I have done in previous versions, I cloned the KPI and made some small changes, deployed/built that code in my own docker-hub repo, and then used that to build the docker container used when run.py runs. When I use docker-hub image kobotoolbox/kpi:2.020.18 in docker-compose.frontend.yml, all works perfectly well. As soon as I try to use another docker image for KPI, I get "kpi_1 /bin/bash: line 0: exec: /srv/src/kpi/docker/init.bash: cannot execute: Permission denied " in the logs and of course fails to start. It would seem to perhaps be an issue with the Dockerfile? I have tried different dockerfiles and to chmod that folder in dockerfile, to no avail. At this point I have tried using any number of releases and commits to try to find one that works, but nothing does except the default. Any guidance to what simple thing I may be missing would be super appreciated!!

Hi @tolexy, could you get inside the failing container, using kobo-install$ ./run.py -cf run --rm kpi bash or perhaps kobo-docker$ docker-compose -f docker-compose.frontend.yml -f docker-compose.frontend.override.yml run --rm kpi bash and see what /srv/src/kpi/docker/init.bash looks like? Some things that might be helpful to know: is it executable; can you cat it; can you start it with bash /srv/src/kpi/docker/init.bash; can you start it with /srv/src/kpi/docker/init.bash?

thanks for the reply. I was eventually able to get kpi running but honestly not sure why and dont like that. I will make some tests with KPI starting init.bash manually. I am thinking the below may have clues as well.

When trying to use this same process of building a new image for kobocat (to increase the file upload limit going into our S3 bucket) I got these errors flowing through logs after every other container started.

kobocat_1         | runsv celery: fatal: unable to start ./run: access denied
kobocat_1         | runsv uwsgi: fatal: unable to start ./run: access denied
kobocat_1         | runsv celery_beat: fatal: unable to start ./run: access denied

so I go poking into the container in etc/service, follow the symlinks to srv/src/kobocat/docker and come to find out they werent executable by anyone.

root@kobocat:/srv/src/kobocat/docker# ls -l
total 44
-rw-r--r-- 1 root root  414 May 22 17:46 backup_media.bash
-rw-r--r-- 1 root root  202 May 22 17:46 backup_media_crontab.envsubst
drwxr-xr-x 2 root root 4096 May 22 17:46 cron
-rw-r--r-- 1 root root 1059 May 22 17:46 init.bash
-rw-r--r-- 1 root root 1531 May 22 14:22 kobocat.ini   
-rw-r--r-- 1 root root  295 May 22 17:46 run_celery.bash  <---
-rw-r--r-- 1 root root  327 May 22 17:46 run_celery_beat.bash  <---
-rw-r--r-- 1 root root  123 May 22 17:46 run_tests.bash
-rw-r--r-- 1 root root  750 May 22 17:46 run_uwsgi.bash  <----

The symlink exists in /etc/service and were all executable except for uwsgi (had to chmod x). This all worked fine, though this of course isnt persistent when restarting.

in the stock kobotoolbox/kobocat:2.020.18 image everything looks fine:
-rwxrwxr-x 1 root root 295 Apr 30 06:40 run_celery.bash

I can probably modify the Dockerfile to add these permissions on build, but it strikes me that the stock img works and then mine doesn’t even when it is the zipped 2.020.18 release. Makes me think there is something fishy with the user permissions. Currently this is running on AWS Lightsail created new user for kobo. Any way, thanks for any insights if the above was at all helpful

1 Like

Hi @jnm,

after following the steps, update fails like this after DB backup questions:

Traceback (most recent call last):
File “run.py”, line 59, in
Command.help()
File “/home/kobo/kobo-install/helpers/command.py”, line 524, in update
File “/home/kobo/kobo-install/helpers/template.py”, line 54, in render
# Environment
File “/home/kobo/kobo-install/helpers/template.py”, line 233, in __write_templates
“aws_backup_bucket_deletion_rule_enabled”, “True”, “False”),
File “/usr/lib/python3.5/string.py”, line 129, in substitute
return self.pattern.sub(convert, self.template)
File “/usr/lib/python3.5/string.py”, line 119, in convert
val = mapping[named]
KeyError: ‘REDIS_PASSWORD_JS_ENCODED’

Any ideas how to get past this? Python3 is 3.5.3, and a working KoBo installation to start with ( kpi 2.019.35b-hotfix2)

Thank You,

Hi @maric.vladimir, could you upgrade your kobo-install to 2.020.19 and try again?

1 Like

@tolexy, thanks for the info. What OS are you using to build your custom images (and what kind of file system do the KoBoCAT and KPI source files reside on)? I think that Docker’s COPY preserves permissions when copying files from the working directory to the container, but maybe it fails in certain environments:

Thanks @jnm, we did and it worked flawlessly. 2.020.19 is in fact currently the last commit in this repo?

Thanks for the hint. Maybe you should update the doc above? Or is it something installation-specific?

Thank you,

1 Like

Hi @jnm,

there is another problem we are facing. Access to legacy projects (second icon from the bottom in the left menu bar) opens KoBo login screen. We did populate_kc_xform_kpi_asset_uid in the KPI container as advised above, and it will try to access user’s path on kobocat, but will redirect instantly to the login screen.

Another thing that I noticed does not work on the current installation, as well as the new one, is “Logout” in the user menu - it does nothing.

Any ideas how to get the kobocat and legacy projects running?

Thanks in advance,

Good point; I’ve updated the instructions to specify 2.020.23, which is now the latest.

1 Like

Redis is now used to share sessions between KPI and KoBoCAT, and it seems like that’s not working properly for you. A session cookie problem could also cause this issue. I can’t compile detailed steps for you to try right now, but perhaps you can uncover something out of place by looking in these areas.

The command populate_kc_xform_kpi_asset_uid is not actually related to this: it’s only for making sure REST Services work with forms whose KoBoCAT id_string is distinct from the KPI asset.uid.

1 Like

Thanks @jnm,

can you please give me some pointers, things to try out, some direction? It might prove useful for others too.

I cannot find anything out of the ordinary, unfortunately.

Thanks in advance,

Sorry for the late reply, I got it all working by indeed mod-ing the dockerfile to add permissions on build. I build using docker-hub built in resources on push to git. One system is Ubuntu 18 (2.020.23) and one is Ubuntu 16 (shared-database) on Lightsail and Azure respectively, and a dev/sandbox copy in azure as well. I am sure it is the environments that are causing some issues but have been able to sort it all out eventually.

1 Like

Hi, we have installed the Kobtoool when there was no ‘kobo-install’ repository, so can you please guide or provide a link where I can get the steps to upgrade to the latest version?
Thanks.

Hi @ryanend
Have you ever installed this before and it worked? If Not, I would suggest you recheck your installation process.

Stephane

Hi @stephanealoo yes it worked and we are using it.

1 Like

Hi I’ve upgraded my existing database to Postgre 9.5 and MongoDb 3.4, how do I use the kobo-install? I dont have this folder, because we did notuse it for installation.
Also I dont want to do a fresh installation of kobo using kobo-install and later dump the data, I just want to upgrade.
Any possible solutions? Thanks.

You could update your kobo-docker setup manually and then split the Postgres database using the same script that kobo-install executes: https://github.com/kobotoolbox/kobo-docker/blob/master/postgres/primary/clone_data_from_kc_to_kpi.sh. You’d need to first pull the latest updates to kobo-docker from GitHub, then update the configuration (crucially, specifying a new database name for KPI), then run that script.

Alternatively, you could:

  1. clone the shared-database-obsolete branch of kobo-install;
    • to avoid conflicts with your existing installation, do not put clone kobo-install alongside your existing kobo-docker directory;
  2. set up kobo-install “from scratch” (setting everything to match your current install);
  3. let it start up empty of data, and make sure it works;
  4. STOP ALL CONTAINERS (both new and old);
  5. kobo-install will have created a new kobo-docker directory. inside it, rename .vols to something else;
  6. copy your old kobo-docker/.vols to the new kobo-docker directory;
  7. start kobo-install again; make sure it works;
  8. follow the instructions in the first post of this thread, beginning with

    Enter the kobo-install directory:

Whatever you do, make sure you take backups first. We’d like to help but are very busy with running our own instances and maintaining the software, so we cannot guarantee timely responses if you have trouble.

2 Likes

2 posts were split to a new topic: Manually set address space for Docker bridge

Hi @jnm thanks for the quick reply and perfectly detailed solution.
However I am facing another problem after upgrading, I can now only see the forms and the count of total submissions, but I am not able to view any previous submissions in the form. I can view the new submissions though.

I followed the exact same steps you mentioned also the ones mentioned by @nolive (thanks a lot @nolive for your answer on other post), but I could not see the old submissions/data in the newer updates.

Any views regarding this will be a great help, thanks. Also can you let us know where are submissions stored in the database? Many thanks again.