Introducing kobo2stata (new on SSC)


Dear Kobo users (and especially the Stata users among you),

I am pleased to announce that a new Stata module - kobo2stata - is now available from the SSC server. Many thanks to Kit Baum who maintains the SSC.

kobo2stata creates labelled Stata datasets from KoboToolbox. It combines the information contained in KoboToolbox’s raw data file and the XLSForm. The main focus is on applying the variable labels and value labels from the XLSForm to the data. There are also some secondary functions, e.g. the removal of HTML tags from labels, or the (optional) removal of note-type variables.

To try it out, just type “ssc install kobo2stata” in Stata. Once installed, you can type “help kobo2stata” for detailed explanations.

This is a new module, so if you experience any problems or have suggestions for improvement (or just want to say that it works great), I would love to hear from you via email at

All the best,

Downloading data from KoBoToolbox with variable definitions for SPSS or STATA

Thanks for this. This package is really helpful in automation the analysis in stata. However, I have noted the following limitations with the package;

  1. The package is not valid for stata13 and earlier versions.
  2. Choice list with more than 9 options are not labeled. Only options 1 to 9 are labeled.
  3. Variables with more than 32 characters are truncated (no problem with this) but their variable and value labels are not displayed (pose addition work in re-coding them manually)
  4. The notes variables are not removed on the variable list despite explicitly adding the option "dropnotes " on the command.
    5 Finally, this is not a limitation but a suggested functionality. Is it possible to provide login and file credentials to kobo account on the fly to allow stata to connect to the kobo account and get both the excel data file as well as the xlsform. This would really save time and enhance full automation of the realtime analysis of collected data.
    Otherwise thanks for the effort you are putting to easen lives of people in data analysis work!

Dear Stephen,

Thank you so much for trying out kobo2stata and providing these very thoughtful comments and suggestions.

A few first thoughts on the important points you raise:

  1. It is true that I have limited kobo2stata to Stata 14 and above. However, this is not because of a confirmed incompatibility with earlier versions but only due to my inability to validate compatibility with lower versions, since I don’t have access to earlier versions of Stata. In fact, I believe kobo2stata might work with all versions down to Stata 12 (but not below, due to Stata handling Excel imports differently up to version 11). There are two possible ways of overcoming this limitation: (a) I can open kobo2stata for use with Stata 12+, but add a cautionary message that compatibility with older versions of Stata has not been confirmed, or (b) if someone could test run kobo2stata on Stata 12&13, I could open it up without warning messages.

  2. This is odd. The behaviour you describe was neither observed in my own test runs nor by any of the beta testers. I would be happy to look into your specific case and identify the source of this problem. If interested, please send your XLSForm and raw data file to the email address stated above (NB: if you are concerned about data confidentiality, please note I don’t need your full dataset. Just the first two lines - one with the column headers and one with a single sample observation - is usually sufficient to identify problems).

  3. This is true, and a result of Stata’s limitation of variable names to 32 characters. I have noted this down for fixing in future versions of kobo2stata.

  4. This is odd. The behaviour you describe was neither observed in my own test runs nor by any of the beta testers. As mentioned above, happy to look into your specific case if you can send me your input files.

  5. This is an excellent idea, but I am not sure whether it is possible since I am unfamiliar with the backend of KoboToolbox for exporting data files. Essentially, if there is a permanent URL (of the form http://… or https://…) from which kobo2stata can pull the input files, then I can program kobo2stata to access them directly from the web. However, my own impression as a Kobo user is that the export files are not generated “live” but rather are generated only “upon user request” - in which case I don’t think kobo2stata can trigger the export of the required files. Perhaps one of the Kobo developers active in this forum can advise?


I’ve linked a Power BI dashboard to KoBo toolbox as shown here;
I believe that the API essentially queries the KoBo serve to create a new CSV each time I refresh my dashboard, and indeed the URL is stable. Might be possible to use for this purpose as well!


Many thanks Jonathan for the very interesting pointer! Perhaps @tinok - who authored that page - can shed some light on the question whether there is a way to get URL-based access to the raw data Excel file without triggering the “refresh/update to latest version” from within PowerBI?


Following the useful feedback from Stephen above, I have just submitted an updated version (v1.04) of kobo2stata to the SSC. This should go online in a few days, please make sure to run the “adoupdate” in Stata to get the latest version.

Updates (v1.04):

  • Moved the “multi” option to the core functionality of kobo2stata. It is no longer necessary to specify this option in order for select_multiple items to get labelled - this is now default behaviour.
  • Fixed a bug that caused kobo2stata to misbehave where the Kobo data contained variable names or value label names exceeding 32 characters (NB: this was also the cause of Stephen’s issues 2&4). Also made the consequences of overly long variable and value label names explicit in the help file.

Finally, please note that further testing (thanks Stephen!) has found that kobo2stata is not compatible with older Stata versions. The minimum requirement will therefore remain Stata 14.


I’m not sure I understand the requirement completely. The PowerBI instructions request a CSV file that is generated on each request. An XLS file can be generated the same way simply by changing the last three letters of the url from csv to xls. This uses the kobocat API, which doesn’t give access to versioned data or data labeled with whatever languages were used in the form design. The newer kpi API doesn’t use such live links, only asynchronous downloads that are requested through the API (see this small sample script).

1 Like

Only few click then I was done with the re-do of yet coding for stata; this is truly helpful; thank you for this wonderful ad-on.