From KoBo to Stata (new solution)

“odkmeta” can be use as well as an alternative for kobo2stata"

1 Like

Hi @bernieseville,

Could you kindly share more information on odkmeta. The community would love to hear and learn regarding the same.

Have a great day!

Thanks for suggesting this. I believe users can have a look at this and then see if it helps on their end.

1 Like

Thanks @stephanealoo, for sharing the link, odkmeta is a straight forward command. Install odkmeta in stata “ssc inst odkmeta”.

1 Like


You may use “import delimited” command from Stata together with your kobo API. e.g. https://kc.xxxxxxxxxxxxxxx/api/v1/data/xxxxxxx.csv


1 Like

I’m facing an error in this command.

I downloaded my data in Excel and used the kobo2stata command. I want to have “name” column as the variable names in Stata. This column in the choices tab only has numbers but in the survey tab I have alphanumeric variable names for example s1_q1.

When I run the kobo2stata command with surveylabel(“name”) choicelabel(“name”) it gives an error “type mismatch”.

Another problem is that my data is in the form of strings! Even yes/no questions are appearing as string even though these are coded as 1/0 in my choices tab.

Any help in this regard would be appreciated.

Sounds like you may have selected the wrong value/header format when exporting your data from Kobo. Please make sure to follow exactly the instructions provided in the “Important remarks on generating input files in KoboToolbox” section of the kobo2stata help file (type “help kobo2stata” in Stata to access the help file).
Also, the surveylabel and choiceslabel options should refer to the “label” columns of your xlsform, not the “name” columns.


Thanks for the quick response.

I downloaded the raw data and the XLS form according to the steps stated in the Stata help file. I do not want my Stata variable name to be the “label” because those are very long. Instead I would like my Stata variable name to be the “name” which is short for example s1_q1, s1_q2 and so on.

The categorical variables (multiple choice questions) are appearing as strings in my Stata dataset. For example in my choices tab I have 1 for Yes, 2 for No, 3 for Don’t Know. I want these numbers 1,2,3 in my dataset rather than having “Yes” “No” “Don’t Know” appearing as strings.

Any idea what I’m doing wrong here?

Kobo2stata will by default consider the name column of your xlsform as the variable names in Stata - exactly as you want it. The surveylabel option of the command relates to variable labels. I suggest you just follow the standard approach described in the help file, and you should get the result you want.
On string variables: are you sure that what you are seeing in Stata’s data browser are actually string variables (red text), and not labelled numerical variables (blue text)?
If you require further help, feel free to reach out to me via the email stated at the bottom of the kobo2stata help file, providing me with your XLSForm and dataset (NB: if you are concerned about data confidentiality, please note I don’t need your full dataset. Just the first two lines - one with the column headers and one with a single sample observation - is usually sufficient to identify problems). - and I’ll be happy to look into it.

1 Like

Thank you for the response.

I tried the same procedure again after removing the second language labels (removed label::urdu column) and it is working perfectly fine.

Does this mean that kobo2stata doesn’t work when the form has multiple languages? I have the default language English and an added label column for Urdu.

Update: I downloaded the data (XML values and headers) and the XLS form from Kobo. After downloading, I removed the second language label column and only kept 1 label column in my XLS form (“label”). After that I ran the kobo2stata command and it’s working fine for me now.

I suppose the error is due to the second language.

Glad everything worked well in the end.
Kobo2stata is capable of handling xlsforms with multiple languages. That is exactly what the (optional) surveylabel and choiceslabel are there for. If your xlsform contains only one label column, you don’t need them at all. If you have multiple languages, you can tell kobo2stata which one to focus on. For example (and as specified in the help file): surveylabel(“Label::English”) choiceslabel(“Label::English”)

1 Like

Thank you for the help. kobo2stata is a fantastic package.

1 Like

Hi Felix,

First thanks to you so much for this command is amazing. I used it recently and worked. However, I am trying to use again with a database and Xls form similar, but STATA gives me the next error

“= replace label=subinstr(label,char(10),” ",.)

  • local numofchoices=_N
  • foreach num of numlist 1/`numofchoices’ {
    = foreach num of numlist 1/1617 {
    invalid numlist has too many elements"

I am don’t know how can to fix it, I am trying to changed the file, but does not work. I appreciate your help.


Hi Estefania,
It looks like on the choices tab of your xlsform, you have one or several value sets that are very long - exceeding 1600 values? If you remove this value set (or at least the 17 rows in excess of 1600) before running kobo2stata, it should run through.


I did what you suggested and it worked!
I would love in the future that the command could run more than 1600 options in choices, in some occasions I have worked with very long surveys, it would be fabulous.
Do you think this would be possible in the future?

Thanks again for your help!

Glad to hear it worked.
Just to be clear, kobo2stata can handle a choices tab with more than 1600 rows - it is only when a single value set in the choices tab contains more than 1600 values that the error occurs.
This 1600 values limit is not so much a limitation of kobo2stata but rather of Stata itself (to be more precise, of Stata’s “numlist” function which kobo2stata depends on). A workaround would be to depend on Stata’s “forvalues” instead, but it would require a major rewrite of kobo2stata, which I’m afraid I won’t have the time for at the moment. In any case, I will make this limitation explicit in the help file when I next update it, to make sure users know how to solve the issue where it occurs.

1 Like

oh! It is good to know. But maybe I forgot something at the moment to run the code because in my case the code I did not have single values set in the choices tab contains more than 1600 values, the maximum was 1100, so I decided to count the number that the code gives in num_list and was the same in total values set in the choices tab, the num_list gove me 1617 numbers and I the time to count the total values on the sheet in choices was the same (sum all single value sets). Maybe, Am I missing something when I run the code?


I have Stata 13 and am unable to use the kobo2stata command (see image attachedCaptura de pantalla (493) ). Is there anything I can do to make it work?

Thanks a lot!

Hi Zara,
Unfortunately kobo2stata will only run on Stata 14 or newer, as it relies on some core Stata functions that were only introduced in that version. FYI some discussion around this here: Introducing kobo2stata (new on SSC)