For capturing baseline data and recurrent updates, we have created a questionnaire (xlsform) which has become quite complex by now. There is a lot of skip logic and repeats – for instance, capturing either data on a school or data on a ECD centre, as well as asking for the number of preprimary classes and then for enrolment numbers of each of them.
However, as a bloody beginner in Power BI, I wonder whether the following problem is just due to lack of Power BI skills or whether there is a problem in the way Power BI is communicating with kobotoolbox when the power query is done.
Two symptoms bother me most:
Missing fields: Due to the many skip logic patterns (“tree” with many “branches”), I would have to fill out many, many forms to ensure that each and every question is answered at least once (and doesn’t remain skipped/empty in ALL questionnaires). For testing purposes, I filled only a few and it seems that only those fields/columns appear in Power BI which were not empty in the moment when I did the first power query (=when I connected via API as per youtube video). If I fill out another form and answer so far unused questions, the counter of questionnaires increases, but the new questions don’t appear in the Power BI table. Do I overlook some refresh function or something like that?
Just one table: If I understand the Power BI tutorials correctly, repeated questions from kobotoolbox are shown in separate tables on the modelling page. But this is not the case for our questionnaire: The API connection brings all the columns into just ONE table. While I believe that the reference feature allows to extract columns into new tables, it looks like an awful lot of work – which by the way is meaningless if problem #1 (missing columns) is not solved beforehand. Any idea how to do the “import” differently?
…looks like my theory is wrong (which was that those fields which were not filled with data in the moment of setting up the connection are not included even if filled with data later). It is rather that this applies on group level: If I create a new Power BI file in a moment that a certain group has been skipped in all completed questionnaires so far, all the questions in that group will not be added once I complete a questionnaire which answers these questions…
So the problem IS indeed the not fully functioning API connection.
Any hints how to further analyse and eventually solve this problem?
It seems that when connecting to the data on the kobotoolbox server via API (for instance from Google Data Studio, MS Power BI, etc.), only those fields/columns will be “synchronized” which contain data. If a skipped question will be answered only in a later questionnaire, is there any chance to include it without modelling the whole data again and again?
@york_rff
To better understand if this issue is an API issue or a PowerBi issue, we need you to do the following checks:
Using the API link, could you please get your data i.e. post the link on a browser and see which data comes out. Inspect the data to see if the skipped question actually has some data.
If the skipped question has some data, then the issue is with PowerBI which would mean you have to work on a workaround i.e. ensuring that you only set up with a complete form in all its response-filled scenario. You can always do a dummy with no question skipped ensuring that all the questions are included in the very first setup.
If the inspection in 1 above shows that the data is missing, then this is a definite bug which you should notify us then we look at it in detail.
Very interesting approach to find out more, thanks a lot!
I copied my xlsform, changed its form_title, deployed it, filled in one questionnaire, avoiding the “enrolment” section, and sent if off. Then I created a PowerBI file and in parallel entered the same URL directly into a browser. Then I filled in a second questionnaire, this time entering enrolment data, and sent it off again. As observed earlier, nothing happened in PowerBI after refresh. So I entered the same URL in a second browser window and – same result (= no enrolment fields)!
This feels like a discovery, although I can’t really imagine what this means. If I enter the URL in a browser, how can something else be shown as it is actually on the server in that moment?
Hi @york_rff
Thanks for that further check, in essence, the behavior you are describing indicates that this is a potential issue with the API. Let us have a look at this and we will get back to you. If we confirm that this is a bug, we will send you a link to the bug report and you can monitor the progress of the resolution.
Sorry, something went wrong yesterday: I just discovered that the second questionnaire (the one with the additionally used fields) didn’t get through. After I now succeeded to sent such a second questionnaire, typing the URL directly into a browser resulted in a much longer list of fields/data. So, the issue might not be as clear as it seemed for a moment…!?
In the meantime would you mind trying out the following:
Step 1: login to your account. Either kc.humanitarianresponse.info or kc.kobotoolbox.org Step 2: Share Data Publicly: Yes Step 3: View data in table
Ensure, value and header format: Labels
Step 4: Copy URL and use it in excel
or, copy and edit the following link:
Where,
“USERNAME” is your login username
“FORMID” is the 22 character (it should see something like: aL7auvitUbKpdznfJuQi5U) that you should see in your survey link when you open the survey project.
Load your Power BI with the above API and then you should be able to get the data updated to your Power BI every time you refresh your data.
Kindly please let us know, this should solve your issue.
Thanks for this suggestion, I tried to follow it diligently.
Still from that test with the two filled forms (the first avoiding the enrolment section, the second using it), I created a csv file using this URL: https://kc.kobotoolbox.org/york_rff/reports/aBGSuQsyd2zdnv8mUQSxmx/export.csv – and after opening it in Excel, I found lots and lots of unused fields, BUT NOT any of the enrolment data…!?
Is it maybe that I have too many sub-sub-groups (third: level: group in a group in a group)? Puzzled…
I tried to confirm. And indeed, it exactly looks like this: whatever questions are between repeat loops, are not shown.
Here are the lines from what I see after the csv export (as you described it above), followed by the respective lines from the xlsform (the term “Grade” comes through a variable for country-specific terminology):
Number of ${term_grade} 3 classes:
Number of ${term_grade} 2 classes:
Number of ${term_grade} 1 classes:
integer school_classes_g3 Number of ${term_grade} 3 classes:
begin repeat school_class_names_g3 ${term_grade} 3 classes names
calculate counter_names_g3
text school_class_name_g3 Name of ${term_grade} 3 class #${counter_names_g3}
end repeat
integer school_classes_g2 Number of ${term_grade} 2 classes:
begin repeat school_class_names_g2 ${term_grade} 2 classes names
calculate counter_names_g2
text school_class_name_g2 Name of ${term_grade} 2 class #${counter_names_g2}
end repeat
integer school_classes_g1 Number of ${term_grade} 1 classes:
begin repeat school_class_names_g1 ${term_grade} 1 classes names
calculate counter_names_g1
text school_class_name_g1 Name of ${term_grade} 1 class #${counter_names_g1}
end repeat
So your concern is that you are able to update/refresh the data if it does not have a repeat group but in cases where you have a repeat group data it is not showing in Power BI?
Yes, the problem is that if questions inside a repeat group get not filled before using the API for the first time, later use of these questions will be ignored in the power query (or even in a csv export).
Maybe if you wish, you could make some dummy entries as well:
https://ee.kobotoolbox.org/x/w2cfGvN8
I could update the records (for both the household level as well as the roster information) once i press the refresh button.
Maybe this video tutorial prepared by @janna would help you solve your issue. Though the video pulls the data from an Excel file, you could still connect your data with the KoBoToolbox server:
Thanks. Are you sure that you tested it with AT FIRST no entries in the repeated questions, then established the API and THEN enter data in the repeated questions?
By the way, if I follow the recommended video (which I consulted several times before), during the get data procedure no “Navigator” windows pops up with SEPARATE tables.
I am not sure what this exercise/test was about, but I found out that the csv-file is only containing the “surface level” (but not the repeats).
Anyway, I created a completely new questionnaire, asking for numbers of four types of household members and then asking one or two repeated questions on each of them (four times “begin repeat” to “end repeat”.)
In the first two questionnaires, I left the number of two types blank (so no additional questions on them). Then I did the Power BI query and saw only data on those two used groups.
Then I filled out another questionnaire, this time saying 2 for one of the previously unused types and answered the repeated questions.
Again:
the additionally used types (and the repeated follow-up data) don’t appear, evon not when starting a completely new Power BI file and establishing the API again
the data of the repeated questions are hidden in “Lists” and there is no automatic split into main table and sub-tables
By the way: If I export the same data to excel and get data from excel in Power BI, everything is there (even in separate tables).