Using Pull Data to Pull Data from Repeat Groups Workaround

dwaimah · October 25, 2020, 6:08pm

I have a survey form that collects data, one of my questions is to fill data about family related to the person for example, entering some informarion about family members and status, if you add one you can add another family member and so on …
When I export the data as an excel file I get those information in another sheet “second sheet”, and I can not preload them with pulldata, -I did not find a way to pull them actually- , so can anyone help me with that.

Thanks in advance

Kal_Lam · October 26, 2020, 4:56am

Maybe this post discussed previously should help you understand on merging a repeat group dataset to the main sheet:

stephanealoo · October 26, 2020, 5:03am

Hi @dwaimah

The first part of your concern is actually happening by design. Since you are using repeat groups, the questions in every repeat group would appear on their own separate sheet and linked up to the sheet containing the primary data with an index.

On your main query

You need to provide more information on how you are pulling the data so that we best understand your situation. Can you send your XLS form, on how you had tried pulling the data?

Also, you should first note that you may need to convert repeat data into a long format or conduct additional manipulations before you try using the data as the main data. Such manipulations are totally dependant on your data processing needs.

Stephane

dwaimah · October 26, 2020, 6:45am

The idea here that I want to pull the repeated questions data, as I understood so far, I need toanipulate the second sheet data which are the repeated questions and filter them first, that will make it possible to pull these data.
I used the pulldata(filename, column_name, id_key, ${key to search}) it worked for multiple choice and other data
I just do not know how to use it to get the repeated questions data
Or at least if there is any other way

stephanealoo · October 26, 2020, 7:17pm

Hi,
This is a form design issue when you need to pull data. As I indicated, you need to restructure the data in the repeat sheet to get back to the main sheet before you run pull data. The following shows how the same will look like when you convert the data.

Stephane

dwaimah · November 10, 2020, 1:21pm

Thanks for your answer, I got your point here and thanks for the structure I totally understand this idea now, still I have only 1 problem
in this example we have 3 indices with value “1” which means 3 entries after we restructure our data.
I just don’t know what kind of function I need to pull them into 3 repeats so that name_1 will go in first repeat along with Gender_1, Name_2 will go in second repeat along with Gender_2 and so on …
How you do that, I mean what functions to use, and if you can show me an example of how you manage to do that, I will be very thankful that’s all I need.
Thanks in advance

stephanealoo · November 10, 2020, 6:34pm

Hi @dwaimah
I am glad that we are on the same page as regards the conversion from long to wide format prior to pulling the data as explained in post number 4.

To specifically execute the calculation

I would provide an overall logic that should guide you in the process. I suggest that you try the implementation and then send in the XLS form with dummy data so that we can perfect it.
Logic 1: You can always get what repeat position you are in using the position(…) function
Logic 2: Using the if command within calculation you can actually change the column being pulled.

With the above 2 logics, I have created a formula for putting in the Gender and the Name for up to 20 repeat iterations. This should work for you as long as your repeats are less than 20. Please note, I have a slight workaroundCalculationGenerator_PullingDataFromRepeatGroupsinWideFormat.xlsx (11.5 KB) excel that helps me generate some of these repetitive formulae which I have attached here

The formula for gender will be
if(position(…)=1,(pulldata(‘csvfilename’,‘Gender_1’,‘uniqueID’,${householdID})),(if(position(…)=2,(pulldata(‘csvfilename’,‘Gender_2’,‘uniqueID’,${householdID})),(if(position(…)=3,(pulldata(‘csvfilename’,‘Gender_3’,‘uniqueID’,${householdID})),(if(position(…)=4,(pulldata(‘csvfilename’,‘Gender_4’,‘uniqueID’,${householdID})),(if(position(…)=5,(pulldata(‘csvfilename’,‘Gender_5’,‘uniqueID’,${householdID})),(if(position(…)=6,(pulldata(‘csvfilename’,‘Gender_6’,‘uniqueID’,${householdID})),(if(position(…)=7,(pulldata(‘csvfilename’,‘Gender_7’,‘uniqueID’,${householdID})),(if(position(…)=8,(pulldata(‘csvfilename’,‘Gender_8’,‘uniqueID’,${householdID})),(if(position(…)=9,(pulldata(‘csvfilename’,‘Gender_9’,‘uniqueID’,${householdID})),(if(position(…)=10,(pulldata(‘csvfilename’,‘Gender_10’,‘uniqueID’,${householdID})),(if(position(…)=11,(pulldata(‘csvfilename’,‘Gender_11’,‘uniqueID’,${householdID})),(if(position(…)=12,(pulldata(‘csvfilename’,‘Gender_12’,‘uniqueID’,${householdID})),(if(position(…)=13,(pulldata(‘csvfilename’,‘Gender_13’,‘uniqueID’,${householdID})),(if(position(…)=14,(pulldata(‘csvfilename’,‘Gender_14’,‘uniqueID’,${householdID})),(if(position(…)=15,(pulldata(‘csvfilename’,‘Gender_15’,‘uniqueID’,${householdID})),(if(position(…)=16,(pulldata(‘csvfilename’,‘Gender_16’,‘uniqueID’,${householdID})),(if(position(…)=17,(pulldata(‘csvfilename’,‘Gender_17’,‘uniqueID’,${householdID})),(if(position(…)=18,(pulldata(‘csvfilename’,‘Gender_18’,‘uniqueID’,${householdID})),(if(position(…)=19,(pulldata(‘csvfilename’,‘Gender_19’,‘uniqueID’,${householdID})),(if(position(…)=20,(pulldata(‘csvfilename’,‘Gender_20’,‘uniqueID’,${householdID})),‘’)))))))))))))))))))))))))))))))))))))))

FOR NAME
if(position(…)=1,(pulldata(‘csvfilename’,‘Name_1’,‘uniqueID’,${householdID})),(if(position(…)=2,(pulldata(‘csvfilename’,‘Name_2’,‘uniqueID’,${householdID})),(if(position(…)=3,(pulldata(‘csvfilename’,‘Name_3’,‘uniqueID’,${householdID})),(if(position(…)=4,(pulldata(‘csvfilename’,‘Name_4’,‘uniqueID’,${householdID})),(if(position(…)=5,(pulldata(‘csvfilename’,‘Name_5’,‘uniqueID’,${householdID})),(if(position(…)=6,(pulldata(‘csvfilename’,‘Name_6’,‘uniqueID’,${householdID})),(if(position(…)=7,(pulldata(‘csvfilename’,‘Name_7’,‘uniqueID’,${householdID})),(if(position(…)=8,(pulldata(‘csvfilename’,‘Name_8’,‘uniqueID’,${householdID})),(if(position(…)=9,(pulldata(‘csvfilename’,‘Name_9’,‘uniqueID’,${householdID})),(if(position(…)=10,(pulldata(‘csvfilename’,‘Name_10’,‘uniqueID’,${householdID})),(if(position(…)=11,(pulldata(‘csvfilename’,‘Name_11’,‘uniqueID’,${householdID})),(if(position(…)=12,(pulldata(‘csvfilename’,‘Name_12’,‘uniqueID’,${householdID})),(if(position(…)=13,(pulldata(‘csvfilename’,‘Name_13’,‘uniqueID’,${householdID})),(if(position(…)=14,(pulldata(‘csvfilename’,‘Name_14’,‘uniqueID’,${householdID})),(if(position(…)=15,(pulldata(‘csvfilename’,‘Name_15’,‘uniqueID’,${householdID})),(if(position(…)=16,(pulldata(‘csvfilename’,‘Name_16’,‘uniqueID’,${householdID})),(if(position(…)=17,(pulldata(‘csvfilename’,‘Name_17’,‘uniqueID’,${householdID})),(if(position(…)=18,(pulldata(‘csvfilename’,‘Name_18’,‘uniqueID’,${householdID})),(if(position(…)=19,(pulldata(‘csvfilename’,‘Name_19’,‘uniqueID’,${householdID})),(if(position(…)=20,(pulldata(‘csvfilename’,‘Name_20’,‘uniqueID’,${householdID})),‘’)))))))))))))))))))))))))))))))))))))))

I do hope the above helps you in the process.

Stephane

wroos · November 11, 2020, 9:15pm

Hello @stephanealoo,
Is there really no other option to pull the related data?? For ex. renaming the sheet, using index and parent_index, creating a combined ID …
Every row (of the repeat) represents all needed variables for the roster member. So there should be a way to pull it out, without transfering the matrix.
(I think index value equals position.)
Kind regards
Wolfgang

stephanealoo · November 13, 2020, 9:47pm

Hi @wroos
Definitely, that would be one approach. However, remember you still get to the situation with the other sheet in a wide format. Since there is not a standard documented approach for this advanced form manipulation, we tend to provides what has worked previously. In most of the longitudinal household surveys, I prefer to use the wide-format which works. I would suggest that @dwaimah tries the approach as you are indicating and if he finds an issue he can reach out for more support.

Stephane

wroos · November 13, 2020, 10:39pm

Dear @stephanealoo,
As far as we evaluated, a wide-format approach creates sever disadvantages for statistical analysis (e.g. with SPSS), like aggregates, means, sums etc. for (individual) household members, The more different household sizes and the more individual questions/variable woe have to treat with, the more heavy becomes a wide-format. (And we should also void to organise standard statistics methodology too much on tool restrictions.)

Side note: I would appreciate if the community and the guidelines would discuss more about such structural design aspects (and lessons learnt) for surveys with KoBo.
Kind regards

dwaimah · November 17, 2020, 10:16am

Dear @stephanealoo,
It seems like I have a problem each time I do the pulling or past the code
I have set the names of variables and csv file and so on, made sure of everything.
Still as soon as I upload the form I, it uploads well, but can’t deploy it I get this message
063

When I remove the pulldata from your code above it works and deploys, I don’t know maybe something with the brackets ? I tried many solutions nothing worked for me.
Another simple question should I do the position(…) formula as in this pic to use along with the pulldata formula you have illustrated? or the place is worng?

Thank you again

stephanealoo · November 17, 2020, 6:27pm

Hi @wroos

You are definitely right when it comes to analysis especially for individual members. When you focus on longitudinal data collection then wide format would make sense for storage. But once again, this is purely dependent on the data management capability of the user. I also think that we will not be able to avoid wide and long format discussions. It should definitely be a good approach to have user experience in managing the wide and long format data.

@dwaimah on the issue you had indicated, this is an error with your formular. Can you test the form here

Stephane

dwaimah · November 17, 2020, 6:59pm

I understand your point, but the error just disappears as I remove the new if(pull…) I mean it causes the problem but why the thing is why!

stephanealoo · November 17, 2020, 7:06pm

Can you paste the entire formula here so that I look at it. Also I noticed that the columns on your cvsfile have the period mark i.e. fullstop. Could you try doing this with underscore instead and see if this sorts the issue out?

wroos · November 17, 2020, 9:12pm

Dear @stephanealoo,
Thanks!
Additional question is still open, please see above).

Solution ideas would be appreciated.
Kind regards

dwaimah · November 18, 2020, 11:54am

@stephanealoo I used the same formula you pasted here which is
if(position(…)=1,(pulldata(‘csvfilename’,‘Name_1’,‘uniqueID’,${householdID})),(if(position(…)=2,(pulldata(‘csvfilename’,‘Name_2’,‘uniqueID’,${householdID})),(if(position(…)=3,(pulldata(‘csvfilename’,‘Name_3’,‘uniqueID’,${householdID})),(if(position(…)=4,(pulldata(‘csvfilename’,‘Name_4’,‘uniqueID’,${householdID})),(if(position(…)=5,(pulldata(‘csvfilename’,‘Name_5’,‘uniqueID’,${householdID})),(if(position(…)=6,(pulldata(‘csvfilename’,‘Name_6’,‘uniqueID’,${householdID})),(if(position(…)=7,(pulldata(‘csvfilename’,‘Name_7’,‘uniqueID’,${householdID})),(if(position(…)=8,(pulldata(‘csvfilename’,‘Name_8’,‘uniqueID’,${householdID})),(if(position(…)=9,(pulldata(‘csvfilename’,‘Name_9’,‘uniqueID’,${householdID})),(if(position(…)=10,(pulldata(‘csvfilename’,‘Name_10’,‘uniqueID’,${householdID})),(if(position(…)=11,(pulldata(‘csvfilename’,‘Name_11’,‘uniqueID’,${householdID})),(if(position(…)=12,(pulldata(‘csvfilename’,‘Name_12’,‘uniqueID’,${householdID})),(if(position(…)=13,(pulldata(‘csvfilename’,‘Name_13’,‘uniqueID’,${householdID})),(if(position(…)=14,(pulldata(‘csvfilename’,‘Name_14’,‘uniqueID’,${householdID})),(if(position(…)=15,(pulldata(‘csvfilename’,‘Name_15’,‘uniqueID’,${householdID})),(if(position(…)=16,(pulldata(‘csvfilename’,‘Name_16’,‘uniqueID’,${householdID})),(if(position(…)=17,(pulldata(‘csvfilename’,‘Name_17’,‘uniqueID’,${householdID})),(if(position(…)=18,(pulldata(‘csvfilename’,‘Name_18’,‘uniqueID’,${householdID})),(if(position(…)=19,(pulldata(‘csvfilename’,‘Name_19’,‘uniqueID’,${householdID})),(if(position(…)=20,(pulldata(‘csvfilename’,‘Name_20’,‘uniqueID’,${householdID})),’’)))))))))))))))))))))))))))))))))))))))

I didnt even change anything in it, I have just changed my csv file name ,uniqueID and householdID accordingly to fit the formula for testing purpose.
I have change the “,” in the csv file in case it was a reason and replaces it with “_”, but the problem that KOBO do not even deploy it duo to the error I showed you earlier.
But whenever I remove the pull you showed me, it works and deploys well, which means the error occurs from the formula you gave me I don’t know I tried to make sure of the brackets and so on, found nothing wrong
Looks like KOBO is not taking it in the form
Anything I can do ?

stephanealoo · November 18, 2020, 8:28pm

Hi @dwaimah
I will have a look at it and see what is the issue.

@wroos I agree this is still open and I will be looking at an alternative approach to pull the data.’

Stephane

dwaimah · November 22, 2020, 6:22am

Hi @stephanealoo,
Did you find anything so far?
Could it be the syntax? or might it be something else

stephanealoo · November 23, 2020, 7:35pm

Hi @dwaimah
After a previous attempt by one of the users @caneeraj, I believe this is an issue brought by one of the syntax errors. I suggest that you try working with the formula below

CalculationGenerator_PullingDataFromRepeatGroupsinWideFormat_Amended.xlsx (11.5 KB)

Stephane

wroos · November 23, 2020, 7:59pm

Hello,
Pay attention: It must be position( 2 dots ). Seems that the editor system.here it producing 3 dots.
This editor behavior should better be changed in general.