Problem with index values changing with new submissions

Dear All,

I have a problem with the changing of index values when new submissions were sending by enumerators every day during the survey. What should I do with this problem? how I can solve it? Because I have index and parent index. So, please help me with this issue.

Hi @botani,

Welcome back to the community! Would you mind sharing us the screen-shot of the issue so that it would be easy for the community to better understand your situation.

Have a great day!

Hi Kal_lam,

Here is the data collection of 29-2-2020 and 1-3-2020. you can see in the screenshot that I downloaded the data in 29th that had an index, then in the next day (1-3-2020), I downloaded the data again. It seems that the value index is changed due to new submissions of the data. I do not want my value index to be changed due to new submissions of new interviews, I want these new interviews to be at the end of the excel file with new index values.

Hi @botani,

In this case, you could always use the _id or _uuid which should always remain constant (with a submission that has been submitted to the server). I personally prefer using the _id as its shorter and much more convenient then the _uuid which is a bit longer then the _id.

As a reference, please see the _id and _uuid from my dummy project:

Have a great day!

1 Like

thanks Kal_Lam,
I know that we can use _id and thanks for reminding me about that. But is there any way to fix or stop the index value from changing when new submission sent ?

Hi @botani
Unfortunately there is no way to stop the index value from changing.

Stephane

Thanks alot for your reply.

1 Like

Hi All,

What is the best way to deal with the changing index field and related tables for ongoing surveys?

If you are downloading data periodically and importing into a database the related table index fields will be changing on each download so you cannot use this in a relational database unless you clear all previous data and import the complete dataset each time (which is not ideal), or assign your own index/parent index values on each download before importing into the databaseā€¦

Is this correct? or is there a way around this?

Thanks

Hi @Rusti,

Welcome back to the community! I would advise to use the _id or the _uuid instead of using the _index as the _id and _uuid are constant.

Have a great day!

Thanks Karl!

I get that the _id or _uuid is constant, but those fields do not pull through into the linked tables (repeats) and in the download data the _Index and _parent index are the linking fields.

Is my form not set up correctly if i am not getting the _id and _uuid field on the linked tables? They are always blank.

Alternatively do i use vlookup to get the the _uuid field manually in excel once downloaded? But this would be using the index/parent_index field which would be changing each time anyway?

Thanks

Hi @Rusti,

Yes, you got that correct! You should use the _index and the _parent index for merging/linking datasets with repeated groups. So as said earlier if you wish to use these, itā€™s always safe to use them at the end of the survey (i.e. when the data collection is over).

Have a great day!

Thanks again Kal,
I am working with resource use surveys were data is collected monthly and imported into a database each month for overview analysis so i need to link the tables each month and ensure they re the correct link.
Is the only way around this then to create my own index field each time to link the related tables. Each month i would need to take the ā€˜new recordsā€™ and create an Index field which is based on the ODK index/parent index but unique to my database. I could probably do this by adding the month/time to the index field.

Any other easier ideas for ongoing survey data collection?

Thanks

1 Like

Hi @Rusti,

Letā€™s see if the community has any new ideas for this.

Have a great day!

Hi @Rusti
For starters we conclude with the fact that the index will always change as I had indicated earlier

Now to the issues you raised, I will split them into specific steps:

It would be important to understand the relational nature of your database and understand some few principles around how the data in repeat sits verses your data base. To provide more clarity. In a previous discussion I had indicated the workarounds that allows you to change the data. You will have probably two or more files one for main and the others for every repeat. My approach has been using commands in SPSS where I do the following

  1. Transform the repeat data from long to wide format using case to var command with the parent key as identifier variable in the transformation
  2. I then merge the two sets using primary key in main file and parent key in the wide format of the repeat. I use the add variables command for merge files.

Action needed from you: Share a schematic of your data management flow within the app I look at it.

You do not need to do this; it is cumbersome and this has already been done. I would suggest the following workaround which should be done before you import the data onto your system:

  1. Download your XLS as usual with all the sheets in it.
  2. You will find the _id or _uuid in the main sheet but probably not on the sheets representing the repeat data.
  3. Create the columns for _id or _uuid in the repeat data sheets and use the VLOOKUP in excel to fill these columns using data from the main sheet.
  4. You can make this as a Macro and save it so that you repeat the process every time you download the data.
  5. Build your data management protocol based on this data set. Note the IDs wont changeā€¦

Stephane

1 Like

Hi Stephane,

Thanks for your input that really helps. I do not have SPSS and that is probably beyond my level at this stage.

Your second approach will work, and is similar to what i was thinking in my badly written post above (sorry), i will lookup the _id values using the index each time and use that as the primary key in the database.

I somehow seemed to have overlooked the changing index field when i started with this but each time i have done a full import so it has not affected my data, going forward i would like to import only the latest data so will follow your suggestion to do this.

Thanks again, the support on this forum is great. Regards

2 Likes

Hi Rust and every one,

I think it will be better for Kobo staff to fix this problem. Why we go in a long way if it is possible to solve it by Kobo staff.

Delshad Botani

1 Like

Hi Kal.

I am facing the same issues. I would need to download the database every week, correct submitted data and include them in a ā€œmasterā€ database.

I understand I canā€™t rely on index number for this, thus my question: is the submission _ID progressive? Meaning that I can order them by ā€œsmallestā€ and expect next weeks submissions to appear ā€œafterā€ the last (largest) _ID that I had in the previous week database?

1 Like

Welcome to the community, @mealcoopiiraq! Yes, you got that correct.