Case Sequence (_index) changed on Data Download - Different to Table View

Hello KoBo Team,
We are encountering a problem (bug?) on data download. In all file formats (XLS, XML, CSV). the case order has changed, different to the order in the table view.

End of table view

Beginning of export file

Normally the cases in the export files appear in reverse order of the KoBo table view. But now we have a case which appears at index place 1. In consequence all indexes are changed by one. (This is different to previous exports we have!)

As far as we can see, there is no reason for this on data level. For ex. this wrong case at position 1 was sent, started and ended later as the next following cases. Also, no editing was done (so far) on server level.

Can anyone explain this? It might have arrived after the server problems in January (but we didn’t lose any project or data at this time).
What should be the KoBo case order in the export file?

The order in the table view seems to follow the end (datetime) metadata.

In the table this case (appearing at position 1 in the export) is somewhere in the middle of the KoBo table view (page 3 of 7). But in the download the case shows up on position 1. (Screenshot updated):

The recent change of the export order / _index is a relevant problem for us, as we used previous downloads to specify cleaning issues.

@wroos, could you also share the last screenshot as the first screenshot so that we could see the entire page as seen in your first image. This should be helpful for us.

Hi @wroos
I this issue localized to one project or it affects all your projects?

Stephane

1 Like

Hello @stephanealoo, @Kal_Lam
I updated the screenshot (above).

I think the initial research question might be to understand the code algorithm: How does KoBo sort the cases in the download?.

A test for a 2nd project (same account) was ok (but submitted data are a bit older).
A difference to the 2nd project is that the sequence for the _submission_time is not strictly sequential in the 1st project, esp. around the wrong case (see table screenshot) and therefore the table order in the 1st project based on end_time is partly different to an order based on _submission_time.

We also tried again several downloads at different times to control internet connectivity effects. But they all showed the same issue, i.e. mixed up order in export as reported above.

We don’t think the problem is related to the server data loses, as they were announced later: “projects that were lost after 07:00 UTC on 22 January. More specifically, our logs show the first effects of this problem on … OCHA at 07:32.”

1 Like

Thank you for bringing this to our attention, @wroos. I will make a random check with some of my dummy entries and see if I can reproduce your issue. In the meantime, could you also let us know the server you are using?

Could you match and see if _submission_time and _id are related? The sole purpose of _index is related to link repeating groups with their parent submission.

Hello @Kal_Lam, @stephanealoo
Server is OCHA. (We are working in refugee context.)

Could you tell us, please:

I will sent you a data extract of the example reported by private email.
Findings, as far as we could see now:

  • _submission_time and _id are related. But there are 2 cases where the index sequence is different to this.
  • The download order (based on _index) is totally different to the order in the table view, which follows the end (metadata) time, in descending order.

Generally, we do not understand how KoBo creates this _index (parent_index) and the download case sequence: KoBo using 2 different orders: table view < > download, is making it difficult to use a direct mapping (from download to server table view) for data cleaning work.

Kind regards

Hello @Kal_Lam, @stephanealoo,
Do you have any news on this, please?
(Is there a GitHub entry even?)
The issue has not changed after the recent KoBo update.

While waiting, we would appreciate, please, if someone could explain:
How KoBo creates this _index (parent_index) and the download case sequence.
:
Thanks and kind regards

@wroos, FYI, kindly please be informed that we have taken this issue to our developer’s table. We should be able to update you if we have any updates from our developers.

1 Like

Hello @Kal_Lam
We meanwhile found the reported problem of reordered cases (and _index) on export at further places. The yellow marked cases have been mixed up now. It seems that the index and ordering of cases in the export can vary between exports of the same project (at least since February 2021). Cases submitted later can mix-up with old cases. (We did not edit or delete the raw data on the server before. Only new (later) submissions came in after the first download.)
Bug02 Order.xlsx (11.3 KB)

Unfortunately, this means for us that links from previous data cleaning in exported data need to be updated manually now.

1 Like

Thank you for providing additional details @wroos! This should be helpful for our verifications.

Hello, also in one of my projects the _index values change for the same entries with different downloads . I know there are other unique id fields but keeping this simple incremental numbering as unique id would be easier to understand the data. Any update on this issue? Thanks!

1 Like

Welcome back to the community, @simblanco! Maybe you will need to use the _id which is always fixed.

Dear @Kal_Lam

Any news on this?

Here is a current example of another user (and with Enketo): https://community.kobotoolbox.org/t/date-time-accuracy/22702