Submissions being duplicated with same uuid

Hello,

We’ve realized that a form created some duplicate submissions. It is data collected using Enketo on tablets, on the researcher server. The duplicates have exactly the same data and metadata, and even the same _id and _uuid (which should be impossibe right?).

Do you know what’s happening? I’m worried that deleting a duplicate will delete the 2 submissions.

Best

Diane

Would you mind sharing the screenshot of the issue. It would be very helpful. Could you also let us know if your survey project has some image questions that should be collected?

Sure, here is a screenshot showing the duplicate submissions:

No image should be collected in this form.

Could you help us by providing your username and project name through a private message. It would help us better understand the situation. TIA!

Sure, will send you the details in private. Thank you!!

1 Like

Hi @dianedetoeuf
Thanks for sending the information you did. We have had a chance to review this with our developers and we noted that there is a bug that the team would be working on. We apologize for any inconvenience caused. The only option now is to have the duplicate deleted; this will not delete the other copy. It could be a good thing to download your data as it is, just in case you are a bit worried about the deletion of both.

Stephane

2 Likes

Hello @stephanealoo,
As this seems to be a general issue, would you mind to provide more details:

  1. When does/can it happen?
  2. Does it happen on both servers (OCHA & HHI)?
  3. Does it happen for KoBoCollect, Enketo & ODK Collect?
  4. How can it best be detected?
  5. Until when will this probably be fixed?
    Kind regards
1 Like
  1. We’ve never been able to reproduce this reliably, and that’s why it’s not fixed :frowning:
  2. Yes (with the caveat of us developers not having reproduced it)
  3. Yes (same caveat)
  4. There are a few scenarios that people report as “duplicate submissions”:
    • The first, which describes Diane’s case, is true duplication, where the XML submissions are completely identical. I consider it a bug that KoBoCAT does not reject a submission whose identical XML already exists in another submission belonging to the same project. This problem is best detected by, first, looking for duplicate UUIDs, and then comparing the XML for any submissions that share the same UUID. Note the _id of each suspicious submission and retrieve its XML from the API, e.g. at https://kf.kobotoolbox.org/api/v2/assets/aYourProjectUid/data/12345.xml, where 12345 is the _id of the submission you want to retrieve.
    • A different scenario consists of submissions that share the same UUID but have different XML contents. UUIDs are generated by the client (Collect, Enketo, someone posting XML to the API, etc.), not the KoBo server. Some OpenRosa implementations reject duplicate UUIDs, but we err on the side of never discarding legitimate data—and we do not plan to change this behavior. This situation can be detected in a similar manner to the previous one: look for the duplicate UUIDs, and then compare the XML. If the XML differs, then the problem lies with the client. Obviously, if you see different responses in submissions that share the same UUID, you don’t need to go to the trouble of comparing the XML.
  5. There’s a lot of KoBoCAT work in the queue ahead of this. Honestly, we probably won’t begin to address it until the first quarter of 2021.
3 Likes

Hi @stephanealoo,

Thanks a lot to you and the team for looking at this! We’ll delete the duplicates and save the data before just in case.

Best

Diane

2 Likes

Hi all,

I have also seen this problem before, the only thing to do it was cleaning the duplicated submission. Then, for this second time, you answer to Diane’s request will help us to fix this problem.

Best regards
Bernard

2 Likes

thank you Kal_Lam Brother.
brother can i set _id is s.l from 1
if possible please help me.

Hi @mizanvai,

Could you kindly please elaborate your issue so that we could understand your requirement and help you out if it’s possible through KoBoToolbox.

when i submit new Kobo data “_id” filed become from 9 digits as like “134598324”, if is it possible from "1"
thank you.

Hi @mizanvai,

Kindly please be informed that this is not possible as _id is the ID provided by the KoBoToolbox system which is unique for each server i.e. HHI, OCHA or a self hosted server. Maybe this post discussed previously should also help you understand what _id is much better: