Sure, here is a screenshot showing the duplicate submissions:
No image should be collected in this form.
Sure, here is a screenshot showing the duplicate submissions:
No image should be collected in this form.
Could you help us by providing your username
and project name
through a private message. It would help us better understand the situation. TIA!
Sure, will send you the details in private. Thank you!!
Hi @dianedetoeuf
Thanks for sending the information you did. We have had a chance to review this with our developers and we noted that there is a bug that the team would be working on. We apologize for any inconvenience caused. The only option now is to have the duplicate deleted; this will not delete the other copy. It could be a good thing to download your data as it is, just in case you are a bit worried about the deletion of both.
Stephane
Hello @stephanealoo,
As this seems to be a general issue, would you mind to provide more details:
_id
of each suspicious submission and retrieve its XML from the API, e.g. at https://kf.kobotoolbox.org/api/v2/assets/aYourProjectUid/data/12345.xml, where 12345
is the _id
of the submission you want to retrieve.Hi @stephanealoo,
Thanks a lot to you and the team for looking at this! We’ll delete the duplicates and save the data before just in case.
Best
Diane
Hi all,
I have also seen this problem before, the only thing to do it was cleaning the duplicated submission. Then, for this second time, you answer to Diane’s request will help us to fix this problem.
Best regards
Bernard
thank you Kal_Lam Brother.
brother can i set _id is s.l from 1
if possible please help me.
Hi @mizanvai,
Could you kindly please elaborate your issue so that we could understand your requirement and help you out if it’s possible through KoBoToolbox.
when i submit new Kobo data “_id” filed become from 9 digits as like “134598324”, if is it possible from "1"
thank you.
Hi @mizanvai,
Kindly please be informed that this is not possible as _id
is the ID provided by the KoBoToolbox system which is unique for each server i.e. HHI
, OCHA
or a self hosted server
. Maybe this post discussed previously should also help you understand what _id
is much better:
Hello,
We had another issue with duplicated uuid, affecting 1% of the data (which is quite high!).
Did someone manage to replicate the issue and try to solve it?
Best
Diane
Hello @dianedetoeuf,
Could you provide more details, please:
As mentioned above by jnm, there is an open github bug report (since July 2018).
I am afraid _uuid duplication is a crucial issue, meaning that the _uuid cannot be trusted as unique. Different to the existing KoBo and ODK documentation. e.g.
ODK XForms Specification, Random Numbers for Questionnaire ID,
Form Operators and Functions — ODK Docs
https://community.kobotoolbox.org/t/what-are-the-relation-between-these-columns-you-get-while-exporting-data-in-excel/9523/2
“The UUID
once received will never be duplicated i.e. i have received a UUID
of e45577db-085d-47a0-b1d4-0d9799077b5a
for my submission as shown in the image above. No one else in the internet should receive this UUID
again.”
There is also another ODK thread on duplicates here:
https://docs.getodk.org/aggregate-data-access/#publishing
“Under certain failure conditions, the downstream service can receive multiple copies of a given submission. This is known, expected, behavior.
Duplicates typically occur if the downstream service is slow to respond or acknowledge a request. It is your responsibility to detect and eliminate these duplicates should they occur (they will always have exactly the same information in all fields).”
See also
https://forum.odk-x.org/t/is-it-possible-to-alert-the-user-of-the-duplicate-records/1054/2:
“If you are using ODK-X to create the identifiers (the uuid) you would never end up with two of the same uuid, so duplication would be avoided.”
Universally unique identifier - Wikipedia “The probability to find a duplicate within 103 trillion version-4 UUIDs is one in a billion.”
Hi all, I’ve got a similar problem as well. As @dianedetoeuf mentioned ~2% (22 out of 1024) of my submitted forms have duplicated _uuid
s (11 duplicated uuids for 22 submissions). Their _id
s are unique, and the submissions are not duplicated -they are genuine submissions.
I don’t know if it helps but here are the answers to the questions that @wroos asked:
username
of the person, most of the data is different. (got 134 questions which some of them expected to be have same answers, not a duplication issue.)The data server for your form or the Enketo server is down. Please try again later or contact support@kobotoolbox.org. (500)
More information:
All the duplicated UUID’s occur within the relevant account. See screenshot for details:
(In other words, there is no duplicated UUID between different accounts)
(Censored usernames
and Most of the UUID’s for security reasons.)
All of them are submitted via web.
One of the things I noticed is the submisson time. In @dianedetoeuf’s, @Bernard_26’s and my cases, the submission time of the duplicated UUID’s are so close to each other.
Will test more and update here.
Best,
Hello @Bernard_26,
did you use Enketo (Webforms) when the duplicares problem happened? (Or Collect?)
Hello @hakan_cetinkaya,
thanks for the details and research!
Did I understand well?
The __uuid duplicates only happened with submissions from the same device and username.
Someone of the Core Team might explain, please, when exactly is the __uuid generated? (And how a duplicate might happen?)
@dianedetoeuf, @wroos, @hakan_cetinkaya, we do have a GitHub issue for this. You should be able to track it through this link:
Hello @Kal_Lam,
Thank you. This Gitithub was also cited by jnm above. But there seems to be no news on thiis since 2020, despite being classified as “bug” since 2018. So, further detailed examples from the community might help the developers to fix it?
The new examples confirm that the duplication of the __uuid (doumented as unique) is not caused randomly. We think the bug is very relevant, similar to a prkmary key in a database which can no more trusted as unique.
And if known cases only came from Enketo - or also from Collect, please?
Update: See also another new posting: Unable to delete or edit some submissions