Submission id column missing from API V1 requests

mkmortera · November 8, 2021, 9:59am

Last month my google scripts suddenly stopped working. It turned out that the get results of
https://kc.kobotoolbox.org/api/v1/data/‘number_here’.csv no longer includes the ‘_submission_id’ column. The _uuid column refers to user id, I checked many times already and nothing was changed with the original script the same happens with complete rewrite so I can rule out typos.

Oddly, manual download from the Data->Downloads menu contains the csv column.

I dont have a workaround for this. I cant use the _uuid and time because there are submissions submitted from the same computer at the same time.

Is this an intended behavior by the devs? If so how do I make it so that the ‘_submission_id’ is included among the columns. Do I have to use the V2 API but I havent seen documentation for exporting csv replacing data.format with data.csv from the documentation doesnt work.

I find it especially hard parsing JSON/XML since they don’t readily import into a rectulangular format.

mkmortera · November 22, 2021, 2:24am

Any updates on this? I need a way to determine unique submissions in the dataset. At least I need to know which columns can be used as identifiers. The _uuid column doesnt look unique to me.

I can send the csv to the team so they can verify.

UPDATE: Even the JSON output of the v1 api does not return the _submission_id column
UPDATE: Used the _uuid column instead its unique.

jnm · December 7, 2021, 12:19am

It sounds like you’ve already figured this out (I’m glad ) but I’ll add some notes here for future reference.

No changes to this part of the V1 API have occurred in approximately 9 years.

_submission_id does not appear in the entire Git history of the KoBoCAT code base. Could you be thinking of _submission__id (double underscore)? This only appears in KPI (non-legacy, kF.kobotoolbox.org) Excel exports of a form with a repeating group, and only then on one of the sheets for the repeating group. It allows cross-referencing the rows in the repeating-group sheet with rows in the main sheet. CSV exports don’t have sheets and therefore don’t include data from repeating groups.

Not at all, sorry. _uuid is the “OpenRosa <instanceID>, which must be a universally unique string identifying this specific submission.” If you are seeing many duplicates of this, please report a bug. We know that a few have crept in, but they should be very uncommon. Some OpenRosa server implementations deal with this by rejecting any submission with a duplicate UUID outright. KoBoToolbox does not; in the interest of safeguarding data on a server ASAP, we only reject submissions whose XML content completely duplicates a previous submission. We may reconsider this in the future.

CSV from Data->Downloads does not include _submission_id, but it does include _id, which is absolutely unique on an individual KoBoToolbox server. It’s the primary key in the logger_instance (submission) SQL table.

https://kc.kobotoolbox.org/api/v1/data/<your form ID>.json includes the unique _id field, the database primary key, but nothing called _submission_id.

Great