As an organization, we are quite new to Kobo. In the previous data collection tool that we used, we were using incremental loads to avoid loading too much data at once, To do this we used a snapshot functionality that checked to see if there were new data. We used submission date for identifying new entries, and modification date (logged separately from submission date) for entries that were entered incorrectly by the data providers, which were then changed. A colleague would modify the data on browser to ensure data quality. This modification would then be logged as the modified date, which then we included in our incremental loads. I have checked the community forum and the web, but I’m unable to find anything in relation to incremental loads or modifications dates for Kobo. I created a test form, submitted some test entries, and then modified them. After this I took an export of the data, but I’m unable to see a log that corresponds to the data being modified. Are there any suggestions from the dev team regarding a workaround for such functionality? Your help would be much appreciated. Thanking in advance.
Welcome to the community, @ckoyuncu! Could you share a sample of the incremental loads you have used in the post? This should be helpful for us to understand it pictorially.
Let me try to reformulate the question: Hypothetically let’s imagine that my form received 10,000 submissions. When I pull the submissions through an API, an incremental load would essentially allow me to check my existing records of the samples, and only fetch once there is new data. In this scenario, there could be two types of new data: 1) New submissions 2) Modified submissions.
Currently, with Kobo, there is no issue in terms of identifying new submissions. If there is a new submission that differentiates from the snapshot we have taken of the 10,000 submissions, then we would consider that a new submission. However, there is also the possibility that an existing record has changed. Let’s say we are keeping track of the number of a given disease for a hospital and a data provider entered one too many zeros, which makes this a clear data error. In our current setup, we would check if the modification date (see attachment 2) of any of our previously saved 10,000 submissions has changed. If there’s a change, then we would replace the new submission with the new modification date with the old submission with the older modification date, hence the data has changed.
So the question is, how do we identify modified cases in Kobo? Do you have any logs that we can retrieve from an API call that any given submission to a form has been changed? At the moment, for the test form, if I change the content of a previously submitted submission from the web and export it, it is exactly the same as the prior export, only that a certain field has been changed. Basically there is no log of modifications. Since it is not efficient to check for every API call, if every single field in every single submission has been changed, the question here is: Do you have a solution for this? How can we retrieve the log (i.e: a column that keeps track of the modification for a single record) of the submissions? Hope this makes it clear! Please feel free to reach out, and looking forward to your reply.
@ckoyuncu, maybe a good option would be to check the _id from the dataset. Whenever you have a new submission, the dataset receives a unique _id for that submission.