Download via API limited to 100 observations

Dear community,

I have been using robotoolbox package in R from @dickoa to download data via API for a long time. Recently the same R scripts will only download 100 observations, regardless which kobotoolbox project I wish to download. Can anyone assist, has there been an API change?

What API(s) are you using to do this?

Note, /api/v2/assets/{uid}/data/ is now limited to 1000 records. See KoboToolbox Primary API

Hi @Tumaini ,

I hope all is well, and thanks for using robotoolbox. Indeed, the page size was capped at 100 max, I had to change the package default recently. The current robotoolbox should work but it’ll be slightly slower because of the increased number of request.

# install.packages("pak")
pak::pkg_install(“dickoa/robotoolbox”)

I’ll push it to CRAN soon.

I do have a question for @Xiphware and/or someone from the core team: Is there a way to know via a request the default page size and max page size? I understand that I might vary from one server to the next, and I want to be able to give more flexibility instead of having 100/1000 by default across different servers.

Hi @Xiphware

I also thought so, but someone using the kf server told me that the limit was set at 100. I wonder I could safely set the default limit to 1000 (maximum allowed).

Thanks for the response @dickoa indeed after downloading the latest, I could download all 800 rows as previously, the only difference is the new version renamed my select_multiple variables adding _1 to each, something it never did before, but not an issue.

So is there a limit I will hit at some point, I am currently at 830 observations, are there any surprises awaiting me at some threshold?

Thanks again for the brilliant package.

Is there a way using robotoolbox to download data within a fixed interval of time e.g last 14 days?

Yes, you can. I added a new ‘query’ parameter you could use to add custom MongoDB query. In your cases, get cutoff date to match your 14 last days

days ← 14
cutoff ← format(Sys.time() - as.difftime(days, units = “days”), “%Y-%m-%dT%H:%M:%S”)
qry ← sprintf(‘{“_submission_time”: {“$gte”: “%s”}}’, cutoff)
df ← kobo_data(uid, query = qry)

In general, all the MongoDB queries supported by KPI should work.

1 Like

In theory, no limit. The package will send multiple request if the size of dataset is greater than the page size (default to 100 and max at 1000; robotoolbox use 1000). For a 830 observation, it’s just one pass, but if you end up with 3,250 observations, robotoolbox will send 3 requests and combine them for you. I’ll investiage the “_1” suffix in the select_multiple.