API quotas and bulk record update

wcs_tanzania · October 31, 2022, 6:19am

Good morning,
I wrote a script in python to update about 1000 records in one of our projects. The script is based on the post How to edit values using Kobo API? - #3 by Akratos. I can successfully run the script over 200/300 records but I eventually get a server disconnect message (‘Max retries exceeded with url’). It seems to be due to the script hitting a API request limit. Can I know what the limits are? I now get a 403 error message and I seem to have lost access to the API. Can my access be reinstated?

To gain in efficiency, I rewrote the script to update several records at the same time which would reduce the number of requests to the server by a factor of x10. I now have a list of IDs for the variable “submission_ids”, similarly to the original script posted by Josh in the link above. Would that be a valid approach? I read in other older posts that this option might not be supported yet…
Thanks for your help!

Kal_Lam · October 31, 2022, 1:11pm

Welcome to the community, @wcs_tanzania! Could you also share your python script with the community so that the community could also test it for you and help you solve your issue?

wcs_tanzania · October 31, 2022, 1:21pm

Sure, find below.


import pandas as pd
import requests
import json

fname = 'path/file.xlsx'
sharc = pd.read_excel(fname, engine = 'openpyxl', sheet_name=0)
catch = pd.read_excel(fname, engine = 'openpyxl', sheet_name='catch_details')

URL = 'https://kf.kobotoolbox.org/api/v2/assets/project_id/data/bulk/'
TOKEN = 'private_token'

PARAMS = {
    'fomat': 'json'
}

HEADERS = {
    'Authorization': 'Token {TOKEN}'
}


# this function is responsible to updating the records. It's designed to check if the record is in a group

def update_record(id,stri,species,local_name):
 if stri == '':
  payload = {
   'submission_ids': [str(id)],
   'data': {
   'boat/catch_details'+stri+'/species': species,
   'boat/catch_details'+stri+'/local_name': local_name
   }
  }
 else:
  payload = {
   'submission_ids': [str(id)],
   'data': {
   'boat/catch/catch_details'+stri+'/species': species,
   'boat/catch/catch_details'+stri+'/local_name': local_name
   }
  }

 res = requests.patch(
    url=URL,
    data={'payload': json.dumps(payload)},
    params=PARAMS,
    headers=HEADERS
)

# this version of the script runs over each row of the excel file

for id in sharc['_id']:
 # this is to find the kobo record ID from the excel file
 parent_index = sharc[sharc['_id'] == id]['_index'].values[0]
 records = catch[catch['_parent_index'] == parent_index]

 # if the record exists and has multiple entries do the following
 if len(records) > 1:
  for i in range(len(records)):
   species = records.iloc[i]['Scientific name']
   local_name = records.iloc[i]['Common name']
   if species == species:
    print(str(id)+': '+species)
    update_record(id,'['+str(i+1)+']',species,local_name)
  # if the record has only one entry 
 elif len(records) == 1: 
  rec = records
  species = rec['Scientific name'].values[0]
  local_name = rec['Common name'].values[0]
  if species == species:
   print(str(id)+': '+species)
   update_record(id,'',species,local_name)

wcs_tanzania · November 3, 2022, 1:45pm

ah! I found the error. I forgot to put a “f” in front of the TOKEN variable in the HEADERS.

Regardless, it would be good to know if the API has a quota and what that is. The script fails due to the server blocking access.
Thanks,
j