In the humanitarian context, there are often attached csv files with all the names, ID numbers, phone numbers, household composition, medical conditions etc. of families that are to be interviewed. Those files are feeding dropdown lists and / or verification fields.
In order to scramble the link between names, ID numbers, addresses and whatever other personal data, i use a strategy in two points:
- to prevent a hostile person who grabs the tablet from an enumerator while they were conducting an interview to reach the sensitive data visible in some parts of the form, i put an extra password to be entered in a text field at the beginning of the group of sensitive fields. Obviously, the password tentatives must be tried against the hashed password stored in the form.
- to prevent anyone to see the link between a given name, a given ID number, a given sensitive data etc. in the attached data, i split it in many CSV files, each with a few columns (ideally a couple of columns), including one that is the hashes of the original IDs (to which I add some salt or pepper, different for each CSV file). Hence even if someone can access those CSVs at any point of the workflow, they won’t be able to make sense of them.