Take a certain part of a QR response with regex

segosal279 · August 19, 2024, 11:18am

Hi everyone,

I’m developing a form that collects data about family units. One section requires scanning an Argentinian document’s QR code.

dni_qr

The decoded text follows this format: Of.Ident@Surname@Name@Sex@Document@Date of birth@Date of expiry.

I’m aiming to extract the individual’s sex using regular expressions and then accumulate this data to determine the number of males and females within each family. Any insights or suggestions on how to effectively achieve this would be greatly appreciated.

Thanks in advance!

wroos · August 20, 2024, 7:58am

The Sex seems coded as one @digit@ (or @letter@), so this might be the only place to locate this token in the whole string. Can you confirm this?

Update: So you might use a calculation like this: if( contains(…, “@FemaleCode@”), “F”, if(contains(…, “@MaleCode@”), “M”, “Diverse”) ). See ODK XForms Specification.
Or see Xiphware below, as an even more flexible search solution.

segosal279 · August 20, 2024, 10:31am

Hello
Yes, the sex is coded with one char (F/M).

Xiphware · August 20, 2024, 8:28pm

Can you also post a specific example of a decoded QR code string.

Of.Ident@Surname@Name@Sex@Document@Date of birth@Date of expiry

BTW, I note your description of the format differs somewhat from what is described here [hence why asking for a specific example…]: Documento Nacional de Identidad (Argentina) - Wikipedia

Xiphware · August 20, 2024, 8:50pm

You could possibly use a regex pick out the specific character field and check if its, say, ‘F’, to determine the sex (presumably if its not ‘F’ then it must be ‘M’?..)

However, it is probably more generally useful to extract specific fields in the ‘@’ delimited string - in this case you want the 4th - which you can do using (sub) string functions. Have a play with this:

substring.xlsx (14.4 KB)

Note, if all the fields are actually fixed width, then you can simplify this expression a lot to extract the specific nth-index character itself, using substr().

segosal279 · August 21, 2024, 11:53am

Hi there,

Yes, the format has changed due to the document’s recent issuance. A sample decoded QR code string might look like this: 00000010532@VILLAREAL@MARIA VICTORIA@F@99999999@A@01/11/1969@31/10/2009.

segosal279 · August 21, 2024, 12:01pm

Thank you so much for your detailed explanation. I’m excited to try it out and see how it works.