Reading of Api v2 - Relationship and reading Attachments

Hello. I would like to know how the information stored in submission is related to the content of the “_attachments” index returned by the api.
For example, in the label submission that stores an image, it returns “group_wu4ss89/_16_Fotografia_de_la_Vivienda” : “complex - copia (2) ñ firma´ special ch&ars-11_3_57.png”.

As we can see, the name of the image contains special characters and kobo removes them or replaces them with “_” or “-”. leaving the index “attachments" in this way:
{ “filename”: "username/attachments/f027b0315cf146708d9883ccb147b578/3b89ba9e-af9d-42fb-81ed-b5d7ee285ac3/complex
-_copia_2_ñ_firma_special_chars-11_3_57.png”,
“instance”: 297065537,
“xform”: 1105231,
“id”: 114683651 }

Under what criteria is the relationship made to search for the document within “_attachments” using the value of label submission since kobo is capable of searching for the attachment with the specific id independent of the stored filename.
Either the value of the attached name is used or some kind of regular expression parsing is implemented for the stored urls.

Welcome to the community, @kltapias! Could you list out the exact steps to make a check so that we could also have a look at it?

  1. Upload an attachment with a filename for example: “copy(2) ñ firma´ special ch&ars.png”
  2. request the api "/api/v2/assets/{form_key}/data/{submission_id}/?format=json for the submission information.
  3. Within the api response, search for the tag/label where the response of that attachment is stored, it will return:
    {
    “_6_Fotograf_a_del_en_evistad_o_de_su_D_I” : copy(2) ñ firma´ special ch&ars-11_37_44.png
    }
  4. Within the api response, “_attachments” section, it will return:
    {
    “download_url”: “…”,
    “download_large_url”: “…”,
    “download_medium_url”: “…”,
    “download_small_url”: “…”,
    “mimetype”: “image/png”,
    "filename": “username/attachments/f027b0315cf146708d9883ccb147b578/2f434709-74bf-4756-875f-1d4f16fe441a/copy2_ñ_firma_special_chars-11_37_44.png”,
    “instance”: 297065537,
    “xform”: 1105231,
    “id”: 115756865
    }

As we can see the special characters were replaced so that the value contained within the tag (step 3) must be formatted to perform the search within “_attachments”.

Apparently this is the regular expression that the frontend uses to format the attachment name: See

I would like to know if there is an alternative to search for attachments without having to format the content of step 3, go through each attachment and obtain the one that matches in “filename”. because I do not find a direct relationship between the name of the question and the values ​​contained inside “_attachments”.

1 Like

Hi @kltapias, we use the method here (which Django uses in the background) to normalize the filename and then the method here to do the matching. This is how the export finds the attachment URL when you have the “Include media URLs” option checked.

1 Like

@Josh, thanks for replying, as you can see I’m using the api to build custom documents with the information. It seems that both Kpi and Backend use the same logic (which I also used) to search for attachments, parse and go through each row until a match is found.
You can consider relating or including some unique key/name of the question to the data contained in the attachments section, this would facilitate the search from the api.

1 Like