Internal server error when downloading encrypted forms

Thank you for confirming on this @freedim!

So it seems there is:

  • a server misconfiguration making it respond HTTP Error 403 when clients with versions 1.16+ try to download encrypted data (that error has been here for years and does not occur with ODK-Central);
  • some problem in the new way KoBo server delivers XPath files, which makes clients with versions 1.15 or less miss the <label> node of attached CSV media files (that problem did not exist some months ago).

Would you, @Kal_Lam or @stephanealoo, have any visibility on the timeline to fix at least one of those problems, please? Depending on the answer, I (and maybe others) would have to make decisions, subscribe to an ODK plan, or start setting up a self-hosted ODK instance etc.

Thank you for your investigation, @freedim. I recognize that you have spent a lot of time on this already, but do you have an understanding as to what the “misconfiguration…for years” specifically is or what has changed in Briefcase versions 1.16 and above?

Given the very limited resources of everyone involved here, let’s not spend time on fixing support for old versions of Briefcase. KoBo actually uses pyxform, which is the standard tool in the ODK ecosystem for creating XML XForms, and my guess would be that a change there has altered this <label> behavior. If we are to avoid the fate you describe:

…then we must keep pace with changes to ODK Briefcase. Unfortunately, to do so seems like it will require the core developer team to somehow postpone contractually-obligated work for other humanitarian organizations to tackle this, since no person within the user community for over a year has been able to follow @lognaturel’s advice for futher troubleshooting. If you or someone else would be able to do that kind of troubleshooting, it would be an immense help.

Of course, if you find your needs better met by ODK Central or another platform, please do use it. We are providing a free service in the interest of helping humanitarians worldwide, not achieving some kind of dominance amongst data-collection software. Thanks for your understanding.

Thanks for your response and the clarifications!

  • About what seems to be a “misconfiguration for years”, I don’t know at all. But since the error message is an HTTP 403 “Forbidden” error (for v1.17+) or an HTTP 502 “Bad gateway” error (v1.16) and since it does not happen with ODK central, this is the only idea I can come up with. And it has first appeared with version 1.16.0, which was released years ago.

  • The focus on version 1.15 of ODK Briefcase finds its justification in that it has been the latest version able to download encrypted data (although it was not always able to decrypt and export it) from kc.humanitarianresponse.info. All later versions are unable to pull the encrypted collected data from that server (see previous bullet point), whereas they smoothly pull it from ODK servers. So when very sensitive data has been at stake for the last 2-3 years, we have been many people relying on v1.15 (and it was already quite hard for a simple user like me to figure out that downgrading to v1.15 was the solution).

  • Now we cannot rely on v1.15 any longer since it now crashes when pulling data from a form having an external CSV instance (whether the form is encrypted or not). Later versions can handle those CSV instances, but as said above, cannot pull encrypted data. I suppose this new problem comes from a legitimate update of KoBo’s server, but users are now left with a hard choice: either CSV instances but no encryption (and latest versions of Briefcase are fine), or encryption, which implies v1.15 or lower (at least for data pulling), which forbids CSV instances.

  • As you may see in the GitHub conversation that you point, I discussed that issue with lognaturel. Then, she gave me temporary access to a public testing ODK server, where we could verify that the problem did not seem to come from ODK Briefcase. In that same conversation, she concludes:

It seems Kobo makes some kind of assumption that Briefcase was accidentally meeting previously but stopped meeting after v1.15.0. Two possible next steps for someone who needs this fixed would be to contact Kobo support and have them look at compatibility with more recent Briefcase versions and/or do a git bisect on Briefcase to track down which commit introduced the change that caused an incompatibility with Kobo.

I did the first half and reported here at the time that v1.15.0 was the last to work (v1.16.0-beta0 does not allow to connect to KoBo server in a way that would allow to start a data pulling, and v1.16.0 is already unable to pull encrypted data), but it seems it didn’t trigger visible investigations, For the second half, I unfortunately have neither the time -like basically everybody, which is a shame-, nor the skills to track down in which commit the bug first appeared. If you think that would be an immense help, though, I will consider spending a few hours learning what a git bisect is and how to do it, and try my luck some day in the next months.

I must say I struggle to understand what and who is behind KoBo, ODK and the myriad of projects like XForm, XLSForm, Javarosa, Enketo etc., how it is run and the limitations or constraints on all that ecosystem. I don’t even understand what KoBo is in relation to ODK, as I initially thought KoBo was just maintaining a public (and re-faced) ODK Central instance, which obviously proved wrong.

Cheers

1 Like

@freedim, we now have a GitHub issue for this. You could follow the same through this link:

1 Like

Thanks again, @freedim.

We’ve found some time to investigate this, and with any luck, it’s we can make a simple change to the way KoBo handles authentication from ODK Briefcase—and then that will allow new versions of Briefcase to retrieve encrypted submissions. I certainly do appreciate your dedication to follow through on this. If another obstacle appears it would indeed be very helpful to have someone git bisect the ODK Briefcase code (this is a method of narrowing down a set of changes by half over and over again until the likely culprit is identified), but please don’t spend your time learning about Git and Java development just yet. If you have a software developer friend with free time (laughable, I know) who might like to do some investigation later should we hit a roadblock—and hopefully we won’t—I’d be eager to collaborate.

Yes, it would be helpful for us to document this somewhere. Just quickly:

  • ODK, or “Open Data Kit”, in their own words “began with a vision to make open-source mobile data collection software for resource-limited settings. Over the last 13 years, the project has produced two tool suites, ODK and ODK-X, that have helped make the world a better place” (https://opendatakit.org/). ODK-X is effectively an entirely different tool suite, so let’s not worry about it here.
  • XLSForm began as a more user-friendly way to create forms (via spreadsheets) than writing XML directly. XLSForm (and it’s Python reference implementation, pyxform) are now maintained by ODK as well as other stakeholders as described in the “History” section of XLSForm.org.
  • KoBoToolbox (since 2014) is both an open-source software package for designing, sharing, deploying, analyzing, and reporting data-collection projects as well as that same software running as a hosted service, free-of-charge to humanitarian users. KoBoToolbox is based on the XLSForm and ODK standards and certainly endeavors to keep up with them! KoBoToolbox leadership participates in the ODK Technical Advisory Board and is an XLSForm stakeholder. The KoBoToolbox graphical form builder creates XLSForm, which then is transformed into XML XForm by pyxform for consumption by other tools in the ODK ecosystem.
  • Enketo is an open-source tool developed by Enketo LLC to serve a similar purpose as the Android-exclusive ODK Collect, that is, to collect submissions, but to do so on any platform with a decent HTML5 browser. It uses the same ODK standards but is not concerned with XLSForm, because by the time a form reaches Enketo, it has already been transformed into XML XForm.
2 Likes

Thanks @Kal_Lam for creating this issue.

As per @jnm advice, I tried to troubleshoot further. I went to the getodk/briefcase github and followed the instructions: installed intelliJ, forked the code and cloned it on my machine.

Then I had to install the older Java 8 because older versions of Briefcase cannot be built with newer versions as they have now-deprecated dependencies.

There is a bug preventing any connection from ODK Briefcase to KoBo servers (and probably all the so-called “ona-like” servers) that appears at the second commit after release 1.15 and that is only fixed in the third commit after release 1.16.0-beta0 (commit ‘fc8b750455b6463228f32cadb6ebeaebaa881b87’). Therefore all builds between the two versions would crash even before we can see if/when/why they are not able to pull encrypted data. This also defeats, I believe, the git bisect thing. So I extracted a patch from that commit fixing the bug and I applied it to successive commits after release 1.15.0.

The last one still “working” (meaning able to download encrypted data from KoBo, even though the external CSV instances are broken) is commit ‘55fb9b804322893cde45c26a03fa5dce9acd46da’. That commit carries out all the submissions download logic in two classes called AggregateUtils and ServerFetcher, held in org.opendatakit.briefcase.util.

In the next commit and the following ones, none of which “working”, the download logic is carried out by the class org.opendatakit.briefcase.pull.PullForm, whose method downloadForm remains blocked without sending an Exception right after clicking on the “Pull” button in the GUI, because it receives an HTTP Error 404 that is not handled. The classes AggregateUtils and ServerFetcher, and their methods used in the previous commit seem to never be even called in that build (which is consistent with its description: “Use the new PullForm…”).

By looking at the following commits descriptions and looking at the code as (not) far as I could, it seems ODK has been preparing the move to ODK Central for a while, shifting logic blocks commit after commit. And it seems KoBo has kept his communication protocol as managed by the ServerFetcher and AggregateUtils classes while ODK Briefcase was continuously drifting away from it. That drifting, from the classes shift described above to the HTTP Error 403 witnessed much later (few commits before release 1.16.0), spans more than 100 commits, including massive refactoring, package structure rework and logic shifts.

It appears overwhelming and unfeasible to go further for me, but I hope it is not for you.

Good luck!

2 Likes

Thanks! I’m quite humbled by the thoroughness of your investigation efforts. I would clarify that I do not think that the ODK Central-related changes have intended to break compatibility with any other tools in the ecosystem, including those, like KoBoToolbox and Ona, that trace their lineage back to Columbia University’s Formhub project.

1 Like

Thanks @jnm for the explanations and kind words.

Glad to read some investigations will be carried out. I have spent the whole day of yesterday (Middle-East timezone…) on it and have found it very hard to make very little progress.

I didn’t mean ODK intended to make it hard or incompatible. Sorry my English is not native so I may poorly phrase certain things. Actually the patch I extracted came from a commit precisely described as meant to maintain compatibility with “ona-like” servers while the way ODK-aimed requests are built was evolving. It seems requests and their building logic have evolved steadily during that period between 1.15 and 1.16 (probably along ODK Central development), including with inclusion of credentials or not, management of “cursor”, management of media files (which I believe encrypted data is). So maybe maintaining compatibility with non-ODK servers turned to be a kind of graceful effort with less testing capacities on their side.

In those days where data security is gaining a lot of traction as awareness and scandals both rise, including in the humanitarian sector, I think having a flawless encryption flow is a must. And by the way I was super-happily surprised when I discovered very recently that digest() was now supporting sha-256. I rely a lot on hashes to obfuscate links between personal IDs and personal data and between every piece of personal data when I upload them in media CSVs. But so far, KoBo/Enketo had only supported MD5, which was cracked long ago. Now I can really claim my obfuscation scheme is robust.

Good luck!

1 Like

Hi @freedim, thank you for your patience as we sorted this out. Please note that a fix has been merged the master branch and will be deployed into production soon.

1 Like

Thanks a lot, @Josh , this is really exciting!

Now excuse my ignorance and impatience, but are you able to tell me how much time it will take to have the fix deployed in production?

Hi @freedim, currently there is not set date for the next release but I will see if this can be pushed out ASAP.

1 Like

Hello! I’m new to Kobo & ODK Briefcase, and I am currently facing an issue which I think is the same as you are describing here:

I have an encrypted Kobo form, which I try to download and decrypt using ODK Briefcase 1.15.0. Three days ago (25/10/2021) this worked without any problems. However, now it is no longer working (“Failed; error 500”, “success with errors”). Do you know if there is a quick fix for this issue? A different version of Briefcase I should try (if yes, which one)?

Welcome to the community, @MariekeM! Could you also let us know the server you are using? Are you on a self-hosted server or are you using one of the publicly hosted servers?

Hi @Kal_Lam, thanks for your message! I’m using the kobo server (https://kc.humanitarianresponse.info).

As a cross check would you mind uploading a dummy project, encrypt it and then try downloading your encrypted project with ODK Briefcase to see if it works for this?

So this is very strange… I copied the exact same form as I am currently using (an encrypted one), uploaded it to Kobo, and run a few test surveys. I can download and decrypt the data of this dummy project without any problems using ODK Briefcase 1.15.0. However, when performing the exact same steps, it still doesn’t work with the real form (“Submission not retrieved: org.opendatakit.briefcase.util.ServerFetcher$SubmissionDownloadException: Fetch of a submission failed. Detailed error: Internal Server Error (500) while accessing …”).

1 Like

Thank you for confirming that it works!

Ah yes, but the problem is that it works with the test survey (= a copy of my form with a few test responses). But I still cannot download and decrypt my real data anymore (which I could few days ago…)… Do you know why this could be the case or something else I should check?

Maybe you will need to check out for the keys that you used while encrypting your survey form?