Thanks @Kal_Lam: We don’t have any problem downloading using the User Interface. The problem arises only when accessing the data through the API.
As for the curl commands:
-
curl -H "Authorization: Token XXXX" "https://kc.humanitarianresponse.info/api/v1/data/XXXX?format=csv"
works but only downloads the first 30,000 records -
curl -H "Authorization: Token XXXX" "https://kc.humanitarianresponse.info/api/v1/data/XXXX?format=csv&start=30000"
also works and downloads records from 30 to 60k, but as mentioned in the link posted by @stephanealoo there is no assurance that the records in this request will not overlap with those obtained in the first batch. To have that assurance we also need a sort parameter. -
curl -H "Authorization: Token XXXX" 'https://kc.humanitarianresponse.info/api/v1/data/XXXX?format=csv&start=30000&sort={"_id":1}'
as suggested in by @stephanealoo does not work and I get the bad gateway response shown in the first post.
I’ve taken care of encoding the url myself or letting curl doing it (using -G --data-urlencode
) to get a path like api/v1/data/XXXX?format=csv&start=30000&sort=%7B%22_id%22%3A1%7D
and it does not work either.
Below the verbose output of curl in case it’s helpful (I’ve edited out my survey id and token)
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Trying 63.34.253.220...
* TCP_NODELAY set
* Connected to kc.humanitarianresponse.info (63.34.253.220) port 443 (#0)
* ALPN, offering h2
* ALPN, offering http/1.1
* successfully set certificate verify locations:
* CAfile: /etc/ssl/cert.pem
CApath: none
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
} [242 bytes data]
* TLSv1.2 (IN), TLS handshake, Server hello (2):
{ [102 bytes data]
* TLSv1.2 (IN), TLS handshake, Certificate (11):
{ [4916 bytes data]
* TLSv1.2 (IN), TLS handshake, Server key exchange (12):
{ [333 bytes data]
* TLSv1.2 (IN), TLS handshake, Server finished (14):
{ [4 bytes data]
* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):
} [70 bytes data]
* TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1):
} [1 bytes data]
* TLSv1.2 (OUT), TLS handshake, Finished (20):
} [16 bytes data]
* TLSv1.2 (IN), TLS change cipher, Change cipher spec (1):
{ [1 bytes data]
* TLSv1.2 (IN), TLS handshake, Finished (20):
{ [16 bytes data]
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
* ALPN, server accepted to use h2
* Server certificate:
* subject: CN=kobo.humanitarianresponse.info
* start date: Aug 14 00:00:00 2020 GMT
* expire date: Sep 13 12:00:00 2021 GMT
* subjectAltName: host "kc.humanitarianresponse.info" matched cert's "kc.humanitarianresponse.info"
* issuer: C=US; O=Amazon; OU=Server CA 1B; CN=Amazon
* SSL certificate verify ok.
0 0 0 0 0 0 0 0 --:--:-- 0:00:01 --:--:-- 0* Using HTTP2, server supports multi-use
* Connection state changed (HTTP/2 confirmed)
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* Using Stream ID: 1 (easy handle 0x7fb0b2009600)
> GET /api/v1/data/XXX?format=csv&sort=%7B%22_id%22%3A1%7D&start=30000 HTTP/2
> Host: kc.humanitarianresponse.info
> User-Agent: curl/7.64.1
> Accept: */*
> Authorization: Token XXXX
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 128)!
0 0 0 0 0 0 0 0 --:--:-- 0:02:01 --:--:-- 0< HTTP/2 502
< date: Mon, 07 Jun 2021 22:45:48 GMT
< content-type: text/html; charset=utf-8
< content-length: 158
< server: nginx/1.19.10
<
{ [158 bytes data]
100 158 100 158 0 0 1 0 0:02:38 0:02:02 0:00:36 45
* Connection #0 to host kc.humanitarianresponse.info left intact
* Closing connection 0
Thanks!