Kobo install process failing - connection refused

We have been running kobo on our own server for almost a year now. (we are very happy and grateful for the project btw). Yesterday it stopped working so we ran * python3 run.py --stop* and again python3 run.py.
The process starts normally without errors, but stays stuck at Waiting for environment to be ready. It can take a few minutes

We have been getting the following error in the log and have no idea how to address it. Any pointers would be much appreciated:

enketo_express_1  | 21:23:27 0|enketo  | Worker 130 sadly passed away. It will be reincarnated.
enketo_express_1  | 21:23:43 0|enketo  | Worker 10 ready for duty at port 8005! (environment: production)
enketo_express_1  | 21:23:43 0|enketo  | Error: Redis connection to redis-cache.kobo.nuup.private:6380 failed - write EPIPE
enketo_express_1  | 21:23:43 0|enketo  |     at afterWriteDispatched (internal/stream_base_commons.js:156:25)
enketo_express_1  | 21:23:43 0|enketo  |     at writeGeneric (internal/stream_base_commons.js:147:3)
enketo_express_1  | 21:23:43 0|enketo  |     at Socket._writeGeneric (net.js:787:11)
enketo_express_1  | 21:23:43 0|enketo  |     at Socket._write (net.js:799:8)
enketo_express_1  | 21:23:43 0|enketo  |     at doWrite (_stream_writable.js:403:12)
enketo_express_1  | 21:23:43 0|enketo  |     at clearBuffer (_stream_writable.js:542:7)
enketo_express_1  | 21:23:43 0|enketo  |     at onwrite (_stream_writable.js:454:7)
enketo_express_1  | 21:23:43 0|enketo  |     at afterWriteDispatched (internal/stream_base_commons.js:159:5)
enketo_express_1  | 21:23:43 0|enketo  |     at writeGeneric (internal/stream_base_commons.js:147:3)
enketo_express_1  | 21:23:43 0|enketo  |     at Socket._writeGeneric (net.js:787:11) {
enketo_express_1  | 21:23:43 0|enketo  |   errno: 'EPIPE',
enketo_express_1  | 21:23:43 0|enketo  |   code: 'EPIPE',
enketo_express_1  | 21:23:43 0|enketo  |   syscall: 'write'
enketo_express_1  | 21:23:43 0|enketo  | }

The error keeps repeating. Our setup is the default with our own domain.

If we disable reddis, we get a similar error, but now ECONNREFUSED, but still nothing.

(Our DNS setup hasn’t changed and seems to be ok)

Hello @mgonzalez , What do you get when you run this command

ps -ef | grep redis

1 Like

Thanks for the repsonse. I get:

systemd+   50933   50911  0 15:41 ?        00:00:00 redis-server *:6379
systemd+   51021   50966  0 15:41 ?        00:00:00 redis-server *:6379
azureus+   54078   39032  0 15:42 pts/0    00:00:00 grep --color=auto redis

Obviously there is an issue with redis containers

1 Like

Suggestions in how to fix it? As I mentioned before, even if we disable using reddis we get an error.

This is the solution, If you are able to ssh make changes below in your redis entrypoint.sh

2 Likes

:partying_face: This worked!! We are extremely grateful. (our server supporting small farmers in Mexico is back online thanks to you).

1 Like

:pray: I understand what it means when such servers are down, I am happy that this worked for you

1 Like

Hello guys, I have been getting a lot of private messages with regard to the error above or anything along the this lines below. so I am adding this Explanation/solution here to help the community:

redis-cache.domain.private:6380 failed … ECONNREFUSED 192.168.16.5:6380]econnrefused-192-168-16-5-6380/33985

  1. You need to check whether main and redis cache are running on the same port ie port 6379
    Run the command below

ps -ef | grep redis

if redis main and redis cache are running on the same port then, the issue is cause by redis

  1. Stop kobotoolbox using commands below

sudo python3 run.py --stop

  1. Replace the apt-get calls from redis/entrypoint.sh in you kobo-docker/redis/entrypoint.sh
    as in the screenshot below, This changes drastically reduces the redis start-up time , hence avoid Enketo loop trying to reconnect

  2. Start kobotoolbox using commands below

sudo python3 run.py

Kobotool box should now start without any issue

3 Likes

It is sudo python3 run.py

2 Likes

Thank you!! Thank you!!! I was doing a routine service account password rotation that required me to stop and restart the containers to load the new env variable. I have done this at least a dozen times with no issue. Then, of course, last night during the mid-night maintenance window…the containers wont come back up and enketo workers kept passing away because they couldn’t connect and I had absolutely NO idea why! After some choice language, I found your post. The steps worked perfectly, but man what a find @stephenoduor!!

Is there any idea why Redis/Redis cache all of a sudden were trying to use the same ports? Very strange all of a sudden they just conflicted…

anyway, a late night saved again by the Kobo community!

I have almost the same problem. Can anyone help me with this?
I install kobo on a clean ubuntu 20 and the environment won’t start.
Waiting for environment to be ready. It can take a few minutes.

Wait another 600 sec?
Below is the contents of the logs

enketo_express_1 | Worker 21468 sadly passed away. It will be reincarnated.
enketo_express_1 | Worker 21462 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-cache.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-cache.suroo.private’
enketo_express_1 | }
enketo_express_1 | Worker 21488 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-main.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-main.suroo.private’
enketo_express_1 | }
enketo_express_1 | Worker 21530 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-main.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-main.suroo.private’
enketo_express_1 | }
enketo_express_1 | Worker 21544 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-cache.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-cache.suroo.private’

@mgonzalez There should be 2 instances of redis: Ie Redis main and redis cache

what do you get when you run

sudo python3 run.py --logs

1 Like

I am facing the same issue. Tried the solution posted in this thread, but in vain. Any help would be great. The instance was running fine till yesterday and suddenly this is happening. The data collection in the field is stopped because of this.

Please find the logs below:

enketo_express_1 | 13:34:21 0|enketo | Worker 4999 sadly passed away. It will be reincarnated. enketo_express_1 | 13:34:21 0|enketo | Worker 383 ready for duty at port 8005! (environment: production) enketo_express_1 | 13:34:21 0|enketo | Error: Redis connection to redis-main.odk-collect.digitalgreen.private:6379 failed - connect ECONNREFUSED 172.18.0.5:6379 enketo_express_1 | 13:34:21 0|enketo | at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16) { enketo_express_1 | 13:34:21 0|enketo | errno: ‘ECONNREFUSED’, enketo_express_1 | 13:34:21 0|enketo | code: ‘ECONNREFUSED’, enketo_express_1 | 13:34:21 0|enketo | syscall: ‘connect’, enketo_express_1 | 13:34:21 0|enketo | address: ‘172.18.0.5’, enketo_express_1 | 13:34:21 0|enketo | port: 6379 enketo_express_1 | 13:34:21 0|enketo | }

But, i can see that redis-main and redis-cache is ready to accept connections in the respective container logs

1:M 11 Jul 13:33:14.488 # Server started, Redis version 3.2.12
1:M 11 Jul 13:33:14.492 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add ‘vm.overcommit_memory = 1’ to /etc/sysctl.conf and $
hen reboot or run the command ‘sysctl vm.overcommit_memory=1’ for this to take effect.
1:M 11 Jul 13:33:14.492 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the $
ommand ‘echo never > /sys/kernel/mm/transparent_hugepage/enabled’ as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is
disabled.
1:M 11 Jul 13:33:14.493 * DB loaded from disk: 0.001 seconds
1:M 11 Jul 13:33:14.493 * The server is now ready to accept connections on port 6379

@stephenoduor stephenoduor Can you kindly take some time and help resolve the issue ?

@stephenoduor

Hi friend,
when run the command i see that the redis (cache and main) work on different port and the problem as shown below


but when run docker ps i see the same port on two container

Hello @Shadi , privately message me,
If possible will try to help you troubleshoot remotely

1 Like

Did you resolve the issue ? Have same here