Kobo install process failing - connection refused

mgonzalez · May 2, 2023, 11:01pm

We have been running kobo on our own server for almost a year now. (we are very happy and grateful for the project btw). Yesterday it stopped working so we ran * python3 run.py --stop* and again python3 run.py.
The process starts normally without errors, but stays stuck at Waiting for environment to be ready. It can take a few minutes

We have been getting the following error in the log and have no idea how to address it. Any pointers would be much appreciated:

enketo_express_1  | 21:23:27 0|enketo  | Worker 130 sadly passed away. It will be reincarnated.
enketo_express_1  | 21:23:43 0|enketo  | Worker 10 ready for duty at port 8005! (environment: production)
enketo_express_1  | 21:23:43 0|enketo  | Error: Redis connection to redis-cache.kobo.nuup.private:6380 failed - write EPIPE
enketo_express_1  | 21:23:43 0|enketo  |     at afterWriteDispatched (internal/stream_base_commons.js:156:25)
enketo_express_1  | 21:23:43 0|enketo  |     at writeGeneric (internal/stream_base_commons.js:147:3)
enketo_express_1  | 21:23:43 0|enketo  |     at Socket._writeGeneric (net.js:787:11)
enketo_express_1  | 21:23:43 0|enketo  |     at Socket._write (net.js:799:8)
enketo_express_1  | 21:23:43 0|enketo  |     at doWrite (_stream_writable.js:403:12)
enketo_express_1  | 21:23:43 0|enketo  |     at clearBuffer (_stream_writable.js:542:7)
enketo_express_1  | 21:23:43 0|enketo  |     at onwrite (_stream_writable.js:454:7)
enketo_express_1  | 21:23:43 0|enketo  |     at afterWriteDispatched (internal/stream_base_commons.js:159:5)
enketo_express_1  | 21:23:43 0|enketo  |     at writeGeneric (internal/stream_base_commons.js:147:3)
enketo_express_1  | 21:23:43 0|enketo  |     at Socket._writeGeneric (net.js:787:11) {
enketo_express_1  | 21:23:43 0|enketo  |   errno: 'EPIPE',
enketo_express_1  | 21:23:43 0|enketo  |   code: 'EPIPE',
enketo_express_1  | 21:23:43 0|enketo  |   syscall: 'write'
enketo_express_1  | 21:23:43 0|enketo  | }

The error keeps repeating. Our setup is the default with our own domain.

If we disable reddis, we get a similar error, but now ECONNREFUSED, but still nothing.

(Our DNS setup hasn’t changed and seems to be ok)

stephenoduor · May 3, 2023, 10:52am

Hello @mgonzalez , What do you get when you run this command

ps -ef | grep redis

mgonzalez · May 3, 2023, 3:43pm

Thanks for the repsonse. I get:

systemd+   50933   50911  0 15:41 ?        00:00:00 redis-server *:6379
systemd+   51021   50966  0 15:41 ?        00:00:00 redis-server *:6379
azureus+   54078   39032  0 15:42 pts/0    00:00:00 grep --color=auto redis

stephenoduor · May 3, 2023, 6:01pm

Obviously there is an issue with redis containers

mgonzalez · May 3, 2023, 6:04pm

Suggestions in how to fix it? As I mentioned before, even if we disable using reddis we get an error.

stephenoduor · May 3, 2023, 6:08pm

This is the solution, If you are able to ssh make changes below in your redis entrypoint.sh

mgonzalez · May 3, 2023, 6:46pm

This worked!! We are extremely grateful. (our server supporting small farmers in Mexico is back online thanks to you).

stephenoduor · May 3, 2023, 7:14pm

I understand what it means when such servers are down, I am happy that this worked for you

stephenoduor · May 3, 2023, 7:38pm

Hello guys, I have been getting a lot of private messages with regard to the error above or anything along the this lines below. so I am adding this Explanation/solution here to help the community:

redis-cache.domain.private:6380 failed … ECONNREFUSED 192.168.16.5:6380]econnrefused-192-168-16-5-6380/33985

You need to check whether main and redis cache are running on the same port ie port 6379
Run the command below

ps -ef | grep redis

if redis main and redis cache are running on the same port then, the issue is cause by redis

Stop kobotoolbox using commands below

sudo python3 run.py --stop

Replace the apt-get calls from redis/entrypoint.sh in you kobo-docker/redis/entrypoint.sh
as in the screenshot below, This changes drastically reduces the redis start-up time , hence avoid Enketo loop trying to reconnect

entryscript1821×275 23.6 KB
Start kobotoolbox using commands below

sudo python3 run.py

Kobotool box should now start without any issue

SebastianZ · May 5, 2023, 9:56pm

It is sudo python3 run.py

tolexy · May 24, 2023, 8:05pm

Thank you!! Thank you!!! I was doing a routine service account password rotation that required me to stop and restart the containers to load the new env variable. I have done this at least a dozen times with no issue. Then, of course, last night during the mid-night maintenance window…the containers wont come back up and enketo workers kept passing away because they couldn’t connect and I had absolutely NO idea why! After some choice language, I found your post. The steps worked perfectly, but man what a find @stephenoduor!!

Is there any idea why Redis/Redis cache all of a sudden were trying to use the same ports? Very strange all of a sudden they just conflicted…

anyway, a late night saved again by the Kobo community!

iskandar · May 29, 2023, 11:26am

I have almost the same problem. Can anyone help me with this?
I install kobo on a clean ubuntu 20 and the environment won’t start.
Waiting for environment to be ready. It can take a few minutes.
…
Wait another 600 sec?
Below is the contents of the logs

enketo_express_1 | Worker 21468 sadly passed away. It will be reincarnated.
enketo_express_1 | Worker 21462 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-cache.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-cache.suroo.private’
enketo_express_1 | }
enketo_express_1 | Worker 21488 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-main.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-main.suroo.private’
enketo_express_1 | }
enketo_express_1 | Worker 21530 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-main.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-main.suroo.private’
enketo_express_1 | }
enketo_express_1 | Worker 21544 sadly passed away. It will be reincarnated.
enketo_express_1 | Error: getaddrinfo ENOTFOUND redis-cache.suroo.private
enketo_express_1 | at GetAddrInfoReqWrap.onlookup [as oncomplete] (dns.js:71:26) {
enketo_express_1 | errno: -3008,
enketo_express_1 | code: ‘ENOTFOUND’,
enketo_express_1 | syscall: ‘getaddrinfo’,
enketo_express_1 | hostname: ‘redis-cache.suroo.private’

Снимок экрана 2023-05-29 153820795×196 4.29 KB

iskandar · May 29, 2023, 11:29am

stephenoduor · May 31, 2023, 10:05am

@mgonzalez There should be 2 instances of redis: Ie Redis main and redis cache

what do you get when you run

sudo python3 run.py --logs

ramaskanda · July 11, 2023, 1:31pm

I am facing the same issue. Tried the solution posted in this thread, but in vain. Any help would be great. The instance was running fine till yesterday and suddenly this is happening. The data collection in the field is stopped because of this.

Please find the logs below:

enketo_express_1 | 13:34:21 0|enketo | Worker 4999 sadly passed away. It will be reincarnated. enketo_express_1 | 13:34:21 0|enketo | Worker 383 ready for duty at port 8005! (environment: production) enketo_express_1 | 13:34:21 0|enketo | Error: Redis connection to redis-main.odk-collect.digitalgreen.private:6379 failed - connect ECONNREFUSED 172.18.0.5:6379 enketo_express_1 | 13:34:21 0|enketo | at TCPConnectWrap.afterConnect [as oncomplete] (net.js:1141:16) { enketo_express_1 | 13:34:21 0|enketo | errno: ‘ECONNREFUSED’, enketo_express_1 | 13:34:21 0|enketo | code: ‘ECONNREFUSED’, enketo_express_1 | 13:34:21 0|enketo | syscall: ‘connect’, enketo_express_1 | 13:34:21 0|enketo | address: ‘172.18.0.5’, enketo_express_1 | 13:34:21 0|enketo | port: 6379 enketo_express_1 | 13:34:21 0|enketo | }

But, i can see that redis-main and redis-cache is ready to accept connections in the respective container logs

1:M 11 Jul 13:33:14.488 # Server started, Redis version 3.2.12
1:M 11 Jul 13:33:14.492 # WARNING overcommit_memory is set to 0! Background save may fail under low memory condition. To fix this issue add ‘vm.overcommit_memory = 1’ to /etc/sysctl.conf and $
hen reboot or run the command ‘sysctl vm.overcommit_memory=1’ for this to take effect.
1:M 11 Jul 13:33:14.492 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the $
ommand ‘echo never > /sys/kernel/mm/transparent_hugepage/enabled’ as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is
disabled.
1:M 11 Jul 13:33:14.493 * DB loaded from disk: 0.001 seconds
1:M 11 Jul 13:33:14.493 * The server is now ready to accept connections on port 6379

ramaskanda · July 13, 2023, 9:47am

@stephenoduor stephenoduor Can you kindly take some time and help resolve the issue ?

Shadi · August 18, 2023, 3:51pm

@stephenoduor

Hi friend,
when run the command i see that the redis (cache and main) work on different port and the problem as shown below

but when run docker ps i see the same port on two container

stephenoduor · August 18, 2023, 6:11pm

Hello @Shadi , privately message me,
If possible will try to help you troubleshoot remotely

alexserv · April 27, 2024, 9:58pm

Did you resolve the issue ? Have same here