I recently installed a new Axigen 10.3 instance on Debian 10. This was a migration from a 10.2 instance running on Windows, but I used the “automatic migration” feature, so the original server is inconsequential for this discussion, and really could have been anything.
On my new Axigen 10.3/Debian 10 instance, I am seeing periodic problems with the IMAP service becoming unresponsive. This starts affecting some users on some clients and then eventually spreads to more users and the only solution I have been able to find so far is a full restart of Axigen.
It appears potentially related to SSL/TLS termination. I suspect this because the clients get caught up on the connecting phase of IMAP and my Axigen server’s imap.txt log file starts showing log entries suggesting SSL problems. These appear as messages that look like the following (which are just some examples):
I currently have to manually shut down Axigen and restart it to resolve this problem once it manifests. Incidentally, another interesting wrinkle is that when the server is in this problematic state, a shutdown takes ~30 seconds, whereas it will normally shutdown in ~2 seconds when operating normally.
Configuration details:
Axigen 10.3.
Debian 10.1. All up to date.
Deployed on a DigitalOcean VM.
Using Let’s Encrypt certificates.
Everything works fine normally, but eventually gets into this degraded state where IMAP is unresponsive.
Other services such as WebMail and WebAdmin respond just fine when IMAP is not responding correctly.
One further data point: When IMAP becomes unresponsive, if I attempt to restart the IMAP service using the Service Management UI in WebAdmin, the restart request will freeze up, leading to endless wait on the server.
The only thing that will restore IMAP is a full Axigen restart. And the issue will re-appear eventually.
We have received similar reports and we have found that the problem is related to the available entropy on the server.
You could check your current situation with the following command (values under 100
cat /proc/sys/kernel/random/entropy_avail
Due to the fact that several libraries have been upgraded in Axigen 10.3 it seems that some of them switched to use the blocking /dev/random source of randomness. If you are interested into this subject you could read here more about this topic.
We have plans to issue a fix for this behavior (like returning to use of /dev/urandom for the libraies we are using internally). In the meantime you should be able to use the following workarounds:
1/ increase the available entropy via haveged - details here
2/ disable “clients” that consumes high levels of entropy (like disabling, temporarily, cram-md5 and digest-md5 authentication methods on secure connections from affected services)
As soon we’ll have available the final fix we’ll provide an update on this topic as well.
We are very sorry for the inconvenience this may have caused to you.
Thank you for the very detailed and targeted response. I will look into the workarounds; they sound very promising. And I am looking forward to the upcoming fix in a future Axigen build.