Hedgedoc not responding / node process consuming 100% cpu

Hello all, maybe you can give me some hints?

My version of HedgeDoc is: 1.7.2

What I expected to happen:

Hedgedoc working normally

What actually happened:

After 3 years of running hedgedoc without any issues, now hedgedoc is not responding. (One time, I got the message in the browser: “I’m busy right now. Please try again later”. But now only timeouts).

The node process is consuming 100% cpu.
Could you provide me with hints where to look for troubleshoting?

I already tried:

Restart

Additional Info:

2 days ago (before restart) there were some of these messages in the log (not sure if connected to the issue):
Jun 01 01:09:36 yarn[3318013]: 2024-05-31T23:09:36.139Z error: Operation timeout
Jun 01 01:09:36 yarn[3318013]: 2024-05-31T23:09:36.139Z error: read history failed: SequelizeConnectionAcquireTimeoutError: Operation timeout

After the last restart, the log file shows many entries like this, which seems to be normal:
yarn[3478]: 2024-06-03T08:32:29.568Z info: deserializeUser: 16907837-22fe-4447-b148-86f55565ae81

But now also sometimes these messages:
MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 disconnect listeners added to [Socket]. Use emitter.setMaxListeners() to increase limit
MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 operation listeners added to [Socket]. Use emitter.setMaxListeners() to increase limit
MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 get_operations listeners added to [Socket]. Use emitter.setMaxListeners() to increase limit
MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 selection listeners added to [Socket]. Use emitter.setMaxListeners() to increase limit

Regards
Dirk

Dear all,

it turned out that client accesses from a specific ip address caused server overload.

After blocking that ip address in the firewall, the hedgedoc service is functioning normally again.

What I don’t know at this time: Is/was it an attack, or a somehow malfunctional client?

In case I get new information, I will let the forum know.

Regards
Dirk

Dear all,

now I know that it wasn’t an attack.

The accesses from one user somehow overloaded the server.

He said, he had many notes open simultaneously. About 50.

But, can this really be the reason? 50 doesn’t sound that much to me.

Is there any configuration value that I could increase, to make more simultaneously opened notes possible?

Regards
Dirk

Hi @dirk and welcome to the HedgeDoc community!

50 open notes shouldn’t be a problem usually — the demo instance for example always has about 200 notes opened simultaneously. Of course the load also depends on your hardware, but HedgeDoc in general has quite a low resource usage. Maybe the amount of open file descriptors or other (u)limits in the system were reached and this caused some malfunctions and issues?

Regarding your question about the HedgeDoc configuration: There’s the variable CMD_TOOBUSY_LAG which defines the maximum CPU time in milliseconds for one tick of the internal processing loop. It defaults to 70 ms and usually that’s fine. You can play around with the value a bit — higher values may result in a lesser frequency of “I’m busy right now” messages but increase the overall lag of the application. If this was a single situation however and your instance is running fine otherwise, I wouldn’t change too much.

Dear Erik,

thank you very much for your response!

The problem went away after the user closed all notes in their browser, and until now, the problem didn’t come back.

So at the moment I hope it stays this way. If not, I’m going to look into OS limits or playing with the CMD_TOOBUSY_LAG - variable.

Kind Regards
Dirk