Skip to content

Instantly share code, notes, and snippets.

@rxgpt
Last active July 15, 2022 00:51
Show Gist options
  • Select an option

  • Save rxgpt/7464f84e7876deba93adf4deb445bf13 to your computer and use it in GitHub Desktop.

Select an option

Save rxgpt/7464f84e7876deba93adf4deb445bf13 to your computer and use it in GitHub Desktop.
Sources of latency in Node.js, and how to diagnose them

I used to be responsible for several language agents at AppDynamics, including the Node.js agent. Here's what I learned (caveat: these notes came together in 2016-2017, and the Node world moves very quickly! The community is always adding new sources of telemetry and higher-level abstractions that make it easier to write good Node code.):

Diagnosing common performance problems in Node.js services

"Context" means the user request, route handler, middleware, 3rd-party module, helper, or callback where the problem is. (Many of the functions involved in a user request are anonymous, which means that further narrowing down context is a challenge).

The "best" signal for a particular problem is the one that detects the problem and rules out the most other possibilities (e.g., event loop max tick length can indicate several different problems, and there are better signals that would more quickly narrow down these specific problems).

Node.js performance problem What is the "best" signal for this problem? What to ask next if these signal are observed?
Long-running external calls/DB calls
  • External/DB call timing
  • Number of external/DB calls
  • What is the health of the external services/DBs?
  • Can I consolidate multiple exit calls into one?
  • What are the DB queries I'm using?
  • Where (i.e., in what context) are the calls occurring?
Memory leak
  • Steady increase in heap size
  • Steady increase in GC frequency
  • Which methods are contributing the most to heap growth?
  • Which objects have the most instances on the heap?
CPU intensive functions
  • CPU utilization spike
  • Event loop avg/max tick length spike
  • Steady increase in all libuv queue sizes
  • Am I doing JSON parsing on the main thread (e.g., using JSON.parse() or JSON.stringify()?)
  • Am I doing sorting/filtering/ranking/image processing on the main thread?
  • Am I doing other types of computation on the main thread?
  • If any of the above 3 is yes, in which context?
libuv threadpool saturation
  • process._getActiveRequests() spike
  • Am I using synchronous methods in the fs API?
  • Do I have a native module that is using uv_queue_work()?
  • Am I calling dns.lookup() (blocks libuv)?
  • Do I have a method that is making a lot of filesystem calls?
  • In what context are these calls occurring?
Large queue of timers, close events, or setImmediate() callbacks (these prevent the event loop from reaching the poll, timer, or close phases)
  • process._getActiveHandles() spike
  • All these events are stored in a single libuv queue and are distinguished by event type, so any more granular detection of this problem would need to be done in libuv or made available to the JS layer:
    • Timer callbacks: handles of type UV_TIMER
    • setImmediate() callbacks: handles of type UV_CHECK
    • Close events: handles where uv_is_closing is 1
  • Where am I calling setTimeout(), setInterval(), or setImmediate()? How frequently? Are these calls nested?
  • Are there modules or frameworks in my application that call these methods or emit close events?
Recursive calls to process.nextTick() (these prevent the event loop from reaching the poll phase)
  • nextTickQueue size spike & process._getActiveHandles() roughly constant
  • Does my code, or any included packages, have callbacks registered with process.nextTick() that are then registering callbacks with process.nextTick()?
Unused CPU cores
  • CPU utilization plateaus at some constant below 100% with increasing load.
  • What is my libuv threadpool size?
  • How many cluster/pm2 child processes does my app use?

Sources

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment