SRE Blog

Better living through reliability.

GoroutinesTooHigh Alert

2020-08-11

Our next alert (alphabetically) in our series on base alerts is GoroutinesTooHigh. This fires when a single instance of a server has many tens of thousands of goroutines open. This is very similar to the file descriptors alert.

"Goroutines are functions or methods that run concurrently with other functions or methods. Goroutines can be thought of as light weight threads."1. They are much lighter weight than either threads or processes, as they are multiplexed across a smaller number of threads. This means that you can run 100,000s or 1,000,000s of goroutines simultaneously and not necessarily have significant performance degradation (as you might using a similar number of threads).

However, most usage of goroutines do not use 100,000 or more goroutines, so having many 10,000s of goroutines is more likely than not an unexpected state (such as not properly closing sockets and leaking both goroutines and file descriptors).

Recommended threshold is a paging alert paging for more than over 20,000 goroutines for more than 5 minutes.

Batch jobs, cron jobs, etc. (even though they may be production) as a general rule should not have this alert configured.