SRE Blog

Better living through reliability.

FileDescriptorsTooHigh Alert

2020-08-04

Our next alert (alphabetically) in our series on base alerts is FileDescriptorsTooHigh. This fires when a single instance of a server is using more file descriptors than is expected.

In Linux, file descriptors are used when opening files or sockets. They are a fixed resource, since Linux sets limits on the number a process may open at one time. This often impacts servers in that if you cannot open new sockets, you are likely returning errors to the client (due to inability to accept new connections, inability to open critical files, etc.).

A good example of this happening even in recent languages is in the golang http client. The response body must be closed by the caller (and you must not fat finger any edge cases) else the code will leak resources.1

Recommended threshold is a paging alert when over 20,000 file descriptors are open for more than 5 minutes. Usually a service will have WAY fewer than that open, but check your graphs accordingly beforehand.

Batch jobs, cron jobs, etc. (even though they may be production) as a general rule should not have this alert configured.