How to prevent Linux SSH client from disconnecting using ServerAliveInterval
How to prevent Linux SSH client from disconnecting using ServerAliveIntervalUsually when I work on a remote server, I like having multiple ssh sessions to that remote server for mult-itasking. But sometimes the server, due to network issues or server keepalives not reaching the client and terminating the ssh session, so it becomes unresponsive to the ssh client. It is especially irritating when you need to re-initialize all ssh sessions to that specific server.
I compiled some information on how to use keepalives on ssh client side to either detect server becoming unresponsive that leads to freezing ssh sessions or to prevent either of the ends from distroying an idle ssh session.
Prevent SSH Disconnect globally by using ServerAliveInterval in SSH configuration files
First method to use ssh keepalives (ServerAliveInterval) is to use the global configuration file /etc/ssh/ssh_config or /usr/local/etc/ssh/ssh_config by uncommenting/adding the option:
Code:
...
ServerAliveInterval 30
...
30 being the number of seconds at which probes have to be sent.
Prevent SSH Disconnect by using ServerAliveInterval in ~/.ssh/config file
The second method to use ssh keepalive is in the user's ssh client configuration file ~/.ssh/config. This method is good when you don't have permissions to change the global ssh configuration file:
Code:
Host *
ServerAliveInterval 30
This sets the keepalive interval to 30 seconds for all hosts. Of course, it can be used also on per-host basis.
Prevent SSH Disconnect by using ServerAliveInterval option in SSH cli command
The third method to use ssh keepalive is by supplying the option to ssh command line:
Code:
$ ssh -o ServerAliveInterval=10 user@remote-ssh-server-ip
More information about ssh ServerAliveInterval option
This directive instructs the ssh client to use the ssh encrypted channel for sending the probes. It means that the client sends an encrypted message to the server, the server replies and then the client acknowledges. These messages are encrypted using parameters negotiated at the establishment of the ssh session.
ServerAliveInterval works in tandem with ServerAliveCountMax (default 3). The client will send probes to the server every "ServerAliveInterval" seconds. If the "ServerAliveCountMax"rd probe does not receive a response from the ssh server, the client will disconnect. As mentioned, this has the purpose of both preventing the ssh session from becoming idle and to be terminated by client/server or some firewall in between and detecting an unresponsive server sooner and preventing the ssh session to freeze.
Both ssh options are described in "man ssh_config". Quote:
Quote:
ServerAliveCountMax
Sets the number of server alive messages (see below) which may be sent without ssh(1) receiving any messages back from the server. If this threshold
is reached while server alive messages are being sent, ssh will disconnect from the server, terminating the session. It is important to note that the
use of server alive messages is very different from TCPKeepAlive (below). The server alive messages are sent through the encrypted channel and thereâ€
fore will not be spoofable. The TCP keepalive option enabled by TCPKeepAlive is spoofable. The server alive mechanism is valuable when the client or
server depend on knowing when a connection has become inactive.
ServerAliveInterval
Sets a timeout interval in seconds after which if no data has been received from the server, ssh(1) will send a message through the encrypted channel
to request a response from the server. The default is 0, indicating that these messages will not be sent to the server, or 300 if the BatchMode
option is set. This option applies to protocol version 2 only. ProtocolKeepAlives and SetupTimeOut are Debian-specific compatibility aliases for
this option.
TCPKeepAlive
Specifies whether the system should send TCP keepalive messages
to the other side. If they are sent, death of the connection or
crash of one of the machines will be properly noticed. However,
this means that connections will die if the route is down tempo-
rarily, and some people find it annoying.
The default is ``yes'' (to send TCP keepalive messages), and the
client will notice if the network goes down or the remote host
dies. This is important in scripts, and many users want it too.
To disable TCP keepalive messages, the value should be set to
``no''.
Bottom line is that with a ServerAliveInterval of 30 and default ServerAliveCountMax, it means that after 90 seconds of server being unresponsive, the client will terminate the ssh session instead of freezing it until tcp keepalives kick in.
What is the difference between ssh keepalives and tcp keepalives?
As stated above, ssh server alive interval allows the ssh client to use the ssh encrypted channel to send probes and receive the response from the server.
Tcp Keepalive is TCP mechanism of sending keepalives once the kernel detects the tcp session (ssh runs over tcp) after configurable amount of seconds. It is a lower level mechanism. The downside is that tcp keepalive, by default, kicks in after high amount of seconds. Linux defines this under "tcp_keepalive_time" and FreeBSD under "net.inet.tcp.keepidle" kernel parameters.
Another disatvantage of the tcp keepalive mechanism in kernel is that the messages can be spoofed by an attacker and could prevent the parties from terminating a faulty ssh session.
Below is tcp keepalive for an idle ssh session under FreeBSD where "net.inet.tcp.keepidle: 600000" = 600 seconds.
Code:
...
00:40:41.130022 IP 10.1.1.1.22 > 192.168.1.4.39507: Flags [P.], seq 3298:3362, ack 4088, win 49, options [nop,nop,TS val 557287680 ecr 2501000757], length 64
00:40:41.229270 IP 192.168.1.4.39507 > 10.1.1.1.22: Flags [.], ack 3362, win 4107, options [nop,nop,TS val 2501001352 ecr 557287680], length 0
00:50:41.129352 IP 192.168.1.4.39507 > 10.1.1.1.22: Flags [.], ack 3362, win 4107, length 0
00:50:41.184584 IP 10.1.1.1.22 > 192.168.1.4.39507: Flags [.], ack 4088, win 49, options [nop,nop,TS val 557437692 ecr 2501001352], length 0
The above output shows the last data packets terminating at 00:40:41 timestamp. After 10 minutes the client's tcp keepalive mechanism in FreeBSD kernel kicks in.
Now for an SSH serveralive output:
Code:
# ssh -o TCPKeepAlive=yes -o ServerAliveInterval=15 10.1.1.1
Code:
...
01:22:19.965455 IP 10.1.1.1.22 > 192.168.1.4.30178: Flags [P.], seq 3298:3362, ack 4088, win 49, options [nop,nop,TS val 557912384 ecr 2503499100], length 64
01:22:20.065124 IP 192.168.1.4.30178 > 10.1.1.1.22: Flags [.], ack 3362, win 4107, options [nop,nop,TS val 2503500188 ecr 557912384], length 0
01:22:35.211919 IP 192.168.1.4.30178 > 10.1.1.1.22: Flags [P.], seq 4088:4152, ack 3362, win 4107, options [nop,nop,TS val 2503515334 ecr 557912384], length 64
01:22:35.267361 IP 10.1.1.1.22 > 192.168.1.4.30178: Flags [P.], seq 3362:3394, ack 4152, win 49, options [nop,nop,TS val 557916210 ecr 2503515334], length 32
01:22:35.367115 IP 192.168.1.4.30178 > 10.1.1.1.22: Flags [.], ack 3394, win 4107, options [nop,nop,TS val 2503515490 ecr 557916210], length 0
01:22:20 is the timestamp of last data packet. 15 seconds later, three more messages are exchanged: 1. is the ssh encrypted probe from ssh client to server. 2 is the response from ssh server to client. 3 is the tcp ACK send by the client (TCP requires every segment to be acknowledged). This 3rd message is sent by client's kernel.