Today I found an online server in monitoring channel, alert saying Timtout connection
to xxx.xxx.com which is one of our production entrance servers and then the story began …
1. Phenomenon and disk issue?
- It took me
over 4 seconds
to SSH connect to this production server. For other production servers can be connected in less than 1 second. I also notice there are 50% packet loss to the target server. - Since this entrance server is very lack of disk, initially I was thinking it’s
disk issue
, so I deleted some files and then restart the process.However, doesn't help
. I started to think, it could be a network issue. - I noticed
kern.log
has error as next, and I steady confirmed it must be a network issue.nf_conntrack: table full, dropping packets
.
2. Solve the problem
After Googling it, I knew that conntrack
is for stateful firewall.
Pls read Netfilter’s connection tracking system if you are interested. It also include the Netfilter
framework basic.
So, in one word, conntrack
is created to record connection state to inspect into traffic and avoid DDoS security issue.
2.1. Just tell me how to solve it
From the error above, we can know conntrack table is full.
How to review the table size? By typing cat /proc/sys/net/netfilter/nf_conntrack_count
. We can get the size.1
2root@localhost:/# cat /proc/sys/net/netfilter/nf_conntrack_count
76390
What’s the maximum size? You can get it by typing cat /proc/sys/net/netfilter/nf_conntrack_max
.
Let’s just increase it. Recommended size: CONNTRACK_MAX = RAMSIZE (in bytes) / 16384 / (ARCH / 32)
. Eg, I have 8GB RAM in x86_64 OS, so I made it as 8*1024^3/16384/2=262144
, which is of course larger as the nf_conntrack_count
.1
2sysctl -w net.netfilter.nf_conntrack_max=262144
echo "net.netfilter.nf_conntrack_max=262144" >> /etc/sysctl.conf
Just after that, it works. Network latency becomes good now and no packet loss.
2.2. What if it really exceed this max limit?
- Option 1. We can remove the module of
state
, but that will make iptables not providing with full compatible APIs. - Option 2. We can use
RAW
iptable without usingCONNTRACK
feature.
RAW table is only applied toPREROUTING
andOUTPUT
chain. Since it has the highest priority (raw-->mangle-->nat-->filter
), so it can handle the connection before tracking mangement. Once after we handle the connection usingRAW
table, we will skipNAT table
andip_conntract
handler.
2.3. How to do without track state?
Review of IPtables,
iptables
has 4 tables and 5 chains as below graph:- Tables: categorized by different operations to data packets.
raw
: highest priority, only appied toPREROUTING
andOUTPUT
chain. When we don’t need to do NAT, we can use RAW table to increase performance.mangle
: modify certain data packetnat
: NAT, port mapping, address mappingfilter
: filter
- Chains: categorized by different hooks.
PREROUTING
: packet before going to route tableINPUT
: after packet passing route table, destination is current machineFORWARDING
: after packet passing route table, destination is not current machineOUTPUT
: packet comes from current machine and to outsidePOSTROUTIONG
: packet before going to network interface
- Tables: categorized by different operations to data packets.
Mark
UNTRACKED
connection will be accept:
CentOS: Change/etc/sysconfig/iptables
file, and appendUNTRACKED
after line ofRH-Firewall-1-INPUT
.
To make it as-A RH-Firewall-1-INPUT -m state --state RELATED,ESTABLISHED,UNTRACKED -j ACCEPT
Other Linux :1
$ sudo iptables -A FORWARD -m state --state UNTRACKED -j ACCEPT
Use
raw
table rules on these ports.1
2
3
4
5# mark destination port and source port as NOTRACK
$ sudo iptables -t raw -A PREROUTING -p tcp -m multiport --dport 80,81,82 -j NOTRACK
$ sudo iptables -t raw -A PREROUTING -p tcp -m multiport --sport 80,81,82 -j NOTRACK
$ iptables -t raw -A OUTPUT -p tcp -m multiport --dports 80,81,82 -j NOTRACK
$ iptables -t raw -A OUTPUT -p tcp -m multiport --sports 80,81,82 -j NOTRACKIf you have only one port, use
1
2
3
4$ iptables -t raw -A PREROUTING -p tcp -m tcp --dport 80 -j NOTRACK
$ iptables -t raw -A OUTPUT -p tcp -m tcp --sport 80 -j NOTRACK
$ iptables -t raw -A PREROUTING -p tcp -m tcp --sport 80 -j NOTRACK
$ iptables -t raw -A OUTPUT -p tcp -m tcp --dport 80 -j NOTRACK
3. Conclusion
Timeout connection
can’t be a disk issue, if it’s disk issue, it will reportServer Internal Error
from monitoring probe.Iptables 4 Table 5 Chains
: 4 table: raw–>mangle–>nat–>filter . 5 Chain: PREROUTING, INPUT, FORWARD, OUTPUT, POSTROUTING.- When we don’t need to do NAT, we can use
RAW table to increase performance
(eg. Web port). But we need extra DDoS protection method. Remember we need bidirectionalNOTRACK
setup onRAW
table. - Use
sysctl -w net.netfilter.nf_conntrack_max=262144
to solve it immediately. Size calculation, pls refer to above equation.
Ref:
- http://www.pc-freak.net/blog/resolving-nf_conntrack-table-full-dropping-packet-flood-message-in-dmesg-linux-kernel-log/
- http://people.netfilter.org/pablo/docs/login.pdf
- https://wiki.mikejung.biz/Sysctl_tweaks#net.core.netdev_max_backlog
- http://blog.51cto.com/wushank/1171768
- http://www.361way.com/%E5%86%8D%E7%9C%8Bnf_conntrack-table-full%E9%97%AE%E9%A2%98/2404.html