Fwbuilder - firewall script hangs
The command ip route neigh flush dev eth0 loops.
Ubuntu BugNr: 9106 - solved in Upstream version: iproute2-050816 see the iproute-flush for iproute version 20041019 (Ubuntu Hoary). The complete package for Ubuntu Hoary
Details: The filewall script, builded by fwbuilder, "hangs", if i start it after compiling. I see the problem with the line:
ip route neigh flush dev eth0
The top command shows the looping process. The double -s option show the situation:
ip -s -s route neigh flush dev eth0
10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale
*** Round 1, deleting 1 entries ***
10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale
*** Round 2, deleting 1 entries ***
10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale
*** Round 3, deleting 1 entries ***
10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale
and so on...
Workarounds:
The del command can remove the entry:
ip neigh del 10.165.166.155 dev eth0 RTNETLINK answers: Invalid argument
The arp -d command can also stops the looping ip route process:
ip neigh flush dev eth0 &
[1] 27892
for a in `arp -n | awk '/^[0-9]/ { print $1; }'`; do arp -d $a; done
<RETURN>
[1]+ Done ip neigh flush dev eth0
Possible Solutions:
This error situation is documented in the Debian BugNr: 282492 and all the tips above are from there. Wilfried Weissmann created a patch - but i don't know, if it works.
Fwbuilder:
The flush functionality came with Firewall Builder 2.0.4. The firewall script, created by fwbuilder, containing the following code:
$IP -4 neigh flush dev eth0 >/dev/null 2>&1 $IP -4 addr flush dev eth0 secondary label "eth0:FWB*" >/dev/null 2>&1 $IP -4 neigh flush dev lo >/dev/null 2>&1 $IP -4 addr flush dev lo secondary label "lo:FWB*" >/dev/null 2>&1
Patch for iproute
Another possible solution is a change of iproute, to avoid the loop situation. The source code for flush looks like:
for (;;) {
if (rtnl_wilddump_request(&rth, filter.family, RTM_GETNEIGH) < 0) {
perror("Cannot send dump request");
exit(1);
}
filter.flushed = 0;
if (rtnl_dump_filter(&rth, print_neigh, stdout, NULL, NULL) < 0) {
fprintf(stderr, "Flush terminated\n");
exit(1);
}
if (filter.flushed == 0) {
if (round == 0) {
fprintf(stderr, "Nothing to flush.\n");
} else if (show_stats)
printf("*** Flush is complete after %d round%s ***\n", round, round>1?"s":"");
fflush(stdout);
return 0;
}
round++;
if (flush_update() < 0)
exit(1);
if (show_stats) {
printf("\n*** Round %d, deleting %d entries ***\n", round, filter.flushed);
fflush(stdout);
}
}
}
There is no way out of the loop, if the arp entry cannot be flushed. The for (;;) construction can be found in other .c files. In iproute.c there is a exit construction like the following patch:
--- ip/ipneigh.c.orig 2005-07-26 16:10:40.850647298 +0200
+++ ip/ipneigh.c 2005-07-26 16:11:09.302486025 +0200
@@ -410,6 +410,7 @@
filter.flushe = sizeof(flushb);
filter.rth = &rth;
filter.state &= ~NUD_FAILED;
+ time_t start = time(0);
for (;;) {
if (rtnl_wilddump_request(&rth, filter.family, RTM_GETNEIGH) < 0) {
@@ -432,6 +433,12 @@
round++;
if (flush_update() < 0)
exit(1);
+ if (time(0) - start > 30) {
+ printf("\n*** Flush not completed after %ld seconds, %d entries remain ***\n",
+ time(0) - start, filter.flushed);
+ exit(1);
+ }
+
if (show_stats) {
printf("\n*** Round %d, deleting %d entries ***\n", round, filter.flushed);
fflush(stdout);
This is not ideal, but avoid a endless loop. The output of ip -s -4 neigh flush dev eth1 looks like:
...
*** Round 57934, deleting 1 entries ***
*** Round 57935, deleting 1 entries ***
*** Round 57936, deleting 1 entries ***
*** Flush not completed after 31 seconds, 1 entries remain ***
The disadvantage of this solution is the 100% CPU consum for 30 seconds.
Writing to the maintainer of iproute2 - Stephen Hemminger at osdl.org. His answer:
Thanks, this usually shows up when someone tries to run flush
as non-root. Some vendors added a check for getuid() != 0, but that
fails in secure environments with capabilities and no root user.
I'll probably just change it to try 10 times and give up.
And iproute2-050816 containig the following ChangeLog entry:
2005-08-16 Stephen Hemminger <shemminger@osdl.org>
* Limit ip route flush to 10 rounds.
* Cleanup ip rule flush error message
Ok and Piotr Roszatycki test this new version with Debian - see Debian Bug #282492 and the solution seems to be working. The output:
localhost~ # ip neigh flush dev eth1
*** Flush not complete bailing out after 10 rounds

Previous:
python2.4-pgsql locale bug - Hoary
