Personal tools
You are here: Home Bugs All Fwbuilder - firewall script hangs

Fwbuilder - firewall script hangs

by Herbert Straub last modified 2008-03-23 11:30
— filed under:

The command ip route neigh flush dev eth0 loops.

Ubuntu BugNr: 9106 - solved in Upstream version: iproute2-050816 see the iproute-flush for iproute version 20041019 (Ubuntu Hoary). The complete package for Ubuntu Hoary

Details: The filewall script, builded by fwbuilder, "hangs", if i start it after compiling. I see the problem with the line:

  ip route neigh flush dev eth0

The top command shows the looping process. The double -s option show the situation:

  ip -s -s route neigh flush dev eth0

        10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale

        *** Round 1, deleting 1 entries ***
        10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale

        *** Round 2, deleting 1 entries ***
        10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale

        *** Round 3, deleting 1 entries ***
        10.165.166.155 lladdr 00:09:5b:ee:72:55 ref 1 used 60/60/60 nud stale

    and so on...

Workarounds:

The del command can remove the entry:

  ip neigh del 10.165.166.155 dev eth0
  RTNETLINK answers: Invalid argument

The arp -d command can also stops the looping ip route process:

  ip neigh flush dev eth0 &
  [1] 27892
  for a in `arp -n | awk '/^[0-9]/ { print $1; }'`; do arp -d $a; done
  <RETURN>
  [1]+  Done                    ip neigh flush dev eth0

Possible Solutions:

This error situation is documented in the Debian BugNr: 282492 and all the tips above are from there. Wilfried Weissmann created a patch - but i don't know, if it works.

Fwbuilder:

The flush functionality came with Firewall Builder 2.0.4. The firewall script, created by fwbuilder, containing the following code:

  $IP -4 neigh flush dev eth0 >/dev/null 2>&1
  $IP -4 addr flush dev eth0 secondary label "eth0:FWB*" >/dev/null 2>&1
  $IP -4 neigh flush dev lo >/dev/null 2>&1
  $IP -4 addr flush dev lo secondary label "lo:FWB*" >/dev/null 2>&1

Patch for iproute

Another possible solution is a change of iproute, to avoid the loop situation. The source code for flush looks like:

              for (;;) {
                      if (rtnl_wilddump_request(&rth, filter.family, RTM_GETNEIGH) < 0) {
                              perror("Cannot send dump request");
                              exit(1);
                      }
                      filter.flushed = 0;
                      if (rtnl_dump_filter(&rth, print_neigh, stdout, NULL, NULL) < 0) {
                              fprintf(stderr, "Flush terminated\n");
                              exit(1);
                      }
                      if (filter.flushed == 0) {
                              if (round == 0) {
                                      fprintf(stderr, "Nothing to flush.\n");
                              } else if (show_stats)
                                      printf("*** Flush is complete after %d round%s ***\n", round, round>1?"s":"");
                              fflush(stdout);
                              return 0;
                      }
                      round++;
                      if (flush_update() < 0)
                              exit(1);
                      if (show_stats) {
                              printf("\n*** Round %d, deleting %d entries ***\n", round, filter.flushed);
                              fflush(stdout);
                      }
              }
      }

There is no way out of the loop, if the arp entry cannot be flushed. The for (;;) construction can be found in other .c files. In iproute.c there is a exit construction like the following patch:

    --- ip/ipneigh.c.orig   2005-07-26 16:10:40.850647298 +0200
    +++ ip/ipneigh.c        2005-07-26 16:11:09.302486025 +0200
    @@ -410,6 +410,7 @@
                    filter.flushe = sizeof(flushb);
                    filter.rth = &rth;
                    filter.state &= ~NUD_FAILED;
    +               time_t start = time(0);

                    for (;;) {
                            if (rtnl_wilddump_request(&rth, filter.family, RTM_GETNEIGH) < 0) {
    @@ -432,6 +433,12 @@
                            round++;
                            if (flush_update() < 0)
                                    exit(1);
    +                       if (time(0) - start > 30) {
    +                               printf("\n*** Flush not completed after %ld seconds, %d entries remain ***\n",
    +                              time(0) - start, filter.flushed);
    +                               exit(1);
    +                       }
    +
                            if (show_stats) {
                                    printf("\n*** Round %d, deleting %d entries ***\n", round, filter.flushed);
                                    fflush(stdout);

This is not ideal, but avoid a endless loop. The output of ip -s -4 neigh flush dev eth1 looks like:

    ... 

    *** Round 57934, deleting 1 entries ***

    *** Round 57935, deleting 1 entries ***

    *** Round 57936, deleting 1 entries ***

    *** Flush not completed after 31 seconds, 1 entries remain ***

The disadvantage of this solution is the 100% CPU consum for 30 seconds.

Writing to the maintainer of iproute2 - Stephen Hemminger at osdl.org. His answer:

    Thanks, this usually shows up when someone tries to run flush
    as non-root.  Some vendors added a check for getuid() != 0, but that
    fails in secure environments with capabilities and no root user.

    I'll probably just change it to try 10 times and give up.

And iproute2-050816 containig the following ChangeLog entry:

    2005-08-16  Stephen Hemminger  <shemminger@osdl.org>

        * Limit ip route flush to 10 rounds.
        * Cleanup ip rule flush error message

Ok and Piotr Roszatycki test this new version with Debian - see Debian Bug #282492 and the solution seems to be working. The output:

    localhost~ # ip neigh flush dev eth1
    *** Flush not complete bailing out after 10 rounds

Document Actions