OpenSolaris

Printable Version Enter a New Search
Bug ID 6725139
Synopsis OOW problem still present after a patch 127888-09 has been applied
State 10-Fix Delivered (Fix available in build)
Category:Subcategory network:ipfilter
Keywords OOW | dropped | ipfilter | packet | rtiq_reviewed
Responsible Engineer Alexandr Nedvedicky
Reported Against 5.10
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_100
Fixed In snv_100
Release Fixed solaris_nevada(snv_100) , solaris_10u7(s10u7_02) (Bug ID:2167515)
Related Bugs 6723612 , 6743637
Submit Date 11-July-2008
Last Update Date 14-October-2008
Description
After installing the ipfilter patch 127888-09 intermittent issues with packets being dropped and OOW messages being recorded in the ipfilter logs.
I was able to reproduce the bug in my lab by using rulest as follows:
      block in log on e1000g1 all
      pass in on e1000g1 proto tcp from any to any port=5001 keep state
      block out on e1000g1 all

and iperf network benchmark tool. the iperf server was running on the box with
ipfilter.

the iperf client has been executed by command as follows on client host:
     ./iperf -w 1M -c 172.16.1.1 -t 10 -P 100
the options are as follows:
      -w 1M - use 1MB TCP window
      -c 172.16.1.1 - client mode, connect to 172.16.1.1
      -t 10 - each connection from client will last 10 seconds
      -P 100 - create 100 client threads (100 parallel connections)

after a while I could see OOW log entries in IPF log. We already know the OOW
packets (Out of Window packets) are reported fr_tcpinwindow() function.

there is a condition, which looks as follows:
        #define MAXACKWINDOW 66000
        if ((SEQ_GE(fdata->td_maxend, end)) &&
            (SEQ_GE(seq, fdata->td_end - maxwin)) &&
            (-ackskew <= (MAXACKWINDOW << fdata->td_winscale)) &&
            ( ackskew <= (MAXACKWINDOW << fdata->td_winscale))) {
                /* packet fits TCP window
if condition above is true, then packet fits into TCP window and vice versa. 
the condition above can be broken into subterms as follows:
       ((SEQ_GE(fdata->td_maxend, end)) [1]
              end = SeqNo + payload length
              maxend the right TCP window edge the max limit
              end <= maxend
        (SEQ_GE(seq, fdata->td_end - maxwin)) [2]
              seq = seqNo
              fdata->td_end the last ACKed octet
              maxwin - the window size
              fdata->td_end - maxwin = defines the left TCP window edge the
              lower bound - seqNo must greater or equal
        (-ackskew <= (MAXACKWINDOW << fdata->td_winscale) [3]
        ( ackskew <= (MAXACKWINDOW << fdata->td_winscale))) [4]
              this defines lower and upper bound for ACK numbers

the terms [1], [2] define window for sequence number. the sequence number must
fit into interval defined by [1], [2]. The analyzis of data I got from testcase
described at the top proved the packet was dropped because of [2] term was not
met.

The further analyzis of TCP stream showed this happens always for retransmitted
packet already ACKed by dst host. The chance, the condition will be triggered,
increases with load of IPF host.

the time diagram looks as follows:

T + 0:
Client:
Client Sends Data, sets retran timer to T + 5

T + 1:
IPF:
Data seen by IPF forwarded to Server

T + 2:
Server:
Server receives data, it's busy it will be on due with sending ACK

T + 4
Server:
Server ACKs data received in T + 4

T + 5
IPF:
ACK from server seen by IPF. IPF adjusts the left edge (fdata->td_end = ack) and
forwards packet.

Client:
Retransmission timer times out. The data packet sent in T + 0 is retransmitted.

T + 6:
IPF:
The retransmission is seen by IPF. It does not fit TCP window, since IPF
already knows the packet sent in T + 0 has been already ACKed in T + 5. The
packet's SeqNo is below the lower bound.

Client: Receives ACK from server

as you can see the OOW in this case is not harmful. the only harm is some log
noise.  It is rather caused by natural behaviour of TCP. The only way to 'fix
it' is to allow user to distinguish what happened. The IPF will log "OOW NEG"
(negative OOW) for case described by timediagram above (the SeqNo < TCP window
lower bound, term [2]), "OOW POS" (positive OOW) will be logged any time packet
does not fit window because it exceeds right edge of TCP window term [1].
Work Around
N/A
Comments
N/A