OpenSolaris

Printable Version Enter a New Search
Bug ID 4509869
Synopsis IPMP's address move mechanism needs to be transparent to applications
State 10-Fix Delivered (Fix available in build)
Category:Subcategory network:ipmp
Keywords clearview
Responsible Engineer Peter Memishian
Reported Against s81_47
Duplicate Of
Introduced In solaris_9
Commit to Fix snv_107
Fixed In snv_107
Release Fixed solaris_nevada(snv_107)
Related Bugs 4509788 , 6783149 , 6783150 , 6783151 , 6783152 , 6783153
Submit Date 2-October-2001
Last Update Date 28-January-2009
Description
IPMP's interface failover/failback is not transparent
to applications. Furthermore the way in which applications are notified
about  interface address transfer(one interface down, followed by another
one up) breaks normal routing protocols.  The goal of IPMP appears to be to
make interface switching transparent.  With static or passive routing, this
appears to work.  However, if we're advertising routes to other systems,
then we're obligated to notify those other systems of interface failures
that cause loss of connectivity.  Since IPMP advertises interface
failures to IP, we have to act on these, and our action will stop peers
from sending data to us for a potentially very long time (due to
common route flap hold-downs).

This problem will affect public-domain routing daemon applicalions like GateD and 
Zebra running on Solaris with IPMP enabled.

As an example, lets consider follwoing setup with router A running a
RIP implemantation:

     192.168.21.0                            192.168.25.0
   ------------  le0 router A --- qfe1    -------------- rtr B 
                              --- qfe2
                              --- qfe3

As you can see the router's qfe1, qfe2, qfe3 interfaces are all hooked
up to subnet 192.168.25.0
qfe1 interface address is 192.168.25.1
qfe2 interface address is 192.168.25.2
qfe3 interface address is 192.168.25.3

Furthermore lets say I have configured router A such that its qfe1,
qfe2, qfe3 interfaces belong  a group called "test" This results in:
qfe1:1 test address is 192.168.25.101
qfe2:1 test address is 192.168.25.102
qfe3:1 test address is 192.168.25.103

Lets also assume that before failover, the router A only uses qfe1 to
multicast RIP messages to routers on 192.168.25 subnet Thus the
(src,dst) address pair of these RIP messages is (src= 192.168.25.1, dst
= 224.0.0.2)

Lets assume there is a failover from qfe1 to qfe3 A RIP routing daemon implementation
at first receives a "interface qfe1 is down" message, upon which the routing daemon
flushes all route entries relating to qfe1 and starts sending messages on
qfe2 ( src = 192.168.25.2, dst= 224.0.0.2). This entails sending updated RIP messages
to neighboring routers. It then gets a " new address on
qfe3" routing messageI am assuming that RIP messages are now being sent.
But it ignores qfe3 since its on the same subnet as qfe3. The daemon
cannot reap the benefits of the failover (and continue to send with qfe1's IP
address as src address and have it sent out qfe3) because it does not get
a "interface qfe1 has changed to qfe3" notification.

The same problem occurs at failback from qfe3 to qfe1. Once again the router
first receives the "interface qfe1 is up" message and " interface qfe3
has a new IP address" routing socket messages from the kernel. But it ignores both messages
(since qfe1 and qfe3 are redundant to the now active qfe2 interface) and continues to
send RIP messages via qfe2's interface. Again it cannot take advantage of the failback,
because it does not recieve the "interface qfe3 has changed to qfe1" notification.

Worse still, many common routing implementations (e.g., Cisco IOS)
will see our two suspiciously similar updates and decide to put us on
a black list for a few *minutes* in order to prevent propagation of
whatever bug we've got.  This will cause long-lived outages.

Please note that it would be best to make address transfer completely transparent to
applications,in which case application dont need to do anything after the occurance of
an address transfer (hence existing public-domain routing daemon implementations will be 
able to take advangage of what IPMP has to offer, without requiring any change)  
 
 xxxxx@xxxxx.com 2002-07-25

I presume that:

	"But it ignores qfe3 since its on the same subnet as qfe3"

... should actually read "... same subnet as qfe2".  Also, I'm confused: why
would failback cause a RTM_NEWADDR message for qfe3?  Seems like it should
cause an RTM_DELADDR for qfe3 and an RTM_NEWADDR for qfe1 (this is of course
assuming that the ipif associated with the address being failed back is IFF_UP).
Work Around
N/A
Comments
[ Moved from Suggested Fix section ]

There have been some discussion on this issue, and 2 contending cadidate solutions are:

1. Have a virtual list of interfaces.  IPMP would then export the real interfaces in the group as though they were interface aliases through
this interface.

2. Have a new mechanism that allows in.mpathd to  ping over interfaces that are
marked as ~IFF_UP.  This would allow in.mpathd to do its testing without having the interface table lie to anyone -- we would see just the primary interfaces
(no aliases) coming and going as failures and recoveries are seen.
*** (#1 of 1): [ UNSAVED ]  xxxxx@xxxxx.com
"Fix affects docs" has been checked because the fix to this involves
adding an IPMP IP interface which in turn affects the documentation.
The associated manpage changes are part of the larger set of manpage
changes for 6783149.