OpenSolaris

Printable Version Enter a New Search
Bug ID 6606991
Synopsis panic assertion failure !ill->ill_join_allmulti for multicast router
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:tcp-ip
Keywords clearview | rtiq_reviewed
Responsible Engineer Sebastien Roy
Reported Against snv_74
Duplicate Of
Introduced In solaris_10
Commit to Fix snv_103
Fixed In snv_103
Release Fixed solaris_nevada(snv_103)
Related Bugs 4502640
Submit Date 20-September-2007
Last Update Date 27-November-2009
Description
When testing multicast forwarding with mrouted (available at e.g., /net/npt.sfbay/export/surya/test/mcast_forward/mrouted3.9. Run as mrouted -d.)

While mrouted is running ifconfig a network interface down and/or unplumb it, and
most likely the following assertion fires:

> $c
vpanic()
assfail+0x7e(fffffffff7b71908, fffffffff7b71c48, 454)
ip_join_allmulti+0x11b(ffffff00caeeb948)
ill_recover_multicast+0xab(ffffff00caadaae8)
ipif_up_done+0x750(ffffff00caeeb948)
ip_arp_done+0x1af(ffffff00ceea5d80, ffffff00ceea6e50, ffffff00c98c1780, 0)
qwriter_ip+0x7a(ffffff00caadaae8, ffffff00ceea6e50, ffffff00c98c1780, 
fffffffff7a2fac0, 0, 0)
ip_wput_nondata+0x984(0, ffffff00ceea6e50, ffffff00c98c1780, 0)
ip_output_options+0x26f(0, ffffff00c98c1780, ffffff00ceea6e50, 2, 
fffffffffbd31c50)
ip_output+0x24(0, ffffff00c98c1780, ffffff00ceea6e50, 2)
ip_wput+0x39(ffffff00ceea6e50, ffffff00c98c1780)
putnext+0x2f1(ffffff00ceea63b0, ffffff00c98c1780)
ar_cmd_done+0x150(ffffff00cf3708b8)
ar_dlpi_done+0x8e(ffffff00cf3708b8, 100)
ar_rput_dlpi+0x226(ffffff00ceea42a8, ffffff00d46a4640)
ar_rput+0x791(ffffff00ceea42a8, ffffff00d46a4640)
qdrain_syncq+0x1e3(ffffff00c9efdf20, ffffff00ceea42a8)
drain_syncq+0x395(ffffff00c9efdf20)
putnext_tail+0xeb(ffffff00c9efdf20, ffffff00ceea42a8, 11)
putnext+0x5d9(ffffff00ceea4550, ffffff00c98c1960)
qreply+0x4c(ffffff00ceea4648, ffffff00c98c1960)
> fffffffff7b71908/s                  
0xfffffffff7b71908:             !ill->ill_join_allmulti
Core file available at /home/nordmark/cores/6606991
Work Around
N/A
Comments
The panic occurs because (as part of ipif_sioctl_netmask() or ipif_sioctl_addr()) the ip_join_allmulti() that is called as a result of ill_recover_multicast() notices that ip_leave_allmulti() was never called as part of bringing the interface down.  The ip_leave_allmulti() _should_ happen in ill_leave_multicast(), which gets called by ill_dl_down(), which gets called by ipif_down_tail() under a strict set of circumstances:

        if (ill->ill_wq != NULL && !ill->ill_logical_down &&
            ill->ill_ipif_up_count == 0 && ill->ill_ipif_dup_count == 0 &&
            ill->ill_dl_up) {
                ill_dl_down(ill);
        }

ip_sioctl_addr() calls ipif_down_tail(), so why does ip_leave_allmulti() never get called in this scenario?  It turns out that ill_logical_down is always set to 1 when ipif_down_tail() gets called by ipif_sioctl_netmask() and ipif_sioctl_addr(), and thus ill_dl_down() doesn't get called.