Despite all fragments of packets arriving in a timely fashion across the cluster private
interconnect, ip reassembly is failing, pruning the fragmentation list erroneously when
ill->ill_frag_count underruns due to read/modify write races between the various threads
which update it across multiple ill_frag_hash_tbl buckets.
The underlying syndrome is described in CR 6534479.
At the moment ill->ill_frag_count is a best efforts approximation, but underruns cannot
be allowed to call ill_frag_prune(), or it will cost us one or more packets for which
the fragments are all fully available and ready for reassembly. On a busy Oracle RAC cluster
interconnect these underruns are EXTREMELY regular and Oracle detects "lost blocks" which it
must try to recover.
(Oracle RAC uses UDP and performs timer based recovery) severely impacting the transaction
performance on SunCluster). The need to recover the occasional packet due
to a checksum error is understood and is rare enough on the interconnect not to be
a significant performance penalty (modulo bad network hardware).