OpenSolaris

Printable Version Enter a New Search
Bug ID 6686647
Synopsis smbsrv scalability impacted by memory management issues
State 10-Fix Delivered (Fix available in build)
Category:Subcategory kernel:cifs
Keywords spbc_s10uX
Responsible Engineer Jose Borrego
Reported Against
Duplicate Of
Introduced In solaris_nevada
Commit to Fix snv_89
Fixed In snv_89
Release Fixed solaris_nevada(snv_89)
Related Bugs 6499454 , 6583545 , 6617183
Submit Date 10-April-2008
Last Update Date 4-June-2008
Description
For each served buffer smbsrv calls smb_net_txb_alloc,smb_net_txb_free
and the kmem_alloc/kmem_free on smb_txbuf_t. 

::sizeof smb_txbuf_t
sizeof (smb_txbuf_t) = 0x20020
::print -t smb_txbuf_t
{
    uint32_t tb_magic 
    list_node_t tb_lnd {
        struct list_node *list_next 
        struct list_node *list_prev 
    }
    int tb_len 
    uint8_t [131075] tb_data 
}


Since this is bigger than KMEM_MAX_BUF (32K) then all allocation
come from the kmem_oversize_arena.  THis leads to tld shoot down on deallocation.
and a  xc storm.

On a 16-core AMD system this leads to a limit of less than 5000 CIFS/sec
(smaller systems might do more as the tlb shootdown is faster).
and we see :

mpstat 1
...
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0    0   0 86818 76651  137 2224    0  548 5680    0    86    0  61   0  39
  1    0   0 80594 72460    0 1950    0  462 5191    0     7    0  58   0  42
  2    0   0 81362 80449 1614 1335    2  303 5120    0     3    0  62   0  38
  3    0   0 66502 56961    1 1507    0  286 4435    0     2    0  46   0  54
  4    0   0 50150 46993    1 1386    0  209 3441    0    17    0  37   0  63
  5    0   0 28144 26409    0  900    0   99 2048    0     5    0  20   0  80
  6    0   0 23349 21449    0  673    0   54 1856    0     7    0  17   0  83
  7    0   0 14070 13182    0  269    0   19  999    0     5    0  10   0  90
  8    0   0 22922 26600  201 2593    0  150 1551    0   580    0  20   0  80
  9    0   0 13136 14848    0 1358    1   40  946    0   174    0  10   0  90
 10    0   0 29018 48037 6444 1113    6   98 1951    0     2    0  29   0  71
 11    0   0 31980 45129 1979  796    0   97 2297    0     3    0  33   0  67
 12    0   0 20144 20790    0 1879    0   50 1090    0    10    0  15   0  85
 13    0   0 4780  5686    0  630    0   22  306    0     2    0   4   0  96
 14    0   0 36031 44722 1991  809    0   82 1943    0     1    0  32   0  68
 15    0   0 30141 43479 1995  756    0   77 1895    0     3    0  31   0  69

Using 
dtrace -n 'profile-1ms{@a[stack(20)]=count();@c=count()} END{trunc(@a,20)} smb_net_txb_send:entry{@b[probefunc]=count()}'

I found my system 23% loaded with the xc (possibly more due to interrupts being off).

             unix`mutex_delay_default+0xa
              unix`lock_set_spl_spin+0xbf
              unix`mutex_vector_enter+0x46c
              unix`xc_do_call+0xdb
              unix`xc_call+0x2b
              unix`hat_tlb_inval+0x1c2
              unix`x86pte_inval+0xa9
              unix`hat_pte_unmap+0xfc
              unix`hat_unload_callback+0x148
              unix`hat_unload+0x41
              unix`segkmem_free_vn+0x73
              unix`segkmem_free+0x23
              genunix`vmem_xfree+0x10c
              genunix`vmem_free+0x25
              genunix`kmem_free+0x47
              smbsrv`smb_net_txb_free+0x1c
              smbsrv`smb_net_txb_send+0x15f
              smbsrv`smb_session_send+0x167
              smbsrv`smbsr_send_reply+0x133
              smbsrv`smb_dispatch_request+0x656
             6086

Using a kmem_cache should get rid of this.
Work Around
N/A
Comments
N/A