|
Description
|
When data is written to file and filling the file system (ufs), CU starts to delete older files which are fairly large (> 150GB). %sys time started to shot up, 'df -k' shown 100% full for hours.
ufs_delete() thread appeared to be busy reserving log space, repeatedly spin in ufs_trans_trunc_resv() in a loop calling ufs_log_amt():
ufs:ufs_log_amt+0x134()
ufs:ufs_trans_trunc_resv+0x80()
ufs:ufs_trans_itrunc+0xb4()
ufs:ufs_delete+0x39c()
ufs:ufs_thread_delete+0xc4()
unix:thread_start+0x4()
### Background:
### ----------
### This issue came about when CU was using NBU to backup to
### a UFS file system. When the file system was full, NBU will
### start deleting old files (These are large, a few hundred
### GBs). CU started to observe high %sys time on one CPU and
### 'df -k' showing disk space remaining 100% full for a long
### time, though the directory entry was no longer showing the
### old files.
### CU has offline all CPUs and left with one running. Under the
### same above scenerio, disk space got freed up very fast and no
### high %sys time was observed.
### Details
### -------
### This can be consistently reproduce by filling up the FS
### with a large file to, let's say 95% full. Then start
### writing a new file to it, when we hit ENOSPC, sleep for
### a short while, then retry again. When ENOSPC is reached,
### start delete the old large file.
### Immediately we will see the %sys time for the CPU where
### ufs_delete() thread for the FS is running shooting up to
### 100%. lockstat profiling will show that the system is
### calling ufs_log_amt() from the for() loop in ufs_trans_trunc_serv().
uts/common/fs/ufs/ufs_trans.c:
914 nchunks = 1;
915 for (; (resv = ufs_log_amt(ip, offset, resid, 1)) > ufs_trans_max_resv;
916 offset = length + (nchunks - 1) * resid) {
917 nchunks++;
918 resid = size / nchunks;
919 }
### Test program based on CU source, cu_cp.c:
/* cu_cp.c */
#include <stdlib.h>
#include <stdio.h>
#include <signal.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/lock.h>
#include <unistd.h>
#include <fcntl.h>
#include <errno.h>
int main(int argc, char *argv[])
{
int f1,f2;
int i,n;
/*char buf[8192];*/
char buf[262144];
if (plock(PROCLOCK) < 0)
perror("plock error");
if (argc != 3)
printf("arg is not 2\n");
f1 = open(argv[1], O_RDONLY|O_LARGEFILE);
if (f1 < 0){
fprintf(stderr, "%s: open failed.\n", argv[1]);
exit(1);
}
f2 = open(argv[2], O_RDWR|O_CREAT|O_LARGEFILE);
if (f2 < 0){
fprintf(stderr, "%s: open failed.\n", argv[2]);
exit(1);
}
for (;;) {
n = read(f1, buf, sizeof (buf));
if (n == 0)
return(0);
if (n < 0) {
perror("Read error.");
(void) close(f1);
(void) close(f2);
return (1);
}
while (write(f2, buf, n) == -1) {
if (errno == ENOSPC) {
fprintf(stderr, "write() ENOSPC. sleep/retry.\n"
);
sleep (1);
continue;
}
perror("Write error.");
(void) close(f1);
(void) close(f2);
return (1);
} /* while() */
fprintf(stderr, "write() ok.\n");
} /* for() */
}
/* end of cu_cp.c */
### In my test scenario in the lab, I was deleting an old 130GB
### test_b.out file, while cu_cp was creating a test_a.out file.
### stack dump on ufs_delete() thread for the FS is showing
### in ufs_trans_trunc_resv() all the time.
-- ktl0 regs data rp: 0x2a1003856e0
pc: 0x11b061c ufs:ufs_log_amt+0x100: mulx %g3, %l3, %g3
npc: 0x11b0620 ufs:ufs_log_amt+0x104: divx %g3, %l3, %g3
global: %g1 0x11c0a4c
%g2 0x1fff %g3 0x1
%g4 0x7ff %g5 0x3d
%g6 0x10 %g7 0x2a100385d40
out: %o0 0x102bb0f %o1 0x800
%o2 0 %o3 0x2000
%o4 0x30002c64000 %o5 0x102bb0f
%sp 0x2a100384f81 %o7 0x3c
loc: %l0 0x1 %l1 0x2080
%l2 0x4600 %l3 0x2000
%l4 0x2 %l5 0xfffffffffffffca0
%l6 0 %l7 0x2a100385790
in: %i0 0x30004e72e80 %i1 0x2057605feb
%i2 0x76e71 %i3 0x1
%i4 0 %i5 0
%fp 0x2a100385031 %i7 0x11b0788
<trap>ufs:ufs_log_amt+0x100(0x30004e72e80, 0x2057605feb, 0x76e71, 0x1, 0x0, 0x0)
ufs:ufs_trans_trunc_resv+0x80(0x30004e72e80, 0x0, 0x2a100385998, 0x2a100385990, 0x8, 0x8)
ufs:ufs_trans_itrunc+0xb4(0x30004e72e80, 0x0, 0x1)
ufs:ufs_delete+0x39c(0x3000004f888, 0x30004e72e80, 0x1, 0x14a8308, 0x2a100385d40, 0x0)
ufs:ufs_thread_delete+0xc4(0x3000027d2c8, 0x0, 0x300010faa28, 0x1438788, 0x16, 0x0)
unix:thread_start+0x4()
-- end of kernel thread's stack --
### At the same time, df -k is showing that the FS is at 100% capacity,
### and %sys going 100%.
bash-2.05# df -k /b
Filesystem kbytes used avail capacity Mounted on
/dev/dsk/c1t13d0s2 141185027 141184553 0 100% /b
CPU minf mjf xcal intr ithr csw icsw migr smtx srw syscl usr sys wt idl
0 0 0 2630 60 56 124 3 0 0 0 62 0 4 0 96
1 0 0 3 215 112 0 0 0 0 0 0 0 100 0 0
bash-2.05# lockstat -D 20 -kgIW sleep 10
Profiling interrupt: 1939 events in 9.998 seconds (194 events/sec)
Count genr cuml rcnt nsec Hottest CPU+PIL Caller
-------------------------------------------------------------------------------
1939 100% ---- 1.00 496 cpu0 lockstat_intr
1939 100% ---- 1.00 496 cpu0 cyclic_fire
1939 100% ---- 1.00 496 cpu0 cbe_level14
1939 100% ---- 1.00 496 cpu0 current_thread
1206 62% ---- 1.00 547 cpu[0] idle
1059 55% ---- 1.00 442 cpu0 ufs_trans_trunc_resv
980 51% ---- 1.00 553 cpu[0] disp_getwork
968 50% ---- 1.00 444 cpu0 ufs_trans_itrunc
968 50% ---- 1.00 444 cpu0 ufs_thread_delete
968 50% ---- 1.00 444 cpu0 ufs_delete
875 45% ---- 1.00 445 cpu0 ufs_log_amt
217 11% ---- 1.00 548 cpu[0] i_ddi_splx
173 9% ---- 1.00 566 cpu[0] i_ddi_splhigh
### It was observed we were not entirely just spinning in the for() loop
### as from the test cu_cp.c program, we can see that disk space did
### get released slowly.
...
write() ok.
write() ok.
write() ok.
write() ok.
write() ENOSPC. sleep/retry.
write() ok.
write() ok.
write() ok.
write() ok.
write() ok.
write() ok.
write() ok.
write() ok.
write() ok.
write() ok.
write() ok.
write() ENOSPC. sleep/retry.
write() ok.
...
### I was not able to reproduce this on S10. And this issue
### can only be observed on a multi CPU machine. If the test
### is run on a single CPU machine, disk space got released
### very fast.
### A similar CR# to this CR#6483587. However, in our case,
### if waited long enough (a few hours for a 130GB file deletion),
### ufs_trans_itrunc() will eventually gone into completion.
|