Discussion:
Periodic stuck in find (unkillable)
(too old to reply)
Andrea Venturoli
2016-06-20 07:45:14 UTC
Permalink
Hello.

On a 10.3/amd64 box periodic is not running properly: often the mail
just says "daily prior run still in progress".

I've checked and there are several processes belonging to periodic; I
was able to kill most of them (sometimes getting a partial report), but
some instances of "find" are unkillable.
# ps ax | grep find
10526 - D 0:34.62 find -sx / /usr /var /data /dev/null -type f ( ( ! -perm +010 -and -p
82967 - DN 0:40.19 find -s / ! ( -fstype ufs ) -prune -or -path /tmp -prune -or -path /u
92472 - D 1:14.42 find -sx / /usr /var /data /dev/null -type f ( -perm -u+x -or -perm -
97995 - D 0:57.10 find -sx / /usr /var /data /dev/null -type f ( ( ! -perm +010 -and -p
99128 0 S+ 0:00.00 grep find
These processes will only go away by rebooting.
# ps ax | grep clamscan
6554 - D 279:49.04 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
13442 - D 473:55.64 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
17923 - D 461:30.71 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
34296 - D 490:31.21 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
36875 - D 475:49.56 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
39928 - D 277:08.06 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
40056 - D 460:25.45 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
43023 - D 479:36.90 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
45475 - D< 461:35.17 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
48693 - D 278:08.96 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
51667 - D 276:28.51 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
65643 - D 276:25.96 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
66804 - D 477:15.09 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
69644 - D< 461:25.68 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
71984 - D 483:20.62 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
72315 - D 277:17.21 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
74601 - D 277:31.67 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
90563 - D 474:07.14 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
93797 - D 465:10.58 /usr/local/bin/clamscan -r -i --stdout --exclude=//proc --exclude=/us
I believe this started when I upgraded from 9.3 to 10.3.



In both cases, I've got no NFS shares mounted, so it cannot be a network
problem.

Any idea what's going on or how to check?



bye & Thanks
av.
Karl Vogel
2016-06-20 22:31:54 UTC
Permalink
...some instances of "find" are unkillable.
# ps ax | grep find
find -sx / /usr /var /data /dev/null -type f ( ( ! -perm +010 -and -p
I don't have a BSD box in front of me; can you use "truss -p pid" to
attach to one of the running processes and see if anything comes back?

If that doesn't work out, can you get the full "find" command line from
/proc and re-run it in a separate session? Don't redirect the output or
do anything else to change the buffering, and see if/as/when it quits.

I use the script below to run things under an enviroment similar to cron
(run "/usr/bin/env > /tmp/env$$" from cron to confirm); you could try the
"find" command using this environment and see if it still freezes.

Good luck.
--
Karl Vogel I don't speak for the USAF or my company

--------------------------------------------------------------------------
#!/bin/ksh
#<ascron: run a job as if cron was doing it.

case "$#" in
0) set env ;;
*) ;;
esac

/usr/bin/env -i \
HOME=$HOME \
LOGNAME=$LOGNAME \
PATH=/bin:/usr/bin \
PWD=$PWD \
SHELL=$SHELL \
SHLVL=1 \
TZ=$TZ \
daemon $@

exit 0
Andrea Venturoli
2016-06-23 10:53:58 UTC
Permalink
Post by Karl Vogel
# ps ax | grep find
find -sx / /usr /var /data /dev/null -type f ( ( ! -perm +010 -and -p
I don't have a BSD box in front of me; can you use "truss -p pid" to
attach to one of the running processes and see if anything comes back?
On the box where I still have the "find"s running, I attached to all of
them and waited 10 minutes: nothing was output.

The box where clamscan was stuck was rebooted in the meanwhile, because
several clamscans brought it to its knees by comsuming all RAM and swap.
Post by Karl Vogel
If that doesn't work out, can you get the full "find" command line from
/proc and re-run it in a separate session? Don't redirect the output or
do anything else to change the buffering, and see if/as/when it quits.
find -sx / /usr /var /data /mnt/backup /dev/null -type f \( -perm -u+x -or -perm -g+x -or -perm -o+x \) \( -perm -u+s -or -perm -g+s \) -exec ls -liTd {} +
(notice /mnt/backup is not currently mounted)
The process soon got stuck, cannot be put in background with ^Z or
interrupted with ^C.
Post by Karl Vogel
load: 0.11 cmd: find 38500 [vodead] 2828.23r 8.63u 50.07s 0% 2448k
Maybe I'm hit by > https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=204764
?




On the other box the clamscan job got to its end in 11h, but didn't get
stuck, probably due to the reboot.



Thanks for your suggestions.

bye
av.

Loading...