I looked everywhere in the logs, I see nothing wrong. There are some indications of corrupt mysql tables, but when I checked those used by RAS after the first crash, they were fine. Maybe there are corrupt tables elsewhere. I have not yet run the checks, I'll try this evening. I commented out crontab in the root and aras accounts with '#CZ'. Let's see whether the machine survives the night. If so, and nobody else see a problem, we should gradually get the service back. The first thing would be to get adrepec current. Then open the web server to users. Then get CitEc data back. Does this make sense? On Mon, 28 Jan 2008, Christian Zimmermann wrote:
First things I see: both crashes happened exactly at the same time:
Jan 17 23:09:01 nebka /USR/SBIN/CRON[14205]: (aras) CMD (cd /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh ) Jan 17 23:10:01 nebka /USR/SBIN/CRON[14237]: (www-data) CMD ([ -x /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl -config=awstats -update >/dev/null) Jan 17 23:10:01 nebka /USR/SBIN/CRON[14238]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1) Jan 17 23:15:01 nebka /USR/SBIN/CRON[14474]: (root) CMD ([ -x /usr/lib/sysstat/sa1 ] && { [ -r "$DEFAULT" ] && . "$DEFAULT" ; [ "$ENABLED" = "true" ] && exec /usr/lib/sysstat/sa1 $SA1_OPTIONS 1 1 ; }) Jan 17 23:16:01 nebka /USR/SBIN/CRON[14476]: (aras) CMD (/home/aras/acis/bin/apu 7 >>/home/aras/apu-job.log 2>&1) Jan 17 23:17:01 nebka /USR/SBIN/CRON[14489]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Jan 17 23:18:01 nebka /USR/SBIN/CRON[14492]: (aras) CMD (cd /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh ) Jan 17 23:20:01 nebka /USR/SBIN/CRON[14547]: (www-data) CMD ([ -x /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl -config=awstats -update >/dev/null) Jan 17 23:20:01 nebka /USR/SBIN/CRON[14548]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1) Jan 17 23:22:01 nebka /USR/SBIN/CRON[14703]: (root) CMD (du -cs /* > du_slash_`date -I`) Jan 18 14:04:08 nebka syslogd 1.4.1#18: restart.
...
Jan 17 23:09:01 nebka /USR/SBIN/CRON[14205]: (aras) CMD (cd /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh ) Jan 17 23:10:01 nebka /USR/SBIN/CRON[14237]: (www-data) CMD ([ -x /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl -config=awstats -update >/dev/null) Jan 17 23:10:01 nebka /USR/SBIN/CRON[14238]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1) Jan 17 23:15:01 nebka /USR/SBIN/CRON[14474]: (root) CMD ([ -x /usr/lib/sysstat/sa1 ] && { [ -r "$DEFAULT" ] && . "$DEFAULT" ; [ "$ENABLED" = "true" ] && exec /usr/lib/sysstat/sa1 $SA1_OPTIONS 1 1 ; }) Jan 17 23:16:01 nebka /USR/SBIN/CRON[14476]: (aras) CMD (/home/aras/acis/bin/apu 7 >>/home/aras/apu-job.log 2>&1) Jan 17 23:17:01 nebka /USR/SBIN/CRON[14489]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Jan 17 23:18:01 nebka /USR/SBIN/CRON[14492]: (aras) CMD (cd /home/aras/acis && /home/aras/acis/bin/make-repec-per.sh ) Jan 17 23:20:01 nebka /USR/SBIN/CRON[14547]: (www-data) CMD ([ -x /usr/lib/cgi-bin/awstats.pl -a -f /etc/awstats/awstats.conf -a -r /var/log/apache/access.log ] && /usr/lib/cgi-bin/awstats.pl -config=awstats -update >/dev/null) Jan 17 23:20:01 nebka /USR/SBIN/CRON[14548]: (root) CMD (test -x /usr/lib/atsar/atsa1 && /usr/lib/atsar/atsa1) Jan 17 23:22:01 nebka /USR/SBIN/CRON[14703]: (root) CMD (du -cs /* > du_slash_`date -I`) Jan 18 14:04:08 nebka syslogd 1.4.1#18: restart.
du /* seems to be the tripping point.
Christian Zimmermann FIGUGEGL! Department of Economics University of Connecticut 341 Mansfield Road, Unit 1063 Storrs, CT 06269-1063 http://ideas.repec.org/zimm/ christian.zimmermann@uconn.edu http://ideas.repec.org/e/pzi1.html
On Mon, 28 Jan 2008, Christian Zimmermann wrote:
Tim seems to have put nebka back online, and it seems to be spewing out emails. I will comment everything in crontab and kill whatever is running to let us investigate the problems.
Christian Zimmermann FIGUGEGL! Department of Economics University of Connecticut 341 Mansfield Road, Unit 1063 Storrs, CT 06269-1063 http://ideas.repec.org/zimm/ christian.zimmermann@uconn.edu http://ideas.repec.org/e/pzi1.html
_______________________________________________ RAS-run mailing list RAS-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/ras-run
_______________________________________________ RAS-run mailing list RAS-run@lists.openlib.org http://lists.openlib.org/cgi-bin/mailman/listinfo/ras-run