[H-SASIG] [SysAdmin] #12: http://lists.humbug.org.au/mailman/admindb/mailman returns 500 Internal Server Error
HUMBUG System Administrators
sasig at lists.humbug.org.au
Sat May 1 21:26:13 EDT 2010
#12: http://lists.humbug.org.au/mailman/admindb/mailman returns 500 Internal
Server Error
---------------------------+------------------------------------------------
Reporter: russell | Owner: russell
Type: defect | Status: closed
Priority: major | Milestone: Sysadmin
Component: mailing-list | Version: NA
Resolution: fixed | Keywords:
---------------------------+------------------------------------------------
Changes (by russell):
* status: assigned => closed
* resolution: => fixed
Comment:
Fixed.
Chronology was thus:
1. I recall someone getting sick of mailman spam notifications for
general. It seems the solution was to send them then to /dev/null.
2. Mailmain still carefully filed away all spam messages in its pending
queue, waiting for someone who cared to look at them.
3. 3 years later, someone who cared came along (me). The pending queue
was held in a single python pickle which by this stage contained some
16,000 messages. When I attemped to look at the queue, mailman died.
4. Turns this initial failure was caused by python running out or memory
when trying to process this pickle. (This much I had guessed).
5. When mailman died, it did so so horribly it didn't cleanup its locks.
6. The next time you went via the web interface, mailman went into what
was effectively an infinite loop, waiting to acquire the lock.
Unfortunately, that loop contains a memory leak. So it again ran and of
memory, and in general for all appearances looked like the original
problem.
7. Turns our that our VM at that point (around 10 PM last night), ran out
of memory. The linux OOM killer fired up, but unfortunately choose the
wrong process to kill. The process it did choose refused to die, so it
went infinite trying to kill it.
8. As a consequence of that, we lost control of the VM.
9. Stephen Thorn wrestled control of the VM back for us this morning.
10. I have now rm /var/lib/mailman/lists/mailman/request.pck. This was
the original trigger for the problem.
11. I have restored normal processing of SPAM messages. All notifications
are now being sent to list-bounces at humbug.org.au, which for now is aliased
to president at humbug.org.au.
Thanks to Greg for loaning me his VM and time to help track down this
problem. Thanks to Stephen for being patient for me when I rang him this
morning.
--
Ticket URL: <http://trac.humbug.org.au/ticket/12#comment:6>
SysAdmin <http://trac.humbug.org.au>
HUMBUG System Administration
More information about the Sasig
mailing list