<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-GB">
	<id>https://t2bwiki.iihe.ac.be/index.php?action=history&amp;feed=atom&amp;title=CE_oveloaded</id>
	<title>CE oveloaded - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://t2bwiki.iihe.ac.be/index.php?action=history&amp;feed=atom&amp;title=CE_oveloaded"/>
	<link rel="alternate" type="text/html" href="https://t2bwiki.iihe.ac.be/index.php?title=CE_oveloaded&amp;action=history"/>
	<updated>2026-04-20T09:52:42Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.43.5</generator>
	<entry>
		<id>https://t2bwiki.iihe.ac.be/index.php?title=CE_oveloaded&amp;diff=46&amp;oldid=prev</id>
		<title>Maintenance script: Created page with &quot; === CE Oveloaded ===  When the CE is overloaded, this can cause issues with publishing BDII info (when the site BDII is running on the CE at least). When &lt;tt&gt;top&lt;/tt&gt; show...&quot;</title>
		<link rel="alternate" type="text/html" href="https://t2bwiki.iihe.ac.be/index.php?title=CE_oveloaded&amp;diff=46&amp;oldid=prev"/>
		<updated>2015-08-26T12:28:19Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot; === CE Oveloaded ===  When the CE is overloaded, this can cause issues with publishing BDII info (when the site BDII is running on the CE at least). When &amp;lt;tt&amp;gt;top&amp;lt;/tt&amp;gt; show...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;&lt;br /&gt;
=== CE Oveloaded ===&lt;br /&gt;
&lt;br /&gt;
When the CE is overloaded, this can cause issues with publishing BDII info (when the site BDII is running on the CE at least).&lt;br /&gt;
When &amp;lt;tt&amp;gt;top&amp;lt;/tt&amp;gt; shows high system cpu usage and &amp;lt;tt&amp;gt;vmstat 1&amp;lt;/tt&amp;gt; shows lots of cs (context switches), there are probably too many globus-job-mamangers running.&lt;br /&gt;
==== Possible causes ====&lt;br /&gt;
*sendmail + torque: torque sends an email to the pool user on job completion. these will (probably) fail and retrying these submission can bring the system to a halt&lt;br /&gt;
**solution:&lt;br /&gt;
**disable sendmail completely&lt;br /&gt;
**clean out /var/spool/clientmqueue and /var/spool/mqueue&lt;br /&gt;
*globus-job-manager tracks jobs from the past: /opt/globus/tmp/gram_job_state contains the lock files that are ids to be checked. some of them can be quite old &lt;br /&gt;
**solution: clean it up by removing the lock files and corresponding regular files&lt;br /&gt;
&amp;lt;pre&amp;gt;&lt;br /&gt;
cd /opt/globus/tmp/gram_job_state&lt;br /&gt;
find $PWD -ctime +20 -regex &amp;#039;.*lock&amp;#039;|sed &amp;#039;s/.lock//&amp;#039; &amp;gt; list&lt;br /&gt;
for i in &amp;lt;tt&amp;gt;cat list|grep /opt/globus/tmp/gram_job_state&amp;lt;/tt&amp;gt;; do echo $i; rm -f $i $i.lock; done&lt;br /&gt;
rm -f list&lt;br /&gt;
&amp;lt;/pre&amp;gt;&lt;br /&gt;
  &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
{{TracNotice|{{PAGENAME}}}}&lt;/div&gt;</summary>
		<author><name>Maintenance script</name></author>
	</entry>
</feed>