MtopTroubleShooting

From T2B Wiki
Jump to navigation Jump to search

Troubleshooting the Mtop machine

Note: these instructions are for root users only!

PageOutline

Restarting TopDB Critical services after mTop reboot

Note: for all these commands you need to be root-user

ssh -X username@mtop.iihe.ac.be
sudo su -

The webserver (apache)

- Start
/etc/init.d/httpd start
- Restart
/etc/init.d/httpd restart
- Checking if it works
ps U apache | grep httpd | wc -l (should be > 1)

MySQL DataBase Server

- Start
/etc/init.d/mysqld start
- Restart
test

Glite Installation

Outdated Glite Installation

Possible error:

crab:  Checking the status of all jobs: please wait

crab: Status Query failed with message : error executing
GLiteStatusQuery
Problem loading python LB API.
Your default python is 2.4.3 (#1, Nov 10 2010, 16:40:04) 
[GCC 4.1.2 20080704 (Red Hat 4.1.2-48)]

Solution:


sudo su -

cd /opt/
tar -czvf /root/glite-backup-DD-MM-YYYY.tar.gz glite
rm -rfv glite
(make shure your private ssh key for the M machines is installed under /root/.ssh/id_rsa_USER)
scp -r -i /root/.ssh/id_rsa_USER username@m7:/opt/glite .

Check that the version is the same as on m7:

Compare glite-wms-job-status --version on mtop and m7

CRL out of date

Possible error: (Also encountered when submitting crab jobs)

[dhondt@mtop ~/AutoMaticTopTreeProducer]$ voms-proxy-init --voms cms:/cms/becms --valid 190:00
Cannot find file or dir: /user/dhondt/.glite/vomses
Enter GRID pass phrase:
Your identity: /C=BE/O=BEGRID/OU=ELEM/OU=VUB/CN=Jorgen DHondt
Creating temporary proxy ................................................... Done
Contacting  lcg-voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "cms" Failed

Error: Could not establish authenticated connection with the server.
GSS Major Status: Authentication Failed
GSS Minor Status Error Chain:
globus_gss_assist: Error during context initialization
globus_gsi_callback_module: Could not verify credential
globus_gsi_callback_module: Could not verify credential
globus_gsi_callback_module: Invalid CRL: The available CRL has expired

Solution

sudo su -
/opt/external/usr/sbin/fetch-crl -l /etc/grid-security/certificates/ -o /etc/grid-security/certificates/


Unable to create proxy

Error message:

[mmaes@mtop2 ~]$ voms-proxy-init --voms cms:/cms/becms --valid 190:00
Cannot find file or dir: /user/mmaes//.glite/vomses
Enter GRID pass phrase:
Your identity: /C=BE/O=BEGRID/OU=ELEM/OU=VUB/CN=Michael Maes
Creating temporary proxy ............................ Done
Contacting  lcg-voms.cern.ch:15002 [/DC=ch/DC=cern/OU=computers/CN=lcg-voms.cern.ch] "cms" Failed

Error: Could not establish authenticated connection with the server.
    globus_gss_assist token :-1: read failure: Operation not permitted

Possible Solution (as root)

Update the system clock: ntpdate ntp.telenet.be

Connection to Grid Infrastructure

Connection problem to behar*.iihe.ac.be

Possible error:

Connection problems to behar* using srmls/srmcp etc.

Solution

On mtop: 
  sudo su -
  route
on m7
  /sbin/route

Compare the two outputs to see if there are some behar routes missing on mtop. If so add them like this:

  route add -host beharXXX.iihe.ac.be gw beharXXX.wn.iihe.ac.be

Then add this line to /root/routes



Template:TracNotice