This wiki has been deprecated and will be removed soon.

The new Advanced Computing and e-Science wiki is located at http://grid.ifca.es/wiki.

Please update your bookmarks.

XMM

De e-Ciencia

Tabla de contenidos

Support to Users of XMM

Currently the group of people running on the Grid infrastructure is made of the following people

  • Xavier Barcons
  • Francisco Carrera
  • Maite Ceballos
  • José Ramón Rodón
  • Francesca Panessa
  • Rodrigo Gil-Merino
  • Jacobo Enebro
  • Amalia Corral
  • Angel Ruiz


If you are a member of the XMM group and want to join the group have a look at this [General Support Documentation] first, and then contact Isabel Campos.


Submission of Serial Batch Jobs

In what follows we will describe the submission of serial batch jobs to the Grid resources at IFCA whatever the project is. The example contains a restriction line which forces the job to end up on our machines locally, both at the level of CPU and at the level of Storage.

Advanced users can try to remove this restriction and go the Grid.

  • Simple Job without Storage Elements (transferred data under 20 MB)

To submit a serial job to the grid batch queues of IFCA the job has to be described in the language of the Grid batch system, this is called Job Description Language (JDL). This is an example:

 # Mandatory attributes
 Executable = "myexe";
 StdOutput = "myexe.out";
 StdError = "myexe.err";
 # I/O files to be staged from/to the User Interface
 InputSandbox = {"myexe","input.dat"};
 OutputSandbox = {"myexe.out","myexe.err","myoutput.dat"};
 Requirements= other.GlueCEUniqueID == "egeece01.ifca.es:2119/jobmanager-lcgpbs-planck";


This is the description of the job:

1. Executes the binary myexe with needs the file input.dat to work.

2. Writes the STDOUT and STDERR in myexe.out and myexe.err

3. The output of the program is myoutput.dat

The size of data which can be carried on SandBoxes is limited to a few Mbytes. Evidently there are few real scientific applications that can work on such basic premises. However we strongly suggest users to try and succeed dummy tests like this before getting to more complicated scenarios.


  • Using the Storage Elements for massive Input/Output

In most real applications one needs to deal with data sizes that do not fit in the SandBox limitation. The way to proceed is then based on three steps

 1. Store in the Grid in the apropriate place the Input data our program needs to run

Every VO, in particular planck has a dedicated directory for storage under /grid In order to access that directory first export the variable LFC_HOST to be at the central host for file catalogs of egee, and then use the following commands

 [rodon@egeeui01 i2g-test]$ export LFC_HOST=lfcserver.cnaf.infn.it
 [rodon@egeeui01 i2g-test]$ lfc-ls /grid/planck
 ceballos
 rodon
 corral

Create a directory to store input and protect it acording to your needs. For example, you might consider to stored all your maps in a directory like XMMDATA and let read access to every planck member

 [rodon@egeeui01 i2g-test]$ lfc-mkdir /grid/planck/XMMDATA
 [rodon@egeeui01 i2g-test]$ lfc-chmod 744 /grid/planck/dummy
 [rodon@egeeui01 i2g-test]$ lfc-ls -l /grid/planck/
 drwxr--r--   0 102      106                       0 Feb 15 16:15 XMMDATA


Now let us copy a big data file from your home directory in the User Interface to this directory

 lcg-cr --vo planck -l lfn:/grid/planck/rodon/mytarball.tar file:///home/rodon/mytarball.tar

the file mytarball.tar will be accesible for Jobs running on the grid when referencing it as lfn:/grid/planck/rodon/mytarball.tar

 2. Design a shell script that used inside a JDL script makes the input available to
 the job, and puts the output available on a Storage Element once the Job has completed.

This in principle implies to copy the files we need before the job executes from a Storage Element to the actual node that is carrying the calculation. When the job completes, one option is to tar/gzip the result and copy it to a Storage Element. This is an example script which would do the job

 cat my-script.sh
 #!/bin/sh
 # Debug info
 echo \| Execution start: `date` \| Host: `hostname` \| User: `whoami` \| Path: `pwd` \|
 #Let us first get the input of the simulation in place
 lcg-cp --vo planck lfn:/grid/planck/XMMDATA/data.tar.gz file:///tmp/map.tar.gz
 #Let us create a directory there (it is not mandatory!)  
 mkdir RUN_febrero15
 cp /tmp/map.tar.gz RUN_febrero15/.
 cp executable RUN_febrero15/.
 cd RUN_febrero15/
 tar xzvf map.tar.gz
 rm map.tar.gz
 ./executable 
 #Once here the program has ended, we tar the result and put it on an accesible place
 cd $HOME
 tar czvf run_febrero15.tgz RUN_febrero15/*
 lcg-cr --vo planck -l lfn:/grid/planck/rodon/run_febrero_15.tgz file://$HOME/run_febrero15.tgz
 # Debug info
 echo \| Execution end: `date` \|


The result is now accesible from the User Interface where the user is logged, and can be retrieve by the command lcg-cp

 lcg-cp --vo planck lfn:/grid/planck/rodon/run_febrero15.tgz file://$HOME/gridruns/run_febrero15.tgz


Finally, this would be the JDL description of this job. Notice we have replaced the executable by the name of the script, my-script.sh

 # Mandatory attributes
 Executable = "my-script.sh";
 StdOutput = "my-script.out";
 StdError = "my-script.err";
 # Environment variables
 Environment = {"LFC_HOST=lfcserver.cnaf.infn.it"};
 # I/O files to be staged from/to the User Interface
 InputSandbox = {"my-script.sh","executable"};
 OutputSandbox = {"my-script.out","my-script.err"};
 Requirements= other.GlueCEUniqueID == "eggeece01.ifca.es:2119/jobmanager-lcgpbs-ifusion";


The command edg-job-submit will submit this job to the IFCA resources and use our File Catalog as reference for file stanging.


Submitting long jobs

An important point to consider is that the proxy of the user that has submitted the job has to be valid all the period that the job is supossed to queue and to run.

The initialization of the proxy is made by default for 12 hours. It is possible though to initialize the proxy for a longer period.

However the elegant solution to avoid that the job dies because the user proxy has expired is to use the mechanism of automatic job renovation by a proxy server. In order to use this facility one has to add to do the following

  #Create and Store a long term proxy on the proxy server
  [rodon@egeeui01 i2g-test]$ myproxy-init -s i2gpx01.ifca.es -d -n
  Your identity: /C=ES/O=DATAGRID-ES/O=BIFI/CN=Jose_Ramon_Rodon_Ortiz
  Enter GRID pass phrase for this identity:
  Creating proxy .......................................... Done 
  Proxy Verify OK
  Your proxy is valid until: Thu Feb 22 17:12:25 2007
  A proxy valid for 168 hours (7.0 days) for user /C=ES/O=DATAGRID-ES/O=BIFI/CN=Jose_Ramon_Rodon_Ortiz now    exists on i2gpx01.ifca.es.

The commanda myproxy-init has created a one-week valid proxy on the IFCA proxy server, i2gpx.ifca.es Next the user has to a line in the JDL job description informing the job about the location of the proxy

  MyProxyServer=i2gpx.ifca.es

Once the job is done, destroy that long proxy, because in general it is not safe to have long proxies impersonating us over the network.

  [rodon@egeeui01 i2g-test]$ myproxy-destroy -s i2gpx01.ifca.es -d
  Default MyProxy credential for user /C=ES/O=DATAGRID-ES/O=BIFI/CN=Jose_Ramon_Rodon_Ortiz was successfully   removed.

Submission of Parallel MPI Batch Jobs

Dealing with Storage Elements

Currently the following Storage Elements are available to the planck Virtual Organization

  [rodon@egeeui01 XMM]$ lcg-infosites --vo planck se
  Avail Space(Kb) Used Space(Kb)  Type    SEs
  ----------------------------------------------------------
  105471484       4780676         n.a     se-ieg.bifi.unizar.es
  193471324       923487          n.a     se.i2g.cesga.es
  30540000000     7440000000      n.a     dpm.cyf-kr.edu.pl
  1131455168      122804          n.a     i2gse01.ifca.es
  1926285861      6972            n.a     dcache01.lip.pt
  72982596        32840           n.a     i2g-se01.lip.pt


By default the Storage Element employed is the one at IFCA. A different one can be specified using the flag '-d

 lcg-cr --vo planck -d se.i2g.cesga.es -l lfn:/grid/planck/rodon/testing  file:///home/rodon/FH/FeynHiggs

Using SAS Software

Herramientas personales
Grid Administration
Users Support