This wiki has been deprecated and will be removed soon.

The new Advanced Computing and e-Science wiki is located at http://grid.ifca.es/wiki.

Please update your bookmarks.

DUS: Submitting jobs to the Grid

De e-Ciencia

The submission of jobs to the Grid is done through dedicated tools, namely EDG tools.

Tabla de contenidos

General scheme

The outline of the steps to follow would be:

  1. Create a JDL file that will describe the job:
  2. % vi test.jdl
    edit at will and save
    
  3. Submit it with the corresponding EDG tool:
  4. % edg-job-submit test.jdl
    
  5. Check the status of the job:
  6. % edg-job-status httpsID
    
  7. Retrieve output when finished:
  8. % edg-job-get-output httpsID
    

Generation of required files

JDL file for a simple test without I/O

Submitting a job implies first writing the JDL description of the job (refer to the LCG User Guide for details). This is an example of JDL to submit such a job:

# Mandatory attributes
Executable = "test.sh";
StdOutput  = "test.out";
StdError   = "test.err";

# I/O files to be staged from/to the User Interface
InputSandbox  = {"test.sh"};
OutputSandbox = {"test.out","test.err"};

where the script test.sh could be, for example:

#!/bin/sh

# Debug info
echo \# Execution start: `date`
echo \# Host: `hostname`
echo \# User: `whoami`
echo \# Path: `pwd`
echo

# Program execution
cat /proc/cpuinfo

# Debug info
echo \# Execution end: `date`

JDL file for reading input from a SE

This section lacks information/accuracy. Please fix it and remove this template.

In the previous example all the files copied over to the Input Sandbox were small (of the order of kBs). However, more often that not we will have to use large files in our calculation (libraries, huge input data files, large executables, etc...). If this is the case, the jobmanager will reject the job for being too large. To circumvent this problem, we will have to copy the necessary files onto the Grid filesystem before sending the job, as explained elsewhere.

This is a JDL script which submits a job that needs to read the input from a SE:

# Mandatory attributes
Executable = "installFusion-ifca.sh";
StdOutput  = "installFusion.out";
StdError   = "installFusion.err";

# Environment variables
Environment = {"LFC_HOST=lfc01.lip.pt"};

# I/O files to be staged from/to the User Interface
InputSandbox  = {"installFusion-ifca.sh"};
OutputSandbox = {"installFusion.out","installFusion.err"};
Requirements= other.GlueCEUniqueID == "i2gce01.ifca.es:2119/jobmanager-lcgpbs-ifusion";

Notice the Requirements extra field. It means that we want the job to end up at the IFCA site. In the script we have included the extra commands to do the copy from the SE of the files needed. The contents of installFusion-ifca.sh:

#!/bin/sh

# Debug info:
echo \# Execution start: `date`
echo \# Host: `hostname`
echo \# User: `whoami`
echo \# Path: `pwd`
echo

# Commands:
lcg-cp -v --vo ifusion lfn:/grid/ifusion/username/fusionGLUT-0.2.tar.gz file:///tmp/myfile.tar.gz
cp /tmp/myfile.tar.gz /opt/exp_soft/ifusion/.
rm /tmp/myfile.tar.gz
cd /opt/exp_soft/ifusion
ls -lrt /opt/exp_soft/ifusion
chmod u+x myfile.tar.gz
tar xzvf myfile.tar.gz
rm myfile.tar.gz

# Debug info:
echo
echo \# Execution end: `date`

Submitting the job

You can submit the job with the edg-job-submit command, and if successful, will obtain the following output:

% edg-job-submit test.jdl

Selected Virtual Organisation name (from proxy certificate extension): icompchem
Connecting to host i2grb01.ifca.es, port 7772
Logging to host i2grb01.ifca.es, port 9002
 

*********************************************************************************************
                               JOB SUBMIT OUTCOME
 The job has been successfully submitted to the Network Server.
 Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is:

 - https://i2grb01.ifca.es:9000/MGQOSE5dnrjnb0s82UBdmQ


*********************************************************************************************

Managing submitted jobs

Checking the status of your submissions

The command edg-job-status httpsID checks the status of the job. When we submit a job:

% edg-job-submit installFusion-cesga.jdl
Selected Virtual Organisation name (from proxy certificate extension): ifusion
Connecting to host i2grb01.ifca.es, port 7772
Logging to host i2grb01.ifca.es, port 9002
*******************************************************************************
                             JOB SUBMIT OUTCOME
 The job has been successfully submitted to the Network Server.
 Use edg-job-status command to check job current status. Your job
 identifier (edg_jobId) is:
  - https://i2grb01.ifca.es:9000/1k_qksI5SCyVvW-c3UcK2A
*******************************************************************************

We check the status typing:

% edg-job-status https://i2grb01.ifca.es:9000/1k_qksI5SCyVvW-c3UcK2A
*************************************************************
BOOKKEEPING INFORMATION:
Status info for the Job : https://i2grb01.ifca.es:9000/1k_qksI5SCyVvW-c3UcK2A
Current Status:     Scheduled
Status Reason:      Job successfully submitted to Globus
Destination:        ce.i2g.cesga.es:2119/jobmanager-lcgpbs-ifusiongrid
reached on:         Wed Dec 20 16:11:25 2006
*************************************************************

Controlling the status of many jobs may become cumbersome. The edg-job-status command has an option to save the httpsID of the job in the file specified, for bookkeeping. For example if we send two jobs:

% edg-job-submit -o logfilename test1.jdl
Selected Virtual Organisation name (from proxy certificate extension): ifusion
Connecting to host i2g-rb01.lip.pt, port 7772
Logging to host i2g-rb01.lip.pt, port 9002
============================ edg-job-submit Success ============================
The job has been successfully submitted to the Network Server.
Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is:
- https://i2g-rb01.lip.pt:9000/mGbOWrPmX-v6haZsX4rEXg
The edg_jobId has been saved in the following file:
/home/username/logfilename
================================================================================

and a second job:

% edg-job-submit -o logfilename test2.jdl
Selected Virtual Organisation name (from proxy certificate extension): ifusion
Connecting to host i2g-rb01.lip.pt, port 7772
Logging to host i2g-rb01.lip.pt, port 9002
============================ edg-job-submit Success ============================
The job has been successfully submitted to the Network Server.
Use edg-job-status command to check job current status. Your job identifier (edg_jobId) is:
- https://i2g-rb01.lip.pt:9000/W7AGjCjKZPLcIHv1nsLPeg
The edg_jobId has been saved in the following file:
/home/username/logfilename
================================================================================

You can check the status of your job using the command "edg-job-status -i logfilename":

% edg-job-status -i logfilename
------------------------------------------------------------------
1 : https://i2g-rb01.lip.pt:9000/mGbOWrPmX-v6haZsX4rEXg
2 : https://i2g-rb01.lip.pt:9000/W7AGjCjKZPLcIHv1nsLPeg
a : all
q : quit
------------------------------------------------------------------
Choose one or more edg_jobId(s) in the list - [1-5]all:

We can then press either "1", "2", "a" or "q" and Enter.

Retrieving your output

Main article: DUS: Managing files on the Grid#Retrieving output files of a finished job

The files you specified in the output Sandbox of the job can be retrieved by the command:

% edg-job-get-output httpsID

The rest of the output has to be writen onto Storage Elements, and have to be retrieved using LCG commands.

Troubleshooting

EDG_WL_LOCATION not set

This section lacks information/accuracy. Please fix it and remove this template.

When first running edg-job-submit, we might encounter the following error:

% edg-job-submit test.jdl
Error: Please set the EDG_WL_LOCATION environment variable pointing to the userinterface installation path

This error can occur with other commands, for example see the same error when managing files. The editor of these lines does not know exactly what the meaning of the EDG_WL_LOCATION variable is, but it must be set to something. In bash:

% export EDG_WL_LOCATION=/tmp

in tcsh:

% setenv EDG_WL_LOCATION /tmp

This seems to fix the issue.

"ImportError: No module named edg_wl_userinterface_common_AdWrapper"

This error, and others related to this one, happen because the path for searching Python modules is not correctly set:

% edg-job-submit test.jdl
Traceback (most recent call last):
  File "/opt/edg/bin/edg-job-submit", line 15, in ?
    import UIchecks
  File "/opt/edg/bin/UIchecks.py", line 22, in ?
    from edg_wl_userinterface_common_AdWrapper import AdWrapper
ImportError: No module named edg_wl_userinterface_common_AdWrapper

To solve this, we have to add to the following to the PYTHONPATH environment variable:

In bash:

% export PYTHONPATH=/opt/edg/lib/python:/opt/edg/lib/

in tcsh:

% setenv PYTHONPATH /opt/edg/lib/python:/opt/edg/lib/

We can save typing the above every time we log to the UI by placing the lines in the corresponding login scripts (~/.bashrc or ~/.tcshrc).

Herramientas personales
Grid Administration
Users Support