This wiki has been deprecated and will be removed soon.

The new Advanced Computing and e-Science wiki is located at http://grid.ifca.es/wiki.

Please update your bookmarks.

DUS: Managing files on the Grid

De e-Ciencia

This article lacks information/accuracy, and requires contributions from knowledgeable editors. If you are such an editor, please remove this template after making the corrections/additions you consider appropriate.

The storing resources are abstracted by a special node type called Storage Element (SE), behind which a simple disk, a tape server, etc. can exist. In our work in the Grid, we will interact with the SEs through specialized tools called LCG tools. With them we will move around our files with ease.

Tabla de contenidos

Setting up the file catalog location

Using LCG tools implies to set a default file catalog location. We can do this by exporting the environment variable LFC_CATALOG as follows. In bash:

% export LFC_CATALOG = lfc01.lip.pt

in tcsh/csh:

% setenv LFC_CATALOG lfc01.lip.pt


However this variable should be automatically set if our account is configured as explained in the second point in DUS: Setting up the User Interface account. For example, for NGI you can execute:

% source /gpfs/csic_projects/grid/etc/env/ngi-env.sh

Information about available Storage Elements

The first question is to know how much space we have available to work, and how to access it. We'll find out by querying the Information Service (IS) with the command:

% lcg-infosites --vo ifusion se
Avail Space(Kb) Used Space(Kb)  Type    SEs
----------------------------------------------------------
105820688       4431472         n.a     se-ieg.bifi.unizar.es
8072312         1934324         n.a     se.i2g.cesga.es
16330000000     3670000000      n.a     dpm.cyf-kr.edu.pl
1131490256      87716           n.a     i2gse01.ifca.es
31400000        18580000        n.a     se1.egee.man.poznan.pl

This information refers to the SEs and the space on them available for the users of the VO ifusion. On every SE there is a directory called /grid/<voname> with writing permisions for all users belonging to the VO=voname.

Manipulating files in the Grid filesystem

Listing existing files

We can list the contents of the Grid filesystem with the LFC tools:

% lfc-ls /grid
ibrain
ienvmod
ifusion
ihep
iplanck
iusct

The lfc-ls accepts options like the Unix command ls does:

% lfc-ls -l /grid/ifusion/
drwxrwxr-x   3 139      114                       0 Dec 16 17:35 isabel
drwxrwxr-x   0 101      114                       0 Dec 19 16:23 mdavid

Creating directories

We can create directories with lfc-mkdir:

% lfc-mkdir /grid/ifusion/username

Copying files into the Grid

There are different ways to address the files in the Grid. Perhaps the most user friendly one is the so-called Logical Filenames. Let us supose we want to copy a particular file on the SE from the CESGA site. Let us use the directory we created above to copy and register in all the Grid filesystem the file /home/username/test.dat, which resides in the local disk of the UI we are using:

% lcg-cr --vo ifusion -d se.i2g.cesga.es -l lfn:/grid/ifusion/username/test.dat file:///home/username/test.dat
guid:b70533cf-201b-4f8d-8fed-a33563268fca

Notice that we have to specify the destination SE (se.i2g.cesga.es in the example above), the VO in whose name we store the file there (ifusion), and of course, source and destination paths. The output of the command (guid: b7...) is the Grid Unique Identifier (GUId) of the file. However we don't have to carry along that name, the file will be identified in the future with the Logical Filename (lfn).

Replicating files in the Grid

It is considered a good practice to replicate the input and/or output file of your jobs. Replicating consists on having duplicated copies of a given file in different SEs. To do so, execute:

% lcg-rep --vo ifusion -d storm.ifca.es lfn:/grid/ifusion/username/test.dat

Note that, unlike with lcg-cr, here the LFN is not preceded by a -l.

Copying files from the Grid

If we want the retrieve a file by LFN, we can run the following command:

% lcg-cp -v lfn:LFN file://local_absolute_path

Don't forget that local_absolute_path must begin with a "/". It also must not be just the destination directory, but the destination filename.

Deleting files in the Grid

Other common operations on files are done through the lcg-* commands. For example, deleting a file and all its replicas is achieved via:

% lcg-del -a --vo ifusion lfn:/grid/ifusion/username/borrar

Deleting only the replica existing in a particular SE is done via:

% lcg-del --vo ifusion -s dpm.cyf-kr.edu.pl lfn:/grid/ifusion/username/borrar

Other

Check the lcg-* commands and the corresponding section in the LCG user guide for more options.

Some hints for application developpers and Input/Output can be found at this link.

Retrieving output files of a finished job

Whenever a job is run in the Grid and finishes, we will want to retrieve its output files. After submission, the system will return a httpsID to identify the job (see job submission instructions for more information). We must make use of it to retrieve the files corresponding to the job with the following command:

% i2g-job-get-output httpsID

Troubleshooting

lcg-cr

Permission denied error

Sometimes an lcg-cr command will be aborted with the following error:

% lcg-cr -d SE --vo VO -l LFN file:///FILE_PATH
the server sent an error response: 550 550 /flatfiles/VO/generated/DATE: Permission denied.

I don't know the reason for that, but choosing another SE fixes the problem (maybe the user has writing permission in some SEs, and not in others).

"lcg_cr: Invalid argument"

When using the lcg-cr command, you can encounter this error message. It usually happens if some path is incorrectly written. For example:

% lcg-cr --vo icompchem -d i2gse01.ifca.es -l lfn:/grid/icompchem/username/test file://home/username/test
lcg_cr: Invalid argument

Although apparently correct, notice the file://: it should have three slashes (file:///).

"the server sent an error response"

When using the lcg-cr command, you can receive this message for various reasons, usually for permission problems. For example:

% lcg-cr --vo icompchem -d i2gse01.ifca.es -l lfn:/grid/icompchem/username/test file:///home/username/test
the server sent an error response: 550 550 /storage/icompchem/generated/2008-12-09: Permission denied.

This means that you (or your VO) do not have permission to write there (either in the directory, or in the SE at all). As far as I know, you are left with looking for another suitable SE, with the aforementioned lcg-infosites command. Then try another SE from the list:

% lcg-cr --vo icompchem -d se05.lip.pt -l lfn:/grid/icompchem/username/test file:///home/username/test
guid:0421382c-0c18-4880-a7d6-2728f0a79da8

edg-job-get-output

EDG_WL_LOCATION error

This section lacks information/accuracy. Please fix it and remove this template.

When retrieving output files of a finished job, we might get the following error:

% edg-job-get-output httpsID
Error: Please set the EDG_WL_LOCATION environment variable pointing to the userinterface installation path

However, this is unlikely, because it should have appeared when we sent the job, and we should have fixed it there. The editor of these lines does not know exactly what the meaning of the EDG_WL_LOCATION variable is, but it must be set to something. In bash:

% export EDG_WL_LOCATION=/tmp

in tcsh:

% setenv EDG_WL_LOCATION /tmp

"Unable to create the directory"

We might get the following error:

% edg-job-get-output httpsID

**** Error: UI_CREATE_DIR ****
Unable to create the directory "/tmp/jobOutput/username_httpsID"

This means that the default output dir is being used (/tmp/jobOutput/). You will seldom have writing permission in that dir, so using a different output dir is recomended, e.g.:

% mkdir ~/finished_jobs
% edg-job-get-output --dir /home/username/finished_jobs/ httpsID
Retrieving files from host: i2grb01.ifca.es ( for httpsID )

*********************************************************************************
                        JOB GET OUTPUT OUTCOME                                   

 Output sandbox files for the job:
 - httpsID
 have been successfully retrieved and stored in the directory:
 /home/username/finished_jobs/username_httpsID                 

*********************************************************************************

lfc-*

"send2nsd: NS009 - fatal configuration error: Host unknown: UNUSED"

This error means that you don't have the LFC_HOST variable properly set. To do so, give it a proper value, e.g.:

% export LFC_HOST=lfc01.lip.pt

This line should be in your .bashrc file.

"/grid/: Could not secure the connection"

This error means that some variables are not correctly set. For that, you have to load the correct environment variables as explained in the second point of DUS: Setting up the User Interface account. For example, for NGI:

% source /gpfs/csic_projects/grid/etc/env/ngi-env.sh

References and further reading

Herramientas personales
Grid Administration
Users Support