DUS: Managing files on the Grid
De e-Ciencia
The storing resources are abstracted by a special node type called Storage Element (SE), behind which a simple disk, a tape server, etc. can exist. In our work in the Grid, we will interact with the SEs through specialized tools called LCG tools. With them we will move around our files with ease.
Tabla de contenidos |
Setting up the file catalog location
Using LCG tools implies to set a default file catalog location. We can do this by exporting the environment variable LFC_CATALOG as follows. In bash:
% export LFC_CATALOG = lfc01.lip.pt
in tcsh/csh:
% setenv LFC_CATALOG lfc01.lip.pt
However this variable should be automatically set if our account is configured as explained in the second point in DUS: Setting up the User Interface account. For example, for NGI you can execute:
% source /gpfs/csic_projects/grid/etc/env/ngi-env.sh
Information about available Storage Elements
The first question is to know how much space we have available to work, and how to access it. We'll find out by querying the Information Service (IS) with the command:
% lcg-infosites --vo ifusion se Avail Space(Kb) Used Space(Kb) Type SEs ---------------------------------------------------------- 105820688 4431472 n.a se-ieg.bifi.unizar.es 8072312 1934324 n.a se.i2g.cesga.es 16330000000 3670000000 n.a dpm.cyf-kr.edu.pl 1131490256 87716 n.a i2gse01.ifca.es 31400000 18580000 n.a se1.egee.man.poznan.pl
This information refers to the SEs and the space on them available for the users of the VO ifusion. On every SE there is a directory called /grid/<voname> with writing permisions for all users belonging to the VO=voname.
Manipulating files in the Grid filesystem
Listing existing files
We can list the contents of the Grid filesystem with the LFC tools:
% lfc-ls /grid ibrain ienvmod ifusion ihep iplanck iusct
The lfc-ls accepts options like the Unix command ls does:
% lfc-ls -l /grid/ifusion/ drwxrwxr-x 3 139 114 0 Dec 16 17:35 isabel drwxrwxr-x 0 101 114 0 Dec 19 16:23 mdavid
Creating directories
We can create directories with lfc-mkdir:
% lfc-mkdir /grid/ifusion/username
Copying files into the Grid
There are different ways to address the files in the Grid. Perhaps the most user friendly one is the so-called Logical Filenames. Let us supose we want to copy a particular file on the SE from the CESGA site. Let us use the directory we created above to copy and register in all the Grid filesystem the file /home/username/test.dat, which resides in the local disk of the UI we are using:
% lcg-cr --vo ifusion -d se.i2g.cesga.es -l lfn:/grid/ifusion/username/test.dat file:///home/username/test.dat guid:b70533cf-201b-4f8d-8fed-a33563268fca
Notice that we have to specify the destination SE (se.i2g.cesga.es in the example above), the VO in whose name we store the file there (ifusion), and of course, source and destination paths. The output of the command (guid: b7...) is the Grid Unique Identifier (GUId) of the file. However we don't have to carry along that name, the file will be identified in the future with the Logical Filename (lfn).
Replicating files in the Grid
It is considered a good practice to replicate the input and/or output file of your jobs. Replicating consists on having duplicated copies of a given file in different SEs. To do so, execute:
% lcg-rep --vo ifusion -d storm.ifca.es lfn:/grid/ifusion/username/test.dat
Note that, unlike with lcg-cr, here the LFN is not preceded by a -l.
Copying files from the Grid
If we want the retrieve a file by LFN, we can run the following command:
% lcg-cp -v lfn:LFN file://local_absolute_path
Don't forget that local_absolute_path must begin with a "/". It also must not be just the destination directory, but the destination filename.
Deleting files in the Grid
Other common operations on files are done through the lcg-* commands. For example, deleting a file and all its replicas is achieved via:
% lcg-del -a --vo ifusion lfn:/grid/ifusion/username/borrar
Deleting only the replica existing in a particular SE is done via:
% lcg-del --vo ifusion -s dpm.cyf-kr.edu.pl lfn:/grid/ifusion/username/borrar
Other
Check the lcg-* commands and the corresponding section in the LCG user guide for more options.
Some hints for application developpers and Input/Output can be found at this link.
Retrieving output files of a finished job
Whenever a job is run in the Grid and finishes, we will want to retrieve its output files. After submission, the system will return a httpsID to identify the job (see job submission instructions for more information). We must make use of it to retrieve the files corresponding to the job with the following command:
% i2g-job-get-output httpsID
Troubleshooting
lcg-cr
Permission denied error
Sometimes an lcg-cr command will be aborted with the following error:
% lcg-cr -d SE --vo VO -l LFN file:///FILE_PATH the server sent an error response: 550 550 /flatfiles/VO/generated/DATE: Permission denied.
I don't know the reason for that, but choosing another SE fixes the problem (maybe the user has writing permission in some SEs, and not in others).
"lcg_cr: Invalid argument"
When using the lcg-cr command, you can encounter this error message. It usually happens if some path is incorrectly written. For example:
% lcg-cr --vo icompchem -d i2gse01.ifca.es -l lfn:/grid/icompchem/username/test file://home/username/test lcg_cr: Invalid argument
Although apparently correct, notice the file://: it should have three slashes (file:///).
"the server sent an error response"
When using the lcg-cr command, you can receive this message for various reasons, usually for permission problems. For example:
% lcg-cr --vo icompchem -d i2gse01.ifca.es -l lfn:/grid/icompchem/username/test file:///home/username/test the server sent an error response: 550 550 /storage/icompchem/generated/2008-12-09: Permission denied.
This means that you (or your VO) do not have permission to write there (either in the directory, or in the SE at all). As far as I know, you are left with looking for another suitable SE, with the aforementioned lcg-infosites command. Then try another SE from the list:
% lcg-cr --vo icompchem -d se05.lip.pt -l lfn:/grid/icompchem/username/test file:///home/username/test guid:0421382c-0c18-4880-a7d6-2728f0a79da8
edg-job-get-output
EDG_WL_LOCATION error
| This section lacks information/accuracy. Please fix it and remove this template. |
When retrieving output files of a finished job, we might get the following error:
% edg-job-get-output httpsID Error: Please set the EDG_WL_LOCATION environment variable pointing to the userinterface installation path
However, this is unlikely, because it should have appeared when we sent the job, and we should have fixed it there. The editor of these lines does not know exactly what the meaning of the EDG_WL_LOCATION variable is, but it must be set to something. In bash:
% export EDG_WL_LOCATION=/tmp
in tcsh:
% setenv EDG_WL_LOCATION /tmp
"Unable to create the directory"
We might get the following error:
% edg-job-get-output httpsID **** Error: UI_CREATE_DIR **** Unable to create the directory "/tmp/jobOutput/username_httpsID"
This means that the default output dir is being used (/tmp/jobOutput/). You will seldom have writing permission in that dir, so using a different output dir is recomended, e.g.:
% mkdir ~/finished_jobs
% edg-job-get-output --dir /home/username/finished_jobs/ httpsID
Retrieving files from host: i2grb01.ifca.es ( for httpsID )
*********************************************************************************
JOB GET OUTPUT OUTCOME
Output sandbox files for the job:
- httpsID
have been successfully retrieved and stored in the directory:
/home/username/finished_jobs/username_httpsID
*********************************************************************************
lfc-*
"send2nsd: NS009 - fatal configuration error: Host unknown: UNUSED"
This error means that you don't have the LFC_HOST variable properly set. To do so, give it a proper value, e.g.:
% export LFC_HOST=lfc01.lip.pt
This line should be in your .bashrc file.
"/grid/: Could not secure the connection"
This error means that some variables are not correctly set. For that, you have to load the correct environment variables as explained in the second point of DUS: Setting up the User Interface account. For example, for NGI:
% source /gpfs/csic_projects/grid/etc/env/ngi-env.sh
