I2G-EGEE Integration
De e-Ciencia
In this section the EGEE and int.eu.grid integration procedures at IFCA will be discussed.
This document is based on the one written by Gonçalo Borges and Jorge Gomes. You must read this one for further explanations.
This is not a replacement for the original document, but just a recipe for the IFCA people.
Ask before doing anything that you don't know what is going to do.
Tabla de contenidos |
EGEE UI with I2G job submission enabled
In order to be able to use the EGEE VOs in the Int.eu.grid testbed and profit from the I2G middleware benefits (for example parallelization) one workaround would be to enable an EGEE UI to submit jobs to the I2G testbed, using the I2G CrossBroker. In this example we are going to enable the "swetest" VO in a working EGEE VO.
In the EGEE UI, add the I2G repositories on /etc/apt/sources.list.d/i2g.list add:
rpm http://savannah.fzk.de repository/i2g/production i386 noarch
Then install the I2G middleware:
# apt-get update # apt-get install i2g-profile i2g-vomscerts i2g-yaim-sysconfig i2g-wl-services-common i2g-wl-logging-api-c i2g-wl-logging-api-cpp i2g-wl-logging-api-sh i2g-wl-bypass i2g-wl-chkpt-api i2g-wl-common-api i2g-wl-common-api-java i2g-wl-common-api-java-interface i2g-wl-ui-api-cpp i2g-wl-ui-api-java i2g-wl-ui-api-java-interface i2g-wl-ui-cli i2g-wl-ui-config i2g-wl-ui-gui i2g-wl-config i2g-yaim-workload_manager_client
Configure it:
/opt/glite/yaim/bin/yaim -r -s siteinfo.def -f config_i2g_sysconfig /opt/glite/yaim/bin/yaim -r -s siteinfo.def -f config_i2g_workload_manager_client
Apply the following patches (Save each one to somefile and apply with patch -p0 < somefile) to allow sending logging information to the CrossBroker:
--- /opt/i2g/bin/i2g-job-submit 2007-04-27 17:02:28.000000000 +0200
+++ /opt/i2g/bin/i2g-job-submit.new 2007-08-29 11:45:49.000000000 +0200
@@ -390,8 +390,13 @@
loggingHost, loggingPort = os.environ ["I2G_WL_LOG_DESTINATION"].split(":")
loggingPort=int(loggingPort)
except:
- try:
- logDest = UIutils.info.confAd.getStringValue ("LoggingDestination")[0]
+ try:
+ # Patch aloga<at>ifca.unican.es
+ logDest = UIutils.getLogging()
+ if not logDest:
+ logDest = UIutils.info.confAd.getStringValue ("LoggingDestination")[0]
+ # logDest = UIutils.info.confAd.getStringValue ("LoggingDestination")[0]
+ # Patch ends here
loggingHost, loggingPort = logDest.split(":")
loggingPort=int(loggingPort)
if logDest:
--- /opt/i2g/bin/UIutils.py 2007-04-27 17:02:28.000000000 +0200
+++ /opt/i2g/bin/UIutils.py.new 2007-08-29 11:48:35.000000000 +0200
@@ -419,6 +419,21 @@
#NS / LB METHODS:
+# Patch aloga<at>ifca.unican.es
+def getLogging(*nsNum):
+ """Return LoggingDestination from the VO config file, if not exists returns None"""
+ try:
+ nsNum = nsNum[0]
+ except:
+ nsnum = -1
+ try:
+ log = info.confAdVo.getStringList("LoggingDestination")[0]
+ except:
+ return None
+ return log
+#Patch ends here
+
+
"""
Retreve LB list from configuration file
"""
Edit the file (or create if neccesary) /opt/i2g/etc/swetest/i2g_wl_ui.conf to configure the job submission to use the CrossBroker (i2g-rb01.lip.pt)
[ VirtualOrganisation = "swetest"; NSAddresses = "i2g-rb02.lip.pt:7772"; LBAddresses = "i2g-rb02.lip.pt:9000"; MyProxyServer = "px01.ifca.es"; LoggingDestination = "i2g-rb02.lip.pt:9002"; ]
Now a user should be able to launch jobs and submit them to the I2G testbed, by using the "i2g" commands:
i2g-job-attach i2g-job-cancel i2g-job-get-chkpt i2g-job-get-logging-info i2g-job-get-output i2g-job-list-match i2g-job-status i2g-job-submit i2g-wl-grid-console-shadow i2g-wl-logev i2g-wl-ui-jdleditor.csh i2g-wl-ui-jdleditor.sh i2g-wl-ui-jobmonitor.csh i2g-wl-ui-jobmonitor.sh i2g-wl-ui-jobsubmitter.csh i2g-wl-ui-jobsubmitter.sh
WNs Integration
Schema
A rudimentary schema of the setup deployed should be:
_
-------------------- -------------------- |
| | | | |
| I2G CE | | EGEE CE | |- Other Grid components are also dedicated
| | | | | (SE, UI, etc)
-------------------- -------------------- _|
| |
-------------------------
|
| _
-------------------- |
| | |
| MAUI/TORQUE | |- A unique Maui + Torque server.
| | |
-------------------- _|
|
|
-------------------------
| | _
-------------------- -------------------- |
| | | | |
| WN 1 | | WN 2 | |
| | | | |
-------------------- -------------------- |
|
... ... |- A unique pool of WNs enabled to run both I2G
| and EGEE jobs.
-------------------- -------------------- |
| | | | |
| WN "m" | | WN "n" | |
| | | | |
-------------------- -------------------- _|
As seen we are going to deploy a separate torque server, in order to perform a centralized the management of the queues. Both CEs are going to communicate with it. Therefore, in the WNs no distinction between projects is made at all.
I2G repository
Where neccesary (i.e. when installing I2G middleware), install it under /etc/apt/sources.list.d/i2g.list.
rpm http://savannah.fzk.de repository/i2g/production i386 noarch
The integration takes part in two steps:
- Make the base installation of the EGEE Middleware (Instructions)
- Follow the particular instructions for each type of node listed below.
Please read these instructions carefully before installing the middleware, as in some places it is needed to redefine some "standard" variables before install them.
Prepare your new site-info.def
We are going to need 3 site-info.def: An I2G one, an EGEE one and a mixed one.
I2G and EGEE
- Redefine the next variable:
TORQUE_SERVER="your.torque.server.$MY_DOMAIN"
Mixed one
This is the EGEE site-info.def file with the added I2G support:
- Add both CEs (as CE CE2) and add them to the BDII regions.
- Create or edit the
users.conffile and add both I2G and EGEE VO's users to the file. - Perform the same operation for the groups in the file
groups.conf. - Add on the
site-info.deffile the VOs supported (VOS="...",QUEUES="...",VO_GROUP_ENABLE="...") - Create the
vo.d/imainfile containing:
VOMS_SERVERS="'vomss://i2g-voms.lip.pt:8443/voms/imain?/imain/' 'vomss://i2gvoms01.ifca.es:8443/voms/imain?/imain/'" SW_DIR=$VO_SW_DIR/imainsoft DEFAULT_SE=$CLASSIC_HOST STORAGE_DIR=$CLASSIC_STORAGE_DIR/imain QUEUES="imaingridsdj" VOMSES="'imain i2g-voms.lip.pt 20001 /C=PT/O=LIPCA/O=LIP/OU=Lisboa/CN=i2g-voms.lip.pt imain' 'imain i2gvoms01.ifca.es 20001 /C=ES/O=DATAGRID-ES/O=IFCA/CN=host/i2gvoms01.ifca.es imain'"
Worker Nodes Installation
Install them and configure them with the "mixed" site-info.def.
Maui + Torque installation
Install the node, and configure it with the "mixed" site-info.def.
/opt/glite/yaim/bin/yaim -i -s siteinfo/site-info_glite30_TORQUE_GRID_070726_164600.def -m glite-torque-server-config /opt/glite/yaim/bin/yaim -c -s siteinfo/site-info_glite30_TORQUE_GRID_070726_164600.def -n TORQUE_server
Edit your /var/spool/maui/maui.cfg, and add both CEs on the line ADMINHOSTS like this:
(...) ADMINHOSTS torque.ifca.es i2gce01.ifca.es egeece01.ifca.es (...)
Edit/create the files /var/spool/pbs/server_priv/acl_svr/operators and /var/spool/pbs/server_priv/acl_svr/managers, adding the server which should be able to administrate the PBS server (i.e. the torque server and both CEs) one on each line.
Restart the service:
/etc/init.d/pbs_server restart
EGEE CE installation
Install and configure it with the EGEE site-info.def, but using the lcg-CE metapackage and the CE target.
/opt/glite/yaim/bin/yaim -i -s siteinfo/EGEE/site-info.def -m lcg-CE -m glite-torque-client-config /opt/glite/yaim/bin/yaim -c -s siteinfo/EGEE/site-info.def -n CE
I2G CE installation
Install and configure it with the I2G site-info.def.
/opt/glite/yaim/bin/yaim -i -s siteinfo/I2G/site-info.def -m i2g-CE-lcg -m glite-torque-client-config /opt/glite/yaim/bin/yaim -c -s siteinfo/I2G/site-info.def -n CE
Jobmanager Patch
As said in the document written by Gonçalo Borges and Jorge Gomes, it is needed to apply the following patch to /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgpbs.pm (as we are using lcgpbs):
. "export " . $tuple->[0] . "\n";
}
- $pbs_job_script->print("#PBS -v " . join(',', @new_env) . "\n");
+
+ ###
+ # [GBorges]
+ # Patch to guaranty interoperability between I2G WNs and EGEE WNs
+
+ # Read the VO from the user proxy
+ my $vo_user_line=`/opt/edg/bin/voms-proxy-info --file $local_x509 --all | grep VO`;
+ my ($vo_tag,$vo_user)=split(/:/,$vo_user_line);
+ $vo_user=~s/\s+//g;
+
+ # Read the vo_environment configuration file
+ my ( %VOCONFIG );
+ $VOCONFIG{'vo_environment'} = '/opt/globus/lib/perl/Globus/GRAM/JobManager/vo_environment';
+ my $voenv = new IO::File( $VOCONFIG{'vo_environment'}, "r" )
+ or die "Unable to open VO environment mapping configuration file.";
+
+ # For every line in the environment configuration map
+ while ( defined( $_ = $voenv->getline ) ) {
+ next if /^\s*\#/; # Skip if comment
+ next if /^\s*$/; # Skip if whitespace
+ # Several chars (\S+), white space (\s+)
+ if (/^(\S+)\s+((\S+)\s+)+/) {
+ # If it's a constraint entry, extract the values.
+ my $vo_name = $1;
+ s/^($vo_name)\s+//g;
+ my @vo_varenvs = split(/\s/);
+ foreach my $vo_varenv (@vo_varenvs) {
+ # Add @new_env with the VO specific env variables
+ push(@new_env,$vo_varenv) if $vo_name eq $vo_user;
+ }
+ } else {
+ warn "Unrecognised entry in ".$VOCONFIG{'vo_environment'}.": '$_'\n";
+ }
+ }
+
+ $voenv->close;
+
+ foreach my $env_line (@new_env) {
+ $pbs_job_script->print("export " . $env_line . "\n");
+ }
+
+# $pbs_job_script->print("#PBS -v " . join(',', @new_env) . "\n");
+
+# Patch ends here
if (defined $description->library_path() && $description->library_path() ne '')
{
MPI configuration
For MPI to work it is needed to patch the Jobmanager and configure the batch system (in the case of PBS/Torque).
WNs configuration
It is needed to get passwordless SSH between the WNS. See our Grid Administration Guide
I2G CE configuration
As we are using lcg-pbs we need to apply the following patch:
--- /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgpbs.pm 2007-08-24 13:25:00.000000000 +0200
+++ lcgpbs.pm 2007-08-24 13:25:46.000000000 +0200
@@ -22,8 +22,8 @@
$qstat = '/usr/bin/qstat';
$qdel = '/usr/bin/qdel';
$qmsg = '/usr/bin/qmsg';
- $cluster = 0;
- $cpu_per_node = 0;
+ $cluster = 1;
+ $cpu_per_node = 1;
$remote_shell = '/usr/bin/ssh';
}
Maui + Torque configuration
The main instructions are here (I2G wiki) and here (EGEE GOC wiki).
- Edit
/var/spool/pbs/torque.cfgand add a line containing:
SUBMITFILTER /var/spool/pbs/submit_filter.pl
- Download the
submit_filter.plfrom [1] and put it in the above location. - Check that the
ENABLEMULTIREQJOBS TRUEvariable is enabled on your maui configuration file (/var/spool/maui/maui.cfg).
WATCH OUT
After doing these changes, PBS needs to re-read its configuration, being startad with the option -t create, overriding the previous settings. Make a backup of your node file /var/spool/pbs/server_priv/nodes and queues configuration
# qmgr -c "print server" > queues.cfg # cp /var/spool/pbs/server_priv/nodes . # /etc/init.d/pbs_server stop # pbs_server -t create # /etc/init.d/pbs_server stop # qmgr < queues.cfg # cp nodes /var/spool/pbs/server_priv/nodes # /etc/init.d/pbs_server start
