This wiki has been deprecated and will be removed soon.

The new Advanced Computing and e-Science wiki is located at http://grid.ifca.es/wiki.

Please update your bookmarks.

I2G-EGEE Integration

De e-Ciencia

In this section the EGEE and int.eu.grid integration procedures at IFCA will be discussed.

This document is based on the one written by Gonçalo Borges and Jorge Gomes. You must read this one for further explanations.

This is not a replacement for the original document, but just a recipe for the IFCA people.

Ask before doing anything that you don't know what is going to do.

Tabla de contenidos

EGEE UI with I2G job submission enabled

In order to be able to use the EGEE VOs in the Int.eu.grid testbed and profit from the I2G middleware benefits (for example parallelization) one workaround would be to enable an EGEE UI to submit jobs to the I2G testbed, using the I2G CrossBroker. In this example we are going to enable the "swetest" VO in a working EGEE VO.

In the EGEE UI, add the I2G repositories on /etc/apt/sources.list.d/i2g.list add:

rpm http://savannah.fzk.de repository/i2g/production i386 noarch

Then install the I2G middleware:

# apt-get update
# apt-get install i2g-profile i2g-vomscerts i2g-yaim-sysconfig i2g-wl-services-common i2g-wl-logging-api-c i2g-wl-logging-api-cpp i2g-wl-logging-api-sh i2g-wl-bypass i2g-wl-chkpt-api i2g-wl-common-api i2g-wl-common-api-java i2g-wl-common-api-java-interface i2g-wl-ui-api-cpp i2g-wl-ui-api-java i2g-wl-ui-api-java-interface i2g-wl-ui-cli i2g-wl-ui-config i2g-wl-ui-gui i2g-wl-config i2g-yaim-workload_manager_client

Configure it:

/opt/glite/yaim/bin/yaim -r -s siteinfo.def -f config_i2g_sysconfig 
/opt/glite/yaim/bin/yaim -r -s siteinfo.def -f config_i2g_workload_manager_client 

Apply the following patches (Save each one to somefile and apply with patch -p0 < somefile) to allow sending logging information to the CrossBroker:

--- /opt/i2g/bin/i2g-job-submit	2007-04-27 17:02:28.000000000 +0200
+++ /opt/i2g/bin/i2g-job-submit.new	2007-08-29 11:45:49.000000000 +0200
@@ -390,8 +390,13 @@
 	loggingHost, loggingPort = os.environ ["I2G_WL_LOG_DESTINATION"].split(":")
 	loggingPort=int(loggingPort)
 except:
-	try:
-		logDest = UIutils.info.confAd.getStringValue ("LoggingDestination")[0]
+	try:	
+		# Patch aloga<at>ifca.unican.es
+		logDest = UIutils.getLogging()
+		if not logDest:
+			logDest = UIutils.info.confAd.getStringValue ("LoggingDestination")[0]
+		# logDest = UIutils.info.confAd.getStringValue ("LoggingDestination")[0]
+		# Patch ends here
 		loggingHost, loggingPort = logDest.split(":")
 		loggingPort=int(loggingPort)
 		if logDest:
--- /opt/i2g/bin/UIutils.py	2007-04-27 17:02:28.000000000 +0200
+++ /opt/i2g/bin/UIutils.py.new	2007-08-29 11:48:35.000000000 +0200
@@ -419,6 +419,21 @@
 
 #NS / LB METHODS:
 
+# Patch aloga<at>ifca.unican.es
+def getLogging(*nsNum):
+    """Return LoggingDestination from the VO config file, if not exists returns None"""
+    try:
+        nsNum = nsNum[0]
+    except:
+        nsnum = -1
+    try:
+        log = info.confAdVo.getStringList("LoggingDestination")[0]
+    except:
+        return None
+    return log
+#Patch ends here
+
+
 """
 Retreve LB list from configuration file
 """

Edit the file (or create if neccesary) /opt/i2g/etc/swetest/i2g_wl_ui.conf to configure the job submission to use the CrossBroker (i2g-rb01.lip.pt)

[
VirtualOrganisation = "swetest";
NSAddresses = "i2g-rb02.lip.pt:7772";
LBAddresses = "i2g-rb02.lip.pt:9000";
MyProxyServer = "px01.ifca.es";
LoggingDestination = "i2g-rb02.lip.pt:9002";
]

Now a user should be able to launch jobs and submit them to the I2G testbed, by using the "i2g" commands:

i2g-job-attach
i2g-job-cancel
i2g-job-get-chkpt
i2g-job-get-logging-info
i2g-job-get-output
i2g-job-list-match
i2g-job-status
i2g-job-submit
i2g-wl-grid-console-shadow
i2g-wl-logev
i2g-wl-ui-jdleditor.csh
i2g-wl-ui-jdleditor.sh
i2g-wl-ui-jobmonitor.csh
i2g-wl-ui-jobmonitor.sh
i2g-wl-ui-jobsubmitter.csh
i2g-wl-ui-jobsubmitter.sh

WNs Integration

Schema

A rudimentary schema of the setup deployed should be:

                                                _
--------------------     --------------------    |
|                  |     |                  |    |
|      I2G CE      |     |     EGEE CE      |    |- Other Grid components are also dedicated
|                  |     |                  |    |               (SE, UI, etc)
--------------------     --------------------   _|
          |                       |
          -------------------------
                      |
                      |                         _
             --------------------                | 
             |                  |                |
             |   MAUI/TORQUE    |                |- A unique Maui + Torque server.
             |                  |                |
             --------------------               _|
                      |
                      |
          -------------------------              
          |                       |             _
--------------------     --------------------    |
|                  |     |                  |    |
|      WN 1        |     |       WN 2       |    |
|                  |     |                  |    |
--------------------     --------------------    |
                                                 |
        ...                      ...             |- A unique pool of WNs enabled to run both I2G
                                                 |                 and EGEE jobs.
--------------------     --------------------    |
|                  |     |                  |    |
|      WN "m"      |     |       WN "n"     |    |
|                  |     |                  |    |
--------------------     --------------------   _|

As seen we are going to deploy a separate torque server, in order to perform a centralized the management of the queues. Both CEs are going to communicate with it. Therefore, in the WNs no distinction between projects is made at all.

I2G repository

Where neccesary (i.e. when installing I2G middleware), install it under /etc/apt/sources.list.d/i2g.list.

rpm http://savannah.fzk.de repository/i2g/production i386 noarch

The integration takes part in two steps:

  1. Make the base installation of the EGEE Middleware (Instructions)
  2. Follow the particular instructions for each type of node listed below.

Please read these instructions carefully before installing the middleware, as in some places it is needed to redefine some "standard" variables before install them.

Prepare your new site-info.def

We are going to need 3 site-info.def: An I2G one, an EGEE one and a mixed one.

I2G and EGEE
  • Redefine the next variable:
TORQUE_SERVER="your.torque.server.$MY_DOMAIN"
Mixed one

This is the EGEE site-info.def file with the added I2G support:

  • Add both CEs (as CE CE2) and add them to the BDII regions.
  • Create or edit the users.conf file and add both I2G and EGEE VO's users to the file.
  • Perform the same operation for the groups in the file groups.conf.
  • Add on the site-info.def file the VOs supported (VOS="...", QUEUES="...", VO_GROUP_ENABLE="...")
  • Create the vo.d/imain file containing:
VOMS_SERVERS="'vomss://i2g-voms.lip.pt:8443/voms/imain?/imain/'
'vomss://i2gvoms01.ifca.es:8443/voms/imain?/imain/'"
SW_DIR=$VO_SW_DIR/imainsoft
DEFAULT_SE=$CLASSIC_HOST
STORAGE_DIR=$CLASSIC_STORAGE_DIR/imain
QUEUES="imaingridsdj"
VOMSES="'imain i2g-voms.lip.pt 20001 /C=PT/O=LIPCA/O=LIP/OU=Lisboa/CN=i2g-voms.lip.pt imain'
'imain i2gvoms01.ifca.es 20001 /C=ES/O=DATAGRID-ES/O=IFCA/CN=host/i2gvoms01.ifca.es imain'"

Worker Nodes Installation

Install them and configure them with the "mixed" site-info.def.

Maui + Torque installation

Install the node, and configure it with the "mixed" site-info.def.

/opt/glite/yaim/bin/yaim -i -s siteinfo/site-info_glite30_TORQUE_GRID_070726_164600.def -m glite-torque-server-config
/opt/glite/yaim/bin/yaim -c -s siteinfo/site-info_glite30_TORQUE_GRID_070726_164600.def -n TORQUE_server

Edit your /var/spool/maui/maui.cfg, and add both CEs on the line ADMINHOSTS like this:

(...)
ADMINHOSTS              torque.ifca.es i2gce01.ifca.es egeece01.ifca.es
(...)

Edit/create the files /var/spool/pbs/server_priv/acl_svr/operators and /var/spool/pbs/server_priv/acl_svr/managers, adding the server which should be able to administrate the PBS server (i.e. the torque server and both CEs) one on each line.

Restart the service:

/etc/init.d/pbs_server restart

EGEE CE installation

Install and configure it with the EGEE site-info.def, but using the lcg-CE metapackage and the CE target.

/opt/glite/yaim/bin/yaim -i -s siteinfo/EGEE/site-info.def -m lcg-CE -m glite-torque-client-config
/opt/glite/yaim/bin/yaim -c -s siteinfo/EGEE/site-info.def -n CE

I2G CE installation

Install and configure it with the I2G site-info.def.

/opt/glite/yaim/bin/yaim -i -s siteinfo/I2G/site-info.def -m i2g-CE-lcg -m glite-torque-client-config
/opt/glite/yaim/bin/yaim -c -s siteinfo/I2G/site-info.def -n CE
Jobmanager Patch

As said in the document written by Gonçalo Borges and Jorge Gomes, it is needed to apply the following patch to /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgpbs.pm (as we are using lcgpbs):

            .  "export " . $tuple->[0] . "\n";
     }
-    $pbs_job_script->print("#PBS -v " .  join(',', @new_env) . "\n");
+
+    ###
+    # [GBorges]
+    # Patch to guaranty interoperability between I2G WNs and EGEE WNs
+
+    # Read the VO from the user proxy
+    my $vo_user_line=`/opt/edg/bin/voms-proxy-info --file $local_x509 --all | grep VO`;
+    my ($vo_tag,$vo_user)=split(/:/,$vo_user_line);
+    $vo_user=~s/\s+//g;
+
+    # Read the vo_environment configuration file
+    my ( %VOCONFIG );
+    $VOCONFIG{'vo_environment'} = '/opt/globus/lib/perl/Globus/GRAM/JobManager/vo_environment';
+    my $voenv = new IO::File( $VOCONFIG{'vo_environment'}, "r" )
+        or die "Unable to open VO environment mapping configuration file.";
+
+    # For every line in the environment configuration map
+    while ( defined( $_ = $voenv->getline ) ) {
+        next if /^\s*\#/;       # Skip if comment
+        next if /^\s*$/;        # Skip if whitespace
+        # Several chars (\S+), white space (\s+)
+        if (/^(\S+)\s+((\S+)\s+)+/) {
+            # If it's a constraint entry, extract the values.
+            my $vo_name = $1;
+            s/^($vo_name)\s+//g;
+            my @vo_varenvs = split(/\s/);
+            foreach my $vo_varenv (@vo_varenvs) {
+                # Add @new_env with the VO specific env variables
+                push(@new_env,$vo_varenv) if $vo_name eq $vo_user;
+            }
+        } else {
+            warn "Unrecognised entry in ".$VOCONFIG{'vo_environment'}.": '$_'\n";
+        }
+    }
+
+    $voenv->close;
+
+    foreach my $env_line (@new_env) {
+       $pbs_job_script->print("export " . $env_line . "\n");
+    }
+
+#    $pbs_job_script->print("#PBS -v " .  join(',', @new_env) . "\n");
+
+# Patch ends here

     if (defined $description->library_path() && $description->library_path() ne '')
     {

MPI configuration

For MPI to work it is needed to patch the Jobmanager and configure the batch system (in the case of PBS/Torque).

WNs configuration

It is needed to get passwordless SSH between the WNS. See our Grid Administration Guide

I2G CE configuration

As we are using lcg-pbs we need to apply the following patch:

--- /opt/globus/lib/perl/Globus/GRAM/JobManager/lcgpbs.pm	2007-08-24 13:25:00.000000000 +0200
+++ lcgpbs.pm	2007-08-24 13:25:46.000000000 +0200
@@ -22,8 +22,8 @@
     $qstat =  '/usr/bin/qstat';
     $qdel = '/usr/bin/qdel';
     $qmsg = '/usr/bin/qmsg';
-    $cluster = 0;
-    $cpu_per_node = 0;
+    $cluster = 1;
+    $cpu_per_node = 1;
     $remote_shell = '/usr/bin/ssh';
 }
Maui + Torque configuration

The main instructions are here (I2G wiki) and here (EGEE GOC wiki).

  • Edit /var/spool/pbs/torque.cfg and add a line containing:
SUBMITFILTER /var/spool/pbs/submit_filter.pl 
  • Download the submit_filter.pl from [1] and put it in the above location.
  • Check that the ENABLEMULTIREQJOBS TRUE variable is enabled on your maui configuration file (/var/spool/maui/maui.cfg).

WATCH OUT

After doing these changes, PBS needs to re-read its configuration, being startad with the option -t create, overriding the previous settings. Make a backup of your node file /var/spool/pbs/server_priv/nodes and queues configuration

# qmgr -c "print server" > queues.cfg
# cp /var/spool/pbs/server_priv/nodes .
# /etc/init.d/pbs_server stop
# pbs_server -t create
# /etc/init.d/pbs_server stop
# qmgr < queues.cfg
# cp nodes /var/spool/pbs/server_priv/nodes
# /etc/init.d/pbs_server start

Testing

Grid Administration
Users Support