This wiki has been deprecated and will be removed soon.

The new Advanced Computing and e-Science wiki is located at http://grid.ifca.es/wiki.

Please update your bookmarks.

CPD-MANAGEMENT

De e-Ciencia

This activity refers to the management of the physical environment of the CPD.

The main components are:

Tabla de contenidos

Building Management

Covers the maintenance and upkeep of CPD room, IT staff offices and access control

> Cleaning Procedure

Daily: Check in entry hall, put to wastebasket dimissed papers, minor dirtiness. use vacuum cleaner located below table if needed. Check level of bag and bags are available.

Weekly: Ask on tuesady at 9.30am to cleaning people to clean in detail next day the main room (temporarily suspended) and the entry hall

> Waste Disposal

Daily: Check if new waste from shipping, maintenance tasks, etc. Open ticket if a significant waste source is found.

Weekly: Check with LAB unit that the waste is planned to be removed (from the place outside the room, close to the access door). If no plans, open ticket.

> Access Control Access control (see also below) is granted to:

-IT technical management team (not to applications management!) and those working with them (ex. IBM maintenance, workshop people, etc.). Shifter must be present along the full visit of an external person. If not possible, a ticket should be opened.

-Building maintenance, IFCA authorities, but should write on entry logbook and IT management on shift should open a ticket later.

Daily: Check the room is locked on tour If not, lock, then open ticket, tag 5.12.c.

Visits to the room must be registered by IT technical management team using an open ticket, including responsible, number and description of visitors, reason for visit. CPD Room can be declared in status "visits not allowed" by IT management.

<ticket tag Access Control>

<Who,When,Why=door found open | key lost | copy of key needed | visit request >

Equipment Hosting

Ensures that all special requirements are provided for the physical housing of equipment and the teams that support them.

BASIC RULE: any new equipment must be in principle RACK-hosted.

REQUEST Procedure (IT-management, ...): Receive supplier communication Open ticket for installation, specifying the equipment details

FULFILLMENT Procedure (IT-management): Locate spare place or expected place Check power, networking, and air-conditioning needs Fix a date/time for the installation

Power Management

CPD room has two separated boards.

ALTAMIRA/RES electrical board is located to the right pass the door. It has a three phase connection, limited to 60A. It has also connected a MGE UPS 5000, with a capacity of XXX Ah

ONLY the DS300 supporting the GPFS disk system is connected to the UPS. Typical energy usage in the UPS is: x x x (per phase)

All blades are connected directly to the electrical board, through 3-phase PDU.

Counters in the ALTAMIRA board: the electronic counter located in the board reports: -current by phase (should be balanced?) -total energy usage in Wh (cyclic switching by pressing the button).

The electrical board also supports the Hiross U66 placed behind the Altamira supercomputer.

GRID/General electrical board is located at the end of the wall to the left in the room. It has a three phase connection, limited to 160A. It has also connected a MGE UPS 3000, with a capacity of XXX Ah

ONLY the DS4700 supporting the GPFS disk system, and critical servers are connected to the UPS. This equipment is marked with a RED circle. In most of the cases there is double power connectors (one to the UPS, one to the direct connection) Typical energy usage in the UPS is: x x x (per phase)

All other servers and blades are connected directly to the electrical board, through 3-phase PDUs.

Counters in the GRID board: the electronic counter located in the board reports: -current by phase (should be balanced?) -total energy usage in Wh (cyclic switching by pressing the button). Typical values:

The electrical board also supports the Hiross U66 placed close to the other Hiross, and the General/Fujitsu air conditioners in the bottom wall.

Environmental Conditioning

Includes specification, maintenance and monitoring of systems such as smoke detection and fire suppression, water, heating and cooling systems

The CPD air-conditioning is currently based mainly on two units Hiross 66UA The units are maintained by IPARKLIMA (phone:) The contract includes periodical revisions.

Known problems:

the humidification filters do not work (get calcified after two-three months and have to be replaced, cost is too high)

the total power is not enough


Proposed solution: removing hot area directly outside (in progress in January 2009)


The Alarm system was installed by Fischer and now is connected to the Security Unit

The phone line unfotunately is shared with an office (!)

The Alarm system calls to Securty, to J. Marco and to R. Marco (as of 15 Jan 2009)

The Alarm pannel is located to the right at the entry of the hall

It has three operations mode: manual, auto, maintenance. IT SHOULD BE IN AUTO MODE A ticket should be opened in any other case.


The revisions of the alarm system are in hands of the safety unit at Univeristy (phone?)

Safety

compliance to all legislation, standards and policies relative to the safety of employees

The following procedures apply:

-all IT managers who have access to the room SHOULD have made a basic SAFETY course like the one available at CERN (for remote training).

-a security contact has to be available at all times. Outside working hours this is handled by the UC Securty (phone: +34 942 20999 )

>Signals in the room:

-Mind the step at the entry

-Although there is no possibility that the room becomes filled with dangerous gas in case of alarm, it can become dangerous if an equipment is burnt. NO WASTE SHOULD BE ANYHWERE IN THE ROOM ALWAYS CLEAN signal is needed

-All electrical connections must be indicated, and electrical boards must be signaled

-No charge exceeding 10Kg can be done by anyone. Labels on equipment boxes have to be respected.

Physical Access Control

the facility is only accessed by authorized personnel and that any unauthorized access is detected and managed.

See previous section on tickets to be opened for access (visits, etc).


Key is NOT a master key

The following people have keys (by 15 Jan 2009):

ADMIN (check where and how is given) SECURITY (?) Jesus Marco Rafael Marco

...

Shipping and Receiving

From ITIL: Receipt of new equipment • Unpacking, configuring and installing standard equipment • Producing and maintaining Data Centre layout diagrams • Managing the schedule of any maintenance activity to equipment hosted in the Data Centre • Disposing of retired equipment.

Any equipment must be usually entered up to the hall of the room, and a ticket should be opened, including the details of the reception (transport, who received the package, etc). First of all it must be checked that it is in operating mode. Time to take this step must not exceed one week in any case. The ticket must be closed by entering the id data (S/N, etc etc), proposed location in the room, and in this way it is entered in the database, that sends a mail with the data to our ADMIN to get a proper inventory number. The physical label must be sticked to the equipment , or stored in the folder (see first drawbox). A copy of the reception papers (albaran, bill, etc) must be scanned and attached to the ticket.

Contract Management

With suppliers and service providers involved in the facility

Any equipment must cost approx max. 10% over "majorist" price (check how to get this price).

Any purchase order MUST be confirmed by -CPD responsible (to see if it fits) -Account owner (typically the IP of the project) -ADMIN

Legal restrictions: -any equipment below 18.000 euros can be purchased directly with 3 offers -any equipment above 206.000 euros must be plublished to DOE-Brussels -equipment cam be bought via CSIC directly from "Catalogo Patrimonio" (see http://catalogopatrimonio.meh.es/pctw/index.aspx)

This is handled in an ad-hoc way

Known suppliers:

IBM: contact for University equipment:

contact for CSIC equipment:

Notice that we are no longer supported by the office in Asturias


maintenance contact:


CIC-SL: contact: Maite (phone: +34 942 269017)

Cantabria Informatica (CIT): contact: Adolfo (phone: +34 942 037502)


MEGWARE:

Maintenance

Refers to regular, scheduled update, including detection and resolution of problems with the facility


Maintenance routine is done weekly, in cooperation between shifter and responsible. A list of minor activities are to be done/fixed along the week, and this is refelected in an specific ticket.

Herramientas personales
Grid Administration
Users Support