Information Technology Reference
In-Depth Information
A special role is being played by the Service Availability Monitor
(SAM) developed at CERN within the EGEE and LCG projects [17]. SAM
is capable to schedule tests on the grid infrastructure (as grid jobs and as
commands from grid user interfaces) in order to collect operational data.
In Figure 17.4 the status as seen by SAM for a part of the EGEE infrastruc-
ture is shown. Computer-centers' statuses are indicated by a color code.
One can then drill down to the status of individual services/sites to spot
operational problems or calculate the availability of the different com-
puter centers. In the case of LCG, monthly reports compare the expected
and available resources at each participating site.
SAM is clearly an essential tool to operate the grid. In addition, it is
important to correlate these data with the actual user activity (usage and
efi ciency seen by the different types of jobs). The correlation is not always
very simple due to the different ways in which different jobs (and different
user communities) use the grid services offered by the computer centers.
A complementary view is needed and the applications should also be
involved. In practice this generated a collaboration between the HEP user
communities and the operation team (at the origin of SAM and other infra-
structure-oriented monitoring systems).
The combination of the experience of the monitoring system of CDF
(FNAL) and the user monitor of an early ARDA analysis prototype was
used to start the CMS Dashboard project [later renamed (Experiment)
IFCA
UNICAN
USC
CESGA
UB
IFAE1
IFAE2
PIC
PIC2
PIC3
PIC4
DI-UMinho
ClusterUL
UPORTO
UPORTO2
UPORTO3
LIP-Coimbra
BIFI
CIEMAT
CNB
UAM
ESAC
IFIC2
IFIC
UPV
LIP-Lisbon
CFP-IST
2008/03/30 - 16:30:04 GMT
FIGURE 17.4 SAM status for a part of the EGEE infrastructure (http://www.egee.cesga.es/
EGEE-SA1-SWE/monitoring/sam.sel.shtml?voselect=atlas.)
Search WWH ::




Custom Search