Information Technology Reference
In-Depth Information
Dashboard since the same foundation is used by all four LHC experiments
[18]]. The project thus started as a collaboration between ARDA and the
CMS experiment.
The strategy was to give to all grid actors the right tool to manipulate
and display the available data. The grid operation support, for example,
could use the Dashboard to isolate site-specii c troubles and use the
statistics from error messages to i x the problem. Middleware develop-
ment teams could collect large statistics of error conditions, concentrat-
ing on the most common (hence the most annoying for the users) factoring
out site or application problems. Users are clearly interested to follow the
execution (including error conditions) of their own jobs while the activity
managers are interested in global i gures like resource usage.
In the development of the project, the emphasis was given to the aggre-
gation of existing information and no special effort was devoted in the
development of new sensors or protocols. The main components of the
Dashboard are then information collectors, the data storage (an Oracle
database), and the services responsible for data retrieval and information
presentation (command-line tools, Web pages, etc.).
The Dashboard is using multiple sources of information. In addition to
SAM, it collects information from other grid monitoring systems like
R-GMA (Relational Grid Monitoring Architecture) [19], GridIce (Monitoring
tool for Grid Systems) [20], and ICRTM (Imperial College Real Time
Monitoring of the Resource Brokers) [21].
The Dashboard also aggregates information from experiment-specii c
sources. Examples are the ATLAS Data Management, central databases
(like the ATLAS Production database), job submission systems (like
Ganga), and individual jobs. Information is transported to the Dashboard
via various protocols (depending on the capability of the information
providers).
The collection of input information implies regular access to the infor-
mation sources. They are retrieved and stored in the Dashboard database.
To provide a reliable monitoring system, data collectors should be resil-
ient to run permanently and recover any missing data in case of failures.
The Dashboard framework provides all the necessary tools to manage
and monitor these agents, each focusing on a specii c subset of the
required tasks.
In Figure 17.5 we present one of the main views of the Dashboard,
namely the Job Monitor. We display as an example the summary of CMS
production jobs (one week at the beginning of 2008). It is worth noting that
since the LHC experiments use as a rule more than one grid infrastruc-
ture, the Dashboard has been designed in order to collect information
from all used resources. The centers listed in the display belong to EGEE
with the exception of the US sites (belonging to OSG).
In Figure 17.6 we present also an alternative view from the Job Monitor.
The dashboard database provides here the view of the analysis jobs
Search WWH ::




Custom Search