Information Technology Reference
In-Depth Information
CMPC CouchDB instance sanctions DELETE requests in order to prevent loss of
any information.
DCI used by the AEGIS CMPC SG is organized around the cmpc.aegis.rs VO.
Establishment of a separate VO has allowed lobbying at the resource providers for
additional or new resources. Also, it guaranties execution of jobs submitted from
the scienti
c gateways in reasonable time, and provides an exact mechanism for
tracking the of number of users, CPU usage, and other DCI-related statistics. The
cmpc.aegis.rs VO is currently supported by three NGI_AEGIS Grid sites, part of
EGI infrastructure (AEGIS01-IPB-SCL, AEGIS04-KG, and AEGIS11-MISANU),
and by the largest HPC installation in Serbia, the PARADOX cluster, totaling to
more than 2,700 CPUs and 140 TBs of storage space. Recently, AEGIS CMPC
scienti
c gateway has been also supported by the largest HPC installation in Serbia.
PARADOX cluster is equipped with 1,696 Sandy Bridge CPUs at a frequency of
2.6 GHz, 106 NVIDIA Tesla M2090 GPU cards, and 100 TBs of storage space. It is
interconnected via the QDR In
niband technology, and achieves peak computing
performance of 105 TFlops.
Management of VO membership is centralized and provided by the NGI_AEGIS
VOMS-admin portal. Core Grid services necessary for users to be able to access all
computing and data storage resources are provided by the NGI_AEGIS as well, in
particular: the BDII information system (bdii.ipb.ac.rs), workload management
system (wms.ipb.ac.rs, wms-aegis.ipb.ac.rs), logging and bookkeeping service (lb.
ipb.ac.rs, lb-aegis.ipb.ac.rs), myproxy service (myproxy.ipb.ac.rs), and logical
le
catalogue (lfc.ipb.ac.rs). All services are running the latest version of the middle-
ware EMI 3 (Monte Bianco) release.
15.4 Usage of the Scientific Gateway
AEGIS CMPC SG achieved production mode in September 2013. Currently, there
are 20 registered users, and according to the EGI accounting portal, 19,000 cmpc.
aegis.rs VO jobs have been so far submitted from the portal. Jobs are uniformly
distributed over the CMPC applications, while the average execution time per job is
around 24 h.
With the introduction of the science gateway, the CPMC job success rate has
dramatically increased. One of the indicators is the ratio between totally consumed
CPU time and the number of jobs. Currently this ratio is approximately 23 h, which
corresponds to the average execution time per job. In the case of failures, this time
would be smaller. Users are allowed to tune application
guration only, so
there is not much space for changes that will lead to application crashes or an
unpredictable behavior. Also, only CPU-intensive parts of the workflow are exe-
cuted on the DCI, and other tasks are executed locally, on the machine running the
science gateway. The success rate of the local jobs is practically 100 %, while the
jobs submitted to the DCI may fail due to various infrastructure problems. For this
reason, each job submitted to the DCI is con
'
s con
gured to allow automatic resubmission.
Search WWH ::




Custom Search