Service Documentation
Please let us know about incorrect, missing or obsolete information concerning the listed products.
ARC Compute Element (CE) is a Grid front-end for a conventional computing resource (e.g. a Linux cluster).
The Argus Authorization Service renders consistent authorization decisions for distributed services (e.g., portals, computing elements, storage elements). The service is based on the XACML standard and uses authorization policies to determine if a user is allowed or denied to perform a certain action on a particular service.
The following documentation mainly covers existing batch system middleware integration components and some general links to the most popular batch systems used within WLCG. In particular, ARC and CREAM CE integration with HTCondor, Slurm, Torque, LSF and Grid Engine flavors are well documented.
The CREAM (Computing Resource Execution And Management) Service is a simple, lightweight service for job management operation at the Computing Element (CE) level.
NOTE: CREAM support will reach its end of life in Dec 2020!
CERNVM File System (CVMFS) is a network file system based on HTTP and optimized to deliver experiment software in a fast, scalable, and reliable way. Files and file metadata are aggressively cached and downloaded on demand.
dCache is a system for storing and retrieving huge amounts of data, distributed among a large number of heterogenous server nodes, under a single virtual filesystem tree with a variety of standard access methods.
The Disk Pool Manager (DPM) is a lightweight storage solution for grid sites.
EOS is a disk-based service providing a low latency storage infrastructure for end users. EOS provides a highly-scalable hierarchical namespace implementation. Data access is provided by the XROOT and HTTPS protocols.
FTS3 is the service responsible for globally distributing the majority of the LHC data across the WLCG infrastructure. It is a low level data movement service, responsible for reliable bulk transfer of files from one site to another while allowing the network resource usage to be controlled per site.
GFAL (Grid File Access Library ) is a client C library providing an abstraction layer of the grid storage system complexity.
An OSG Compute Element (CE) is the entry point for the OSG to your local resources: a layer of software that you install on a machine that can submit jobs into your local batch system. At the heart of the CE is the job gateway software, which is responsible for handling incoming jobs, authorizing them, and delegating them to your batch system for execution. In the past, the OSG only had one option for a job gateway solution, Globus Toolkit’s GRAM-based gatekeeper, but now offers the HTCondor CE instead.
Today in OSG, most jobs that arrive at a CE (called grid jobs) are not end-user jobs, but rather pilot jobs submitted from factories. Successful pilot jobs create and make available an environment for actual end-user jobs to match and ultimately run within the pilot job container. Eventually pilot jobs remove themselves, typically after a period of inactivity.
HTCondor CE is a special configuration of the HTCondor software designed to be a job gateway solution for the OSG. It is configured to use the JobRouter daemon to delegate jobs by transforming and submitting them to the site’s batch system.
StoRM (Storage Resource Manager) is a light, scalable, flexible, high-performance, file system independent, storage manager service for generic disk based storage systems.
The User Interface (UI) is the access point to the Grid Infrastructure. This can be any machine where users have personal account and where their user certificate is installed. From the UI, the user can be authenticated and authorised to use the Grid resources and can access the functionalities offered by the Information, Workload and Data Management Systems.
The Worker Node (WN) is the computing node inside the Grid where the user's jobs are finally executed at a site. On the WN, the necessary middleware components are installed. Additional software components may be necessary according to the requirements of the site supported VOs.
Note: these days many VOs use UI and WN middleware provided through CVMFS, served either via their own repositories or via the generic grid.cern.ch repository.
The Virtual Organization Membership Service is a Grid attribute authority which serves as central repository for VO user authorization information, providing support for sorting users into group hierarchies and keeping track of their roles and other attributes. Such information is used to issue trusted attribute certificates and assertions used in the Grid environment for authorization purposes.
The grid information system provides detailed information about grid services in the interest of a multitude of grid clients and services. The grid information system has a hierarchical structure of three levels. The fundamental building block used in this hierarchy is the Berkeley Database Information Index (BDII). The resource level BDII is usually co-located with a given grid service and provides information about that service. Each grid site in EGI runs a site level BDII. This aggregates the information from all the resource level BDIIs running at that site. A top level BDII aggregates all the information from all the site level BDIIs and hence contains information about all grid services published by any site. A site may run its own top level BDII or point local clients to some other instance(s) on the grid. The information system clients query a top level BDII to find the information that they require.
XRootD software framework is a fully generic suite for fast, low latency and scalable data access, which can serve natively any kind of data, organized as a hierarchical filesystem-like namespace, based on the concept of directory.
Operations Meetings
T1 sites are invited to report about relevant matters at the weekly Operations meetings, which usually take place on Mondays at 15h00 CE(S)T. They are short meetings (max 30min) where experiments, T1s and central services report about ongoing operational matters.
Operations Coordination Meeting
T1 and T2 sites are invited to raise any issue they are concerned about at the monthly Operations Coordination meeting that usually takes place the 1st Thursday of the month from 15h30 to 17h CE(S)T. There is a section on the agenda for this. You can also write to wlcg-ops-coord-chairpeople in advance to make sure a specific slot is scheduled in the agenda. Short and medium term plans of the experiments affecting sites are reported. TFs and WGs also report about their progress and there may be specific presentations covering a particular topics that need discussion at operations level.
Middleware
WLCG Middleware Baseline
The WLCG Middleware Baseline lists the minimum recommended versions of middleware services that should be installed by WLCG sites to be part of the production infrastructure. It does not necessarily reflect the latest versions of packages available in the UMD, OSG or EPEL repositories. It contains the latest version fixing significant bugs or introducing important features. Versions newer than those indicated are assumed to be at least as good, unless otherwise indicated. In other words: if you have a version older than the baseline, you should upgrade at least to the baseline. For more details, please check the list of versions in the following link:
That page also contains a list of (major) known issues affecting WLCG middleware.