3.2. Dirac File Catalog
The DIRAC File Catalog (DFC) is a full replica and metadata catalog integrated to DIRAC. It has a very modular structure, allowing for several backends. The interaction with the backend is handled by Managers in such a way that the interface exposed to the users remains always the same.
There are two main sets of managers:
the historical ones, offering the full range of functionalities and used by most VO
and the LHCb ones, where a subsets of the functionalities related to user defined metadata are not tested, but optimized for scaling and consistency. Any VO could of course use it.
The DFC can be used also as a Metadata catalog. Metadata is the information describing the user data in order to easily select the data sets of interest for user applications. In the DIRAC File Catalog metadata can be associated with any directory. It is important that subdirectories are inheriting the metadata of their parents, this allows to reduce the number of the stored metadata values. Some metadata variables can be declared as indexes. Only indexed metadata can be used in data selections. One can declare ancestor files for a given file. This is often needed in order to keep track of the derived data provenance path.
3.2.1. Installation
The installation and configuration procedure changes slightly between the historical managers and the LHCb ones.
The list of components you need to have installed is:
FileCatalogDB: if you want the standard managers, you should use FileCatalogDB.sql, but FilecatalogWithFkAndPsDB.sql if you want the LHCb ones
FileCatalogHandler: just the interface to the DB
3.2.2. FileCatalogDB
No special configuration there.
3.2.3. FileCatalogHandler
All the configuration of the DFC takes place there.
DatasetManager: default DatasetManager Manager for the dataset
DefaultUmask: default 0775 Umask in octal
DirectoryManager: default DirectoryLevelTree Manager for the Directories
DirectoryMetadata: default DirectoryMetadata Manager for the directory metadata
FileManager: default FileManager Manager for the files
FileMetadata: default FileMetadata Manager for the file metadata
GlobalReadAccess: default True. If set to True, anyone can read anything
LFNPFNConvention: default Strong.
ResolvePFN: default True. Deprecated
SecurityManager: default NoSecurityManager. Manager for authentication
SEManager: default SEManagerDB. Manager for the storage elements
UniqueGUID: default False. If True, the GUID has to be unique through the namespace
UserGroupManager: default UserAndGroupManagerDB. Managers for groups and users
ValidFileStatus: default [AprioriGood,Trash,Removing,Probing]. Status that are valid for Files
ValidReplicaStatus: default [AprioriGood,Trash,Removing,Probing]. Status that are valid for Replicas
VisibleFileStatus: default [AprioriGood]. By default, only files in this status are returned
VisibleReplicaStatus: default [AprioriGood] By default, only replicas in this status are returned
In order to use the LHCb handlers you should choose:
FileManager = FileManagerPs
DirectoryManager = DirectoryClosure
UniqueGUID = True
SecurityManager = VOMSSecurityManager
3.2.4. Security Manager
This manager takes care of the access permissions in the DFC. There are several of them:
NoSecurityManager (
NoSecurityManager
): offer yourself to whatever treatment the world reserves youDirectorySecurityManager (
DirectorySecurityManager
): only look at directories for permissionsFullSecurityManager (
FullSecurityManager
): Close to POSIX treatment of security permissionsDirectorySecurityManagerWithDelete (
DirectorySecurityManagerWithDelete
): same as DirectorySecurityManager but consider the parent’s directory write bit for removalVOMSSecurityManager (
VOMSSecurityManager
): implements a 3-level posix permission (directory-file-replica), and groups the dirac group using their VOMS roles. Basically, if the owner does not match, the groups are used. But the group doing the request and the one owning the file do not need to be the same: it is enough if they share the same VOMS role.
3.2.5. LFN PFN convention
The DFC encourages to use a convention for naming physical file names (PFNs) such that they contain the logical file name (LFN) as their trailing part. In this case there is a clear one-to-one correspondence between the LFNs and PFNs which simplifies a lot data integrity management. If the LFNPFNConvention option is set to Strong, this convention is imposed: the PFNs are not stored in the DFC and they are constructed on the fly following the convention.