10.2.9. InputDataResolution: giving job access to the data
When a job needs access to data, there are two ways data can be accessed:
either by downloading the file on the local worker node
or by reading the data remotely, aka
streaming
.
The resolution is done in the JobWrapper
(see DIRAC jobs: definitions). By default, the resolution logic is implemented in InputDataResolution
. It can be overwritten by the Job JDL (see InputDataModule
in Job Description Language Reference), or by the /Operations/<>/InputDataPolicy/InputDataModule
parameter.
You can look into this class for more details, but to summarize:
it will look into the
job
JDL if it can findInputDataPolicy
option. If so, it will use that as the module.If not, it will check whether a policy is defined for the site we are running on (in
/Operations/InputDataPolicy/<site>
).If not, it will run the default policy specified in
/Operations/InputDataPolicy/Default
The InputDataPolicy
parameter can either be set directly in the JDL, in which case it should be a full module, or it can be set using the Job
class (see setInputDataPolicy()
)
10.2.9.1. DownloadInputData
This module will download the files locally on the worker node for processing.
See DownloadInputData
for details.
10.2.9.2. InputDataByProtocol
This module will generate the URLs necessary to access the files remotely.
See InputDataByProtocol
for details.