HTCondorCEComputingElement

HTCondorCE Computing Element

Allows direct submission to HTCondorCE Computing Elements with a SiteDirector Agent

Configuration Parameters

Configuration for the HTCondorCE submission can be done via the configuration system. See the page about configuring Resources / Computing for where the options can be placed.

WorkingDirectory:

Location to store the pilot and condor log files locally. It should exist on the server and be accessible (both readable and writeable). Also temporary files like condor submit files are kept here. This option is only read from the global Resources/Computing/HTCondorCE location.

DaysToKeepRemoteLogs:

How long to keep the log files on the remote schedd until they are removed

DaysToKeepLogs:

How long to keep the log files locally until they are removed

ExtraSubmitString:

Additional options for the condor submit file, separate options with ‘n’, for example:

request_cpus = 8 \n periodic_remove = ...

CERN proposes additional features to the standard HTCondor implementation. Among these features, one can find an option to limit the allocation runtime (+MaxRuntime), that does not exist in the standard HTCondor version: no explicit way to define a runtime limit (maxCPUTime would act as the limit). On CERN-HTCondor CEs, one can use CERN-specific features via the ExtraSubmitString configuration parameter.

UseLocalSchedd:

If False, directly submit to a remote condor schedule daemon, then one does not need to run condor daemons on the submit machine. If True requires the condor grid middleware (condor_submit, condor_history, condor_q, condor_rm)

Proxy renewal or lifetime

When not using a local condor_schedd, add delegate_job_GSI_credentials_lifetime = 0 to the ExtraSubmitString.

When using a local condor_schedd look at the HTCondor documenation for enabling the proxy refresh.

Code Documentation

class DIRAC.Resources.Computing.HTCondorCEComputingElement.HTCondorCEComputingElement(ceUniqueID)

Bases: DIRAC.Resources.Computing.ComputingElement.ComputingElement

HTCondorCE computing element class implementing the functions jobSubmit, getJobOutput

__init__(ceUniqueID)

Standard constructor.

available(jobIDList=None)

This method returns the number of available slots in the target CE. The CE instance polls for waiting and running jobs and compares to the limits in the CE parameters.

Parameters

jobIDList (list) – list of already existing job IDs to be checked against

getCEStatus()

Method to return information on running and pending jobs.

getDescription()

Get CE description as a dictionary.

This is called by the JobAgent for the case of “inner” CEs.

getJobOutput(jobID, _localDir=None)

TODO: condor can copy the output automatically back to the submission, so we just need to pick it up from the proper folder

getJobStatus(jobIDList)

Get the status information for the given list of jobs

initializeParameters()

Initialize the CE parameters after they are collected from various sources

isProxyValid(valid=1000)

Check if the stored proxy is valid

isValid()

Check the sanity of the Computing Element definition

killJob(jobIDList)

Kill the specified jobs

loadBatchSystem(batchSystemName)

Instantiate object representing the backend batch system

Parameters

batchSystemName (str) – name of the batch system

loadParallelLibrary(parallelLibraryName, workingDirectory='.')

Instantiate object representing the parallel library that will generate a script to wrap the executable

Parameters

parallelLibraryName (str) – name of the parallel library

sendOutput(stdid, line)

Callback function such that the results from the CE may be returned.

setCPUTimeLeft(cpuTimeLeft=None)

Update the CPUTime parameter of the CE classAd, necessary for running in filling mode

setParameters(ceOptions)

Add parameters from the given dictionary overriding the previous values

Parameters

ceOptions (dict) – CE parameters dictionary to update already defined ones

setProxy(proxy, valid=0)

Set proxy for this instance

shutdown()

Optional method to shutdown the (Inner) Computing Element

submitJob(executableFile, proxy, numberOfJobs=1)

Method to submit job

writeProxyToFile(proxy)

CE helper function to write a CE proxy string to a file.

DIRAC.Resources.Computing.HTCondorCEComputingElement.condorIDAndPathToResultFromJobRef(jobRef)

Extract tuple of jobURL and jobID from the jobRef string. The condorID as well as the path leading to the job results are also extracted from the jobID.

Parameters

jobRef (str) – PilotJobReference of the following form: htcondorce://<ceName>/<condorID>:::<pilotStamp>

Returns

tuple composed of the jobURL, the path to the job results and the condorID of the given jobRef

DIRAC.Resources.Computing.HTCondorCEComputingElement.findFile(workingDir, fileName, pathToResult=None)

Find a file in a file system.

Parameters
  • workingDir (str) – the name of the directory containing the given file to search for

  • fileName (str) – the name of the file to find

  • pathToResult (str) – the path to follow from workingDir to find the file

Returns

list of paths leading to the file

DIRAC.Resources.Computing.HTCondorCEComputingElement.getCondorLogFile(pilotRef)

Return the location of the logFile belonging to the pilot reference.

DIRAC.Resources.Computing.HTCondorCEComputingElement.logDir(ceName, stamp)

Return path to log and output files for pilot.

Parameters
  • ceName (str) – Name of the CE

  • stamp (str) – pilot stamp from/for jobRef