SiteDirector

The Site Director is an agent performing pilot job submission to particular sites/Computing Elements.

SiteDirector options
SiteDirector
{
  # VO treated (leave empty for auto-discovery)
  VO =
  # VO treated (leave empty for auto-discovery)
  Community =
  # Group treated (leave empty for auto-discovery)
  Group =
  # Grid Environment (leave empty for auto-discovery)
  GridEnv =
  # the DN of the certificate proxy used to submit pilots. If not found here, what is in Operations/Pilot section of the CS will be used
  PilotDN =
  # the group of the certificate proxy used to submit pilots. If not found here, what is in Operations/Pilot section of the CS will be used
  PilotGroup =


  # List of sites that will be treated by this SiteDirector ("any" can refer to any Site defined in the CS)
  Site = any
  # List of CE types that will be treated by this SiteDirector ("any" can refer to any CE defined in the CS)
  CETypes = any
  # List of CEs that will be treated by this SiteDirector ("any" can refer to any type of CE defined in the CS)
  CEs = any

  # The maximum length of a queue (in seconds). Default: 3 days
  MaxQueueLength = 259200
  # Log level of the pilots
  PilotLogLevel = INFO
  # Max number of pilots to submit per cycle
  MaxPilotsToSubmit = 100
  # Check, or not, for the waiting pilots already submitted
  PilotWaitingFlag = True
  # How many cycels to skip if queue is not working
  FailedQueueCycleFactor = 10
  # Every N cycles we update the pilots status
  PilotStatusUpdateCycleFactor = 10
  # Every N cycles we update the number of available slots in the queues
  AvailableSlotsUpdateCycleFactor = 10
  # Maximum number of times the Site Director is going to try to get a pilot output before stopping
  MaxRetryGetPilotOutput = 3
  # If True, pilots will be submitted with option --pythonVersion=3
  Python3Pilots = True
  # To submit pilots to empty sites in any case
  AddPilotsToEmptySites = False
  # Should the SiteDirector consider platforms when deciding to submit pilots?
  CheckPlatform = False
  # Attribute used to define if the status of the pilots will be updated
  UpdatePilotStatus = True
  # Boolean value used to indicate if the pilot output will be or not retrieved
  GetPilotOutput = False
  # Boolean value that indicates if the pilot job will send information for accounting
  SendPilotAccounting = True
  # Boolean value that indicates if the pilot submission statistics will be sended for accounting
  SendPilotSubmissionAccounting = True
}
class DIRAC.WorkloadManagementSystem.Agent.SiteDirector.SiteDirector(*args, **kwargs)

Bases: DIRAC.Core.Base.AgentModule.AgentModule

SiteDirector class provides an implementation of a DIRAC agent.

Used for submitting pilots to Computing Elements.

__init__(*args, **kwargs)

c’tor

am_Enabled()
am_checkStopAgentFile()
am_createStopAgentFile()
am_disableMonitoring()
am_getBasePath()
am_getControlDirectory()
am_getCyclesDone()
am_getMaxCycles()
am_getModuleParam(optionName)
am_getOption(optionName, defaultValue=None)

Gets an option from the agent’s configuration section. The section will be a subsection of the /Systems section in the CS.

am_getPollingTime()
am_getShifterProxyLocation()
am_getStopAgentFile()
am_getWatchdogTime()
am_getWorkDirectory()
am_go()
am_initialize(*initArgs)

Common initialization for all the agents.

This is executed every time an agent (re)starts. This is called by the AgentReactor, should not be overridden.

am_monitoringEnabled()
am_removeStopAgentFile()
am_secureCall(functor, args=(), name=False)
am_setModuleParam(optionName, value)
am_setOption(optionName, value)
am_stopExecution()
beginExecution()

This is run at every cycle, as first thing.

  1. Check the pilots credentials.

  2. Get some flags and options used later

  3. Get the site description dictionary

  4. Get what to send in pilot wrapper

endExecution()
execute()

Main execution method (what is called at each agent cycle).

It basically just calls self.submitPilots() method

finalize()
getExecutable(queue, proxy=None, jobExecDir='', envVariables=None, **kwargs)

Prepare the full executable for queue

Parameters
  • queue (str) – queue name

  • bundleProxy (bool) – flag that say if to bundle or not the proxy

  • queue – pilot execution dir (normally an empty string)

Returns

a string the options for the pilot

Return type

str

getQueueSlots(queue, manyWaitingPilotsFlag)

Get the number of available slots in the queue

initialize()

Initial settings

monitorJobsQueuesPilots(matchingTQs)

Just printout of jobs queues and pilots status in TQ

sendPilotAccounting(pilotDict)

Send pilot accounting record

sendPilotSubmissionAccounting(siteName, ceName, queueName, numTotal, numSucceeded, status)

Send pilot submission accounting record

Parameters
  • siteName (str) – Site name

  • ceName (str) – CE name

  • queueName (str) – queue Name

  • numTotal (int) – Total number of submission

  • numSucceeded (int) – Total number of submission succeeded

  • status (str) – ‘Succeeded’ or ‘Failed’

Returns

S_OK / S_ERROR

submitPilots()

Go through defined computing elements and submit pilots if necessary and possible

Returns

S_OK/S_ERROR

updatePilotStatus()

Update status of pilots in transient and final states