TransformationCleaningAgent

TransformationCleaningAgent cleans up finalised transformations.

TransformationCleaningAgent options
TransformationCleaningAgent
{
  # MetaData key to use to identify output data
  TransfIDMeta=TransformationID

  # Location of the OutputData, if the OutputDirectories parameter is not set for
  # transformations only 'MetadataCatalog has to be used
  DirectoryLocations=TransformationDB,MetadataCatalog

  # Enable or disable, default enabled
  EnableFlag=True

  # How many days to wait before archiving transformations
  ArchiveAfter=7

  # Shifter to use for removal operations, default is empty and
  # using the transformation owner for cleanup
  shifterProxy=

  # If enabled, remove files by submitting requests to the RequestManagementSystem
  # instead of during the agent run
  CleanWithRMS=False

  # Which transformation types to clean
  # If not filled, transformation types are taken from
  #   Operations/Transformations/DataManipulation
  # and Operations/Transformations/DataProcessing
  TransformationTypes=

  #Time between cycles in seconds
  PollingTime = 3600
}
class DIRAC.TransformationSystem.Agent.TransformationCleaningAgent.TransformationCleaningAgent(*args, **kwargs)

Bases: AgentModule

class TransformationCleaningAgent
Parameters:
__init__(*args, **kwargs)

c’tor

am_Enabled()
am_checkStopAgentFile()
am_createStopAgentFile()
am_getControlDirectory()
am_getCyclesDone()
am_getMaxCycles()
am_getModuleParam(optionName)
am_getOption(optionName, defaultValue=None)

Gets an option from the agent’s configuration section. The section will be a subsection of the /Systems section in the CS.

am_getPollingTime()
am_getShifterProxyLocation()
am_getStopAgentFile()
am_getWatchdogTime()
am_getWorkDirectory()
am_go()
am_initialize(*initArgs)

Common initialization for all the agents.

This is executed every time an agent (re)starts. This is called by the AgentReactor, should not be overridden.

am_removeStopAgentFile()
am_secureCall(functor, args=(), name=False)
am_setModuleParam(optionName, value)
am_setOption(optionName, value)
am_stopExecution()
archiveTransformation(transID)

This just removes job from the jobDB and the transformation DB

Parameters:
  • self – self reference

  • transID (int) – transformation ID

beginExecution()
cleanContent(directory)

wipe out everything from catalog under folder :directory:

Parameters:

self – self reference

Params str directory:

folder name

cleanMetadataCatalogFiles(transID)

wipe out files from catalog

cleanTransformation(transID)

This removes what was produced by the supplied transformation, leaving only some info and log in the transformation DB.

cleanTransformationLogFiles(directory)

clean up transformation logs from directory :directory:

Parameters:
  • self – self reference

  • directory (str) – folder name

cleanTransformationTasks(transID)

clean tasks from WMS, or from the RMS if it is a DataManipulation transformation

endExecution()
execute()

execution in one agent’s cycle

Parameters:

self – self reference

finalize()

Only at finalization: will clean ancient transformations (remnants)

  1. get the transformation IDs of jobs that are older than 1 year

  2. find the status of those transformations. Those “Cleaned” and “Archived” will be cleaned and archived (again)

Why doing this here? Basically, it’s a race:

  1. the production manager submits a transformation

  2. the TransformationAgent, and a bit later the WorkflowTaskAgent, put such transformation in their internal queue, so eventually during their (long-ish) cycle they’ll work on it.

  3. 1 minute after creating the transformation, the production manager cleans it (by hand, for whatever reason). So, the status is changed to “Cleaning”

  4. the TransformationCleaningAgent cleans what has been created (maybe, nothing), then sets the transformation status to “Cleaned” or “Archived”

  5. a bit later the TransformationAgent, and later the WorkflowTaskAgent, kick in, creating tasks and jobs for a production that’s effectively cleaned (but these 2 agents don’t know yet).

Of course, one could make one final check in TransformationAgent or WorkflowTaskAgent, but these 2 agents are already doing a lot of stuff, and are pretty heavy. So, we should just clean from time to time. What I added here is done only when the agent finalize, and it’s quite light-ish operation anyway.

getTransformationDirectories(transID)
get the directories for the supplied transformation from the transformation system.

These directories are used by removeTransformationOutput and cleanTransformation for removing output.

Parameters:
  • self – self reference

  • transID (int) – transformation ID

initialize()

agent initialisation

reading and setting config opts

Parameters:

self – self reference

removeTransformationOutput(transID)

This just removes any mention of the output data from the catalog and storage