TransformationCleaningAgent
TransformationCleaningAgent cleans up finalised transformations.
TransformationCleaningAgent
{
# MetaData key to use to identify output data
TransfIDMeta=TransformationID
# Location of the OutputData, if the OutputDirectories parameter is not set for
# transformations only 'MetadataCatalog has to be used
DirectoryLocations=TransformationDB,MetadataCatalog
# Enable or disable, default enabled
EnableFlag=True
# How many days to wait before archiving transformations
ArchiveAfter=7
# Shifter to use for removal operations, default is empty and
# using the transformation owner for cleanup
shifterProxy=
# Which transformation types to clean
# If not filled, transformation types are taken from
# Operations/Transformations/DataManipulation
# and Operations/Transformations/DataProcessing
TransformationTypes=
#Time between cycles in seconds
PollingTime = 3600
}
- class DIRAC.TransformationSystem.Agent.TransformationCleaningAgent.TransformationCleaningAgent(*args, **kwargs)
Bases:
AgentModule
- class TransformationCleaningAgent
- Parameters:
dm (DataManager) – DataManager instance
transClient (TransformationClient) – TransformationClient instance
metadataClient (FileCatalogClient) – FileCatalogClient instance
- __init__(*args, **kwargs)
c’tor
- am_Enabled()
- am_checkStopAgentFile()
- am_createStopAgentFile()
- am_getControlDirectory()
- am_getCyclesDone()
- am_getMaxCycles()
- am_getModuleParam(optionName)
- am_getOption(optionName, defaultValue=None)
Gets an option from the agent’s configuration section. The section will be a subsection of the /Systems section in the CS.
- am_getPollingTime()
- am_getShifterProxyLocation()
- am_getStopAgentFile()
- am_getWatchdogTime()
- am_getWorkDirectory()
- am_go()
- am_initialize(*initArgs)
Common initialization for all the agents.
This is executed every time an agent (re)starts. This is called by the AgentReactor, should not be overridden.
- am_removeStopAgentFile()
- am_secureCall(functor, args=(), name=False)
- am_setModuleParam(optionName, value)
- am_setOption(optionName, value)
- am_stopExecution()
- archiveTransformation(transID)
This just removes job from the jobDB and the transformation DB
- Parameters:
self – self reference
transID (int) – transformation ID
- beginExecution()
- cleanContent(directory, transID)
wipe out everything from catalog under folder :directory:
- Parameters:
self – self reference
- Params str directory:
folder name
- cleanMetadataCatalogFiles(transID)
wipe out files from catalog
- cleanTransformation(transID)
This removes what was produced by the supplied transformation, leaving only some info and log in the transformation DB.
- cleanTransformationLogFiles(directory)
clean up transformation logs from directory :directory:
- Parameters:
self – self reference
directory (str) – folder name
- cleanTransformationTasks(transID)
clean tasks from WMS, or from the RMS if it is a DataManipulation transformation
- endExecution()
- execute()
execution in one agent’s cycle
- Parameters:
self – self reference
- finalize()
Only at finalization: will clean ancient transformations (remnants)
get the transformation IDs of jobs that are older than 1 year
find the status of those transformations. Those “Cleaned” and “Archived” will be cleaned and archived (again)
Why doing this here? Basically, it’s a race:
the production manager submits a transformation
the TransformationAgent, and a bit later the WorkflowTaskAgent, put such transformation in their internal queue, so eventually during their (long-ish) cycle they’ll work on it.
1 minute after creating the transformation, the production manager cleans it (by hand, for whatever reason). So, the status is changed to “Cleaning”
the TransformationCleaningAgent cleans what has been created (maybe, nothing), then sets the transformation status to “Cleaned” or “Archived”
a bit later the TransformationAgent, and later the WorkflowTaskAgent, kick in, creating tasks and jobs for a production that’s effectively cleaned (but these 2 agents don’t know yet).
Of course, one could make one final check in TransformationAgent or WorkflowTaskAgent, but these 2 agents are already doing a lot of stuff, and are pretty heavy. So, we should just clean from time to time. What I added here is done only when the agent finalize, and it’s quite light-ish operation anyway.
- getTransformationDirectories(transID)
- get the directories for the supplied transformation from the transformation system.
These directories are used by removeTransformationOutput and cleanTransformation for removing output.
- Parameters:
self – self reference
transID (int) – transformation ID
- initialize()
agent initialisation
reading and setting config opts
- Parameters:
self – self reference
- removeTransformationOutput(transID)
This just removes any mention of the output data from the catalog and storage