10.2.5. Job Priority Handling

This page describes how DIRAC handles job priorities.

10.2.5.1. Scenario

There are two user profiles:

  1. Users that submit jobs on behalf of themselves. For instance normal analysis users.

  2. Users that submit jobs on behalf of a group. For instance production users.

In the first case, users are competing for resources, and in the second case users share them. But these two profiles also compete against each other. DIRAC has to provide a way to share the resources available. On top of that users want to specify a “UserPriority” to their jobs. They want to tell DIRAC which of their own jobs should run first and which should run last.

DIRAC implements a priority schema to decide which user gets to run in each moment so a fair share is maintained between the users.

10.2.5.2. Priority implementation

DIRAC handles jobs using TaskQueues. Each TaskQueue contains all the jobs that have the same requirements for a user/group combination. To prioritize user jobs, DIRAC only has to prioritize TaskQueues.

To handle the users competing for resources, DIRAC implements a group priority. Each DIRAC group has a priority defined. This priority can be shared or divided amongst the users in the group depending on the group properties. If the group has the JOB_SHARING property (see Properties module) the priority will be shared, and if it doesn’t the group priority will be divided amongst them. Each TaskQueue will get a priority based on the group and user it belongs to:

  • If it belongs to a JOB_SHARING group, it will get 1/N of the priority being N the number of TaskQueues that belong to the group.

  • If it does NOT, it will get 1/(N*U) being U the number of users in the group with waiting jobs and N the number of TaskQueues of that user/group combination.

Administrators can set different job shares depending on the user’s group that runs the jobs, by setting the JobShare option in the Configuration, in the groups definitions. For example:

lhcb_user  # All LHCb users
{
   Users = aaa
   ...
   Users += zzz
   Properties = NormalUser
   Properties += PrivateLimitedDelegation
   JobShare = 10000

 }
 ....
 lhcb_mc  # this is for MonteCarlo simulations
 {
   Users = ...
   Properties = NormalUser
   Properties += JobSharing
   Properties += ProductionManagement
   JobShare = 300
 }
 ....
 lhcb_data   # this is for real data productions
 {
   Users = ...
   Properties = NormalUser
   Properties += JobSharing
   Properties += ...
   JobShare = 40000
 }

On top of that production administrators can specify a different “priority” to different productions. To reflect that, DIRAC modifies the TaskQueues priorities depending on the “priority” of the jobs in each TaskQueue. Each TaskQueue priority will be P*J being P the TaskQueue priority. J is the sum of all the “UserPriorities” of the jobs inside the TaskQueue divided by the sum of sums of all the “UserPiorities” in the jobs of all the TaskQueues belonging to the group if it has JOB_SHARING or to that user/group combination.

10.2.5.2.1. Dynamic share corrections

DIRAC includes a priority correction mechanism. The idea behind it is to look at the past history and alter the priorities assigned based on it. It can have multiple plugins but currently it only has one. All correctors have a CS section to configure themselves under /Operations/<vo>/<setup>/JobScheduling/ShareCorrections. The option /Operations/<vo>/<setup>/JobScheduling/ShareCorrections/ShareCorrectorsToStart defines witch correctors will be used in each iteration.

10.2.5.2.1.1. WMSHistory corrector

This corrector looks the running jobs for each entity and corrects the priorities to try to maintain the shares defined in the CS. For instance, if an entity has been running three times more jobs than it’s current share, the priority assigned to that entity will be one third of the corresponding priority. The correction is the inverse of the proportional deviation from the expected share.

Multiple time spans can be taken into account by the corrector. Each time span is weighted in the final correction by a factor defined in the CS. A max correction can also be defined for each time span. The next example defines a valid WMSHistory corrector:

ShareCorrections
{
  ShareCorrectorsToStart = WMSHistory
  WMSHistory
  {
    GroupsInstance
    {
      MaxGlobalCorrectionFactor = 3
      WeekSlice
      {
        TimeSpan = 604800
        Weight = 80
        MaxCorrection = 2
      }
      HourSlice
      {
        TimeSpan = 3600
        Weight = 20
        MaxCorrection = 5
      }
    }
    lhcb_userInstance
    {
      Group = lhcb_user
      MaxGlobalCorrectionFactor = 3
      WeekSlice
      {
        TimeSpan = 604800
        Weight = 80
        MaxCorrection = 2
      }
      HourSlice
      {
        TimeSpan = 3600
        Weight = 20
        MaxCorrection = 5
      }
    }
  }
}

The previous example will start the WMSHistory corrector. There will be two instances of the WMSHistory corrector. The only difference between them is that the first one tries to maintain the shares between user groups and the second one tries to maintain the shares between users in the _lhcb_user_ group. It makes no sense to create a third corrector for the users in the _lhcb_prod_ group because that group has JOB_SHARING, so the priority is assigned to the whole group, not to the individuals.

Each WMSHistory corrector instance will correct at most x[ 3 - 1/3] the priorities. That’s defined by the _MaxGlobalCorrectionFactor_. Each instance has two time spans to check. The first one being the last week and the second one being the last hour. The last week time span will weight 80% of the total correction, the last hour will weight the remaining 20%. Each time span can have it’s own max correction. By doing so we can boost the first hour of any new entity but then try to maintain the share for longer periods. The final formula would be:

hourCorrection = max ( min( hourCorrection, hourMax ), 1/hourMax )
weekCorrection = max ( min( weekCorrection, weekMax ), 1/weekMax )
finalCorrection = hourCorrection * hourWeight + weekCorrection * weekWeight
finalCorrection = max ( min( finalCorrection, globalMax ), 1/globalMax )