10.2.7. Job Priority Handling¶
This page describes how DIRAC handles job priorities.
10.2.7.1. Scenario¶
There are two user profiles:
Users that submit jobs on behalf of themselves. For instance normal analysis users.
Users that submit jobs on behalf of the group. For instance production users.
In the first case, users are competing for resources, and on the second case users share them. But this two profiles also compete against each other. DIRAC has to provide a way to share the resources available. On top of that users want to specify a “UserPriority” to their jobs. They want to tell DIRAC which of their own jobs should run first and which should ran last.
DIRAC implements a priority schema to decide which user gets to run in each moment so a fair share of CPU is kept between the users.
10.2.7.2. Priority implementation¶
DIRAC handles jobs using TaskQueues. Each TaskQueue contains all the jobs that have the same requirements for a user/group combination. To prioritize user jobs, DIRAC only has to prioritize TaskQueues.
To handle the users competing for resources, DIRAC implements a group priority. Each DIRAC group has a priority defined. This priority can be shared or divided amongst the users in the group depending on the group properties. If the group has the JOB_SHARING property the priority will be shared, if it doesn’t have it the group priority will be divided amongst them. Each TaskQueue will get a priority based on the group and user it belongs to:
If it belongs to a JOB_SHARING group, it will get 1/N of the priority being N the number of TaskQueues that belong to the group.
If it does NOT, it will get 1/(N*U) being U the number of users in the group with waiting jobs and N the number of TaskQueues of that user/group combination.
On top of that users can specify a “UserPriority” to their jobs. To reflect that, DIRAC modifies the TaskQueues priorities depending on the “UserPriority” of the jobs in each TaskQueue. Each TaskQueue priority will be P*J being P the TaskQueue priority. J is the sum of all the “UserPriorities” of the jobs inside the TaskQueue divided by the sum of sums of all the “UserPiorities” in the jobs of all the TaskQueues belonging to the group if it has JOB_SHARING or to that user/group combination.