Developing Databases

This is a quick guide about developing classes interacting with MySQL databases. DIRAC supports also Elasticsearch NoSQL database, but it is not part of this document.

Before starting developing databases, you have to make sure that MySQL is installed, as well as python-mysql, as explained in Editing DIRAC code, and make sure that MySQL service is on.

Develop the database

To develop a new database structure it requires to design a database schema and develop the python database class that will interact with the database itself.

The DIRAC.Core.Base.DB module provides a base class for defining and interacting with MySQL DBs. The example that follows make use of it. There is also the possibility to define the DB using sqlalchemy python package, (and some DIRAC DBs do indeed that) but this is not covered in this document.

A simple example of the python class of a database follows:

""" A test DB in DIRAC, using MySQL as backend
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from DIRAC.Core.Base.DB import DB


class AtomDB(DB):
    def __init__(self):
        DB.__init__(self, "AtomDB", "Test/AtomDB")
        retVal = self.__initializeDB()
        if not retVal["OK"]:
            raise Exception("Can't create tables: %s" % retVal["Message"])

    def __initializeDB(self):
        """
        Create the tables
        """
        retVal = self._query("show tables")
        if not retVal["OK"]:
            return retVal

        tablesInDB = [t[0] for t in retVal["Value"]]
        tablesD = {}

        if "atom_mytable" not in tablesInDB:
            tablesD["atom_mytable"] = {
                "Fields": {"Id": "INTEGER NOT NULL AUTO_INCREMENT", "Stuff": "VARCHAR(64) NOT NULL"},
                "PrimaryKey": ["Id"],
            }

        return self._createTables(tablesD)

    def addStuff(self, something):
        return self.insertFields("atom_mytable", ["stuff"], [something])

Let’s break down the example. The first two lines are simple includes required. Then the class definition. The name of the class should be the same name as the file where it is.

So AtomDB should be in AtomDB.py. The class should inherit from the DB class. The DB class includes all the methods necessary to access, query, modify… the database.

The first line in the __init__ method should be the initialization of the parent (DB) class. That initialization requires 2 or 3 parameters:

  1. Logging name of the database. This name will be used in all the logging messages generated by this class.

  2. Full name of the database. With System/Name. So it can know where in the CS look for the initialization parameters. In this case it would be /Systems/Test/<instance name>/Databases/AtomDB.

  3. Boolean for the debug flag

After the initialization of the DB parent class we call our own __initializeDB method. This method (following __init__ in the example) first retrieves all the tables already in the database. Then for each table that has not yet been created then it creates a definition of the table and creates all the missing tables. Each table definition includes all the fields with their value type, primary keys, extra indexes… By default all tables will be created using the InnoDB engine.

The addStuff method simply inserts into the created table the argument value.

Configure the database access

The last step is to configure the database credentials for DIRAC to be able to connect. In our previous example the CS path was /Systems/Test/<instance name>/Databases/AtomDB. That section should contain:

Systems
{
  Test
  {
    <instance name>
    {
      Databases
      {
        AtomDB
        {
          Host = localhost
          User = yourusername
          Password = yourpasswd
          DBName = yourdbname
        }
     }
   }
}

In a production environment, the “Password” should be defined in a non-accessible file, while the rest of the configuration can go in the central Configuration Service.

If you encounter any problem with sockets, you should replace “localhost” (DIRAC/Systems/Test/<instance name>/AtomDB/Host) by 127.0.0.1.

Keep in mind that <instance name> is the name of the instance defined under /DIRAC/Setups/<your setup>/Test and <your setup> is defined under /DIRAC/Setup.

Once that is defined you’re ready to go.

Trying the database from the command line

You can try to access the database by doing:

from DIRAC.TestSystem.DB.AtomDB import AtomDB

try:
  atomdb = AtomDB()
except Exception:
   print "Oops. Something went wrong..."
   raise
result = atomdb.addStuff('something')
if not result['OK']:
  print "Error while inserting into db:", result['Message']  # Here, in DIRAC, you better use the gLogger
else:
  print result['Value']  # Here, in DIRAC, you better use the gLogger