Implementing a Unit of Work - Handling Domain Objects through a Transactional Model

Even in the most basic scenario you can picture, where the logic of an application’s core is boiled down to just pulling in a few records from the database, bringing some domain objects to life, and then dumping them to the screen through an API of some basic rendering mechanism, there’s always an ongoing transaction behind the scenes whose most expensive facet often gets blurred beneath the appealing outward influence of the user interface. If you think this through, you’ll notice that the crux of the matter is the transaction lies not surprisingly in the heap of database trips, even though they can be largely mitigated by a clever caching strategy. In relatively small applications, where there are just a few basic domain objects involved in each transaction, and where the hike to the database is just for retrieving data most of the time, a simple caching system dropped into the appropriate place certainly can help get things sorted out with efficiency. While sad but true, reality is a ruthless creature always shouting at us radically different things than the sweet ones we’d rather hear instead. In most cases, because of the intrinsic, unavoidable mutability of domain objects (with a few scarce exceptions when the dependencies of domain classes are modelled around the concept of immutable Value Objects), chances are that some objects will need to be modified across multiple requests, and even new ones will be put in memory in response to some user-related event. In short, this means that even dummy CRUD applications that don’t encapsulate extensive chunks of additional business logic can quickly become bloated and generate a lot of overhead under the hood when it comes to performing multiple database writes. What if they reach a point where it’s necessary to handle a huge number of domain objects, which must be persisted and removed in sync, without compromising what us programming plebs loosely call data integrity? Let’s be honest with ourselves (at least once). Neither all the lofty data source architectural patterns that we could just pick up along the way, nor that cool new approach we might have figured out overnight, can tackle satisfactorily something as predictable and mundane as writing out and removing multiple sets of data from storage. In light of this, should we just give up and call the issue pretty much a lost cause? Admittedly the question is rhetorical. In fact, it’s feasible to wrap collections of domain objects inside a fairly flexible business transactional model and just perform several database writes/deletes in one go, therefore avoiding having to break down the process into more atomic and expensive database calls, which always lead to the session-per-operation antipattern. Moreover, this transaction-based mechanism rests on the academic formalities of a design pattern commonly known as Unit of Work (UOW), and its implementation in several popular enterprise-level packages, such as Hibernate, is quite prolific and prosperous. On the flip side, PHP is, for obvious reasons, still elusive at having a variety of UOWs running in production, excepting in a few well-trusted libraries like Doctrine and RedBeanPHP, which use the pattern’s forces at disparate levels in order to process and coordinate operations on entities. Despite this, it would be certainly pretty educational to take a closer look at the benefits a UOW provides, that way you can see if they are something that may meet your requirements.

Registering Domain Objects with a Unit of Work

In his book Patterns of Enterprise Application Architecture, Martin Fowler discusses two mainstream approaches that can be followed when it comes to implementing a UOW: the first makes the UOW directly responsible for registering or queuing domain objects for insertion, update, or deletion, and the second shifts this responsibility over to the domain objects themselves. In this case, since I’d like to have the domain model only encapsulating my business logic and remain agnostic about any form of persistence that may exist further down in other layers, I’m going to just stick to the commandments of the first option. In either case, you’re free to pick the approach you feel will fit the bill the best. A lightweight implementation of a UOW might look like this:

<?php
namespace ModelRepository;
use ModelEntityInterface;

interface UnitOfWorkInterface
{
    public function fetchById($id);
    public function registerNew(EntityInterface $entity);
    public function registerClean(EntityInterface $entity);
    public function registerDirty(EntityInterface $entity);
    public function registerDeleted(EntityInterface $entity);
    public function commit();
    public function rollback();
    public function clear();
}

<?php
namespace ModelRepository;
use MapperDataMapperInterface,
    LibraryStorageObjectStorageInterface,
    ModelEntityInterface;

class UnitOfWork implements UnitOfWorkInterface
{
    const STATE_NEW     = "NEW";
    const STATE_CLEAN   = "CLEAN";
    const STATE_DIRTY   = "DIRTY";
    const STATE_REMOVED = "REMOVED";
    
    protected $dataMapper;
    protected $storage;

    public function __construct(DataMapperInterface $dataMapper, ObjectStorageInterface $storage) {
        $this->dataMapper = $dataMapper;
        $this->storage = $storage;
    }
    
    public function getDataMapper() {
        return $this->dataMapper;
    }
    
    public function getObjectStorage() {
        return $this->storage;
    }
    
    public function fetchById($id) {
        $entity = $this->dataMapper->fetchById($id);
        $this->registerClean($entity);
        return $entity;
    }
    
    public function registerNew(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_NEW);
        return $this;
    }
    
    public function registerClean(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_CLEAN);
        return $this;
    }
    
    public function registerDirty(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_DIRTY);
        return $this;
    }
    
    public function registerDeleted(EntityInterface $entity) {
        $this->registerEntity($entity, self::STATE_REMOVED);
        return $this;
    }
    
    protected function registerEntity($entity, $state = self::STATE_CLEAN) {
        $this->storage->attach($entity, $state);
    }
    
    public function commit() {
        foreach ($this->storage as $entity) {
            switch ($this->storage[$entity]) {
                case self::STATE_NEW:
                case self::STATE_DIRTY: 
                    $this->dataMapper->save($entity);
                    break;
                case self::STATE_REMOVED:
                    $this->dataMapper->delete($entity);
            }
        }
        $this->clear();
    }
    
    public function rollback() {
        // your custom rollback implementation goes here
    }
    
    public function clear() {
        $this->storage->clear();
        return $this;
    }  
}

It should be clear to see that a UOW is nothing but plain, in-memory object storage which keeps track of which domain objects should be scheduled for insertion, update, and removal. In short, the convention could be boiled down to something along these lines: domain objects that need to be added to the storage will be registered “NEW”; those being updated will be marked “DIRTY”; the ones flagged “REMOVED” will be… yep, dropped from the database. In addition, any object registered “CLEAN” will be kept frozen and safe in memory until the client code explicitly requests to modify its associated state. Of course, the method that performs these persistence-related operations in just one single transaction is commit(), which exploits the functionality of an still undefined data mapper to get access to the persistence layer. It would be even easier for you to understand the UOW’s inner workings if I show you the implementation of the collaborators injected in its constructor, so here’s the components that compose the object storage module:

<?php
namespace LibraryStorage;

interface ObjectStorageInterface extends Countable, Iterator, ArrayAccess
{
    public function attach($object, $data = null);
    public function detach($object);
    public function clear();
}

<?php
namespace LibraryStorage;

class ObjectStorage extends SplObjectStorage implements ObjectStorageInterface
{
    public function clear() {
        $tempStorage = clone $this;
        $this->addAll($tempStorage);
        $this->removeAll($tempStorage);
        $tempStorage = null;
    } 
}

In this case in particular, I decided to use a slightly-customized implementation of the SplObjectStorage class for registering domain objects without much fuss along with their related states with the UOW, even though pretty much the same can be also achieved using plain arrays. Again, it’s up to you to have the domain objects registered by using the method that best accommodates your needs. With the custom ObjectStorage class in place, let’s take a look at the implementation of the aforementioned data mapper:

<?php
namespace Mapper;
use ModelEntityInterface;

interface DataMapperInterface
{
    public function fetchById($id);
    public function fetchAll(array $conditions = array());
    public function insert(EntityInterface $entity);
    public function update(EntityInterface $entity);
    public function save(EntityInterface $entity);
    public function delete(EntityInterface $entity);
}

<?php
namespace Mapper;
use LibraryDatabaseDatabaseAdapterInterface,
    ModelCollectionEntityCollectionInterface,   
    ModelEntityInterface;

abstract class AbstractDataMapper implements DataMapperInterface
{
    protected $adapter;
    protected $collection;
    protected $entityTable;
    
    public function __construct(DatabaseAdapterInterface $adapter, EntityCollectionInterface $collection, $entityTable = null) {
        $this->adapter = $adapter;
        $this->collection = $collection;
        if ($entityTable !== null) {
            $this->setEntityTable($entityTable);
        }
    }
        
    public function setEntityTable($entityTable) {
        if (!is_string($table) || empty($entityTable)) {
            throw new InvalidArgumentException(
                "The entity table is invalid.");
        }
        $this->entityTable = $entityTable;
        return $this;
    }
    
    public function fetchById($id) {
        $this->adapter->select($this->entityTable, 
            array("id" => $id));
        if (!$row = $this->adapter->fetch()) {
            return null; 
        }
        return $this->loadEntity($row);
    }
    
    public function fetchAll(array $conditions = array()) {
        $this->adapter->select($this->entityTable, $conditions);
        $rows = $this->adapter->fetchAll();
        return $this->loadEntityCollection($rows);
    }
    
    public function insert(EntityInterface $entity) {
        return $this->adapter->insert($this->entityTable,
            $entity->toArray());
    }
    
    public function update(EntityInterface $entity) {
        return $this->adapter->update($this->entityTable,
            $entity->toArray(), "id = $entity->id");
    }
    
    public function save(EntityInterface $entity) {
        return !isset($entity->id) 
            ? $this->adapter->insert($this->entityTable,
                $entity->toArray()) 
            : $this->adapter->update($this->entityTable,
                $entity->toArray(), "id = $entity->id");   
    }
    
    public function delete(EntityInterface $entity) {
        return $this->adapter->delete($this->entityTable,
            "id = $entity->id");
    }
    
    protected function loadEntityCollection(array $rows) {
        $this->collection->clear();
        foreach ($rows as $row) {
            $this->collection[] = $this->loadEntity($row);
        }
        return $this->collection;
    }
    
    abstract protected function loadEntity(array $row);
}

The AbstractDataMapper puts behind a pretty standard API the bulk of logic required for pulling domain objects in and out of the database. To make things even easier, it’d be also nice to derivate a refined implementation of it, that way we could easily test the UOW with a few sample user objects. Here’s how this extra mapping subclass looks:

<?php
namespace Mapper;
use ModelUser;

class UserMapper extends AbstractDataMapper
{
    protected $entityTable = "users";
    
    protected function loadEntity(array $row) {
        return new User(array(
            "id"    => $row["id"], 
            "name"  => $row["name"], 
            "email" => $row["email"],
            "role"  => $row["role"]));
    }
}

At this point we just could put our hands on the UOW and see if its transactional schema delivers what it promises. But before we do, first off we really should drop at least a few domain objects in memory. That way, we can get them neatly registered with the UOW. So let’s now define a primitive Domain Model which will be charged with supplying the objects in question.

Defining a basic Domain Model

Frankly speaking, there are several ways to implement a functional Domain Model (most likely there exists one per developer living and breathing out there). Since in this case I want the process to be both painless and short, the model I’ll be using for testing the UOW will be composed just of a prototypical entity class, along with a derivative, which will be charged with spawning basic users objects:

<?php
namespace Model;

interface EntityInterface
{
    public function setField($name, $value);
    public function getField($name);
    public function fieldExists($name);
    public function removeField($name);
    public function toArray();      
}

<?php
namespace Model;

abstract class AbstractEntity implements EntityInterface
{
    protected $fields = array(); 
    protected $allowedFields = array(); 

    public function __construct(array $fields = array()) {
        if (!empty($fields)) {
            foreach ($fields as $name => $value) {
                $this->$name = $value;
            } 
        }
    }
    
    public function setField($name, $value) {
        return $this->__set($name, $value);
    }
    
    public function getField($name) {
        return $this->__get($name);
    }
    
    public function fieldExists($name) {
        return $this->__isset($name);
    }
    
    public function removeField($name) {
        return $this->__unset($name);
    }
    
    public function toArray() {
        return $this->fields;
    }
             
    public function __set($name, $value) {
        $this->checkAllowedFields($name);
        $mutator = "set" . ucfirst(strtolower($name));
        if (method_exists($this, $mutator) && 
            is_callable(array($this, $mutator))) {
            $this->$mutator($value);
        }
        else {
            $this->fields[$name] = $value;
        }
        return $this;                 
    }
    
    public function __get($name) {
        $this->checkAllowedFields($name);
        $accessor = "get" . ucfirst($name);
        if (method_exists($this, $accessor) &&
            is_callable(array($this, $accessor))) {
            return $this->$accessor();
        }
        if (!$this->__isset($name)) {
            throw new InvalidArgumentException(
                "The field '$name' has not been set for this entity yet.");
        }
        return $this->fields[$name];
    }
    
    public function __isset($name) {
        $this->checkAllowedFields($name);
        return isset($this->fields[$name]);
    }
    
    public function __unset($name) {
        $this->checkAllowedFields($name);
        if (!$this->__isset($name)) {
            throw new InvalidArgumentException(
                "The field "$name" has not been set for this entity yet.");
        }
        unset($this->fields[$name]);
        return $this;
    }
    
    protected function checkAllowedFields($field) {
        if (!in_array($field, $this->allowedFields)) {
            throw new InvalidArgumentException(
                "The requested operation on the field '$field' is not allowed for this entity.");
        }
    }
}

<?php
namespace Model;

class User extends AbstractEntity
{
    const ADMINISTRATOR_ROLE = "Administrator";
    const GUEST_ROLE         = "Guest";
    
    protected $allowedFields = array("id", "name", "email", "role");
    
    public function setId($id) {
        if (isset($this->fields["id"])) {
            throw new BadMethodCallException(
                "The ID for this user has been set already.");
        }
        if (!is_int($id) || $id < 1) {
            throw new InvalidArgumentException(
                "The user ID is invalid.");
        }
        $this->fields["id"] = $id;
        return $this;
    }
    
    public function setName($name) {
        if (strlen($name) < 2 || strlen($name) > 30) {
            throw new InvalidArgumentException(
                "The user name is invalid.");
        }
        $this->fields["name"] = htmlspecialchars(trim($name),
            ENT_QUOTES);
        return $this;
    }
    
    public function setEmail($email) {
        if (!filter_var($email, FILTER_VALIDATE_EMAIL)) {
            throw new InvalidArgumentException(
                "The user email is invalid.");
        }
        $this->fields["email"] = $email;
        return $this;
    }
    
    public function setRole($role) {
        if ($role !== self::ADMINISTRATOR_ROLE &&
            $role !== self::GUEST_ROLE) {
            throw new InvalidArgumentException(
                "The user role is invalid.");
        }
        $this->fields["role"] = $role;
        return $this;
    }
}

While the implementations of the AbstractEntity and User classes might look complex at first glance, I assure this is just a fuzzy impression. In fact, the former is a skeletal wrapper for some typical PHP magic methods, while the latter encapsulates some straightforward mutators, in order to assign the appropriate values to the fields of generic user objects. With these domain classes already doing their business in relaxed insulation, let’s now do the last building block of the model. In reality, this is an optional class which can be skipped over if the situation warrants, and its responsibility is just to wrap collections of entities. Its implementation is as following:

<?php
namespace ModelCollection;
use ModelEntityInterface;

interface EntityCollectionInterface extends Countable, ArrayAccess, IteratorAggregate 
{
    public function add(EntityInterface $entity);
    public function remove(EntityInterface $entity);
    public function get($key);
    public function exists($key);
    public function clear();
    public function toArray();
}

<?php
namespace ModelCollection;
use ModelEntityInterface;
    
class EntityCollection implements EntityCollectionInterface
{
    protected $entities = array();
    
    public function __construct(array $entities = array()) 
    {
        if (!empty($entities)) {
            $this->entities = $entities;
        }
    }
    
    public function add(EntityInterface $entity) {
        $this->offsetSet($entity);
    }
    
    public function remove(EntityInterface $entity) {
        $this->offsetUnset($entity);
    }
    
    public function get($key) {
        $this->offsetGet($key);
        
    }
    
    public function exists($key) {
        return $this->offsetExists($key);
    }
    
    public function clear() {
        $this->entities = array();
    }
    
    public function toArray() {
        return $this->entities;
    }
    
    public function count() {
        return count($this->entities);
    }
    
    public function offsetSet($key, $entity)
    {
        if (!$entity instanceof EntityInterface) {
            throw new InvalidArgumentException(
                "Could not add the entity to the collection.");
        }
        if (!isset($key)) {
            $this->entities[] = $entity;
        }
        else {
            $this->entities[$key] = $entity;
        }
    }
    
    public function offsetUnset($key) {
        if ($key instanceof EntityInterface) {
            $this->entities = array_filter($this->entities, 
                function ($v) use ($key) {
                    return $v !== $key;
                });
        }
        else if (isset($this->entities[$key])) {
            unset($this->entities[$key]);
        }
    }
    
    public function offsetGet($key) {
        if (isset($this->entities[$key])) {
            return $this->entities[$key];
        }
    }
    
    public function offsetExists($key) {
        return $key instanceof EntityInterface 
            ? array_search($key, $this->entities) 
            : isset($this->entities[$key]);
    }
    
    public function getIterator() {
        return new ArrayIterator($this->entities);
    }
}

At this point we’ve managed to create a primitive domain model, which certainly we can use for engendering user objects without a major hassle. In doing do, we have a real chance to see if the UOW is actually the functional component it seems to be when it comes to persisting multiple entities in the database as one single transaction.

Putting the UOW Under Test

If you’ve reached this point of the article, you probably feel like you’re being pulled in opposite directions, wondering if all of the hard up front work required in writing a bunch of interfaces and classes was really worth it. In fact, it was. Moreover, if you’re still skeptical, make sure check the following code snippet, which shows how to put the UOW to work in sweet synchrony with some naïve user objects:

<?php    
require_once __DIR__ . "/Library/Loader/Autoloader.php";
$autoloader = new Autoloader;
$autoloader->register();

$adapter = new PdoAdapter("mysql:dbname=test", "myfancyusername",
    "myhardtoguesspassword");

$unitOfWork = new UnitOfWork(new UserMapper($adapter,
    new EntityCollection), new ObjectStorage);

$user1 = new User(array("name" => "John Doe", 
    "email" => "john@example.com"));
$unitOfWork->registerNew($user1);

$user2 = $unitOfWork->fetchById(1);
$user2->name = "Joe";
$unitOfWork->registerDirty($user2);

$user3 = $unitOfWork->fetchById(2);
$unitOfWork->registerDeleted($user3);

$user4 = $unitOfWork->fetchById(3);
$user4->name = "Julie";

$unitOfWork->commit();

Leaving aside some irrelevant details, such as assuming there’s effectively a PDO adapter living somewhere, the driving logic of the earlier script should be fairly easy to assimilate. Simply put, it shows off how to get things rolling with the UOW, which drags in some user objects from the database and queues them for insertion, update, and deletion by using the corresponding registering methods. At the end of the process, commit() just loops internally over the registered objects and performs the proper operations all in one go. While in a standard implementation a UOW does expose the typical set of registering methods that we’d expect to see, its formal definition doesn’t provide any kind of finder. In this case, however, I decided intentionally to implement a generic one so you can see more clearly how to pull in objects from storage and in turn register them with the UOW without struggling with the oddities of a standalone, closer-to-the domain structure, such as a Repository or even an overkill Service.

Closing Thoughts

Now that you’ve peeked behind the curtain at a UOW and learned how to implement a naïve one from scratch, let your wild side show and tweak it at your will. Keep in mind though that while there are benefits with the pattern, it’s far from being a panacea that will solve all of the issues associated with massive accesses to the persistence layer. In enterprise-level applications that must perform expensive database writes across several places, though, a UOW provides an effective, transactional-like approach that reduces the underlying overhead, hence becoming a solid, multifaceted solution when properly coupled to a caching mechanism. Image via Zhukov Oleg / Shutterstock

Frequently Asked Questions (FAQs) about Implementing a Unit of Work

What is the main purpose of the Unit of Work pattern in software development?

The Unit of Work pattern is a software design pattern that maintains a list of objects affected by a business transaction and coordinates the writing out of changes and the resolution of concurrency problems. It’s primarily used to ensure that all changes are made within a single transaction, thus maintaining the integrity of data and ensuring consistency. This pattern is particularly useful in applications where multiple operations need to be performed as a single atomic operation.

How does the Unit of Work pattern differ from the Repository pattern?

While both the Unit of Work and Repository patterns are used to abstract and encapsulate data access, they serve different purposes. The Repository pattern is used to decouple the business logic from the data access logic, providing a simpler interface for accessing data. On the other hand, the Unit of Work pattern is used to group a set of operations into a single transaction, ensuring that all operations either succeed or fail as a whole.

Can the Unit of Work pattern be used with any type of database?

Yes, the Unit of Work pattern can be used with any type of database, whether it’s a relational database like MySQL or PostgreSQL, or a NoSQL database like MongoDB or Cassandra. The pattern is not tied to any specific type of database, but rather to the concept of a transaction, which is a common feature in most databases.

What are the benefits of using the Unit of Work pattern?

The Unit of Work pattern provides several benefits. First, it ensures data integrity by grouping related operations into a single transaction. Second, it simplifies error handling, as you only need to handle errors at the transaction level, rather than for each individual operation. Third, it can improve performance by reducing the number of database round-trips, as all operations are sent to the database in a single batch.

Are there any drawbacks to using the Unit of Work pattern?

While the Unit of Work pattern provides many benefits, it’s not without its drawbacks. One potential drawback is that it can lead to increased memory usage, as all changes are held in memory until the transaction is committed. Another potential drawback is that it can make the code more complex, especially in scenarios where nested transactions are required.

How does the Unit of Work pattern handle concurrency issues?

The Unit of Work pattern handles concurrency issues by keeping track of all changes made during a transaction. If two transactions try to modify the same object at the same time, the pattern can detect this and prevent one of the transactions from proceeding, thus avoiding a potential conflict.

Can the Unit of Work pattern be used in a multi-threaded environment?

Yes, the Unit of Work pattern can be used in a multi-threaded environment. However, care must be taken to ensure that each thread has its own instance of the Unit of Work, to avoid potential conflicts and concurrency issues.

How does the Unit of Work pattern relate to the concept of persistence ignorance?

Persistence ignorance is the principle that business logic should not be concerned with data storage details. The Unit of Work pattern supports this principle by abstracting away the details of transaction management and data persistence, allowing the business logic to focus on business rules and workflows.

Is the Unit of Work pattern applicable only to Object-Relational Mapping (ORM) frameworks?

While the Unit of Work pattern is commonly used in conjunction with ORM frameworks, it’s not limited to them. The pattern can be used in any scenario where you need to group a set of operations into a single transaction, regardless of whether you’re using an ORM framework, raw SQL, or some other data access method.

How does the Unit of Work pattern handle rollbacks?

In the Unit of Work pattern, if an error occurs during the execution of a transaction, all changes made during that transaction are rolled back, returning the system to its previous state. This ensures that the system remains consistent, even in the face of errors.