Roman Imankulov

Roman Imankulov

Full-stack Python web developer from Porto

search results (esc to close)
06 Dec 2022

Pydantic as a Backward Compatibility Layer

Pydantic as Backward Compatibility Layer

Data structures evolve with time. Say you have data storage with person objects.

I assume the storage is schema-less. Maybe it’s your primary storage in MongoDB, a Redis cache, or a log record in ElasticSearch.

{
  "name": "Guido Van Rossum",
  "email": "guido@python.org"
}

At some point, you decide to store a profession alongside each person. New records can look like this.

{
  "name": "Ryan Dahl",
  "email": "ryan@nodejs.com",
  "profession": "Software developer"
}

Old objects stored in the database before the migration don’t have a “profession” attribute. If your code expects every person to have a defined profession field, it will crash the first time it interacts with the old code.

if person["profession"] == "Software developer":
    print("Hello, world")

# Traceback (most recent call last):
# ...
# KeyError: 'profession’

Our code is not backward-compatible, as it breaks on input, designed for the older system version.

We can account for missing fields, making the code backward compatible.

if "profession" in person and person["profession"] == "Software developer":
    print("Hello, world")

# same, but shorter
if person.get("profession") == "Software developer":
    print("Hello, world")

It may feel good to come up with a one-off solution, but it can be difficult to maintain in the long run. Every piece of code that interacts with the “profession” field carries the legacy of the ancient system of people without professions.

Wouldn’t it be nice to offload the burden of maintaining backward compatibility to a separate layer of your application and let the rest of your code deal with clean, uniform, and well-defined data structures? This can make your code much easier to work with and maintain.

Pydantic to the rescue.

I don’t like dicts as data structures. Instead, I prefer to convert dictionaries to domain models as early as possible. Most of the time, I use Pydantic models. In a declarative way, I define the shape of my model, default values for missing fields, validation, and conversion rules.

The pydantic Person model would look like this.

from pydantic import BaseModel

class Person(BaseModel):
    name: str
    email: str
    profession: str = ""

Here is how I would interact with my model.

>>> person_dict = {
...     "name": "Guido Van Rossum",
...     "email": "guido@python.org"
... }
>>> person_object = Person(**person_dict)
>>> person_object
Person(name='Guido Van Rossum', email='guido@python.org', profession='')

The model provides me with a contract that guarantees the presence of a “profession” attribute. If the corresponding field in the source dictionary is missing, the model populates the value with an empty string.

Pydantic offers several types of backward compatibility for your data model. Here are just a few examples.

Add a new field. Pydantic can populate missing fields with the default value from a constant or calculated dynamically with a default factory.

Rename a field. Pydantic makes it possible to define field aliases. If you decide to rename a field from name to full_name, you define a full_name field and provide a name alias.

-from pydantic import BaseModel
+from pydantic import BaseModel, Field

 class Person(BaseModel):
-    name: str
+    full_name = Field(alias="name")

Change the field type. Simple type conversions (like converting from a string to a number, enum, or datetime) can be performed automatically. More complicated changes can be done with validators.

I found that I can express any reasonable change in the model structure with Pydantic types, type annotations, default factories, and validators.

When I work with schema-less data, I often use Pydantic models to provide the contract for the rest of the application. It makes my code more robust and easier to maintain.

Helpful and relevant resources:

In my blog:

Roman Imankulov

Hey, I am Roman, and you can hire me.

I am a full-stack Python web developer who loves helping startups and small teams turn their ideas into products.

More about me and my skills