28 Nov 2021

SQLAlchemy and Alembic notes

updated on 19 Jan 2022

table of contents

My notes on SQLAlchemy and Alembic

Quickstart snippet

A quickstart snippet for the SQLAlchemy ORM playground.

# file: playground.py
import sqlalchemy as sa
from sqlalchemy.orm import declarative_base, sessionmaker
import datetime

engine = sa.create_engine("sqlite:///:memory:", echo=True)
Session = sessionmaker(bind=engine, expire_on_commit=False)

Base = declarative_base()

class User(Base):
    __tablename__ = "users"
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String, nullable=False)

Base.metadata.create_all(engine)
session = Session()

You can run the script with IPython interactive mode.

$ ipython -i playground.py

...
In [1]: session.add(User(name="foo"))
In [2]: session.commit()

SQLAlchemy pitfalls

Columns are nullable by default

The default value of SQLAlchemy nullable is False unless it’s a primary key. A foreign key is also nullable by default. If you came from the Django ORM, where the default values for fields are null=False, blank=False, you might accidentally create table definitions with nullable fields, which bites you in the future.

This behavior matches the default SQL, though, where the column would be considerably nullable unless you explicitly set NOT NULL for it.

define a required user name with sqlalchemy

import datetime
import sqlalchemy as sa

class User(Base):
    __tablename__ = "users"
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String, nullable=False)

Column `default` vs. `server_default` and `onupdate` vs. `server_onupdate`

Column’s default attribute does not add DEFAULT value to the CREATE TABLE statement. Don’t rely on it if you plan to insert new records in the database outside of SQLAlchemy, now or in the future. The value can be a scalar or a callable (e.g., datetime.datetime.utcnow)

To delegate setting the default value to the database, use server_default instead. You can use scalars or functions. SQLAlchemy is aware of a lot of database functions and their database-specific variants. Use SQL function expressions generator, available as sqlalchemy.func. Consult SQL and Generic Functions.

The same rule applies to update vs. server_onupdate.

a model with created_at and updated_at fields populated from python

import datetime
import sqlalchemy as sa

class User(Base):
    __tablename__ = "users"
    id = sa.Column(sa.Integer, primary_key=True)
    created_at = sa.Column(
        sa.DateTime,
        nullable=False,
        default=datetime.datetime.utcnow,
    )
    updated_at = sa.Column(
        sa.DateTime,
        nullable=False,
        default=datetime.datetime.utcnow,
        onupdate=datetime.datetime.utcnow,
    )

a model with created_at and updated_at fields populated from sql

import sqlalchemy as sa

class User(Base):
    __tablename__ = "users"
    id = sa.Column(sa.Integer, primary_key=True)
    created_at = sa.Column(sa.DateTime, nullable=False, server_default=sa.func.now())
    updated_at = sa.Column(
        sa.DateTime,
        nullable=False,
        server_default=sa.func.now(),
        server_onupdate=sa.func.now(),
    )

Use `sqlalchemy[mypy]` instead of sqlalchemy-stubs from Dropbox

Several tutorials recommend using sqlalchemy-stubs from Dropbox to make SQLAlchemy play with mypy. These stubs are not compatible with SQLAlchemy 1.4 and 2.0, though. Fortunately, there is a project sqlalchemy/sqlalchemy2-stubs that comes as a dependency of sqlalchemy[mypy], supported by the developers of SQLAlchemy and does all the heavy lifting.

In my code, I replaced the dependency from sqlalchemy-stubs to sqlalchemy[mypy] and changed the plugin from sqlmypy to sqlalchemy.ext.mypy.plugin, as outlined in Mypy / Pep-484 Support for ORM Mappings.

pip install sqlalchemy[mypy]

File mypy.ini:

[mypy]
plugins = sqlalchemy.ext.mypy.plugin

SQLAlchemy tips

Document fields of your models with `doc` and `comment`

Use the doc column attribute to document the SQLAlchemy field for future self and other developers working with the code.
If you plan to work with raw SQL, consider adding the comment attribute — it will be associated with the column on the database level and will be visible even for those who don’t use the ORM.

using doc and comment for columns

import sqlalchemy as sa

class User(Base):
    id = sa.Column(sa.Integer, primary_key=True)
    name = sa.Column(sa.String, nullable=False, doc="User name", comment="User name")

Include date and time to Alembic migration files

Default migration files have the format <revision>_<slug>.py. A randomly generated revision makes it harder to find recently added migrations. The solution is to add a date and time prefix to the template generator.

Update alembic.ini to define the file template:

file_template = %%(year)d%%(month).2d%%(day).2d_%%(rev)s_%%(slug)s

The command alembic revision -m "Add new column" will generate a file like this: migrations/versions/20211128_2eb46b06af52_add_new_column.py.

Alternatively, you can generate a new revision manually with the --rev-id flag.

Source: StackOverflow

How to create a data migration

Usually, a data migration inserts new records in upgrade() and removes them in downgrade(). The recommended way is not to go too far into the ORM and model things and modify data with op.bulk_insert() or op.execute() instead.

The advantage of the simplified approach is that it still can generate the SQL version (with an --sql option) when necessary.

Note that while writing the script, you should probably repeat the table definition inside it. It’s OK only to define the fields that you need for the migration.

Alembic documentation explicitly warns you against importing a table from your existing code.

Note it’s a recommended practice to at least ensure the definition of a table is self-contained within the migration script, rather than imported from a module that may break compatibility with older migrations.

Example:

from alembic import op
import sqlalchemy as sa

...

account = sa.table("account", sa.column("name", sa.String))


def upgrade():
    op.bulk_insert(
        account,
        [
            {"name": "John Smith"},
            {"name": "Ed Williams"},
            {"name": "Wendy Jones"},
        ],
    )


def downgrade():
    op.execute(account.delete())

References:

Roman Imankulov