SQLAlchemy and Alembic notes
My notes on SQLAlchemy and Alembic
Quickstart snippet
A quickstart snippet for the SQLAlchemy ORM playground.
# file: playground.py
import sqlalchemy as sa
from sqlalchemy.orm import declarative_base, sessionmaker
import datetime
engine = sa.create_engine("sqlite:///:memory:", echo=True)
Session = sessionmaker(bind=engine, expire_on_commit=False)
Base = declarative_base()
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False)
Base.metadata.create_all(engine)
session = Session()
You can run the script with IPython interactive mode.
$ ipython -i playground.py
...
In [1]: session.add(User(name="foo"))
In [2]: session.commit()
SQLAlchemy pitfalls
Columns are nullable by default
The default value of SQLAlchemy nullable
is False unless it’s a primary key. A foreign key is also nullable by default. If you came from the Django ORM, where the default values for fields are null=False, blank=False
, you might accidentally create table definitions with nullable fields, which bites you in the future.
This behavior matches the default SQL, though, where the column would be considerably nullable unless you explicitly set NOT NULL
for it.
import datetime
import sqlalchemy as sa
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False)
Column default
vs. server_default
and onupdate
vs. server_onupdate
Column’s default
attribute does not add DEFAULT value
to the CREATE TABLE
statement. Don’t rely on it if you plan to insert new records in the database outside of SQLAlchemy, now or in the future. The value can be a scalar or a callable (e.g., datetime.datetime.utcnow
)
To delegate setting the default value to the database, use server_default
instead. You can use scalars or functions. SQLAlchemy is aware of a lot of database functions and their database-specific variants. Use SQL function expressions generator, available as sqlalchemy.func
. Consult SQL and Generic Functions.
The same rule applies to update
vs. server_onupdate
.
import datetime
import sqlalchemy as sa
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
created_at = sa.Column(
sa.DateTime,
nullable=False,
default=datetime.datetime.utcnow,
)
updated_at = sa.Column(
sa.DateTime,
nullable=False,
default=datetime.datetime.utcnow,
onupdate=datetime.datetime.utcnow,
)
import sqlalchemy as sa
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
created_at = sa.Column(sa.DateTime, nullable=False, server_default=sa.func.now())
updated_at = sa.Column(
sa.DateTime,
nullable=False,
server_default=sa.func.now(),
server_onupdate=sa.func.now(),
)
Use sqlalchemy[mypy]
instead of sqlalchemy-stubs from Dropbox
Several tutorials recommend using sqlalchemy-stubs from Dropbox to make SQLAlchemy play with mypy. These stubs are not compatible with SQLAlchemy 1.4 and 2.0, though. Fortunately, there is a project sqlalchemy/sqlalchemy2-stubs that comes as a dependency of sqlalchemy[mypy]
, supported by the developers of SQLAlchemy and does all the heavy lifting.
In my code, I replaced the dependency from sqlalchemy-stubs
to sqlalchemy[mypy]
and changed the plugin from sqlmypy
to sqlalchemy.ext.mypy.plugin
, as outlined in Mypy / Pep-484 Support for ORM Mappings.
pip install sqlalchemy[mypy]
File mypy.ini:
[mypy]
plugins = sqlalchemy.ext.mypy.plugin
SQLAlchemy tips
Document fields of your models with doc
and comment
- Use the
doc
column attribute to document the SQLAlchemy field for future self and other developers working with the code. - If you plan to work with raw SQL, consider adding the
comment
attribute — it will be associated with the column on the database level and will be visible even for those who don’t use the ORM.
import sqlalchemy as sa
class User(Base):
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False, doc="User name", comment="User name")
Include date and time to Alembic migration files
Default migration files have the format <revision>_<slug>.py
. A randomly generated revision makes it harder to find recently added migrations. The solution is to add a date and time prefix to the template generator.
Update alembic.ini
to define the file template:
file_template = %%(year)d%%(month).2d%%(day).2d_%%(rev)s_%%(slug)s
The command alembic revision -m "Add new column"
will generate a file like this: migrations/versions/20211128_2eb46b06af52_add_new_column.py
.
Alternatively, you can generate a new revision manually with the --rev-id
flag.
Source: StackOverflow
How to create a data migration
Usually, a data migration inserts new records in upgrade()
and removes them in downgrade()
. The recommended way is not to go too far into the ORM and model things and modify data with op.bulk_insert()
or op.execute()
instead.
The advantage of the simplified approach is that it still can generate the SQL version (with an --sql
option) when necessary.
Note that while writing the script, you should probably repeat the table definition inside it. It’s OK only to define the fields that you need for the migration.
Alembic documentation explicitly warns you against importing a table from your existing code.
Note it’s a recommended practice to at least ensure the definition of a table is self-contained within the migration script, rather than imported from a module that may break compatibility with older migrations.
Example:
from alembic import op
import sqlalchemy as sa
...
account = sa.table("account", sa.column("name", sa.String))
def upgrade():
op.bulk_insert(
account,
[
{"name": "John Smith"},
{"name": "Ed Williams"},
{"name": "Wendy Jones"},
],
)
def downgrade():
op.execute(account.delete())
References: