SQLAlchemy and Alembic notes
My notes on SQLAlchemy and Alembic
Quickstart snippet
A quickstart snippet for the SQLAlchemy ORM playground.
# file: playground.py
import sqlalchemy as sa
from sqlalchemy.orm import declarative_base, sessionmaker
import datetime
engine = sa.create_engine("sqlite:///:memory:", echo=True)
Session = sessionmaker(bind=engine, expire_on_commit=False)
Base = declarative_base()
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False)
Base.metadata.create_all(engine)
session = Session()
You can run the script with IPython interactive mode.
$ ipython -i playground.py
...
In [1]: session.add(User(name="foo"))
In [2]: session.commit()
SQLAlchemy pitfalls
Columns are nullable by default
The default value of SQLAlchemy nullable is False unless it’s a primary key. A foreign key is also nullable by default. If you came from the Django ORM, where the default values for fields are null=False, blank=False, you might accidentally create table definitions with nullable fields, which bites you in the future.
This behavior matches the default SQL, though, where the column would be considerably nullable unless you explicitly set NOT NULL for it.
import datetime
import sqlalchemy as sa
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False)
Column default vs. server_default and onupdate vs. server_onupdate
Column’s default attribute does not add DEFAULT value to the CREATE TABLE statement. Don’t rely on it if you plan to insert new records in the database outside of SQLAlchemy, now or in the future. The value can be a scalar or a callable (e.g., datetime.datetime.utcnow)
To delegate setting the default value to the database, use server_default instead. You can use scalars or functions. SQLAlchemy is aware of a lot of database functions and their database-specific variants. Use SQL function expressions generator, available as sqlalchemy.func. Consult SQL and Generic Functions.
The same rule applies to update vs. server_onupdate.
import datetime
import sqlalchemy as sa
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
created_at = sa.Column(
sa.DateTime,
nullable=False,
default=datetime.datetime.utcnow,
)
updated_at = sa.Column(
sa.DateTime,
nullable=False,
default=datetime.datetime.utcnow,
onupdate=datetime.datetime.utcnow,
)
import sqlalchemy as sa
class User(Base):
__tablename__ = "users"
id = sa.Column(sa.Integer, primary_key=True)
created_at = sa.Column(sa.DateTime, nullable=False, server_default=sa.func.now())
updated_at = sa.Column(
sa.DateTime,
nullable=False,
server_default=sa.func.now(),
server_onupdate=sa.func.now(),
)
Use sqlalchemy[mypy] instead of sqlalchemy-stubs from Dropbox
Several tutorials recommend using sqlalchemy-stubs from Dropbox to make SQLAlchemy play with mypy. These stubs are not compatible with SQLAlchemy 1.4 and 2.0, though. Fortunately, there is a project sqlalchemy/sqlalchemy2-stubs that comes as a dependency of sqlalchemy[mypy], supported by the developers of SQLAlchemy and does all the heavy lifting.
In my code, I replaced the dependency from sqlalchemy-stubs to sqlalchemy[mypy] and changed the plugin from sqlmypy to sqlalchemy.ext.mypy.plugin, as outlined in Mypy / Pep-484 Support for ORM Mappings.
pip install sqlalchemy[mypy]
File mypy.ini:
[mypy]
plugins = sqlalchemy.ext.mypy.plugin
SQLAlchemy tips
Document fields of your models with doc and comment
- Use the
doccolumn attribute to document the SQLAlchemy field for future self and other developers working with the code. - If you plan to work with raw SQL, consider adding the
commentattribute — it will be associated with the column on the database level and will be visible even for those who don’t use the ORM.
import sqlalchemy as sa
class User(Base):
id = sa.Column(sa.Integer, primary_key=True)
name = sa.Column(sa.String, nullable=False, doc="User name", comment="User name")
Include date and time to Alembic migration files
Default migration files have the format <revision>_<slug>.py. A randomly generated revision makes it harder to find recently added migrations. The solution is to add a date and time prefix to the template generator.
Update alembic.ini to define the file template:
file_template = %%(year)d%%(month).2d%%(day).2d_%%(rev)s_%%(slug)s
The command alembic revision -m "Add new column" will generate a file like this: migrations/versions/20211128_2eb46b06af52_add_new_column.py.
Alternatively, you can generate a new revision manually with the --rev-id flag.
Source: StackOverflow
How to create a data migration
Usually, a data migration inserts new records in upgrade() and removes them in downgrade(). The recommended way is not to go too far into the ORM and model things and modify data with op.bulk_insert() or op.execute() instead.
The advantage of the simplified approach is that it still can generate the SQL version (with an --sql option) when necessary.
Note that while writing the script, you should probably repeat the table definition inside it. It’s OK only to define the fields that you need for the migration.
Alembic documentation explicitly warns you against importing a table from your existing code.
Note it’s a recommended practice to at least ensure the definition of a table is self-contained within the migration script, rather than imported from a module that may break compatibility with older migrations.
Example:
from alembic import op
import sqlalchemy as sa
...
account = sa.table("account", sa.column("name", sa.String))
def upgrade():
op.bulk_insert(
account,
[
{"name": "John Smith"},
{"name": "Ed Williams"},
{"name": "Wendy Jones"},
],
)
def downgrade():
op.execute(account.delete())
References: