Most of our projects use databases. Using Ecto allows us to manage data migrations easily. We often do not realize that incorrectly used functionality may block our applications.
In this post, I would like to present a problem that may occur. It is important to understand the problem and the consequences better. Simple changes can save us from unpleasant consequences.
Migrations are performed one file after the other.
Ecto uses the
schema_migrations table, which stores all migrations that have already been executed.
We can configure this table’s name with the
:migration_source in Ecto configuration option1.
To ensure that the migration is performed once, Ecto will lock2 the
schema_migrations table when running migrations.
This guarantees that only one server can make changes at the same time.
The table stores
There is no storing of checksums.
This is crucial because it allows us to modify and improve already created files.
The use of the Elixir code in migration files can be divided into two categories. When we use modules such as Enum or Map, there should be no problems.
The greater risk occurs when we decide to use data structures: structs or Ecto schemas. We can make our application unable to start!
Let’s analyze a simple example. In our project, we have prepared the following migration:
1 2 3 4 5 6 7 8 9 10 11 # priv/repo/migrations/timestamp_create_logs_table.exs defmodule Project.Repo.Migrations.CreateLogsTable do use Ecto.Migration def change do create table(:logs) do add(:action, :string) add(:details, :text) end end end
In our code, we can use structure:
1 2 3 4 5 6 7 8 9 10 defmodule Project.Schema.Log do use Ecto.Schema schema "logs" do field :action, :string field :details, :string end ... end
Let’s suppose we need to change data in a database.
We want to mark the used earlier
update action as
We can prepare a simple migration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 # priv/repo/migrations/timestamp_modify_log_action.exs defmodule Project.Repo.Migrations.CreateLogsTable do use Ecto.Migration def up do import Ecto.Query, only: [from: 2] from( q in Project.Schema.Log, where: q.action == "update" ) |> Project.Repo.update_all(set: [action: "modified"]) end def down do ... end end
The project is developing, but at some point, we recognize that our logs need to be transferred to a separate database.
To keep things in order, we delete the table
logs and structure
Project.Schema.Log after moving the data.
Anyone who wants to run our migration code from zero state will get an error now.
1 2 ** (UndefinedFunctionError) function Project.Schema.Log.__schema__/1 is undefined (module Project.Schema.Log is not available)
Migrations are blocked, CI cannot work. It may be a not very extensive example, but it conveys an idea. Using structures in migrations is dangerous.
Changing the name or deleting the fields will cause problems. Most likely just when we least expect it and have other important tasks.
How to improve invalid migration? Just use Ecto.Migration.execute/2
1 execute("UPDATE logs SET action = 'modified' WHERE action = 'update';")
Theoretically, both versions will lead to the same state. In the case of using data schemas, unfortunately, we close ourselves to changes. Removing the schematic from the project blocks migrations and the entire project because the module is unknown.
Changes in migrations are not tricky. Instead of structures, we should schemaless Ecto queries. You can still use Ecto API, but you should use it without schemas. This will eliminate problems when we want to run our application with a clean database or change fields in schemas. Simple changes today can save us from critical errors in the future.
In the previous version I used the phrase “raw SQL”. It was not very precise as the Ecto API can still be used. Thank you, Felipe Pereira Stival, for pointing out this ambiguity.