Most of our projects use databases. Using Ecto allows us to manage data migrations easily. We often do not realize that incorrectly used functionality may block our applications.
In this post, I would like to present a problem that may occur. It is important to understand the problem and the consequences better. Simple changes can save us from unpleasant consequences.
Migrations are performed one file after the other.
Ecto uses the
schema_migrations table, which stores all migrations that have already been executed.
We can configure this table’s name with the
:migration_source in Ecto configuration option1.
To ensure that the migration is performed once, Ecto will lock2 the
schema_migrations table when running migrations.
It guarantees that only one server can make changes at the same time.
The table stores
There is no storing of checksums.
It is crucial because it allows us to modify and improve already created files.
The use of the Elixir code in migration files can be divided into two categories. When we use modules such as Enum or Map, there should be no problems.
The greater risk occurs when we decide to use data structures: structs or Ecto schemas. We can make our application unable to start!
Let’s analyze a simple example. In our project, we have prepared the following migration:
1 2 3 4 5 6 7 8 9 10 11 # priv/repo/migrations/timestamp_create_logs_table.exs defmodule Project.Repo.Migrations.CreateLogsTable do use Ecto.Migration def change do create table(:logs) do add(:action, :string) add(:details, :text) end end end
In our code, we can use structure:
1 2 3 4 5 6 7 8 9 10 defmodule Project.Schema.Log do use Ecto.Schema schema "logs" do field :action, :string field :details, :string end ... end
Let’s suppose we need to change data in a database.
We want to mark the used earlier
update action as
We can prepare a simple migration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 # priv/repo/migrations/timestamp_modify_log_action.exs defmodule Project.Repo.Migrations.CreateLogsTable do use Ecto.Migration def up do import Ecto.Query, only: [from: 2] from( q in Project.Schema.Log, where: q.action == "update" ) |> Project.Repo.update_all(set: [action: "modified"]) end def down do ... end end
The project is developing, but at some point, we recognize that our logs need to be transferred to a separate database.
We delete the table
logs and structure
Project.Schema.Log to keep things in order after moving the data.
Anyone who wants to run our migration code from zero state will get an error now.
1 2 ** (UndefinedFunctionError) function Project.Schema.Log.__schema__/1 is undefined (module Project.Schema.Log is not available)
Migrations are blocked; CI cannot work. It may be a not very extensive example, but it conveys an idea. Using structures in migrations is dangerous.
Changing the name or deleting the fields will cause problems. Most likely just when we least expect it and have other important tasks.
How to improve invalid migration?
Note: This section has been updated since the article was posted.
There are two ways to improve. The first one (originally included here) is the use of the SQL code prepared by us.
With Ecto.Migration.execute/2 you can execute the SQL command. It can be helpful if you don’t want to create temporary structures.
1 execute("UPDATE logs SET action = 'modified' WHERE action = 'update';")
The second approach is to use structures that will only be visible in the context of migration. This is some code duplication but also eliminates the problem of changing structures.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 # priv/repo/migrations/timestamp_modify_log_action.exs defmodule Project.Repo.Migrations.CreateLogsTable do use Ecto.Migration defmodule Log do use Ecto.Schema schema "logs" do field(:action, :string) field(:details, :string) end end def up do import Ecto.Query, only: [from: 2] from( q in Log, where: q.action == "update" ) |> Project.Repo.update_all(set: [action: "modified"]) end def down do ... end end
Theoretically, both versions will lead to the same state. I leave the choice to you. For me personally, the use of SQL seems like a more straightforward solution. I am updating the post to avoid inaccuracies as there is also an approach beyond what was initially proposed.
Changes in migrations are not tricky. Instead of structures, we can use schemaless Ecto queries. You can still use Ecto API, but in my opinion, you should use it without schemas. This will eliminate problems when running our application with a clean database or changing fields in schemas. Simple changes today can save us from critical errors in the future.
In the previous version, I used the phrase “raw SQL”. It was not very precise as the Ecto API can still be used. Thank you, Felipe Pereira Stival, for pointing out this ambiguity. Thank you, Josef Strzibny, for pointing out the ability to use temporary module inside migration.