Elixir's Ecto Querying DSL: Beyond the Basics

This article builds on the fundamentals of Ecto that I covered in Understanding Elixir’s Ecto Querying DSL: The Basics. I’ll now explore Ecto’s more advanced features, including query composition, joins and associations, SQL fragment injection, explicit casting, and dynamic field access.

Once again, a basic knowledge of Elixir is assumed, as well as the basics of Ecto, which I covered in An Introduction to Elixir’s Ecto Library.

Query Composition

Separate queries in Ecto can be combined together, allowing for reusable queries to be created.

For example, let’s see how we can create three separate queries and combine them together to achieve DRYer and more reusable code:

SELECT id, username FROM users;
SELECT id, username FROM users WHERE username LIKE "%tp%";
SELECT id, username FROM users WHERE username LIKE "%tp%" LIMIT 10, 0;

offset = 0
username = "%tp%"

# Keywords query syntax
get_users_overview = from u in Ectoing.User,
  select: [u.id, u.username]

search_by_username = from u in get_users_overview,
  where: like(u.username, ^username)

paginate_query = from search_by_username,
  limit: 10,
  offset: ^offset

# Macro syntax
get_users_overview = (Ectoing.User
|> select([u], [u.id, u.username]))

search_by_username = (get_users_overview
|> where([u], like(u.username, ^username)))

paginate_query = (search_by_username
|> limit(10)
|> offset(^offset))

Ectoing.Repo.all paginate_query

The SQL version is quite repetitive, but the Ecto version on the other hand is quite DRY. The first query (get_users_overview) is just a generic query to retrieve basic user information. The second query (search_by_username) builds off the first by filtering usernames according to some username we are searching for. The third query (paginate_query) builds off of the second, where it limits the results and fetches them from a particular offset (to provide the basis for pagination).

It’s not hard to imagine that all of the above three queries could be used together to provide search results for when a particular user is searched for. Each may also be used in conjunction with other queries to perform other application needs too, all without unnecessarily repeating parts of the query throughout the codebase.

Joins and Associations

Joins are pretty fundamental when querying, and yet we’re only just covering them now. The reason for this is because learning about joins in Ecto alone is not useful: we need to know about associations as well. Whilst these are not difficult to learn about, they’re not quite as trivial as the other topics covered so far.

Simply put, associations enable developers to handle table relationships (implemented as foreign keys) in the models. They are defined in the schemas for each model using the has_one/3 and has_many/3 macros (for models containing other models), and the belongs_to/3 macro (for models that are apart of other models — those that have the foreign keys).

Looking at our Ectoing application, we can see one example of an association between the Ectoing.User model and the Ectoing.Message model. The schema defined in Ectoing.User defines the following association:

has_many :messages, Ectoing.Message

We can see that one user has many messages (Ectoing.Message), and we’re calling this association :messages.

In the Ectoing.Message model, we define the following association relationship:

belongs_to :user, Ectoing.User

Here, we’re saying that the model, Ectoing.Message, belongs to the Ectoing.User model. We have also named the association as :user. By default, Ecto will append an _id onto the belongs_to association name and use that as the foreign key name (so here, it would be :user_id). This default behavior can be overridden by manually specifying the foreign key name by specifying the foreign_key option. For example:

# Ectoing.Message
belongs_to :user, Ectoing.User, foreign_key: some_other_fk_name

Let’s now take a look at a simple query that uses a join to fetch a user and their messages:

SELECT * FROM users u INNER JOIN messages m ON u.id = m.user_id WHERE u.id = 4;

# Keywords query syntax
query = from u in Ectoing.User,
  join: m in Ectoing.Message, on: u.id == m.user_id,
  where: u.id == 4

# Macro syntax
query = (Ectoing.User
|> join(:inner, [u], m in Ectoing.Message, u.id == m.user_id)
|> where([u], u.id == 4))

Ectoing.Repo.all query

Returned value:

[%Ectoing.User{__meta__: #Ecto.Schema.Metadata<:loaded>,
  firstname: "Jane",
  friends_of: #Ecto.Association.NotLoaded<association :friends_of is not loaded>,
  friends_with: #Ecto.Association.NotLoaded<association :friends_with is not loaded>,
  id: 4,
  inserted_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
  messages: #Ecto.Association.NotLoaded<association :messages is not loaded>,
  surname: "Doe",
  updated_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
  username: "jane_doe"},
 %Ectoing.User{__meta__: #Ecto.Schema.Metadata<:loaded>,
  firstname: "Jane",
  friends_of: #Ecto.Association.NotLoaded<association :friends_of is not loaded>,
  friends_with: #Ecto.Association.NotLoaded<association :friends_with is not loaded>,
  id: 4,
  inserted_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
  messages: #Ecto.Association.NotLoaded<association :messages is not loaded>,
  surname: "Doe",
  updated_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
  username: "jane_doe"}]

Noticeably, we have a number of unloaded associations, including the :messages association. Loading this association can be done in one of two ways: from the result set of a query or from within the query itself. Loading associations from a result set can be done with the Repo.preload function:

results = Ectoing.Repo.all query
Ectoing.Repo.preload results, :messages

Loading associations from within a query can be done using a combination of the assoc and preload functions:

SELECT * FROM users u INNER JOIN messages m ON u.id = m.user_id WHERE u.id = 4;

# Keywords query syntax
query = from u in Ectoing.User,
  join: m in assoc(u, :messages),
  where: u.id == 4,
  preload: [messages: m]

# Macro syntax
query = (Ectoing.User
|> join(:inner, [u], m in assoc(u, :messages))
|> where([u], u.id == 4)
|> preload([u, m], [messages: m]))

Ectoing.Repo.all query

Now, we have the messages association loaded in the result:

[%Ectoing.User{__meta__: #Ecto.Schema.Metadata<:loaded>,
  firstname: "Jane",
  friends_of: #Ecto.Association.NotLoaded<association :friends_of is not loaded>,
  friends_with: #Ecto.Association.NotLoaded<association :friends_with is not loaded>,
  id: 4,
  inserted_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
  messages: [%Ectoing.Message{__meta__: #Ecto.Schema.Metadata<:loaded>,
    id: 5,
    inserted_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
    message_body: "Message 5",
    updated_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
    user: #Ecto.Association.NotLoaded<association :user is not loaded>,
    user_id: 4},
   %Ectoing.Message{__meta__: #Ecto.Schema.Metadata<:loaded>,
    id: 6,
    inserted_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
    message_body: "Message 6",
    updated_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
    user: #Ecto.Association.NotLoaded<association :user is not loaded>,
    user_id: 4}],
  surname: "Doe",
  updated_at: #Ecto.DateTime<2016-05-15T20:23:58Z>,
  username: "jane_doe"}]

Associations implicitly join on the primary key and foreign key columns for us, and so we don’t have to specify an :on clause. From the above, we can also see that when it comes to preloading associations, they aren’t lazily loaded. Associations must be explicitly loaded if they are wanted.

Because this article specifically focuses on Ecto’s querying DSL, we won’t cover the inserting, updating, or deleting of associations here. For more information on this, check out the blog post Working with Ecto associations and embeds.

SQL Fragment Injection

Whilst Ecto provides us with a lot of functionality, it only provides functions for common operations in SQL (it doesn’t aim to emulate the whole SQL language). When we need to drop back down into raw SQL, we can use the fragment/1 function, enabling for SQL code to be directly injected into a query.

For example, let’s perform a case-sensitive search on the username field:

SELECT username FROM users WHERE username LIKE BINARY '%doe';

username = "%doe"

# Keywords query syntax
query = from u in Ectoing.User,
  select: u.username,
  where: fragment("? LIKE BINARY ?", u.username, ^username)

# Macro syntax
query = (Ectoing.User
|> select([u], u.username)
|> where([u], fragment("? LIKE BINARY ?", u.username, ^username)))

Ectoing.Repo.all query

(The above contains MySQL-specific SQL. If you’re using another database, then this will not work for you.)

The fragment/1 function takes the SQL code as a string that we’d like to inject as the first parameter. It enables for columns and values to be bound to the SQL code fragment. This is done via placeholders (as question marks) in the string, with subsequent arguments passed to fragment being bound to each placeholder respectively.

Explicit Casting

Another way Ecto uses the models’ schema definitions is by automatically casting interpolated expressions within queries to the respective field types defined in the schema. These interpolated expressions are cast to the type of the field that they are being compared to. For example, if we have a query fragment such as u.username > ^username, where u.username is defined as field :username, :string in the schema, the username variable will automatically be cast to a string by Ecto.

Sometimes, however, we don’t always want Ecto to cast interpolated expressions to the defined field types. And other times, Ecto will not be able to infer the type to cast an expression to (typically, this is when fragments of SQL code are involved). In both instances, we can use the type/2 function to specify the expression and the type it should be cast to.

Let’s take the first case of wanting to cast an expression to another type, since this is the more interesting scenario. In our Ectoing application, we have used the Ecto.Schema.timestamps macro to add two extra fields to each of our tables: updated_at and inserted_at. The macro, by default, sets type of these fields to have a type of Ecto.DateTime. Now, if we’d like to see how many users have registered in the current month, we could use a simple query like the following:

Ectoing.Repo.all from u in Ectoing.User,                   
select: count(u.id),                                       
where: u.inserted_at >= ^Ecto.Date.from_erl({2016, 05, 01})

This will, however, give us a Ecto.CastError, since an Ecto.Date struct cannot be cast to an Ecto.DateTime struct (since we’re comparing the interpolated Ecto.Date expressions to a field of type Ecto.DateTime). In this case, we could either build a Ecto.DateTime struct, or we could specify to Ecto that we’d like to cast the expression to Ecto.Date instead of Ecto.DateTime:

Ectoing.Repo.all from u in Ectoing.User,                   
select: count(u.id),                                       
where: u.inserted_at >= type(^Ecto.Date.from_erl({2016, 05, 01}), Ecto.Date)

Now, Ecto happily accepts the query. After the cast operation, it then translates the interpolated Ecto.Date expression to the underlying :date type, which then lets the underlying database (MySQL, in this case) handle the comparison between a date and a datetime.

Dynamic Field Access

Let’s go back to our example from composing queries together, where we performed a username search:

search_by_username = from u in get_users_overview,
  where: like(u.username, ^username)

Like the pagination query that came after it, we can generalize this query too, so that it can search any field from a given table. This can be done by performing a dynamic field access:

query = Ectoing.User
query_by_field = :username
username = "%tp%"

# Keywords query syntax
search_by_field = from table in query,
  where: like(field(table, ^query_by_field), ^username)

# Macro syntax
search_by_field = (query
|> where([table], like(field(table, ^query_by_field), ^username)))

Ectoing.Repo.all search_by_field

The field/2 function is used for when a field needs to be specified dynamically. Its first argument is the table of the field to be accessed, and the second argument is the field’s name itself, specified as an atom. Using a general query like the above, we can encapsulate it within a function and use parameters to search for any given field from the table specified in the given query.

Conclusion

In both this and my previous article on Ecto’s querying DSL, we’ve covered quite a lot of what it’s capable of. The features mentioned should cover the vast majority of cases encountered when using Ecto within applications. But there are still some topics that haven’t been covered (such as query prefixing). There’s also all of the new upcoming features in Ecto’s much anticipated 2.0 release, including sub queries, aggregation queries, and many-to-many associations. These, as well as other features not specific to Ecto’s querying DSL, will be covered in future articles — so stay tuned!

Frequently Asked Questions (FAQs) about Elixir’s Ecto Querying DSL

What is Elixir’s Ecto Querying DSL and why is it important?

Elixir’s Ecto Querying DSL (Domain Specific Language) is a powerful tool for interacting with databases. It provides a way to write queries in a syntax that is close to SQL, but with the added benefits of compile-time safety, better integration with Elixir code, and potential for abstraction and code reuse. It’s important because it allows developers to write complex queries in a more readable and maintainable way, reducing the likelihood of errors and making the code easier to understand and modify.

How does Ecto handle associations between tables?

Ecto provides a way to define associations between tables using the has_many, has_one, and belongs_to macros. These associations allow you to query related data in a convenient and efficient way. For example, if you have a User schema and each user has many Posts, you can retrieve all posts for a user with a simple query.

Can I use Ecto to perform complex queries involving joins, subqueries, and aggregations?

Yes, Ecto supports a wide range of query operations, including joins, subqueries, and aggregations. You can use the join keyword to join tables, the from keyword to create subqueries, and functions like sum, avg, min, and max to perform aggregations. This makes Ecto a powerful tool for querying data in complex ways.

How does Ecto handle transactions?

Ecto provides a Repo.transaction function that allows you to execute multiple operations in a single transaction. If any operation fails, all changes made within the transaction are rolled back. This ensures data consistency and integrity.

Can I use Ecto with databases other than PostgreSQL?

While Ecto was initially designed to work with PostgreSQL, it now supports other databases as well, including MySQL and SQLite. You can specify the database type when setting up your Ecto repository.

How does Ecto handle migrations?

Ecto provides a robust migration system that allows you to create, modify, and delete database tables in a controlled and reversible way. You can generate migration files using mix tasks, and then define the changes in the migration file using Ecto’s DSL.

Can I use Ecto to validate data before inserting or updating it in the database?

Yes, Ecto provides a changeset function that allows you to validate data before it’s inserted or updated in the database. You can define validation rules in your schema, and then use the changeset function to apply these rules to the data.

How does Ecto handle database connections?

Ecto uses a connection pool to manage database connections. This allows it to efficiently handle multiple concurrent queries, ensuring that your application remains responsive even under heavy load.

Can I use Ecto to perform raw SQL queries?

Yes, while Ecto’s DSL provides a high-level, abstracted way to write queries, you can also use the Ecto.Adapters.SQL.query function to execute raw SQL queries if needed.

How does Ecto handle schemaless queries?

Ecto provides a Ecto.Query.API.dynamic function that allows you to build queries dynamically, without a predefined schema. This can be useful when you need to build queries based on user input or other runtime data.