Rails 7 allows passing raw SQL as `on_duplicate` value to `#upsert_all`


We often come across cases in the Rails application where we need to bulk-insert records. Rails 6 introduced insert_all and upsert_all methods to solve bulk import issues.

upsert_all is nothing but a combination of update + insert. In case a record exists, it is updated with the new attributes, or a new record is inserted.

Before

In Rails 6, we can pass returning and unique_by options to the upsert_all method.

returning option expects an array of model attributes that should be returned for all successfully inserted/updated records.

Let’s say we have a Commodity model with three attributes id, name and price. We want to bulk import three rows of the Commodity model and pass returning option that returns name and price of the updated attributes. The code to do the above steps will be as shown below:

result = Commodity.upsert_all(
          [
            { id: 1, name: "Crude Oil", price: 51.27 },
            { id: 2, name: "Copper", price: 2.84 },
            { id: 4, name: "Gold", price: 1480.35 }
          ],
        )

# Bulk Update (2.3ms) INSERT INTO "commodities" ("id", "name","price")

# VALUES(1, "Crude Oil", 51.27...)

# RETURNING "name", "price"


puts result.inspect

#<ActiveRecord::Result:0x00007fb6612a1ad8 @columns=["id", "name", "price"], @rows=[[1, "Crude Oil", 51.27]....],

@hash_rows=nil, @column_types=...>

puts result.rows

[[1, "Crude Oil", 51.27], [2, "Copper", 2.84], [4, "Gold", 1480.35]]

By default in Rails 6, when we pass a duplicate record to upsert_all, it will update the record with new attributes. In few scenarios, we might want to raise an error when duplicate records are passed, or fire our custom SQL query.

If we try to pass on_duplicate option to upsert_all method it would raise the below error:

result = Commodity.upsert_all(
          [
            { id: 1, name: "Crude Oil", price: 51.27 },
            { id: 2, name: "Copper", price: 2.84 },
            { id: 4, name: "Gold", price: 1480.35 }
          ],
          on_duplicate: :raise
        )

ArgumentError (unknown keyword: :on_duplicate)

After

To resolve the above issue, Rails 7 added on_duplicate option to upsert_all method. The change also allows us to pass raw SQL queries to on_duplicate and returning options.

Continuing with the Commodity example, let’s say we want to set the higher price of the commodity, in case we try to upsert duplicate records. We can pass a custom SQL query to the on_duplicate option which sets the higher price of the commodity.

Commodity.upsert_all(
  [
    { id: 2, name: "Copper", price: 4.84 },
    { id: 4, name: "Gold", price: 1380.87 },
    { id: 6, name: "Aluminium", price: 0.35 }
  ],
  on_duplicate: Arel.sql("price = GREATEST(commodities.price, EXCLUDED.price)")
)

Commodity.find_by_name("Copper").price
=> 4.84

Commodity.find_by_name("Gold").price
=> 1480.35

As seen above, the Copper price is updated to 4.84 as it is higher when compared to 2.84, but the Gold price remains the same.

Similarly, we can pass a custom query to the returning option.

Commodity.upsert_all(
  [
    { id: 2, name: "Copper", price: 4.84, created_at: Time.now, updated_at: Time.now  },
    { id: 4, name: "Gold", price: 1380.87, created_at: Time.now, updated_at: Time.now  },
    { id: 8, name: "Steel", price: 1.35, created_at: Time.now, updated_at: Time.now  }
  ],
  returning: Arel.sql("id, (xmax = '0') as inserted, name as new_name")
)

=> <ActiveRecord::Result:0x00007f9a3061ee38 @columns=["id", "inserted", "new_name"], @rows=[[2, false, "Copper"], [4, false, "Gold"], [8, true, "Steel"]], @hash_rows=nil, @column_types={}>

As seen xmax = '0' is true when records are inserted and false for existing records.

Note:

The above changes will not be applicable for all types of databases. returning option is supported in PostgreSQL but not in SQLite3.