There are cases when we want to insert records in bulk. For eg., we have a list of users in CSV and we want to import these users in our application.
Rails has methods like
update_all to delete and update bulk
records respectively. Similar method to insert bulk records was missing.
Rails 6 added
to solve the above issue.
Before Rails 6
Before Rails 6, bulk insert was achieved -
- Using activerecord-import gem
- Creating records one by one
- Using the SQL INSERT statement
In Rails 6
insert_all and insert_all!
insert_all we can perform bulk insert as below -
As mentioned above, we notice an
ON CONFLICT DO NOTHING clause in the query.
This is supported by SQLite and PostgreSQL databases. If there is a
conflict or a unique key violation during the bulk insert process, it
skips the conflicting record and proceeds with the insertion of the next record.
If we need to ensure all rows are inserted we can use
the bang version directly.
RETURNING "id" clause in the above query return the primary key(s)
@rows=[, ]. If we want to check more attributes apart from
we can use an optional
returning option, which expects an array of
insert_all! either skip the duplicate records or
raise an exception, if a duplicate record is encountered while bulk inserting.
If a record exists we want to update it or else create a new record. This is known as upsert.
upsert_all method performs bulk upserts.
The second row in the input array has duplicate
id 1 and hence the
name of the user will be
Sam's new instead of
The third row in the input array has no duplicate and it will perform an update.
The fourth row with
id 3 is a new entry and hence insert operation
will be performed here.
upsert_all behaviour is different for different databases.
The above example works for
MySQL database but will not work for
Inserting or updating records sequentially has huge performance issue.
Let’s try to insert 1000 users sequentially and benchmark it.
1000 users created locally in 15.32 seconds in 1000 transactions executing 1000 sequential insert query.
insert_all we can prepare dataset of users and import them in one
As seen the time taken to import 1000 users has reduced from 15.32 seconds to just 0.40 seconds.