First of all, I’m no expert in programming. In fact, I started with Ruby on Rails about one and a half month ago with the beginning of my internship at Codeminer 42.
So that’s me, an intern trying to do things I’ve never done before based on some online documentation which I’ve just googled about, along with the effort of my colleagues to teach me.
Step 1: Read the documentation
That’s right, as an intern who hasn’t got a clue about what to do, the first step I took was to google up for documentation, and there I found the official migration docs and a really great article.
The article, as it says, is a supplement to the original Paperclip documentation. Both of them shaped the way I did the migration.
Step x.5: Do what the docs tell you and watch it fail
Actually, this can happen through all the steps, hence the x in x.5. In other words, this is the intersection step.
As you follow the guide and work on the changes, errors will pop up because that’s life how it is. And even though you’ve read the entire docs and made the changes as needed, you will probably still get some errors. But that’s OK. Errors teach us about humility and make us pay attention to the details.
If you can’t solve the error by yourself in at most 30 minutes, call for help. There’s no shame on it. Your teammates can help you out and open your mind to new possibilities.
Step 2: Install Active Storage and migrate the database
First, you need to install Active Storage, or you won’t be able to do anything. Assuming you are on Rails 5.2, here is the command you need to run in the terminal:
rails active_storage:install
It will create the Active Storage tables in your app, but they will be empty and you will have to fill them out with the data from Paperclip. For that matter, I used a rake task which runs the following code:
class MigrateToActiveStorage
require 'open-uri'
def perform
get_blob_id = 'LASTVAL()'
ActiveRecord::Base.connection.raw_connection.prepare("active_storage_blob_statement",<<-SQL)
INSERT INTO active_storage_blobs (
key, filename, content_type, metadata, byte_size, checksum, created_at
) VALUES ($1, $2, $3, '{}', $4, $5, $6)
SQL
ActiveRecord::Base.connection.raw_connection.prepare("active_storage_attachment_statement",<<-SQL)
INSERT INTO active_storage_attachments (
name, record_type, record_id, blob_id, created_at
) VALUES ($1, $2, $3, #{get_blob_id}, $4)
SQL
models = ActiveRecord::Base.descendants.reject(&:abstract_class?)
models.each do |model|
attachments = model.column_names.map do |c|
if c =~ /(.+)_file_name$/
$1
end
end.compact
model.find_each.each do |instance|
attachments.each do |attachment|
make_active_storage_records(instance,attachment,model)
end
end
end
end
private
def make_active_storage_records(instance,attachment,model)
blob_key = key(instance, attachment)
filename = instance.send("#{attachment}_file_name")
content_type = instance.send("#{attachment}_content_type")
file_size = instance.send("#{attachment}_file_size")
file_checksum = checksum(instance.send(attachment))
created_at = instance.updated_at.iso8601
blob_values = [blob_key, filename, content_type, file_size, file_checksum, created_at]
ActiveRecord::Base.connection.raw_connection.exec_prepared(
"active_storage_blob_statement",
blob_values
)
blob_name = attachment
record_type = model.name
record_id = instance.id
attachment_values = [blob_name, record_type, record_id, created_at]
ActiveRecord::Base.connection.raw_connection.exec_prepared(
"active_storage_attachment_statement",
attachment_values
)
end
def key(instance, attachment)
# SecureRandom.uuid
# Alternatively:
instance.send("#{attachment}").path
end
def checksum(attachment)
# local files stored on disk:
# url = "#{Rails.root}/public/#{attachment.path}"
# Digest::MD5.base64digest(File.read(url))
# remote files stored on another person's computer:
url = attachment.url
Digest::MD5.base64digest(Net::HTTP.get(URI(url)))
end
end
This code will fetch all records from the models with Paperclip attachments and fill the ActiveStorage tables with Paperclip references. In my project, Paperclip is configured to use remote storage (Amazon S3), but if you are using local storage, just uncomment the checksum method appropriately.
That being done, you can enter Rails Console and check if the ActiveStorage::Attachment
and ActiveStorage::Blob
records were created correctly. If so, you are ready to go to the next step.
Step 3: Create a separate branch
This is kind of tricky because you need 2 separate pull requests: one to fill the Active Storage tables (as shown above) and another to replace Paperclip with Active Storage throughout the codebase.
Spoiler alert: in the next step, we will change the models to use Active Storage instead of Paperclip, but if you look closely at the code for the above rake task, you will notice that it uses Paperclip methods:
filename = instance.send("#{attachment}_file_name")
Trying to run the rake task after the model modifications will cause an error. That’s why you need to execute the rake task through a separate PR before changing the models, otherwise things won’t work.
Step 4: Change the models and views
It’s very simple. Just do as the official docs say and you shouldn’t have any problems. Check it out there and let’s be DRY.
Note that Active Storage works by saving the original picture and resizing it on-the-fly, as opposed to eagerly. That’s why in the model you simply write that it has_one_attached
and leave the crop config out to the views.
Step 5: Migrate the attachments
Yeah, you’ve changed your models and your views. You filled out the Active Storage tables. Now you can rest, right? No way. If you look closely at the old attachments, they are still within the Paperclip path, so you aren’t completely Paperclip-free. You must therefore move the attachments onto the ActiveStorage path. This requires running the following code through a rake task:
class MigrateData
def perform
models = ActiveRecord::Base.descendants.reject(&:abstract_class?)
models.each do |model|
attachments = model.column_names.map do |c|
if c =~ /(.+)_file_name$/
$1
end
end.compact
attachments.each do |attachment|
migrate_data(attachment,model)
end
end
end
private
def migrate_data(attachment,model)
model.where.not("#{attachment}_file_name": nil).find_each do |instance|
bucket = ENV['AWS_BUCKET']
name = instance.send("#{attachment}_file_name")
content_type = instance.send("#{attachment}_content_type")
id = instance.id
url = "https://s3.amazonaws.com/#{bucket}/uploads/#{attachment.pluralize}/#{id}/original/#{name}"
instance.send(attachment.to_sym).attach(
io: open(url),
filename: name,
content_type: content_type
)
end
end
end
This code will copy the Paperclip files to the Active Storage path. The duplication is important because it makes you feel safe about your data. In other words, changing the references without the risk of losing data. If you are using a storage service other than Amazon S3, just change the URL in line 27. If the files are in your local disk, change the line to use local file paths.
After that, check if the files are hooked up to the ActiveStorage path and if the migration was executed without errors. You should then be ready for the next step.
Step 6: Remove Paperclip
If you got the previous steps right, this won’t be a problem. Just remove the Paperclip gem from your Gemfile, run bundle install
, and check if everything is still working. If you have tests, don’t hesitate to run them!
Step 7: Deploy to staging!
Remember:
Merge the first branch and execute the migrations and the rake task for
MigrateToActiveStorage
, and,Merge the second branch and execute the rake task for
MigrateData
.
Now delete the old S3 attachment folder and voila! You’ve successfully migrated to Active Storage.
Final considerations
After I got everything working in staging, I noticed the model validations were missing. Unfortunately, Active Storage doesn’t provide built-in validations, which calls for a workaround. But there is a problem: Active Storage saves the attachment blob before running the model validations, when it should do it in the opposite order. I found 3 solutions to this:
Implement a model callback to delete the attachment after validation fails.
Validate the attachment at the controller level. This feels out of place and is thus not ideal.
Active Storage will support model validations on Rails 6.0. Until then, stick with Paperclip and problem solved.
It will come as a surprise that I’ve chosen the third option. It was the simplest solution to my project. And even though my work got in standby mode, it was a nice experience. It taught me a lot about staging, deployment, rake tasks, file storage, and so on.
When Rails 6.0 comes out, I will be ready for that.
References
https://blog.carbonfive.com/2018/06/25/safely-migrating-from-paperclip-to-active-storage/
https://gorails.com/episodes/migrate-from-paperclip-to-rails-active-storage
https://github.com/thoughtbot/paperclip/blob/master/MIGRATING.md
https://github.com/rails/rails/commit/e8682c5bf051517b0b265e446aa1a7eccfd47bf7
Thanks to Luan Gonçalves Barbosa, Thiago Araújo Silva, Halan Pinheiro, and Maychell Oliveira.
We want to work with you. Check out our "What We Do" section!