Privacy on Rails

Pragmatically complying with data protection laws

On a calm and peaceful day, you’re working on your project, combobulating, discombobulating… then this email shows up in the contact inbox:

From: notthatdhh@example.com
To: contact@yourcompany.com
Cc: veradikt@somelawfirm.com
Subject: I’m outta here

Hello, here’s DHH (not the one you’re thinking, I’m Dustin Harper Holloway), and my lawyer Vera Dikt who’s cc’d in this email. Send me everything you have about me. Also delete my account. kthxbye

Would you know what to do? Or would you start spiraling thinking:

  • What tables store this user’s private information?
  • Can we remove them from the logs too?
  • Oh no, what about ActiveJob payloads?
  • How much time do we have to respond?!
  • Are we breaking any law? Are we going to be fined?
  • Are we about to declare bankruptcy?

⚠️ An important disclaimer ⚠️

I’m not a lawyer. This is an engineering perspective on how to pragmatically comply with data protection laws using Rails. If you need legal advice on law compliance, consult a lawyer.

What are data protection laws?

After feeling overwhelmed for a bit, you regain focus and start your research.

You find out that data protection laws exist to enforce data protection and user privacy. They’re present in more than 140 countries, and here’s the part that catches you off guard: most of them don’t apply only if you’re based in the country. They also apply if you have users there.

They’re about consent, data minimization, access rights, breach notification, and purpose limitation.

And "personal data" is a lot broader than you expect. It includes:

  • Basic personal identifiers: Name, email, phone, government ID, date of birth
  • Personal information: Race, religion, gender identity, political views/affiliation
  • Medical information: Prescriptions, diagnoses, lab results, disabilities, health insurance information
  • Financial and official records: Tax records, salary
  • Behavioral and inferred data: Behavior profile, AI-generated predictions
  • Digital activity and technical identifiers: IP address, browsing history, chat messages, location history

And more. If it can be used to identify a person, directly or indirectly, it’s probably covered.

Well, dear reader, if relating to this scenario made you sweat, I have great news: the solution is up ahead!

Why do companies fail to comply with privacy laws?

Some years ago, privacy support used to be a competitive advantage, but now it’s the baseline for a product. And companies still fail to comply. Why?

The answer is: they treat privacy as just a legal concern and put innovation above user privacy.

So let’s start with that in mind:

Don’t gamble with user data to pay for innovation

Privacy by design

When you prioritize your user’s privacy and treat it as a normal constraint of your development process (just like we already do for security, performance, and mobile responsiveness) we go towards what we call privacy by design.

Privacy by design starts with a few principles:

  1. Data minimization. Don’t collect what you can’t protect, or what you don’t need. Less data, less to worry about.
  2. Private by default. Every data exposure needs an explicit reason. Data must be protected with appropriate technical measures.
  3. Transparency. Users should know and control what you store about them.

Remember: the data belongs to the user, you can’t use it freely just because it’s stored in your database.

Ok, let’s talk about how Rails can be used to follow these principles.

Data minimization

The idea is simple. If you don’t have the data, you don’t have to worry about it. That applies both to data entering your app and data leaving it.

Minimize data entering your app

Use strong parameters

I know, this one sounds obvious. But the point isn’t just "don’t mass-assign". The point is intentionality. Write an explicitly permitted list and only allow what you actually need, don’t permit things "just because you might need it". You can always change the code later if something else is needed.

def registration_params
  params.require(:user).permit(
    :email_address, :password, :password_confirmation,
    :first_name, :last_name, :phone
  )
end

Filtering at the entry point covers the whole application. That’s why it’s worth being careful here.

Anonymize user IPs

IP addresses are personal data under most data protection laws. If you don’t need the full IP (and most apps don’t), anonymize it at the middleware level using the ip_anonymizer gem:

config.middleware.insert_after ActionDispatch::RemoteIp, IpAnonymizer::MaskIp

Important: if geocoding is important for your app, it still works with anonymized IPs! Maybe you’ll lose a bit bit of precision, but most apps don’t need that level of precision anyway.

Again, one entry point, whole app covered.

Minimize data leaving your app

Use explicit serialization

Don’t rely on to_json or as_json defaults that dump entire records. Write a serializer that lists exactly what should be exposed:

class OrderSerializer
  def initialize(order)
    @order = order
  end

  def as_json(*)
    {
      id: @order.id,
      total_cents: @order.total_cents,
      status: @order.status,
      items: @order.order_items.map do |item|
        item.slice(:product_id, :quantity, :price_cents)
      end,
      shipping_address: {
        street: @order.address.street,
        city: @order.address.city,
        state: @order.address.state,
        zip_code: @order.address.zip_code
      }
    }
  end
end

If someone adds a social_security_number column to the orders table next week, it won’t silently leak into your API responses or third-party requests.

Minimize job payloads

Every argument you pass to a background job gets serialized into your queue backend (Redis, the database, wherever), shown in job dashboards like Sidekiq Web, and logged on enqueue and retry. Pass personal data in and you’ve just leaked it into a whole new set of places you now have to worry about.

The rule is simple: pass IDs only, never the data itself. Let the job re-fetch what it needs. As a bonus, you also avoid acting on stale data when the job runs minutes or hours after it was enqueued.

# Don't
WelcomeEmailJob.perform_later(
  user.email_address,
  user.first_name,
  user.last_name
)

# Do
WelcomeEmailJob.perform_later(user.id)

class WelcomeEmailJob < ApplicationJob
  def perform(user_id)
    user = User.find(user_id)
    UserMailer.welcome(user).deliver_now
  end
end

Safeguard your error tracking strategies

Error tracking services capture full request context: params, headers, IPs, sometimes request bodies. PII ends up in your error tracker without you noticing, and then you have no proper way to delete it upon user request.

First, make sure your filter_parameters config is thorough. Another good thing about filter_parameters config is that it also applies do ActiveRecord’s #inspect method, so not only objects with keys listed here will be filtered from request logging, but also when you log a model.

Rails.application.config.filter_parameters += [
  :passw, :email, :secret, :token, :_key,
  :crypt, :salt, :certificate, :otp, :ssn, :cvv, :cvc,
  # App-specific PII:
  :first_name, :last_name, :phone, :date_of_birth
]

Then use the logstop gem to scrub PII from error messages before they reach your tracking service:

Logstop.scrub(error)

Anonymize analytics data

This is where so-called "innovators" get mad, so I’ll say it again: Innovation should not come at the expense of your users’ privacy. If your business depends on keeping or sharing your users’ data without their consent, maybe you should reconsider it. Yes, I know, analytics is important, and here’s the thing: you can have it without hurting your users’ privacy.

Use a privacy-first analytics service, and make sure tracking scripts are consent-gated. Anonymize any personal information sent to analytics.

There’s no code example in this section, it’s a matter of principle. We’ll see consent modeling and anonymization techniques soon.

Keep personal data out of emails

Data can leak through email logs and bounced-email handling. Keep personal information in emails to the minimum possible and filter email logs. Need to send an email to the user? Don’t mention their name in it, make it necessary to sign-in to get access to this kind of information.

Also, you should be using signed URLs with auto-expiry and authenticated downloads instead of embedding user data directly. And here’s something you should know: Rails supports it natively!

# In the mailer
@download_url = user_file_download_url(
  token: Rails.application.message_verifier("some_file")
                .generate(user.id, expires_in: 30.minutes)
)
# In the controller
user_id = Rails.application.message_verifier("some_file")
               .verify(params[:token])

Data retention and TTL policies

Don’t keep data you no longer need. Set up automatic retention policies that anonymize or delete inactive users, clear expired sessions, and clean up completed job records.

A sample retention job:

class DataRetentionJob < ApplicationJob
  RETENTION_PERIOD = 3.years

  def perform
    cutoff = RETENTION_PERIOD.ago

    User.customer
      .not_anonymized
      .where(updated_at: ...cutoff)
      .find_each do |user|

      # Don't delete users that have pending matters
      next if user.orders.where(status: [:pending, :confirmed, :shipped]).exists?

      user.addresses.not_anonymized.find_each(&:anonymize!)

      user.anonymize!
    end
  end
end

Automate it, don’t do it only when a user requests it. If it depends on someone remembering to run it, it won’t happen.

Private by default

The second principle is about making data protected by default, with no extra effort from the developer. Extra effort should only be needed when you need to expose something private.

Logs

Logs are the number one silent leak of personal data. Sometimes you implement all the other strategies to keep information private and still leak it through logs.

Check everything that’s explicitly logged. And for everything else, guard your logger with the logstop gem:

Logstop.guard(Rails.logger)

This automatically filters common PII patterns (emails, IPs, credit card numbers, and so on) from your log output.

Encryption at rest

Rails natively supports three levels of data protection for model attributes. Pick based on the field’s usage pattern:

DecryptSearchUse
encrypts :first_nameYesNoYes
encrypts :email_address, deterministic: true, downcase: trueYesYesYes
has_secure_passwordNoNoJust verify

Non-deterministic encryption is for fields you only need to display, like names or phone numbers. Deterministic encryption is for fields you need to query. You need to look up users by email, so email goes there. Hashing (has_secure_password) is for fields you only need to verify, never read back.

Force SSL

Encryption at rest is only half the story. If your app accepts a single request over plain HTTP in production, session cookies and form submissions travel in the clear, and anything on the path between the user and your server can read them. Every production Rails app should force HTTPS.

Rails makes it a one-liner:

# config/environments/production.rb
config.force_ssl = true

That flag does three things at once: redirects any HTTP request to HTTPS, sets the Strict-Transport-Security header so browsers refuse plain HTTP on future visits, and marks your cookies as secure so they never leave the browser over an unencrypted connection.

If you want to go further and submit your domain to the HSTS preload list, be intentional about it — preloading is sticky and hard to undo:

config.ssl_options = {
  hsts: {
    expires: 1.year,
    subdomains: true,
    preload: true
  }
}

Don’t turn on preload until you’re sure every subdomain can serve HTTPS. Once a browser has your domain preloaded, there’s no "just use HTTP for a minute" escape hatch.

Protect cookies properly

Be intentional about cookies. Define proper same_site policies and pick the right level of protection:

cookies.signed_permanent[:app_session_id] = {
  value: session_id,
  httponly: true,
  same_site: :strict # or :lax
}

cookies.encrypted[:contact_draft] = {
  value: {
    name: "Hugh Zerr",
    email: "hughzerr@example.com",
    message: "Hello, here's..."
  }.to_json,
  httponly: true,
  same_site: :strict, # or :lax
  expires: 30.minutes.from_now
}

Use signed for data that shouldn’t be tampered with. Use encrypted for data that shouldn’t be read by the client at all.

Don’t treat same_site policy like people do for CORS, where they try to guess which option will make the browser yell at them. Be intentional.

Make your backups secure

Always encrypt your database backups and apply an automatic retention policy to them. There’s no point in encrypting your production database if old, unencrypted backups are sitting in a bucket somewhere.

Protect direct console access in production

When someone opens a Rails console in production, they have direct access to all your user data. Track who accessed it, why, and what commands they ran. The console1984 and audits1984 gems do a good job at this.

Protecting direct console access in production

When you open the console, you’re greeted with a clear message: "You have access to production data here. That’s a big deal. As part of our promise to keep customer data safe and private, we audit the commands you type here." It even asks you to explain why you’re using the console before you start.

Transparency and data rights

The third principle is about users knowing and controlling what you store about them. This is where things get more involved, and also where Rails gives you plenty to work with.

Consent modeling

Just a cookie banner doesn’t count as consent. Think about everything you do with user data: order processing, marketing, analytics tracking, third-party sharing, and specific features. Each of those needs its own consent.

Every consent should be per-purpose, versioned, and stored with proof. Consents should also be explicit. No user consent, no action.

class Consent < ApplicationRecord
  PURPOSES = %w[
    order_processing
    marketing_analytics
    third_party_sharing
  ].freeze

  belongs_to :user

  encrypts :ip_address

  enum :status, { granted: 0, revoked: 1 }

  validates :purpose, presence: true, inclusion: { in: PURPOSES }
  validates :status, presence: true
end

Notice that we encrypt the IP address in the consent record itself, it works as proof. Also, notice that consents are stored as an immutable, append-only audit trail. Revoking creates a new record. Nothing is deleted.

Then enforce consent at the controller level:

before_action -> { require_consent!("order_processing") }, only: :create

No consent, no action.

Consent dashboard

On the user-facing side, you can build a consent management page that shows each purpose, its current status, what it controls, and a full audit trail of when consents were granted or revoked.

Data subject access rights

Remember that email from Dustin at the beginning? Users have access, rectification, and erasure rights over their data. And there are deadlines. LGPD (Brazil) requires a response in 15 days. GDPR (EU) requires a response in 30 days, extendable to 90. It sounds like plenty of time, but you don’t want to keep remembering everything every time a user files a request.

Model it explicitly:

class DataSubjectRequest < ApplicationRecord
  belongs_to :user

  enum :request_type, { access: 0, rectification: 1, erasure: 2 }
  enum :status, {
    pending: 0,
    approved: 1,
    processing: 2,
    completed: 3,
    rejected: 4
  }

  validates :request_type, :status, presence: true

  scope :pending_review, -> { where(status: :pending) }

  def approve!
    update!(status: :approved)
    ProcessDataSubjectRequestJob.perform_later(id)
  end

  def reject!(reason:)
    update!(status: :rejected, notes: reason)
  end
end

Make it as automatic as possible. When a request is approved, a background job kicks in and handles the heavy lifting automatically.

Implementing access request (DSAR) handling

For access requests, you need to export every piece of data you hold about a user. Start with a reusable concern that declares which fields are exportable:

module DataExportable
  extend ActiveSupport::Concern

  class_methods do
    def exportable(*fields)
      @exportable_fields = fields
    end

    def exportable_fields
      @exportable_fields || column_names.map(&:to_sym)
    end
  end

  def export_data
    self.class.exportable_fields.each_with_object({}) do |field, hash|
      hash[field] = public_send(field)
    end
  end
end

Then declare the exportable fields on each model:

class User < ApplicationRecord
  # ...
  include DataExportable

  exportable :uuid,
             :email_address,
             :first_name,
             :last_name,
             :phone,
             :date_of_birth
end

Now you have a standardized way to get all the data that should be sent to the user should they request it.

class DataExportSerializer
  def initialize(user)
    @user = user
  end

  def as_json(*)
    {
      profile: @user.export_data,
      addresses: @user.addresses.map(&:export_data),
      orders: @user.orders.includes(:order_items).map(&:export_data),
      consents: @user.consents.map(&:export_data),
      analytics: {
        account_created_at: @user.created_at,
        total_orders: @user.orders.count,
        total_spent_cents: @user.orders.sum(:total_cents)
      }
    }
  end
end

The output looks like this:

{
  "profile": {
    "uuid": "a1b2c3d4-e5f6-...",
    "email_address": "hugh@example.com",
    "first_name": "Hugh",
    "last_name": "Zerr",
    "phone": "+5511999990000",
    "date_of_birth": "1998-05-14"
  },
  "addresses": [
    {
      "label": "Home",
      "street": "Av Brasil, 500",
      "city": "São Paulo",
      "state": "SP",
      "zip_code": "01430-000",
      "country": "BR"
    }
  ],
  "orders": [
    {
      "number": "ORD-4A8F2B",
      "status": "delivered",
      "total_cents": 15000,
      "created_at": "2025-11-20"
    }
  ]
}

Implementing erasure request handling

For erasure requests, the job anonymizes all associated records and revokes active consents:

class ProcessDataSubjectRequestJob < ApplicationJob
  # ...
  def process_erasure(request)
    user = request.user

    # Anonymize addresses
    user.addresses.not_anonymized.find_each(&:anonymize!)

    # Anonymize user profile
    user.anonymize!

    # Revoke all active consents
    user.consents.where(status: :granted).find_each do |consent|
      user.revoke_consent(consent.purpose)
    end
  end
end

Implementing anonymization

The anonymization itself is also a reusable concern:

module Anonymizable
  extend ActiveSupport::Concern

  class_methods do
    def anonymizable(*fields)
      @anonymizable_fields = fields
    end

    def anonymizable_fields
      @anonymizable_fields || []
    end
  end

  # scopes and #anonymized? omitted

  def anonymize!
    return if anonymized?

    transaction do
      self.class.anonymizable_fields.each do |field|
        public_send(:"#{field}=", "[ANONYMIZED]")
      end

      self.anonymized_at = Time.current
      save!(validate: false)
    end
  end
end
class User < ApplicationRecord
  # ...
  include Anonymizable

  anonymizable :first_name,
               :last_name,
               :phone,
               :date_of_birth
end

After anonymization, that same export looks like this:

{
  "profile": {
    "uuid": "a1b2c3d4-e5f6-...",
    "email_address": "[ANONYMIZED]",
    "first_name": "[ANONYMIZED]",
    "last_name": "[ANONYMIZED]",
    "phone": "[ANONYMIZED]",
    "date_of_birth": "[ANONYMIZED]"
  },
  "addresses": [
    {
      "label": "Home",
      "street": "[ANONYMIZED]",
      "city": "[ANONYMIZED]",
      "state": "[ANONYMIZED]",
      "zip_code": "[ANONYMIZED]",
      "country": "BR"
    }
  ],
  "orders": [
    {
      "number": "ORD-4A8F2B",
      "status": "delivered",
      "total_cents": 15000,
      "created_at": "2025-11-20"
    }
  ]
}

The personal data is gone, but the business data (order totals, statuses, dates) stays. That’s the point of anonymization over deletion: you keep your analytics and audit trail without keeping the personal data.

Ok, I recognize that’s a lot of things to do. There must be a way to make it automatic, right?

Automate it with agent skills

If you’re looking at an existing Rails app and wondering where to even start, you’re not alone. Most teams are in the same spot. And we’re in the age of AI, so let’s use it.

We built a set of open-source agent skills that bring privacy-by-design into your Rails workflow. They work in two modes:

  1. Complete codebase assessment. Scans your entire application and generates a thorough report with findings, severity ratings, and prioritized recommendations.
  2. Review your recent changes. Checks uncommitted or recently changed files for privacy rule violations before you commit.

Both modes generate a detailed report and offer fixes. The skills are fully open source and ship with Ruby scripts for easy auditing, so you can inspect exactly what they check.

Install them with:

# knowledge base skill
npx skills add codeminer42/skills --skill privacy-by-design-rails

# command skills
npx skills add codeminer42/skills --skill privacy-assessment-rails
npx skills add codeminer42/skills --skill privacy-review-rails

When you run the full assessment, you get a report that starts with an executive summary:

Executive summary

Then, findings by severity. Each one comes with its location in the code, a description of the issue, the relevant code snippet, and a recommended fix:

Findings by severity

A checklist summary with a pass/fail overview of every privacy check:

Checklist

And prioritized recommendations, organized by severity: Critical (address immediately), High (address soon), and Medium (plan and implement):

Recommendations

The skills live at github.com/codeminer42/skills.

Wrapping up

Data privacy laws don’t kill innovation. Not complying is a skill issue.

Respect your users’ privacy, Rails will help you on that.

Previous work

This post is based on my talk at Tropical on Rails 2026. You can find the slides here, and I’ll update the post once the recording is available.

We want to work with you. Check out our Services page!