Fifteen Ruby takes on a stepped-worker DSL

We want a Ruby class that models a multi-step transaction script. Given an input, it runs through a series of steps in order, and each call to step! advances the cursor by one. The base class (Steppa) knows how many steps the subclass has, and each instance tracks where it is in that list.

Our domain is e-commerce order fulfillment. The four steps are validation, preparation (totals, tax, shipping), persistence, and customer notification.

We’re going to look at fifteen ways to declare those steps in a subclass. Each runs the same four steps in the same order. What changes is the Ruby idiom we use to say “here is my list of steps.”

The manifest

The declaration is a plain constant: an array of method names in the order we want to call them.

class OrderFulfillment < Steppa
  STEPS = %i[validate prepare persist notify]

  def initialize(order)
    @order = order
    super()
  end

  def validate
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end

  def prepare
    @order.total = @order.items.sum(&:price) + @order.shipping
  end

  def persist
    @order.save!
  end

  def notify
    OrderMailer.confirmation(@order).deliver_later
  end
end

Steppa indexes that constant and sends the method name at the current cursor:

class Steppa
  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    send(self.class::STEPS[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class::STEPS.size
end

With step! and done? in place, running the whole workflow looks like this:

worker = OrderFulfillment.new(order)
worker.step! until worker.done?

The same calling convention applies to every take that follows.

The explicit DSL

The subclass calls step once per method name:

class OrderFulfillment < Steppa
  step :validate
  step :prepare
  step :persist
  step :notify

  def initialize(order)
    @order = order
    super()
  end

  def validate
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end

  # prepare, persist, notify – bodies as in "The manifest"
end

Steppa adds a class-level step method that appends each declared name to an array on the class:

class Steppa
  class << self
    def steps = @steps ||= []
    def step(name) = steps << name
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    send(self.class.steps[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

The sugared DSL

The declaration can fold step into the method definition:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  step def validate
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end

  step def prepare
    @order.total = @order.items.sum(&:price) + @order.shipping
  end

  step def persist
    @order.save!
  end

  step def notify
    OrderMailer.confirmation(@order).deliver_later
  end
end

The base class is unchanged from the previous take. Since def in Ruby returns the name of the method it just defined as a symbol, we can pass its return value directly to step. This is the same trick private def validate uses. The visibility methods – private, protected, public – all accept a method name as a symbol, so private :validate is the longhand form. Our step has that same shape, so step def validate ... end passes :validate to step.

The naming convention

The subclass marks steps by naming convention:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  def validate_step
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end

  # prepare_step, persist_step, notify_step – same shape
end

Ruby calls method_added on a class every time a method is added to it. We use that hook to pick up any method whose name ends in _step and put it on the step list.

class Steppa
  class << self
    def steps = @steps ||= []

    def method_added(name)
      super
      steps << name if name.to_s.end_with?("_step")
    end
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    send(self.class.steps[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

The block-step DSL

The step takes a block. The subclass declares each step like this:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  step :validate do
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end

  # step :prepare, :persist, :notify – same shape
end

The block form stores the block as data and runs it via instance_exec:

class Steppa
  class << self
    def steps = @steps ||= []
    def step(name, &block) = steps << [name, block]
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    _, block = self.class.steps[@cursor]
    instance_exec(&block)
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

The define_method form turns each block into a regular instance method on the subclass:

class Steppa
  class << self
    def steps = @steps ||= []

    def step(name, &block)
      steps << name
      define_method(name, &block)
    end
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    send(self.class.steps[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

In the block form, each block is stored in the worker class’s step list, and the subclass has no per-step methods. In the define_method form, each block becomes a regular instance method on the subclass, callable by name in tests or debugging.

The step class

Each step is a small class with a single #call method:

ValidateOrder = Struct.new(:order) do
  def call
    raise "out of stock" unless order.items.all?(&:in_stock?)
  end
end

# PrepareOrder, PersistOrder, NotifyOrder – same shape, Struct with #call

The subclass declares its steps as a list of those classes:

class OrderFulfillment < Steppa
  step ValidateOrder
  step PrepareOrder
  step PersistOrder
  step NotifyOrder
end

Steppa iterates through the classes in order, instantiates each one with the worker’s arguments, and calls it:

class Steppa
  class << self
    def steps = @steps ||= []
    def step(klass) = steps << klass
  end

  def initialize(*args)
    @args = args
    @cursor = 0
  end

  def step!
    return if done?
    self.class.steps[@cursor].new(*@args).call
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

The mixin chain

Each step is a module with the step’s method on it. The subclass includes those modules, and the worker class builds its step list from them.

module ValidateStep
  extend Steppa::Step

  def validate
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end
end

# PrepareStep, PersistStep, NotifyStep – same shape, each extends Steppa::Step

class OrderFulfillment < Steppa
  include ValidateStep
  include PrepareStep
  include PersistStep
  include NotifyStep

  def initialize(order)
    @order = order
    super()
  end
end

Steppa can collect the list two ways. The pull version reads from the worker class’s ancestors whenever someone asks for steps. The push version has each step module add its methods to the worker class when it gets included.

Pull reads it from ancestors on each call, with Step as an empty marker:

class Steppa
  module Step; end  # marker for step modules

  def self.steps
    ancestors
      .select { |m| m.is_a?(Step) }
      .flat_map { |m| m.instance_methods(false) }
      .reverse
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    send(self.class.steps[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

Push has each step module register itself when included, with Step carrying the included callback:

class Steppa
  class << self
    def steps = @steps ||= []
  end

  module Step
    def included(base)
      base.steps.concat(instance_methods(false))
    end
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    send(self.class.steps[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

Pull reads ancestors fresh on every call, so the worker class does not need a stored list. The list has to be reversed because include puts newer modules higher in the ancestor chain. Push stores the list once on the worker class, in include order, but the Step marker has to carry the included callback that step modules pick up via extend.

Step order comes from the order of the include lines in the subclass, which means moving a line also moves the step. In a short class body that’s manageable, but in a longer file where the includes aren’t all visible at once, it’s an easy thing to get wrong.

The constant declaration

Each step is declared as a lambda assigned to a constant on the subclass:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  Validate = -> { raise "out of stock" unless @order.items.all?(&:in_stock?) }
  Prepare  = -> { @order.total = @order.items.sum(&:price) + @order.shipping }
  Persist  = -> { @order.save! }
  Notify   = -> { OrderMailer.confirmation(@order).deliver_later }
end

Ruby 3.2 added a const_added hook that fires when a constant is assigned on a class, analogous to method_added for methods. Steppa uses that hook to pick up any lambda assigned as a constant on the subclass and treats it as a step:

class Steppa
  class << self
    def steps = @steps ||= []

    def const_added(name)
      super
      value = const_get(name)
      steps << value if value.is_a?(Proc) && value.lambda?
    end
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    instance_exec(&self.class.steps[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

The bound method list

The subclass exposes steps as a list of bound method objects:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  def steps = @steps ||= [
    method(:validate),
    method(:prepare),
    method(:persist),
    method(:notify),
  ]

  def validate
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end

  # prepare, persist, notify – bodies as in "The manifest"
end

Ruby can turn a method into an object: method(:name) returns a Method object bound to the current instance. We can store those objects in a list and call the current one:

class Steppa
  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    steps[@cursor].call
    @cursor += 1
  end

  def done? = @cursor >= steps.size
end

The TracePoint

The subclass shape is the same as the naming-convention take:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  def validate_step
    raise "out of stock" unless @order.items.all?(&:in_stock?)
  end

  # prepare_step, persist_step, notify_step – same shape
end

TracePoint lets us listen for low-level interpreter events. We hook into :end events – which fire when a class or module body closes – and scan the newly closed subclass for methods to register.

class Steppa
  class << self
    def steps = @steps ||= []

    def inherited(subclass)
      super

      # :end fires whenever any class or module body closes.
      TracePoint.new(:end) do |tp|
        # Keep only the event for the subclass currently being defined.
        next unless tp.self == subclass

        step_methods = subclass.instance_methods(false)
          .select { |m| m.to_s.end_with?("_step") }

        # instance_methods(false) finds the methods, but not in
        # the order they were written in the class body.
        step_methods
          .sort_by { |m| subclass.instance_method(m).source_location.last }
          .each { |m| subclass.steps << m }

        # This take only handles the first class body.
        # Reopening the class later won't add more steps.
        tp.disable
      end.enable
    end
  end

  def initialize
    @cursor = 0
  end

  def step!
    return if done?
    send(self.class.steps[@cursor])
    @cursor += 1
  end

  def done? = @cursor >= self.class.steps.size
end

TracePoint isn’t the right tool for this DSL. The trace runs for every class and module body in the program, so even this filtered version does more work than the problem deserves. It also interacts badly with JIT. Sorting by source_location and disabling the trace after the first class body make the example behave, but they are both signs that the hook is too low-level for this job.

TracePoint is still worth recognizing. It does not come up in everyday Ruby, but debuggers, coverage tools, and instrumentation libraries use it for work that ordinary method hooks cannot see.

Resuming from step 3

Suppose we need to resume a workflow that’s partway done. Steps 1 and 2 ran in a previous background job, and we’d like to instantiate the worker with its cursor at 2 so the next step! runs persistence.

In the twelve takes above, the step list is something we can index by position. Adding a starting cursor to Steppa and forwarding it from the subclass does the rest:

class Steppa
  def initialize(starting_at: 0)
    @cursor = starting_at
  end
end

class OrderFulfillment < Steppa
  def initialize(order, **opts)
    @order = order
    super(**opts)
  end
end

OrderFulfillment.new(order, starting_at: 2).step!  # runs persistence

The next three takes do not have a step list. The workflow is one method, with Fiber.yield, y << val, or callcc marking the pauses. Step 3 is a place inside that method rather than an entry we can index.

That makes resumption awkward. Calling run starts at validation again. Guarding each section with resume_from turns the method into a dispatcher, and persisting a fiber or continuation means serializing execution state.

What we get back is a workflow that reads top-to-bottom.

The Fiber

Here the workflow is a single linear method, and each Fiber.yield marks the boundary between two steps:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  def run
    raise "out of stock" unless @order.items.all?(&:in_stock?)
    Fiber.yield

    @order.total = @order.items.sum(&:price) + @order.shipping
    Fiber.yield

    @order.save!
    Fiber.yield

    OrderMailer.confirmation(@order).deliver_later
  end
end

The base class creates the fiber in initialize and resumes it once per step!:

class Steppa
  def initialize
    @fiber = Fiber.new { run }
  end

  def step!
    return if done?
    @fiber.resume
  end

  def done? = !@fiber.alive?
end

The Enumerator

The subclass can express the same boundaries with a run(y) method and a yielder:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  def run(y)
    raise "out of stock" unless @order.items.all?(&:in_stock?)
    y << :validated

    @order.total = @order.items.sum(&:price) + @order.shipping
    y << :prepared

    @order.save!
    y << :persisted

    OrderMailer.confirmation(@order).deliver_later
  end
end

Enumerator can express the same boundary with Ruby’s yielder protocol. The block receives a yielder argument, and each y << pauses the block until the outer code asks for the next value. Steppa wraps the workflow in an Enumerator and pulls one value per step!:

class Steppa
  def initialize
    @enumerator = Enumerator.new { |y| run(y) }
    @done = false
  end

  def step!
    return if @done
    @enumerator.next
  rescue StopIteration
    @done = true
  end

  def done? = @done
end

The exile (callcc)

The subclass is another linear workflow, this time with yield_step between sections:

class OrderFulfillment < Steppa
  def initialize(order)
    @order = order
    super()
  end

  def run
    raise "out of stock" unless @order.items.all?(&:in_stock?)
    yield_step

    @order.total = @order.items.sum(&:price) + @order.shipping
    yield_step

    @order.save!
    yield_step

    OrderMailer.confirmation(@order).deliver_later
  end
end

Underneath, this uses Ruby continuations, which have been officially obsolete since Ruby 2.2 but are still shipped in the standard library, printing a deprecation warning whenever the continuation file is required.

require "continuation"

class Steppa
  def initialize
    @done = false
    @resume_here = nil
    @return_here = nil
  end

  def step!
    return if @done
    callcc do |k|
      @return_here = k
      if @resume_here
        @resume_here.call
      else
        run
        @done = true
      end
    end
  end

  def yield_step
    callcc do |k|
      @resume_here = k
      @return_here.call
    end
  end
end

A bit of Ruby archaeology on how callcc got here.

In late 2007, not long after I picked up Ruby 1.8.5, ruby-talk was already debating whether continuations would survive the jump to 1.9. The initial post claimed they were being dropped. A same-day correction pointed out that 1.9 was moving them out of core into a continuation bundled library, not removing them. The 1.9.1 NEWS matched that account when it shipped in January 2009: “Kernel#callcc and Continuation now become ‘continuation’ bundled library.”

In early 2009, Charles Nutter noted on ruby-talk that JRuby didn’t support continuations and called the whole area unlikely to behave consistently across implementations. A few months later, in another thread, he added that JRuby could implement them but would be several times slower. Matz, in the same discussion, pushed back: “It’s not for the performance, but for various problems it can cause, especially when continuation reenters into C function.”

The removal attempt became formal five years later. At the Ruby developers’ meeting of 2014-05-17, Matz agreed to remove callcc. Feature #10548 followed on 2014-11-26, and the first applied change marked callcc obsolete. Ruby 2.2 shipped that warning a month after.

That was back in 2014. More than a decade later, callcc is still part of the interpreter, and the deprecation warning from Ruby 2.2 still prints every time anything requires the continuation library.

Wrapping up

Of the fifteen, there are two versions I’d actually keep.

For application code, I’d pick the manifest:

STEPS = %i[validate prepare persist notify]

It is the normal answer. The workflow is visible before any method body, and changing the order means moving symbols around. A plain constant does enough.

For a gem, or for a shared internal abstraction used in more than one place, I’d use the explicit DSL:

class OrderFulfillment < Steppa
  step :validate
  step :prepare
  step :persist, retry: false
  step :notify, async: true
end

Once steps need their own behavior, the options belong next to the step names.

step :persist, retry: false is the line that changes the feel of it: run persist, but do not retry it.

That is genuinely delightful Ruby. It could be a library, and I’d use it.