Many times on many projects I've changed direction. We discover a new technique, new tool, or somehow figure out a better way to do something. The next step is to start using it. But if using that new technique requires that you change a lot of code, when should you do it?

When should I fix this?

I'll tell you a decision we made on one project recently:

"Let's switch to this new tool as we make other changes. We'll eventually touch every area where the old tool is being used."

We merrily coded along expecting to improve our project over time.

But that anticipated eventuality never came. Yes, as we moved some code to use a new tool, other areas remained with the old approach.

We later had to take time away from other work to bite the bullet and make all the changes we should have made. We spent more time removing the old way for the new.

So when should we fix it?

Postponing changes and improvements can add unexpected costs later.

Right NOW we understand the problem. We have time to understand it AND its solution. In the future we'll have to figure that out again. New problems, new ideas, and new priorities will call for our attention.

Worse, if the team is different in the future, we may lose knowledge of the problem and its solution.

When we only move to a better solution as we touch different areas of code, we leave in the old one, of course. This will lead future developers to wonder why we use one dependency over another.

I've played the git blame game more often than I'd like.

Why is this here and that there? When did this change happen? Who committed this and what does the message say?

Researching and removing this confusion is development time wasted.

Although git commit spelunking can reveal good information, it requires purposeful commit messages. If you run into surprises, you'll need to spend time investigating commits. Sometimes laziness wins and commit messages are terse and uninformative.

Sometimes git commits don't have the answers.

Can't we document it?

If you can take time to document why code removal should occur, your time might be better spent removing it. If we add documentation for why a change has not been made, will it take as much time as making the desired change?

Let's add a deprecation warning

If you're building a library, a piece of an app shared by many, then deprecation might be appropriate. Upgrading to a newer version of a library brings complications. So a kind warning about future changes will help others prepare their own code for the change. Deprecations are great for users of libraries.

When we are writing our own app, we shouldn't deprecate code, we should remove it. We are the users of our own application code.

As James Shore put it on twitter "Do; or do not. There is no //TODO"

We don't have time. Let's skip the tests.

Sometimes adjusting our code means changing our tests. Depending on how you wrote your tests, that might be a painful process.

Most test runners provide you a way to skip or mark tests as pending. The messages you provide yourself and others here can be helpful for the future. Often that future doesn't come as quickly as we intend.

Yes, you can skip tests but that's not very different from deprecations and documentation. Your code still needs rework, and now your tests do too. Skipped tests leave noisy reminders in your test output. Inevitably you learn to ignore the noise.

How long will you ignore the pending test noise until you decide to finally address them. Skipped test are short term good with a long term loss.

Now what?

With problems in many locations, developers will wonder: "how are we going to dig ourselves out of this?"

We want to ship code and create features. Cleaning up often feels boring. There's no excitement in rearranging or fixing up our implementation.

There's no glory in this work.

Clean up feels unproductive because we're not adding a new feature. I have spent many project cycles feeling unproductive.

But I was actually helping my whole team move forward faster.

Without it we get stuck. Spending time finally moving to that library, or finally switching to that new method gives us freedom to focus.

And it's not just freedom to focus for me. Or... for you. It's everyone on your team. Cleanup work removes distractions. Finalizing decisions about what to do with our code removes unknowns.

Without cleanup work, the team loses time because these changes need to happen. Without cleanup work, the confusion becomes solidified. Your team multiplies the loss of time and focus created by indecisive implementations.

My "unproductive" project cycles spread improvements for the whole team. There's no glory in this work but, it is a part of shipping.

My project is different...

Maybe things are different for you. It's not likely but everyone likes to think that their project is really the unique one.

If you're not going to make changes that you should, ask yourself these questions:

can this lead to system going down?
what will developers do when they need to debug and fix it?
how will other developers know which implementation/feature/library to use going forward?
have I left informative commit messages?

Get advice from others

On two separate projects, my teams managed significant changes in two similar ways.

On one team we insisted that at least 3 developers discuss changes. That meant that if you discovered a new way to work, you got opinions and weighed the options looking for different perspectives. This helped us to spread knowledge about decision making. We were able to find alternative scenarios and ensure that developers played a part in decision making.

On another team with constant pair programming we chose fast meetings. We called a team meeting for everyone to discuss changes with significant impact. Our goal was to have the original pair determine options and explain to the rest of the team. When discussion we went beyond 5 minutes we put a stop to the meeting. We decided to reevaluate our work with new stories or deferred decsisons to the origin pair or team lead.

As a team we agreed to give one person or pair the authority of benevolent dictator. We would all support the decision and move forward. If we needed more discussion, we made plans for it and set the changes aside so we could finish up our work. The next step was to discuss the changes we wanted to make.

These interruptions allowed our teams to communicate about the needs of the project. But the interruptions were designed to be short.

Get advice from your code

You can look to your code and git commits to give you an idea of what areas of your code need your attention.

Use turbulence to check if you're likely to run into this code again.

If your code has a high amount of churn, you're likely to be editing it again. If it's got a high amount of complexity, you might find some bad habits in there that you could fix up.

Fix your code now

It's easy to find reasons to postpone making changes. It's even easier to skip it and leave a note in the code for later. In my experience, that time to read notes and make changes rarely comes.

Avoiding manual changes is often overvalued. Sometimes the best thing to do is dig in, find all the places that need updating, and make the changes.

Jim Gay

June 13, 2017

7 ways to evaluate gems, and 1 crazy idea

Jim Gay

June 13, 2017

I remember late nights browsing support forums ready for any answer to help. Writing a post, pleading for help, hitting refresh, and waiting for someone to reply was not efficient. But when that's all you know to do, it had to work.

Sleepless nights like those are much easier to avoid if you more carefully choose dependencies.

When I've got a problem to solve, I often look around to see if someone else has already solved it. I never want to waste my time reinventing the wheel.

Ruby has a healthy community. There are so many available gems it's not surprising to find your problem already solved. Dropping a gem into your project is easy. But if you don't understand the impact, you too could be up late begging the internet for help!

Over time I developed a few steps of figuring out what to choose.

If we're going to depend on one of these gems, we'd better make a good decision. Here are some questions to ask when I find a possible solution for my projects.

1. How many stars/watchers does it have?

I've often seen this used as a major deciding factor of choosing a new dependency. A large and active community can keep a project healthy. Yes, if many people use it, that means finding help might be easier. But it's foolish to stop there. A collective mind is great guide, but it's no replacement for taking responsibility.

2. How many active issues or pull requests are there?

Are the project's issues loaded with bug reports or confused developers? Do the maintainers categorize the issues according to how they will be addressed? For any old issues, is there active discussion?

3. When was the most recent release?

Is the project active? When was the last time the gem was released? Or is the code so simple that regular updates aren't necessary?

4. How many of its own dependencies does it have?

Are you prepared to pull all these dependencies into your project? What will happen if your project needs a newer version of one of these but this gem prevents it? Or if you need an older version but this requires something newer?

The more dependencies a gem has, the more impact these questions will have.

5. Are the maintainers friendly?

Whether in a bug ticket, a forum post, or anywhere else: do the maintainers act helpful? Can you count on them to help you when you need it?

There is the corollary to that: are you helpful? When it comes time to ask for help can you provide steps to reproduce an error?

6. How many forks with un-merged changes exist?

The GitHub network graph is a great place to find out if a project has a disconnected community. Are there many forks with good updates that the main project has ignored? Will you need to gather commits from those forks to get a feature working?

7. Does it have documentation?

Is there a clear place to go that gives you example uses? Can you understand what the gem does and how it works from the provided documentation?

8. The crazy one: Can you understand the code?

When things go wrong, as they inevitably will, have you chosen something you can figure out? Or is this dependency something that will stop in your tracks?

I don't actually go through those other steps above until I've first looked at the code.

Most often, the first thing I do is read code.

I recall a past project where my pair and I were researching a dependency. I immediately dove into the "lib" directory and he (who preferred other languages to Ruby) said:

Is this what Ruby developers do? Don't you look for documentation?

I don't know about other Ruby developers, but it's absolutely what I do first. If I think a project might be a fit for my problem, I want to see what kind of code will become a part of my application. My pair was a fantastic developer and his question was in good fun. But it made me realize that this code-first approach might be unusual.

If it works now but the code is a hot mess, what will my life be like when things don't work?

Here's my checklist in the order I do it:

Can you understand the code?
Does it have documentation?
Are the maintainers friendly?
How many of its own dependencies does it have?
When was the most recent release?
How many forks with un-merged changes exist?
How many active issues or pull requests are there?

That's it. I always start with reading the code. The order of the rest of them may change here and there but that's typical. I may have to make concessions with one because the answer to another is compelling.

You may notice that I left one out...

I don't care how many stars or watchers a project has. If I can't answer the above questions with satisfaction, watchers means nothing. And if I am satisfied with the answers, watchers means nothing.

When you bring in a dependency, you own it; you must be able to figure it out.

If you're an experienced developer you may think differently about answering these questions. But one reason I look to the code first is that junior developers will need to be able to do it too.

You may find that no existing tool does exactly what you want and you might then build your own. But if you cannot evaluate the code well, how would you know if you should build your own?

What do you do? Write a blog post about it and share your experience.

Get in touch with me to help you evaluate your projects.

Jim Gay

May 30, 2017

Why write code when more dependencies will do?

Jim Gay

May 30, 2017

What do you do when you need your code to work in different environments?

New versions of Ruby or important gems are released and we begin to consider how to switch between them. If your code needs to work in old as well as current versions of Ruby, how do you approach it? Or how do you handle changes in Rails for that matter?

Surely, there's a gem out there that would solve your problem for you...

Balancing the need for new features

A few years ago I was working on a project and was looking for a way to make Casting work in versions of Ruby 1.9 and 2.0. I had to write code that would determine what could be done and ensure it would behave.

When Ruby 2.0 was released, it allowed methods defined on a Module to be bound to any object and called:

module Greeter
  def hello
    "Hello, I'm #{@name}"
  end
end

class Person
  def initialize(name)
    @name = name
  end
end

jim = Person.new('Jim')
Greeter.instance_method(:hello).bind(jim).call # => "Hello, I'm Jim"

This was an exciting new feature. I had been working through new ways to structure behavior in my applications. My book, Clean Ruby, was in development. A version of Ruby came with a new feature but I couldn't yet use it in my current application environments with Ruby 1.9. This change gave me ideas about organizing code by application feature.

The desire to use new features is strong. The problem is that upgrading your projects to new versions isn't always easy. Your infrastructure might require updates, your tests should be run and adjusted, dependencies should be checked for compatibility, and you might need to update or remove code.

Running Ruby 1.9 meant binding module methods wouldn't work. It was a challenge to get Casting to work in my current environment.

Bridging the platform gap

Giving objects new behavior often means you must include the module in the object's class. Or you may include it in the object's singleton_class using extend.

class Person
  include Greeter
end

# or extend the instance:

jim = Person.new('Jim')
jim.extend(Greeter)

jim.hello # => "Hello, I'm Jim"

There was a trick to get module method binding to work. You can clone an object, extend clone, and grab the unbound method for your original object.

object.clone.extend(Greeter).method(:hello).bind(:object).call

It was definitely a hack and certainly an unintended behavior of Ruby 1.9. But I wanted to use Casting in any Ruby version so I needed to be strategic about how to handle the differences.

One way to check your Ruby version is to look at the RUBY_VERSION constant and compare numbers. I quickly ruled that out. Checking that version value in JRuby, Rubinius, or some other implementation, might not be good enough. Alternative rubies have their own quirks. Since method binding isn't intended behavior in MRI, it's likely that things wouldn't work the way the code expected. Not every Ruby implementation is the same. RUBY_VERSION wouldn't be a reliable predictor of behavior.

I came across Redcard which says in its README:

RedCard provides a standard way to ensure that the running Ruby implementation matches the desired language version, implementation, and implementation version.

That seemed to be exactly what I wanted. Relying on an open source project can help you unload work required. Other users of the same project will want to fix bugs and add features. A collaborative and active community can be a great asset.

So I dove in and began placing if RedCard.check '2.0' and similar code wherever necessary

Build around behavior, not versions

Once it was in, I still didn't feel quite right about adding this dependency.

Third-party code brings along it's own dependencies. Third party-code can add rigidity to your own by reducing your ability to adjust to changes. You're not in charge of the release schedule of third-party code, unlike your own.

I soon realized that I didn't actually care about versions at all. What I cared about was behavior.

I had added a third-party dependency for a single feature. Although RedCard can do more, I didn't need any of those other features. Adding this dependency for one behavior was easy but it exposed the project to more third-party code than I wanted.

It was much easier to check for the behavior I wanted, and store the result. Here's how I tackled that in early versions of Casting (before I dropped Ruby 1.9 support).

# Some features are only available in versions of Ruby
# where this method is true
def module_method_rebinding?
  return @__module_method_rebinding__ if defined?(@__module_method_rebinding__)
  sample_method = Enumerable.instance_method(:to_a)
  @__module_method_rebinding__ = begin
    !!sample_method.bind(Object.new)
  rescue TypeError
    false
  end
end

My solution was binding a method to an arbitrary object, catching any failure, and memoizing the result. If it worked my module_method_rebinding? method would return true without running the test again. If the result was false, then this method would always return false.

The beauty of this solution was that it removed a dependency. It relied on the natural behavior of Ruby: to raise an exception.

Removing the gem makes all the problems of having third-party code go away.

Preparing for the future

Adding dependecies to your code can make you more productive. Adding dependencies can also reduce flexibility in responding to the needs of your code. A new dependency might prevent you from upgrading aspects of your system due to compatibility problems.

Polyfill, a project I recently came across, reminded me of this. The polyfill project says:

Polyfill implements newer Ruby features into older versions.

It might make sense to implement new features in your current environment rather than upgrading. Polyfill is important because it helps us avoid checking for behavior completely and instead implements it. When you're unable to upgrade your Ruby environment, you might pull in a project like polyfill.

Polyfill uses refinements so you can isolate new features without affecting other areas of your code.

Polyfill attempts to get your environment working like a newer version of Ruby. ActiveSupport adds its own features to Ruby core classes, but polyfill adds features which exist by default. This allows you to write your code in a manner consistent with upcoming upgrades to Ruby. Writing code with new versions of Ruby in mind will prepare you to drop the polyfill dependency.

Prepare for the future by implementing your own

My current project had a need to truncate a Float. In Ruby 2.4 the floor method accepts an argument to limit the number of digits beyond the decimal. In our current environment with Ruby 2.3.x, the floor method doesn't accept any arguments.

Instead of pulling in polyfill for a new feature, our solution was to do the math required to truncate it. Although using (3.14159265359).floor(2) would be convenient, we can't yet justify a new dependency on polyfill, and we can implement this method on our own.

Handling versions and behavior limitations takes balance. Whether you are building gems or building applications, it's important to consider the larger impact of upgrading systems or installing new dependencies.

I'll be keeping my eyes on new features in Ruby and polyfill. If I'm unable to use either immediately, I'll at least be able to steal some good ideas.

Jim Gay

May 18, 2017

What if we organized code by features?

Jim Gay

May 18, 2017

I began asking myself this question when I was working on a large Rails project. Requirements changed fairly often and we needed to react quickly.

When trying to figure out how something could change, I had to backtrack through how it already works. This often meant tracing method calls through several different files and classes. A collection of methods and collaborators can be difficult to keep in your head.

Too many times, it was easier to search for the person on the team who implemented the original code to figure out how best to change it. Figuring out how it was all put together can be a distraction. I went from "let's make changes" to "how does this work?"

If we organized our implementation by what we look for when we change features rather than by the related class, could eliminate distraction? What if someone wanted to make a change to a feature, and I was able to pull up the code that represented that feature? Then I'd be able to feel confident that I could look in as few places as possible to get what I need.

I began experimenting with how Ruby could help me stay focused quite a lot. One of the results is Casting. Casting is a gem that allows you to apply behavior to an initialized object and to organize behavior around your features.

With Casting I could take the implementation details of an algorithm (the work to be done) out of individual classes of collaborating objects and move it into a centralized location. Rather than having 4 or 5 classes of things each contain a different part of the puzzle, I could put the puzzle together and have it ready for future changes.

Rather than each class knowing a lot about how a feature is put together, the classes could be small and focused on representing their data. A feature could grow or shrink within the context of every object it needed.

It's a bit different from other approaches where you wrap an object in another one to provide additional behavior, however.

Here's a simple example of what this means.

Applying new behavior

Rather than putting code in multiple different places, or putting code in a class merely because that type of object needed it at some point, we could put it into a a fetaure class.

Let's take a collection of objects and start up a game:

class Game
  def initialize(*players)
    @players = players
    # randomly select a leader
    @leader = players.sample
  end
end

This Game class expects to receive a collection of objects, one is selected as a leader and then what?

Well, if I put together the features for what the players and leader can do, or if I want to read and understand what they can do later so that I can make changes, I'll look first to the Game itself to understand.

I can put all the behavior I need inside this class. It makes a lot of sense to me to keep it there because it will be behavior specific to this feature. The Game won't exist without the behavior and the behavior won't exist without the Game.

class Game
  def initialize(*players)
    @players = players
    # randomly select a leader
    @leader = players.sample
  end

  module Player
    def build_fortress; end
    def plant_crops; end
  end

  module Leader
    def assemble_team; end
    def negotiate_trade; end
  end
end

When the game is initialized with the players, we can make those players become what we need.

Using Casting, we can allow the objects to have access to new behaviors by including Casting::Client and telling them to look for missing methods in their collection of behaviors.

class Account
  include Casting::Client
  delegate_missing_methods
end

With that change, any Account object (or whatever class you use) will keep a collection of delegates. In other words, these objects will keep track of the roles they play in a given context and have access to the behaviors for those roles. The object will receive a message and first run through its own methods before looking at its behaviors for a matching method.

The next step is to assign the roles:

class Game
  def initialize(*players)
    @players = players.map{|player| player.cast_as(Player) }
    # randomly select a leader
    @leader = players.sample.cast_as(Leader)
  end
end

Now each of these objects will have access to the behavior of the assigned modules.

The @leader has both Player behavior as well as Leader.

Later, if we decide to add a Guard role to our game or some other job for a player, any player may gain that behavior at any point we determine.

Adding Casting to my projects allows me to work with objects and apply their behaviors where I plan for them to be used. Then I am able to look for my implementation where the feature is defined and I'm not distracted searching through multiple files and classes to piece together my understand of how it all works.

Why not just use...

You might argue that you could simply create a wrapper using something like SimpleDelegator.

When using wrappers, we create new objects that maintain a reference to the original. We take those 2 objects and treat them as one.

Doing so might change our Game initializer like this:

class Game
  def initialize(*players)
    @players = players.map{|player| Player.new(player) }
    # randomly select a leader
    @leader = Leader.new(players.sample)
  end
end

One of the downsides of this is that we are working with a new set of objects. The self inside that Leader.new isn't the same object as players.sample. Any process which would need to search for an object in the collection of players might attempt to compare them for equality and get an unexpected result.

To reduce our mental stress as best we can, we want only the information which is neccessary to understand our system. With an additional layer wrapping our objects we could be making it more difficult to understand.

Here's a small example of how wrapper objects like this can lie to us:

class Account; end
require 'delegate'
class Player < SimpleDelegator; end

account = Account.new
player = Player.new(account)

account == player # => false
player == account # => true

account.object_id # => 70340130700420
player.object_id  # => 70340122853880

The result of these 2 comparisons are not the same even though we would expect them to be. The player object just forwards the == message to the wrapped object; whereas the account object will do a direct comparison with the provided object in the == method.

This complicates how we may interact with the objects and we must be careful to perform things in the right order.

If the object merely gains new behavior and remains itself, the outcome will give us relief:

require 'casting'
class Account
  include Casting::Client
  delegate_missing_methods
end
module Player; end

account = Account.new
player = account.cast_as(Player)

account == player # => true
player == account # => true

account.object_id # => 70099874674300
player.object_id  # => 70099874674300

Here we can see that the account and the player are definitely the same object. No surprises.

Wrapping up my objects is easy but I've spent my fair share of time tracking down bugs from the wrappers behaving differently than I expected. Time spent tracking down bugs is a distraction from building what I need.

Wrapping my objects in additional layers can affect my program in unexpected ways, interrupting my work. Although the code with wrappers is easy to read, it subtly hides the fact that the objects I care about are buried beneath the surface of the objects I interact with. By keeping my objects what they are and applying behavior with modules, I ensure that I can stay focused on the feature.

Our code is a communication tool about our expectations of how a program should execute. The better we focus on the actual objects of concern and avoid layers, the easier it will be to avoid unintentional behavior.

This is why I'm glad to have a tool like Casting to help me build systems that limit unnecessary layers.

Creating the shortest path to understanding

When I began building and working with Casting, it allowed me to flatten the mental model I had of my programs.

It's easy for a programmer to see a wrapper style implementation or subclass and understand the consequences. Unfortunately that extra layer can and does lead to surprises that cost us time and stress.

I do still use tools like SimpleDelegator, but I often look to ways to make my programs better reflect the mental model of the end user. Sometimes SimpleDelegator-like tools work well, other times they don't.

If the ideas in my mind are closer to those in the user's mind, I'm much more likely to a program that communicates what it is and what it does more accurately.

Developers who work together need to communicate effectively to build the right thing. Our code can help or hinder our communication. Sometimes, when we want an object to gain new behavior, we introduce tools like SimpleDelegator and in doing so, we add layers to the program and more to understand.

Casting, although it too needs to be understood, provides us the ability to add behavior to an object without additional layers which might introduce distraction.

Attempting to meet requirements and build a product well, means we need to consider how our code reflects the shared understanding of what it should do.

When requirements change, and they often do, we'll look to our code to understand the features. The faster we can find and understand our features, the faster and more confidently we will be able to react to changing requirements.

When I needed to react to changing requirements and couldn't easily find all the pieces of the feature, it wasn't a confidence inspiring result for my other team members. Everyone should be able to find and understand how a feature works.

Where to look and where to understand

By placing code in every class related to a feature, I gave myself many different places to look to build up my understanding of how it worked. I treated individual data classes as the only place to look for behavior, rather than creating the world I need with a feature class.

Organizing by class vs. feature makes me think about my product differently.

When I think about features, I remain focused on the goals of the end user. Each user of the system is only using it to achieve a specific purpose. Often we can become distracted by our code and forget the goals of the end user. Building up features is a continual reminder of the reason that code needs to be written and updated.

Thinking about the end user will help us implement only what is necessary for her or him to complete their work. We may better avoid getting tripped up my technical concerns.

When we add behavior in our data classes, it often ends up including behavior from many unrelated features.

Think about what you have in your classes and what they should or could be.

class Account
  def build_fortress; end
  def plant_crops; end

  def assemble_team; end
  def negotiate_trade; end

  def renew; end
  def cancel; end
  def update_payment_info; end

  def send_friend_request; end
  def upload_photo; end

  # etcetera...
end

With the above Account class there are many behaviors for vastly different concerns. If we were to move those behaviors into an object that represented the feature where the behaviors were required the class would be freed to better describe the data it represents.

class Account
end

Defending my focus

Being able to focus on my immediate problem drives me to think about how I want to structure my code. When I write code, I know that the next person to read it and change it may not be me. Maintaining my own mental model isn't good enough when solving a problem; programmers need to create code that helps someone else pick up the same mental model.

Sometimes adding layers to your code can help separate parts that should be separate. Sometimes adding layers means introducing distraction and distraction leads to bugs and lost time.

These ideas lead me to build Casting and write about Object-oriented design in Clean Ruby.

Take a look at your application's features and ask yourself if you could organize differently. Can you remove distractions? Could Casting help you build cohesive features?

Jim Gay

April 11, 2017

Four tips to prepare yourself to build software

Jim Gay

April 11, 2017

A few of my articles have caught some attention or challenged ideas and I thought you might like to read them.

Take a read through these articles and let me know what you think about object modeling and understanding and building your tools.

Ruby Forwardable Deep Dive

Developers and teams that understand their tools will be better able to choose the right ones. They'll make better decisions about when to avoid existing tools and when to build their own.

Take a deep dive into Ruby's standard library Forwardable. Use this article to get to know how it's built and how it works. Take lessons from the code and use them to decide when and where to use it or write your own. Alternatively, another good library to know is 'delegate' and I dove deep into that one too.

Enforcing Encapsulation with East-Oriented Code

Responsibilities can explode in our programs without us ever realizing exactly how it happens. Before we know it, we've got a mess of interconnected objects that know too much about each other. With an approach called East-oriented Code (coined by James Ladd), we can create objects which enforce their responsibilities and make sure that you tell, don't ask. If you're interested in seeing more about it, check out my presentation from RubyConf and of course I wrote something for functional programming and immutable data afficionados: Commanding Objects Toward Immutability

How I fixed my biggest mistake with implementing background jobs

Distractions are an enormous problem for every software developer. This article doesn't solve all of them but it (and the others in the series that follow) shows one way to keep me focused on the problem at hand.

Walk through building a tool to remove distractions from your code. I pull from my own projects to show how I try to make a short a step as possible from deciding when to run code in the background.

The Gang of Four is wrong and you don't understand delegation

When I began researching earnestly for Clean Ruby I regularly came across references to Object-oriented programming about "delegation." What I found is that we tend to use this term to mean something entirely different than what it is.

Misunderstandings lead to frustration and bugs.

To make sure I understood it correctly, I spoke with Henry Lieberman creator of Self, a language which created the delegation concept. I followed it up with more research and contacted object modeling pioneer Lynn Andrea Stein who wrote that "Delegation is Inheritance" when I then wrote Delegation is Everything and Inheritance Does Not Exist

Jim Gay

March 21, 2017

Building tools and building teams

Jim Gay

March 21, 2017

My series of articles has been an exploration of how we grow small solutions into larger tools. Sometimes we discover new information or rethink our approach and need to change direction.

Each article had it's own problem to solve and built upon the ideas and solutions of the last:

I was talking to my friend Suzan about this series and she asked me "What is this about? What's the common thread?"

I had to think about it for a minute. This isn't really about background jobs.

It's about removing distractions.

Maintaining focus means removing distractions

When developers need to make decisions when building software, we need to decide what matters and what doesn't.

My main goal in the code we've been writing has been to create the ability to make decisions quickly.

We turned this:

process.call

into this:

process.later(:call)

and we were able to focus on the code we care about.

What matters here is what we don't need to do.

We don't need to rethink our code structure. We don't need to move code into a new background class.

As I wrote in the first article about building my background library:

If my main purpose in the code is to initialize SomeProcess and use the call method. The only decision I need to make is to either run it immediately, or run it later.

If my first instinct were to change a line of code to another type of implementation, another class for the background, another layer to fiddle with, I would pull every future reader of that code away from the purpose and toward that new thing.

When I or any future developer look at the code, I want them to read it and understand quickly. Adding in a later method is a device to let the reader know when something will occur without pulling them away to understand its implementation.

This decision to implement a later function is a compression of the idea of the background job. The words we choose should be good indicators of what to expect.

Our software can help or hinder communication.

Knowing which it does will help us build better software.

When a team member reads what we've written, will they understand the words we've chosen as a compression of our ideas into more managable forms?

Do we express our solutions like poetry and provide greater meaning with each word? Or do we attempt to express in strict terms the absolute definition and possible interpretations?

I find that my approach changes often.

Sometimes I want things to be absolutely explicit. I want bondaries and clear expression of all requirements leaving no uncertain terms about what needs to be done.

Other times I want a single word to be placed to give new meaning. I want others to read once and understand larger implications without great effort of digging into documentation and implementation details.

To make truly successful code, I need to share my goals and express my intent with others who will interact with it in the future.

Learn and do by focusing on what matters

When I began writing this series, I was focused on what really mattered to me: decisions and implementations getting in the way.

When I discover that some bit of code is taking too long to run, I want to get that code running well. That may mean rewriting it. Or that may mean diving into optimization techninques.

One simple technique to optimize it is to just run it in a separate process. As long as that's a valid solution, I want it to be as easy as possible. I just want to say later and be done.

Here's what I've learned:

Although I've written a library like ProcessLater three times now on several projects, I'm less interested in exactly how it was implemented previously and more interested in getting distractions out of the way.

That's it's whole purpose.

As I wrote each article explaining what I would do and why, I found I'd run into scenarios where the code just didn't quite work right.

I had to fiddle to get some of my argument setup to work properly. I'd forget to add a splat (*) to my argument name and be confused and a little bit worried that my entire article about why this is a good idea was just derailed and wasn't working at all.

Developing a library isn't a straight path. I inevitably need to stop and reconsider new information. But each stopping point is an opportunity to refocus on my goals and make sure I'm not getting caught up in technical trouble.

I ask myself if I'm achieving my goal and if there's a better way. Am I able to remove distractions or am I creating more?

With each team where I've approached solving this problem, my focus has been to talk about the code and what I want to do:

"If all I need to do is run it in the background, then I just want to type this..."

By having a conversation with others, I can express my desire to make my and their lives simpler. We all had a shared understanding of putting a process into the background, and we all had a desire to remove distractions.

In the end, the conversations we have tend to be focused on:

Whether or not we've chosen the right words to express the idea
"What if it worked this way..."

These conversations always lead to either better code, or a better shared understanding. We all learn new things.

Here's what I hope you've learned...

I hope that you take lessons from what you've built and turn around to provide a better experience for the others on your team.

I hope that you've looked at what code you've written and how you've done it, and have thought about how expressive the words are. How will others understand it? How will they want to use it to something the same or similar?

Lastly, I want to point you to this interview with Russ Olsen

Olsen reminds us to consider our emotions and ambitions:

"Technical people want to focus on technical issues: is this a good programming language, how fast will this workload run on that platform, [etc.] Fundamentally, a lot of what stands between us and what we want to do are human problems: issues of motivation, working together, how people cooperate," Olsen argues. When it comes to working together effectively, particularly in complex endeavors, human emotions and ambitions can complicate things -- even among geeks.

Building software is complicated. Building software with a team can be even more complicated, but with good communication and forethought, we can build even better tools together than we can alone.

Stay focused on your goals and reach out for new ideas and different perspectives from your team members. Start a new conversation with your team, hit reply, or reach out about working together with me.

Jim Gay

March 8, 2017

From implicit magic to explicit code

Jim Gay

March 8, 2017

In this series we've built a tool to help us move processing into the background and made it flexible enough to easily use in more that one situation and made sure it was easy to use without forgetting something important.

If you haven't read through the series yet, start with "How I fixed my biggest mistake with implementing background jobs"

In the last article, we made sure than when initializing an object to use our ProcessLater module, we would store the appropriate data in initializer_arguments when it is initialized. This allows us to send that data to the background so it can do the job of initialization when it needs.

Our final code included parts which provided a hijack of the new method on the class using ProcessLater:

module ProcessLater
  def self.included(klass)
    # ...omitted code...
    klass.extend(Initializer)
  end

  module Initializer
    def new(*args)
      instance = allocate
      instance.instance_variable_set(:@initializer_arguments, args)
      instance.send(:initialize, *args.flatten)
      instance
    end
  end
end

This allowed us to keep our object initialization simple:

class SomeProcess
  include ProcessLater

  def initialize(some_id)
    @some_id = some_id
  end
end

SomeProcess.new(1).later(:call)

Hijacking the new method might feel strange. It's certainly unusual and it's done implicitly. When you use this ProcessLater module, you may not be aware that this magically happens for you.

Building clear and explicit methods

We can make our code more explicit as well as provide an easy interface for other developers to use this library.

I wrote about solving problems like this and knowing how to build your own tools in the Ruby DSL Handbook.

The following is a chapter from the Ruby DSL Handbook called Creating a Custom Initializer.

This has some ideas about how we can build a method which would provide a clear and explicit initializer. Take a deep dive in to understanding solving this problem with some metaprogramming techniques. Afterward, I'll wrap it up and show how it ties into the ProcessLater module that we've been building for background jobs.

Creating a custom initializer

Common and repetitive tasks are ripe for moving into a DSL and often Ruby developers find themselves wanting to take care of initialization and setting of accessor methods.

The following example is a modified version of a custom initializer from the Surrounded project.

The goal of the custom initializer is to allow developers to simplify or shorten code like this:

class Employment
  attr_reader :employee, :boss
  private :employee, :boss
  def initialize(employee, boss)
    @employee = employee
    @boss = boss
  end
end

The above sample creates attr_reader methods for employee and boss. Then it makes those methods private, and next defines an initializer method which takes the same named arguments and assigns them to instance variables of the same name.

This code is verbose and the repetition of the same words makes it harder to understand what's going on at a glance. If we understand the idea that we want to define an initializer which also defines private accessor methods, we can boil that down to a simple DSL.

This is far easier to type and easier to remember all the required parts:

class Employment
  initialize :employee, :boss
end

There is one restriction. We can't provide arguments like a typical method. If we tried this, it would fail:

initialize employee, boss
  # or
  initialize(employee, boss)

Ruby will process that code and expect to find employee and boss methods which, of course, don't exist. We need to provide names for what will be used to define arguments and methods. So we need to stick with symbols or strings.

Let's look at how to make that work.

Our first step is to define the class-level initialize method.

class Employment
  def self.initialize()

  end
end

Because we're creating a pattern that we can follow in multiple places, we'll want to move this to a module.

module CustomInitializer
  def initialize()
  end
end

class Employment
  extend CustomInitializer
end

Now we're setup to use the custom initializer and we can use it in multiple classes.

Because we intend to use this pattern in multiple places, we want the class-level initialize method to accept any number of arguments. To do that we can easily use the splat operator: *. Placing the splat operator at the beginning of a named parameter will treat it as handling zero or more arguments. The parameter *setup_args will allow however many arguments we provide.

The next step is to take those same arguments and set them as attr_readers and make them private.

module CustomInitializer
  def initialize(*setup_args)
    attr_reader(*setup_args)
    private(*setup_args)

  end
end

With that change, we have the minor details out of the way and can move on to the heavy lifting.

As we saw in Chapter 2: Structure With Modules we want to define any generated methods on a module to preserve some flexability for later alterations. We only initialize Ruby objects once; since we're defining the initialize method in a special module, it doesn't make sense for us to check to see if the module already exists. All we need to do is create it and include it:

module CustomInitializer
  def initialize(*setup_args)
    attr_reader(*setup_args)
    private(*setup_args)

    initializer_module = Module.new
    line = __LINE__; method_module.class_eval %{

    }, __FILE__, line
    const_set('Initializer', initializer_module)
    include initializer_module
  end
end

After we set the private attribute readers, we created a module with Module.new. We prepared the lines to evaluate the code we want to generate, and then we gave the module a name with const_set. Finally we included the module.

The last step is to define our initialize instance method, but this is tricky. At first glance it might seem that all we want to do is create a simple method definition in the evaluated string:

line = __LINE__; method_module.class_eval %{
    def initialize(*args)

    end
  }, __FILE__, line

This won't work the way we want it to. Remember that we are specifying particular names to be used for the arguments to this generated method in our class-level initialize using employee and boss as provided by our *setup_args.

The change in scope for these values can get confusing. So let's step back and look at what we want to generate.

In our end result, this is what we want:

def initialize(employee, boss)
    @employee = employee
    @boss = boss
  end

Our CustomInitializer is merely generating a string to be evaluated as Ruby. So we need only to look at our desired code as a generated string. With the surrounding code stripped away, here's what we can do:

%{
    def initialize(#{setup_args.join(',')})
      #{setup_args.map do |arg|
        ['@',arg,' = ',arg].join
      end.join("\n")}
    end
  }

The setup_args.join(',') will create the string "employee, boss" so the first line will appear as we expect:

def initialize(employee,boss)

Next, we use map to loop through the provided arguments and for each one we complile a string which consists of "@", the name of the argument, " = ", and the name of the argument.

So this:

['@',arg,' = ',arg].join

Becomes this:

@employee = employee

Because we are creating individual strings in our map block, we join the result with a newline character to put each one on it's own line.

%{

    #{setup_args.map do |arg|

    end.join("\n")}

  }

Here's our final custom initializer all the pieces assembled:

module CustomInitializer
  def initialize(*setup_args)
    attr_reader(*setup_args)
    private(*setup_args)

    mod = Module.new
    line = __LINE__; method_module.class_eval %{
      def initialize(#{setup_args.join(',')})
        #{setup_args.map do |arg|
          ['@',arg,' = ',arg].join
        end.join("\n")}
      end
    }, __FILE__, line
    const_set('Initializer', mod)
    include mod
  end
end

Custom initializer with our custom tool

With the techniques from this Ruby DSL Handbook chapter, we can have our ProcessLater module provide an initialize method which can handle the dependencies we need for the background work, as well as be a warning sign to developers that something different is going on.

Here's an alternative to our original solution which hijacked the new method.

module ProcessLater
  # ...omitted code...

  def self.included(klass)
    # ...omitted code...

    # add the initializer
    klass.extend(CustomInitializer)
  end

  class Later < Que::Job
    # ... omitted code...
    def run(*args)
      options = args.pop

      # Arguments changed from just "args" to "*args"
      self.class_to_run.new(*args).send(options['trigger_method'])
    end
  end

  module CustomInitializer
    def initialize(*setup_args)
      attr_reader(*setup_args)

      mod = Module.new
      line = __LINE__; mod.class_eval %{
        def initialize(#{setup_args.join(',')})
          #{setup_args.map do |arg|
            ['@',arg,' = ',arg].join
          end.join("\n")}

          @initializer_arguments = [#{setup_args.join(',')}]
        end
      }, __FILE__, line
      const_set('Initializer', mod)
      include mod
    end
  end
end

This highlights changes since the new hijack approach and with these changes we'll be able to use our new initialize method:

class SomeProcess
  include ProcessLater

  initialize :some_id

  def call
    puts "#{self.class} ran #{__method__} with #{some_id}"
  end
end

Now we can see explicit use of initialize.

This gives us an alternative approach that may do a better job of communicating with other developers about what this object needs to initialize.

Building better solutions with your team in mind

Ruby gives us a lot of power to make decisions not only about how the code works, but how we want to understand it when we use it.

The code we've created supports our desire to stay focused on an individual task. We can decide to run code now or run it later without the need to build an intermediary background class in a way that keeps our code cohesive with closely related features tied together.

We've altered the code be flexible enough to run in multiple places. Once we've made it work with one class, we were able to use it on any other class without the burden of having to rethink the solution of using a background job.

Finally, we made the code so that it would ensure that developers would not forget an important dependency. Our solution first had an interception of new with no extra work for the developer and we later balanced that decision and rethought it to provide some explicit indicators to future developers about how this code works.

As we work together to build software, we communicate in unspoken way through the structure of our code. We can push each other to make good or bad decisions and we can even make the next developer to come along feel powerful when using our code.

This is the reason I wrote the Ruby DSL Handbook.

There are many ways to approach the challenges we face in our code. Knowing how to build and use our own tools and how to help others repeat our good work with ease can make a team work more efficiently. With Ruby, we can build a language around what we think and say and do that helps to guide our software development.

As you go forward solving problems with your team, consider the many ideas that each team member can bring to the code. When you decide to build your tool in a certain way, what does that do to the alternatives?

I'd love to know how you build your tools and what sorts of decisions you have made. Drop me a note especially if you've picked up the Ruby DSL Handbook and read through it, watched the videos, and experimented or applied any ideas or techniques.

If you haven't yet, grab a copy of the Ruby DSL Handbook and bend Ruby to your will... or maybe just build tools that make you and your team happy to have a solid understanding of your code.

Jim Gay

February 22, 2017

Building a tool that's easy for your team to use

Jim Gay

February 22, 2017

In previous articles I shared how I moved a solution to a problem into a general tool.

Building your own tools helps you avoid solving the same problem over and over again. Not only does it give you more power over the challenges in your system, but it gives you a point of communication about how a problem is solved.

By building tools around your patterns you'll be able to assign a common language to how you understand it's solution. Team members are better able to pass along understanding by using and manipulating the tools of their trade rather than reexplaining a solution and repeating the same workarounds.

We can compress our ideas and solutions into a simpler language by building up the Ruby code that supports it.

Here's the code:

module ProcessLater
  def later(which_method)
    later_class.enqueue(initializer_arguments, 'trigger_method' => which_method)
  end

  private

  def later_class
    self.class.const_get(:Later)
  end

  class Later < Que::Job
    # create the class lever accessor get the related class
    class << self
      attr_accessor :class_to_run
    end

    # create the instance method to access it
    def class_to_run
      self.class.class_to_run
    end

    def run(*args)
      options = args.pop # get the hash passed to enqueue
      self.class_to_run.new(args).send(options['trigger_method'])
    end
  end

  def self.included(klass)
    # create the unnamed class which inherits what we need
    later_class = Class.new(::ProcessLater::Later)

    # name the class we just created
    klass.const_set(:Later, later_class)

    # assign the class_to_run variable to hold a reference
    later_class.class_to_run = klass
  end
end

I showed how I'd use this code with this sample:

class SomeProcess
  include ProcessLater

  def initialize(some_id)
    @initializer_arguments = [some_id]
    @object = User.find(some_id)
  end
  attr_reader :initializer_arguments

  def call
    # perform some long-running action
  end
end

Unfortunately EVERY class that uses ProcessLater will need to implement initializer_arguments. What will happen if you forget to implement it? Errors? Failing background jobs?

Ruby's Comparable library is an example of one that requires a method to be defined in order to be used properly, so it's not an unprecedented idea.

Dangerous combination: implicit dependencies and confusing failures

The Comparable library is a fantastic tool in Ruby's standard library. By defining one method, you gain many other useful methods for comparing and otherwise organizing your objects.

But here's an example of what happens when you don't define that required method:

# in a file called compare.rb
class CompareData
  include Comparable

  def initialize(data)
    @data = data
  end
end

first = CompareData.new('A')
second = CompareData.new('B')

first < second # => compare.rb:12:in `<': comparison of CompareData with CompareData failed (ArgumentError)

comparison of CompareData with CompareData failed (ArgumentError) isn't a helpful error message. It even tells me the problem is in the < data-preserve-html-node="true" method and it's an ArgumentError, but it's actually not really there.

If you're new to using Comparable, this is a surprising result and the message tells you nothing about what to do to fix it.

If you know how to use Comparable, you'd immedately spot the problem in our small class: there's no <=> method (often called the "spaceship operator").

The Comparable library has an implicit dependency on <=> in classes where it is used.

We can fix our code by defining it:

# in a file called compare.rb
class CompareData
  include Comparable

  def initialize(data)
    @data = data
  end
  attr_reader :data

  def <=>(other)
    data <=> other.data
  end
end

first = CompareData.new('A')
second = CompareData.new('B')

first < second # => true

After what could have been a lot of head scratching, we've got our comparable data working. Thanks to our knowledge of that implicit dependency, we got past it quickly.

Built-in dependency warning system

Although it's true that the documentation for Comparable says The class must define the <=> operator, it's always nice to know that the code itself will complain in useful ways when you're using it the wrong way.

Sometimes we like to dive into working with the code to get a feel for how things work. Comparable and libraries like it that have implicit dependencies don't lend themselves to playful interaction to discover it's uses.

I mentioned this implicit dependency in the previous article:

The downside with this is that we have this implicit dependency on the initializer_arguments method. There are ways around that and techniques to use to ensure we do that without failure but for the sake of this article and the goal of creating this generalized library: that'll do.

But really, that won't do. Requiring developers to implement a method to use this ProcessLater library isn't bad, but there should be a very clear error to occur if they do forget.

Documentation can be provided (and it should!) but I want the concrete feedback I get from direct interaction with it. I'd hate to have developers spend time toying with a problem only to remember hours later that they forgot the most important part.

Better yet, I'd like to provide them with a way to ensure that they don't forget.

We could check for the method we need when the module is included:

module ProcessLater
  def self.included(klass)
    unless klass.method_defined?(:initializer_arguments)
      raise "Oops! You need to define `initializer_arguments' to initialize this class in the background."
    end
  end
end

class SomeProcess
  include ProcessLater
end # => RuntimeError: Oops! You need to define `initializer_arguments' to initialize this class in the background.

That's helpful noise. And it should be easy to fix:

class SomeProcess
  include ProcessLater

  def initializer_arguments
    # ...
  end
end # => RuntimeError: Oops! You need to define `initializer_arguments' to initialize this class in the background.

Wait a minute! What happened!?

When the Ruby virtual machine processes this code, it executes from the top to the bottom.

The included hook is fired before the required method is defined.

We could include the library after the method definition:

class SomeProcess
  def initializer_arguments
    # ...
  end
  include ProcessLater
end # => SomeProcess

Although that works, other developers will find this to be a weird way of putting things together. Ruby developers tend to expect modules at the top of the source file. Although this example is small, it is, afterall, just an example so we should expect that a real world file would be much larger than just these few lines. Finding dependecies included at the bottom of the file would be a surprise, or perhaps we might not find them at all when first reading.

Everything in it's right place

Let's keep the included module at the top of the file to prevent confusion and make our dependencies clear.

We can automatically define the initializer_arguments method and return an empty array:

module ProcessLater
  def initializer_arguments; []; end
end

But that would do away with the helpful noise when we forget to set it.

One way to ensure that the values are set is to intercept the object initialization. I've written about managing the initialize method before but here's how it can be done:

module ProcessLater
  def new(*args)
    instance = allocate
    instance.instance_variable_set(:@initializer_arguments, args)
    instance.send(:initialize, *args.flatten)
    instance
  end
end

The new method on a class is a factory which allocates a space in memory for the object, runs initialize on it, then returns the instance. We can change this method to also set the @initializer_arguments variable.

But this also requires that we change the structure of our module.

Because we want to use a class method (new) we need to extend our class with a module instead of including it.

Our ProcessLater module already makes use of the included hook, so we can do what we need there. But first, let's make a module to use under the namespace of ProcessLater.

module ProcessLater
  module Initializer
    def new(*args)
      instance = allocate
      instance.instance_variable_set(:@initializer_arguments, args)
      instance.send(:initialize, *args.flatten)
      instance
    end
  end
end

Next, we can add a line to the included hook to wire up this new feature:

module ProcessLater
  def self.included(klass)
    later_class = Class.new(::ProcessLater::Later)
    klass.const_set(:Later, later_class)

    # extend the klass with our Initializer
    klass.extend(Initializer)

    later_class.class_to_run = klass
  end
end

The final change, is to make sure that all objects which implement this module, have the initializer_arguments method to access the variable that our Initializer sets.

module ProcessLater
  attr_reader :initializer_arguments
end

No longer possible to forget

Our library will now intercept calls to new and store the arguments on the instance allowing them to be passed into our background job.

Developers won't find themselves in a situation where they could forget to store the arguments for the background job.

Here's what it's like to use it:

class SomeProcess
  include ProcessLater

  def initialize(some_id)
    @some_id = some_id
  end
  attr_reader :some_id

  def call
    # ...
  end
end

That's a lot simpler than adding a line in every initialize method to store an implicitly required @initializer_arguments variable.

Although developers on your team will no longer find themselves in a situation to forget something crucial, you may still not like overriding the new method like this. That's might be a valid concern for your team, and I have an alternative approach to create a custom and explicit initialize method next time.

For now, however, we can see that Ruby gives us the power to make our code easy to run in the background, but Ruby gives us what we need to automatically manage our dependencies as well.

What this means for other developers

When we write software, we are not only solving a technical or business problem, but we're introducing potential for our fellow developers to succeed or fail.

This can be an important factor in how your team communicates about your work and the code required to do it.

It may be acceptable to have a library like Comparable which implicitly requires a method to be defined. Or perhaps something like that might fall through the cracks and cause bugs too easily.

If we build tools that implicitly require things, it's useful to automatically provide them.

Ready to go

We finally have a tool that can be passed along to others without much fear that they'll run into surprising errors.

Our ProcessLater library is ready to include in our classes. We can take our long-running processes and isolate them in the background by including our module and using our later method on the instance:

class ComplexCalculation
  include ProcessLater

  # ...existing code for this class omitted...
end

ComplexCalculation.new(with, whatever, arguments).later(method_to_run)

This gives us a way to reevaluate the code which might be slow or otherwise time consuming and make a decision to run it later. As developers come together to discuss application performance issues, we'll have a new tool in our vocabulary of potential techniques to overcome the challenges.

Finally, here's the complete library:

module ProcessLater
  def later(which_method)
    later_class.enqueue(initializer_arguments, 'trigger_method' => which_method)
  end

  attr_reader :initializer_arguments

  private

  def later_class
    self.class.const_get(:Later)
  end

  class Later < Que::Job
    # create the class lever accessor get the related class
    class << self
      attr_accessor :class_to_run
    end

    # create the instance method to access it
    def class_to_run
      self.class.class_to_run
    end

    def run(*args)
      options = args.pop # get the hash passed to enqueue
      self.class_to_run.new(args).send(options['trigger_method'])
    end
  end

  def self.included(klass)
    # create the unnamed class which inherits what we need
    later_class = Class.new(::ProcessLater::Later)

    # name the class we just created
    klass.const_set(:Later, later_class)

    # add the initializer
    klass.extend(Initializer)

    # assign the class_to_run variable to hold a reference
    later_class.class_to_run = klass
  end

  module Initializer
    def new(*args)
      instance = allocate
      instance.instance_variable_set(:@initializer_arguments, args)
      instance.send(:initialize, *args.flatten)
      instance
    end
  end
end

When you solve your application's challenges, how to you build new tools? In what ways are the tools you build aiding future developers in there ability to overcome challenges without confusing errors or unknown dependencies?

Jim Gay

January 25, 2017

Turning a specific solution into a general tool

Jim Gay

January 25, 2017

In a previous article I explored how I make putting work into the background easier. The goal is to be able to decide when to run some procedure immediately or to run it asynchronously via a background job. Here it is:

class SomeProcess
  class Later < Que::Job
    def run(*args)
      options = args.pop # get the hash passed to enqueue
      ::SomeProcess.new(args).send(options['trigger_method'])
    end
  end

  def initialize(some_id)
    @some_id = some_id
    @object = User.find(some_id)
  end
  attr_reader :some_id

  def later(which_method)
    Later.enqueue(some_id, 'trigger_method' => which_method)
  end

  def call
    # perform some long-running action
  end
end

This works well for this class, but eventually we'll want to use this same idea elsewhere. You can always copy and paste, but we know that's a short term solution.

Generalizing your solution

Here's how we can take a solution like this and turn it into a more general tool.

First, I like to come up with the code that I want to write in order to use it. Deciding what code you want to write often means deciding how explicit you want to be.

Do we want to extend or include a module? How should we specify that methods can be performed later? Do we need to provide any default values?

I often begin answering these questions for myself but end up changing my answers as I think through them or even coming up with additional questions.

Here's where I might start...

Often, I want my code to clearly opt in to using a library like the one we're building. It is possible, however, to automatically make it available.

We can monkey-patch Class for example so that all classes might have this ability. But implicitly providing features to a vast collection of types lacks the clarity that developers of the future will want to find when reading through or changing our code.

Although I want to be able to make any class have the ability to run in the background, I'll want to explicitly declare that it can do that.

class SomeProcess
  include ProcessLater
end

And here's what we would need inside that module:

module ProcessLater
  def later(which_method)
    Later.enqueue(some_id, 'trigger_method' => which_method)
  end

  class Later < Que::Job
    def run(*args)
      options = args.pop # get the hash passed to enqueue
      ::SomeProcess.new(args).send(options['trigger_method'])
    end
  end
end

We've just moved some code around but have mostly left it the way it was before. This means we'll have a few problems.

Overcoming specific requirements in generalizations

Our ProcessLater module has a direct reference to SomeProcess so the next class where we attempt to use this module will have trouble.

We need to tell our background job what class to initialize when it's pulled from the queue.

That means our Later class needs to look something like this:

class Later < Que::Job
    def run(*args)
      options = args.pop # get the hash passed to enqueue
      class_to_run.new(args).send(options['trigger_method'])
    end
  end

Every class that uses ProcessLater would need to provide that class_to_run object. We could initialize our Later class with an argument, but often with background libraries we don't have control over the initialization. Typically, all we get is a method like run or perform which accepts our arguments.

We'll get to solving that in a minute but another problem we'll see is that every queued job would be for the ProcessLater::Later class. Even though we're creating a generalized solution, I'd rather see something more specific in my queue.

I like to keep related code as close together as is reasonably possible and that leads me to nesting my background classes within the class of concern.

Here's an example of what jobs I'd like to see in my queue: SomeProcess::Later, ComplexCalculation::Later, SolveHaltingProblem::Later.

Seeing that data stored for processing (along with any relevant arguments) would give me an idea of what work would need to be done.

Creating a custom general class

We can create those classes when we include our module.

module ProcessLater
  def later(which_method)
    Later.enqueue(some_id, 'trigger_method' => which_method)
  end

  class Later < Que::Job
    # create the class lever accessor get the related class
    class << self
      attr_reader :class_to_run
    end

    # create the instance method to access it
    def class_to_run
      self.class.class_to_run
    end

    def run(*args)
      options = args.pop # get the hash passed to enqueue
      class_to_run.new(args).send(options['trigger_method'])
    end
  end

  def self.included(klass)
    # create the unnamed class which inherits what we need
    later_class = Class.new(::ProcessLater::Later)

    # assign the @class_to_run variable to hold a reference
    later_class.instance_variable_set(:@class_to_run, self)

    # name the class we just created
    klass.const_set(:Later, later_class)
  end
end

There's a lot going on there but the end result is that when you include ProcessLater you'll get a background class of WhateverYourClassIs::Later.

But there's still a problem. The ProcessLater module has our later method enqueue the background job with Later which will actually look for ProcessLater::Later but we need it to be specifically the class we just created.

We want the instance we create to know how to enqueue itself to the background. All we need to do is provide a method which will look for that constant.

module ProcessLater
  def later(which_method)
    later_class.enqueue(some_id, 'trigger_method' => which_method)
  end

  private

  # Find the constant in the class that includes this module
  def later_class
    self.class.const_get(:Later)
  end

Knowing how to initialize

There's still one problem: initializing your object.

The later method knows about that some_id argument. But not all classes are the same and arguments for initialization are likely to be different.

We're going to go with a "let's just make it work" kind of solution. Since we need to know how to initialize, we can just put those arguments into an @initalizer_arguments variable.

class SomeProcess
  include ProcessLater

  def initialize(some_id)
    @initializer_arguments = [some_id]
    @object = User.find(some_id)
  end
  attr_reader :initializer_arguments
end

Now, instead of keeping track of an individual value, we track an array of arguments. We can alter our enqueueing method to use that array instead:

module ProcessLater
  def later(which_method)
    later_class.enqueue(*initializer_arguments, 'trigger_method' => which_method)
  end

Our general solution will now properly handle specific class requirements.

The downside with this is that we have this implicit dependency on the initializer_arguments method. There are ways around that and techniques to use to ensure we do that without failure but for the sake of this article and the goal of creating this generalized library: that'll do.

I'll cover handling those requirements like providing initializer_arguments in the future, but for now: how would you handle this? What impact would code like this have on your team?

A thin, slice between you and the background.

With that change, we're enqueueing our background jobs with the right classes.

Here's the final flow:

Initialize your class: SomeProcess.new(123)
Run later(:call) on it
That enqueues the details storing the background class as SomeProcess::Later
The job is picked up and the SomeProcess::Later class is initalized
The job object in turn initializes SomeProcess.new(123) and runs your specified method: call

That gives us a very small generalized layer for moving work into the background. What you'll see in your main class files is this:

class SomeProcess
  include ProcessLater

  def initialize(some_id)
    @initializer_arguments = [some_id]
    @object = User.find(some_id)
  end
  attr_reader :initializer_arguments

  def call
    # perform some long-running action
  end
end

And here's the final library:

module ProcessLater
  def later(which_method)
    later_class.enqueue(initializer_arguments, 'trigger_method' => which_method)
  end

  private

  def later_class
    self.class.const_get(:Later)
  end

  class Later < Que::Job
    # create the class lever accessor get the related class
    class << self
      attr_accessor :class_to_run
    end

    # create the instance method to access it
    def class_to_run
      self.class.class_to_run
    end

    def run(*args)
      options = args.pop # get the hash passed to enqueue
      self.class_to_run.new(args).send(options['trigger_method'])
    end
  end

  def self.included(klass)
    # create the unnamed class which inherits what we need
    later_class = Class.new(::ProcessLater::Later)

    # name the class we just created
    klass.const_set(:Later, later_class)

    # assign the class_to_run variable to hold a reference
    later_class.class_to_run = klass
  end
end

We'll explore more about building your own tools in the future and I put a lot of effort into explaining what you can do with Ruby in the Ruby DSL Handbook, so check it out and if you have any questions (or feedback), just hit reply!

Certainly some will say "Why aren't you using ActiveJob?" or "Why aren't you using Sidekiq?" or "Why aren't you ...."

All of those questions are good ones.

The way your team works, interacts, and builds their own tools has a lot more to do with answering those questions than my reasons. Many different decisions can be made but it's important for your whole team to understand which questions are the most important to answer.

Follow-up this article with the next in the series: Building a tool that's easy for your team to use

Jim Gay

January 12, 2017

How I fixed my biggest mistake with implementing background jobs

Jim Gay

January 12, 2017

When I first began implementing background jobs I found myself moving my code into an appropriate background class whether it was in app or lib somewhere. I found it frustrating that I needed to shift code around based only on the decision to run it in the background. It seemed to be the conventional wisdom to do this, or at least that's what I thought.

It's not uncommon to find a reason to move some of your application into the background.

We build up systems that do a lot of work and at a certain point find that the work to be done takes too long. This often means that we reach for a new background job class, move our code to it, stick it in a place like app/jobs and then we're done with it.

Our code now lives in the background jobs and we move along to the next feature. So we separate app/models and app/jobs.

But this ends up feeling like I have two applications. I have my user facing application where most of my work happens and the one where things must happen in the background. But this is mostly a false dichotomy. That's not really how it works nor how any of us think about the system.

While it is important that background processing happen when necessary, I'd rather make those decisions as I determine them. I'd rather those descisions not require I rearrange my code like that.

Make decisions about background processing without rearranging your code

Here's an example of being able to make those decisions as you determine them. Developers using Rails may be familiar with this easy to change aspect of ActionMailer:

Notifier.welcome(some_user).deliver_now
# or with a minor change to the code...
Notifier.welcome(some_user).deliver_later

This feature is built-in and ready to use. Few things are easier to change than altering now now into later.

We can do the same in our code too.

To be able to handle this same approach, we'd need to create objects which can be initialized with data from your background data store.

A specialized class to write our object data to the background data store can do the job well, and it's pretty easy to do.

Here's a quick example.

Let's say you need to run some process.

class SomeProcess
  def initialize(some_id)
    @object = User.find(some_id)
  end

  def call
    # perform some long-running action (use your imagination)
  end
end
process = SomeProcess.new('ee6f1d66-b4e5-11e6-80f5-76304dec7eb7')
process.call

But you've implemented a long running process that runs as soon as some user requests it. If you follow the example from ActionMailer, maybe could just change process.call to process.call_later.

I want to have the ability to make decisions as easy as this when I need to run the Process code. When you keep all related code together, it’s easier to understand and make changes.

To make this work, we'll need that class to have a call_later method. We might have other methods we want to be able to run later too. Implementing method_missing can make this work...

class SomeProcess
  def method_missing(method_name, *args, &block)
    if method_name.to_s =~ /_later\z/
      # run it in the background
    else
      super # do the normal thing ruby does
    end
  end
end

The above implementation of method_missing will catch any methods ending in _later, but I'd rather not do that.

Using method_missing hides the implementation of our hook into the world of background jobs. It's hard to know that it is there and will probably difficult to find later when you want to understand how it works.

Instead, I'm going to write some code so that I can run my code in the backgound by using later(:call) instead of call. It might not be as elegant as appending _later to a method name, but the implementation easier is to get going and will put the code in a place where you can more easily find it.

Saving for later

So here's where we'll start:

class SomeProcess
  def later(trigger_method)
    # ...now what?
  end
end

We've made a later method that will accept a single argument of some method that we want to run.

That's easy enough but now we need to actually save this to the background. Usually this involves referring to some background job class and saving it.

We need to create a class that will save our object information to the background data store, and then later initialize the object and run our trigger_method.

Writing the data to your background store will be handled by whatever library you use for managing background jobs. This example will use Que but the differences with yours won't matter much.

Our backgroud class needs to initialize the SomeProcess object and tell it to run the trigger_method.

There's a trick to doing this. Our backgroud class needs to know what attributes are required to initialize the object, and which one to use as the method we're calling.

First, let's make a minor change to our initializer to store the argument we're given:

class SomeProcess
  def initialize(some_id)
    @some_id = some_id # <-- keeping track of the argument
    @object = User.find(some_id)
  end
  attr_reader :some_id
end

This change allows us to reference the argument we're given so that we can use it when we enqueue our background job. Rails allows us to pass an object into an ActiveJob instance (which we're not using here) and make it's best guess about how to serialize and deserialize the data to initialize our objects. Given our simple example here, we don't really need that feature (but we could implement the same if we like).

We really only need the class loaded by the background process to be a thin mediator between the data in the job store and the class which defines our business process. So I just make the class as small and isolated as possible.

class SomeProcess
  class Later < Que::Job
    def run(*args)
      options = args.pop # get the hash passed to enqueue
      ::SomeProcess.new(args).send(options['trigger_method'])
    end
  end

  def later(which_method)
    Later.enqueue(some_id, 'trigger_method' => which_method)
  end
end

There's no need to make a new thing inside of app/jobs (or anywhere else) since we never directly access this Later class.

If my main purpose in the code is to initialize Process and use the call method. The only decision I need to make is to either run it immediately, or run it later.

Once I began organizing my code with small and focused background classes, I was able to push aside the concern of when it would run until I needed to make that decision. Reading my code left my head clearer when I kept it all together.

With this code, I can make that decision as I determine the need for it.

Here's the final code for the class:

class SomeProcess
  class Later < Que::Job
    def run(*args)
      options = args.pop # get the hash passed to enqueue
      ::SomeProcess.new(args).send(options['trigger_method'])
    end
  end

  def initialize(some_id)
    @some_id = some_id
    @object = User.find(some_id)
  end
  attr_reader :some_id

  def later(which_method)
    Later.enqueue(some_id, 'trigger_method' => which_method)
  end

  def call
    # perform some long-running action
  end
end

There are some other features we could add to this, but this small class and later method get us where we want to be.

Now our decision to run this immediately or at a later time is as simple as changing this:

process.call

to this:

process.later(:call)

Are the objects in your system able to change like this? How do you handle decisions to move actions into the background?

Check the next article in the series: Turning a specific solution into a general tool

Jim Gay

August 19, 2015

Commanding objects toward immutability

Jim Gay

August 19, 2015

Following the rules for East-oriented Code helps me organize behavior in my code but it can lead to other benefits as well. As a result of following the rules, I find that my code is better prepared for restrictions like that which immutable objects introduce.

I recently went looking for samples of how people are using instance_eval and instance_exec and ended up with a great example from FactoryGirl thanks to Joshua Clayton. As I was searching, I came upon some code which happened to use instance_eval. Although it was a simple use case for that method it lent itself as a much better example of commands, immutability, and East-oriented code.

Here's the details...

If we want to use nested blocks to create a tree structure, we might create some pseudo-code like this to illustrate our desired code:

Node.new('root') do
  node('branch') do
    node('leaf')
    node('leaf2')
    node('leaf3')
  end
end

The representation of this set of objects should look something like this:

"['root', ['branch', ['leaf', 'leaf2', 'leaf3']]]"

This shows that the created tree is a pair of a named node and an array of named children (who can also have children).

Imperative approach

A simple solution is to initialize a Node and, using an imperative approach, to change its state; that is to say that we alter its collection of children.

class Node
      def initialize(name, &block)
        @name = name
        @children = []

        instance_eval(&block) if block
      end    

      attr_reader :children, :name

      def node(name, &block)
        children << Node.new(name, &block)
      end
    end

When each node is created, its collection of children is set to an empty array. With each call to the node method, a new Node object is created and shoveled into the collection of children.

If we refactor our sample to inline the methods and show us exactly what's going on, it would look something like this:

Node.new('root') do
  self.children << Node.new('branch') do
    self.children << Node.new('leaf')
    self.children << Node.new('leaf2')
    self.children << Node.new('leaf3')
  end
end

We can more clearly see what's happening inside of the node method with this change to our code.

Eastward flow

As I worked with this problem I wondered: what would happen if I started following the 4 rules of East-oriented code?

If our node method returns self, how does that affect our code?

class Node
      # initialize omitted...

      def node(name, &block)
        children << Node.new(name, &block)
        self
      end
    end

Fortunately, because our code relies on an imperative approach by changing the state of the children, the code still works.

If we want, we can shrink the space we use by chaining commands together:

t = Node.new("root") do
      node("branch") do
        node("subbranch") do
          node("leaf").node("leaf2").node("leaf3")
        end
      end
    end

I think that's actually a little more difficult to read, so we can go back to the regular style:

node("leaf")
    node("leaf2")
    node("leaf3")

When seeing techniques like returning self to encourage an East-oriented approach, it's easy to fixate on the chaining. But it's commands that we want to introduce, not chaining. The chaining is incidental here.

If you do chain your method calls together, it at least appears more clearly that each subsequent method is operating on the return value of the last one.

If we want to be clear that we're operating on the last return value, we can maintain the readability of the multiline option by writing it like this:

node("leaf").
    node("leaf2").
    node("leaf3")

Each line chains the next by adding the dot character. We don't have a specific need to do this, but it's good to know how it works.

Not much has changed after introducing our East-oriented approach. We're still updating that collection of children.

Introducing immutability

What will we see if we introduce immutable objects to our solution?

Immutable objects might just help us make our code more predictable. An object which never changes, of course, stays the same. This allows you to better handle the behavior of the system and, without changing any objects, makes a multithreaded approach much less likely to introduce headaches.

The simplest way to add immutability is to freeze objects as they are initialized:

class Node
      def initialize(name, &block)
        @name = name.freeze
        @children = [].freeze

        instance_eval(&block) if block
      end

      attr_reader :children, :name

      def node(name, &block)
        children << Node.new(name, &block).freeze
        self
      end
    end

This, of course, breaks everything. Our code relies upon the fact that the children array may be mutated. Instead of doing the mutation, we'll see this:

RuntimeError: can't modify frozen Array

Now what?

If we can't alter the collection, we're left at creating an entirely new one.

One thing we could do is change the constructor to accept a collection of children when the Node is initialized. Instead of altering the children, we'd use a constructor like this Node.new(name, chlidren). Here's what that looks like:

class Node
      def initialize(name, children=[], &block)
        @name = name.freeze
        @children = children.freeze

        instance_eval(&block) if block
      end
      # ... omitted code

    end

That still doesn't allow us to change anything until we also change the way our node method works (since it is responsible for handling changes to the children).

If the node method created a new Node instead of altering the children, that would get us what we want. Let's break it down.

First, when the node method is called, it needs to create the node to be added to the collection of children:

def node(name, &block)
      new_child = Node.new(name, &block)
      # ... ?
      self
    end

Since we're trying to avoid mutating the state of this object, we don't want to just shove the new node into the collection of children (and we can't because we used freeze on it).

So let's create an entirely new node, with an entirely new collection of children. In order to do that, we need to ensure that for every existing child object, we creat a corresponding new node.

For each command to the object with node, we'll get the representation of what the children should be. So let's build a method to do that:

def next_children
      children.map{|child| Node.new(child.name, child.next_children) }.freeze
    end

When we changed our initializer, that allowed us to set the list of children. Our new next_children method relies on that feature and a recursive call to itself to build the collection of children for that new node with Node.new(child.name, child.next_children).

Looking back at our node method we'll need to break the rules of East-oriented Code. Since we have immutable objects, we'll return a new node instead of self.

def node(name, &block)
      new_child = Node.new(name, &block)
      Node.new(self.name, next_children + [new_child])
    end

But there's still a problem left. Because we need our initialized object to execute a block and the contstructor new might actually need to return a different object than the one originally created. The call to node inside the block changes the return value from the instance that new creates, to the instance that node creates.

Controlling the constructor

To better handle our immutable objects and the return values from the methods we created, we can alter the way the new method works on our Node class.

Instead of handling a block in the initialize method, we can move it to new.

Here's the new new method:

def self.new(*args, &block)
      instance = super.freeze
      if block
        instance.instance_eval(&block)
      else
        instance
      end
    end

The first step is to call super to get an instance the way Ruby normally creates them (as defined in the super class of Node). Then we freeze it.

If we haven't provided a block to the new method, we'll want to return the instance we just created. If we have provided a block, we'll need to evaluate that block in the context of the instance we just created and return it's result.

This means that the block can use the node method and whatever is returned by it.

We need to alter the new method this way because we're not always just returning the instance it creates. Since our objects are frozen, we can't allow the block to alter their states.

The way new usually works is like this:

def self.new(*args, &block)
      instance = allocate
      instance.send(:initialize, *args, &block)
      return instance
    end

You can see the reason that Ruby has you call new on a class but in practice you write your initialize method. This structure ensures that no matter the result of your initialize method, new will always return an instance of the class you've used.

We're bending the rules to allow us to evaluate the given block and return its result, instead of the instance typically created by new.

After that, we can remove the block evaluation from initialize:

def initialize(name, children=[])
      @name = name.freeze
      @children = children.freeze
    end

While the method signature (the list of accepted arguments) has changed for initialize, it's still the same for new: a list of arugments and a block.

Believe it or not, there's still one more problem to solve.

Operating on values

We looked at how returning self allows you to chain your method calls. Although we've broken that rule and are instead returning a new Node object, it's important to consider that chaining.

Our initial code still doesn't work quite right and it's all because we need to think about operating on the return values of our commands and not relying on an imperitive approach to building and changing objects.

First, here's what our Node class looks like:

class Node
      def self.new(*args, &block)
        instance = super.freeze
        if block
          instance.instance_eval(&block)
        else
          instance
        end
      end

      def initialize(name, children=[])
        @name = name.freeze
        @children = children.freeze
      end

      attr_reader :children, :name

      def node(name, &block)
        new_child = self.class.new(name, &block)
        self.class.new(self.name, next_children + [new_child])
      end

      def next_children
        children.map{|child| self.class.new(child.name, child.next_children) }.freeze
      end

      def inspect
        return %{"#{name}"} if children.empty?
        %{"#{name}", #{children}}
      end
    end

We didn't discuss it, but there's an inspect method to return either the name of the node if it has no children, or the name and a list of children if it has some.

Here's what the code to create the tree looks like:

Node.new('root') do
      node('branch') do
        node('leaf')
        node('leaf2')
        node('leaf3')
      end
    end

If we assign the result of that to a variable and inspect it we'll get a surprising result.

t = Node.new('root') do
          node('branch') do
            node('leaf')
            node('leaf2')
            node('leaf3')
          end
        end
    puts [t].inspect

The output will only be

["root", ["branch", ["leaf3"]]]

So what happened to the other leaf and leaf2 objects? Why aren't they there?

Remember that each node call returns a new node. With every node a new result is returned. The node('leaf') returns an object, but node('leaf2') is not a message sent to the object returned by the first. It is a message sent to the node('branch') result.

Each of those calls is returned and forgotten. Here it is annotated:

t = Node.new('root') do
          node('branch') do
            node('leaf') # returned and forgotten
            node('leaf2') # returned and forgotten
            node('leaf3') # returned and used as the final result
          end
        end
    puts [t].inspect
    #=> ["root", ["branch", ["leaf3"]]]

The answer to this problem is to command each object to do the next thing. We can achieve this by chaining the methods. The result of one method is the object which will receive the next command.

t = Node.new('root') do
          node('branch') do
            node('leaf'). # dot (.) charater added to chain
            node('leaf2'). # executed on the result of the last node
            node('leaf3') # executed on the result of the last node
          end
        end
    puts [t].inspect
    #=> ["root", ["branch", ["leaf", "leaf2", "leaf3"]]]

An alternative way to look at this is to store the result of each command:

t = Node.new('root') do
          node('branch') do
            branch = node('leaf')
            next_branch = branch.node('leaf2')
            final_branch = next_branch.node('leaf3')
          end
        end
    puts [t].inspect
    #=> ["root", ["branch", ["leaf", "leaf2", "leaf3"]]]

Following the rules so you know when to break them

What was interesting about this to me was that my code was prepared for the immutable objects when I prepared it to operate on the same one. By structuring my code to return self and send the next message to the result of the last, I was able to change the implementation from an imperative style to a functional style.

Jim Gay

April 30, 2015

Cohesive behaviors with data clumps

Jim Gay

April 30, 2015

A good example of how we use context and locality to understand and manage concepts in our code is using a data clump.

A data clump is a collection of two or more bits of information that are consistently used together. You’ll find that your data loses its meaning when you remove items from the clump.

Date ranges are simple examples of how a data clump puts necessary information into context.
An example of this is to find out if a question was asked between today and one month ago. If our Question class implements a query method for this:

class Question
      def asked_within?(start_date, end_date)
        (start_date..end_date).cover?(self.asked_date)
      end
    end

Then we can pass in our desired dates to get the answer:

# using ActiveSupport
    start_date = 1.month.ago
    end_date = Time.now
    question.asked_within?(start_date, end_date)

Discovering whether a question is within this time frame always requires both a start and end date. This is an indication that we can only understand the feature and indeed only implement it when we have this data clump. To better encapsulate the behavior of these values, we can create a class to manage initializing objects that represent them.

DateRange = Struct.new(:start_date, :end_date)
    last_month = DateRange.new(1.month.ago, Time.now)
    question.asked_within?(last_month)

We can then change our Question class to instead take a date range object for the asked_within? method, but the question’s responsibilities have grown a bit here. A question doesn’t have anything to do with comparing dates, so we can move the control of that information into the data clump that represents them.

DateRange = Struct.new(:start_date, :end_date) do
      def contains?(date)
        (start_date..end_date).cover?(date)
      end
    end

Now, instead of the question managing its date comparison, the date range can do the work.

last_month.contains?(question.date_asked)

By analyzing the individual parts of this date comparison we have to juggle a bit more in our heads. Considering a range as an complete object rather than a collection of parts is simpler and we tend not to think of every individual day within a month when doing a mental comparison. A date range is a small system of interacting parts that we better understand as a broader context.

This example shows us the value not only of separating responsibilities, but of bringing objects together. We get more value by putting details into context than we would have if they remained separate.

Things to note

Struct.new returns a class instance. Inheriting from the result of a new Struct creates an anonymous class in the ancestors of your created class:

[DateRange, #, Struct, ...]

Instead of class DateRange < Struct.new; end use DateRange = Struct.new and avoid an anonymous class in the ancestors:>

[DateRange, Struct, ...]

Additionaly, be careful with large ranges. If our code used include? instead of cover?, Ruby would initialize a Time object for every time between the beginning and end. As your range grows, the memory needed to calculate the answer will grow too.

Avoid excessive memory and use cover? instead. It will check that your beginning date is less than or equal to the given date, and that the given date is less than or equal to the end date.

This article is an excerpt from my book Clean Ruby

Jim Gay

April 21, 2015

Locality and Cohesion

Jim Gay

April 21, 2015

"The primary feature for easy maintenance is locality: Locality is that characteristic of source code that enables a programmer to understand that source by looking at only a small portion of it." -- Richard Gabriel

This advice is from Patterns of Software by Richard Gabriel.

Keeping cohesive parts of our system together can help us understand it. By managing locality we can keep cohesive parts together.

It’s easy to see coupling in our code. When one object can't do it's job without another, we experience frustration in the face of change. We often think about dependencies in our code, but cohesion is the relatedness of the behaviors and plays an import part in how we organize the ideas to support our domain.

def process_payment(amount)
      gateway.authorize_and_charge(amount) do
        deliver_cart
      end
      logger.info "handling payment: #{amount}"
      logger.info "cart delivered: #{id}"
    end

The exact purpose of this completely-made-up code isn't that important. But we can look at parts of this procedure and extract them into a related method:

def process_payment(amount)
      gateway.authorize_and_charge(amount) do
        deliver_cart
      end
      log_purchase(amount)
    end

    def log_purchase(amount)
      logger.info "handling payment: #{amount}"
      logger.info "cart delivered: #{id}"
    end

As Gabriel points out in his book, we can compress a procedure into a simple phrase like log_purchase but this compression carries a cost. In order to understand the behavior of this log_purchase phrase, we need to understand the context around it.

Indeed, we might look at this and realize that there's a problem with the way we managed the locality of the procedure. Instead of easily understanding a single method, we might look at process_payment and realize there's a bit more to it than we first expect.

We're forced to understand the log_purchase and the context which previously surrounded it's procedure. A second look at this extraction might lead us to reconsider and to go back to inline the method. Let's keep this code with a tighter locality:

def process_payment(amount)
      gateway.authorize_and_charge(amount) do
        deliver_cart
      end
      logger.info "handling payment: #{amount}"
      logger.info "cart delivered: #{id}"
    end

While extracting the log_purchase method was easy, given the original code, it added a bit too much for us to understand and it doesn't feel quite right. Handling the locality of this code helps us to better understand it and to make better decisions about how to improve the main process_payment method.

Consider this: How much must you pack into your head before you can begin evaluating a part of your code?

While breaking procedures up into small methods can be a useful way to make easy to understand (and easy to test) parts, we may do so to the detriment of understanding.

This is something to consider if you are building a DSL to compress ideas in your code or if you're trying to create objects to manage your business logic. I'll be writing more about the value of controlling the locality of behavior in your system, but I'd love to hear how you manage locality. What do you do to ensure that related bits stay together?

Jim Gay

April 15, 2015

The difference between instance_eval and instance_exec

Jim Gay

April 15, 2015

There's an important difference between instance_eval and instance_exec. And there's a great lesson about how to use them well in FactoryGirl

But first, before you go rushing off to build your fantastic DSL, let's look at what instance_eval is and does.

The simplest of examples can be taken straight from the Ruby docs:

class KlassWithSecret
      def initialize
        @secret = 99
      end
    end
    k = KlassWithSecret.new
    k.instance_eval { @secret } #=> 99

The current value for self inside the provided block will be the object on which you call instance_eval. So in this case the k object is the current context for the block; @secret is a variable stored inside k and instance_eval opens up access to that object and all of it's internal variables.

The interface that FactoryGirl provides is simple and straightforward. Here's an example from it's "Getting Started" documentation:

FactoryGirl.define do
      factory :user do
        first_name "Kristoff"
        last_name  "Bjorgman"
        admin false
      end
    end

Here, FactoryGirl uses instance_eval to execute the blocks of code passed to factory.

Let's take a look at some representative code from how FactoryGirl makes this work:

def factory(name, &block)
      factory = Factory.new(name)
      factory.instance_eval(&block) if block_given?
      # ... more code
    end

That's not actually the code from FactoryGirl, but it represents roughly what happens. When the method factory is called a new Factory is created and then the block is executed in the context of that object. In other words where you see first_name it's as if you had that factory instance before it and instead had factory.first_name. By using instance_eval, the users of FactoryGirl don't need to specify the factory object, it's implicitly applied to it.

_Ok, that's all well and good, but what about instance_exec?_

I'm glad you asked.

The instance_eval method can only evaluate a block (or a string) but that's it. Need to pass arguments into the block? You'll be frozen in your tracks.

But instance_exec on the other hand, will evaluate a provide block and allow you to pass arguments to it. Let's take a look...

FactoryGirl allows you to handle callbacks to perform some action, for example, after the object is created.

FactoryGirl.define do
      factory :user do
        first_name "Kristoff"
        last_name "Bjorgman"
        admin false

        after(:create) do |user, evaluator|
          create_list(:post, evaluator.posts_count, user: user)
        end
      end
    end

In this sample, the after(:create) is run after the object is created, but the block accepts two arguments: user and evaluator. The user argument is the user that was created. The evaluator is an object which stores all the values created by the factory.

Let's take a look at how this is implemented:

def run(instance, evaluator)
      case block.arity
      when 1, -1 then syntax_runner.instance_exec(instance, &block)
      when 2 then syntax_runner.instance_exec(instance, evaluator, &block)
      else        syntax_runner.instance_exec(&block)
      end
    end

FactoryGirl will create a callback object named by the argument given to the after method. The callback is created with a name, :create in this case, and with a block of code.

The block that we used in our example had two arguments.

The run method decides how to execute the code from the block.

The callback object stores the provided block and Ruby allows us to check the arity of the block, or in other words, it allows us to check the number of arguments.

When looking at a case statement, it's a good idea to check the else clause first. This gives you an idea of what will happen if there's no match for whatever code exists in the when parts.

There we see syntax_runner.instance_exec(&block) and this could easily be changed to use instance_eval instead. Ruby will evaluate, or execute, the block in the context of the syntax_runner object.

If the block's arity is greater than zero, FactoryGirl needs to provide the objects to the block so that our code works the way we expect.

The second part of the case checks if the block arity is equal to 2.

when 2 then syntax_runner.instance_exec(instance, evaluator, &block)

If it is, the syntax_runner receives the instance (or in our case user) and the evaluator.

If, however, the arity is 1 or -1 then the block will only receive the instance object.

So what is that -1 value? Let's look at the ways we could create a callback:

# Two arguments and arity of 2
    after(:create) do |user, evaluator|
      create_list(:post, evaluator.posts_count, user: user)
    end
    # One argument and arity of 1
    after(:create) do |user|
      create_group(:people, user: user)
    end
    # Zero arguments and arity of 0
    after(:create) do
      puts "Yay!"
    end
    # Any arguments and arity of -1
    after(:create) do |*args|
      puts "The user is #{args.first}"
    end

Ruby doesn't know how many args you'll give it with *args so it throws up it's hands and tells you that it's some strange number: -1.

This is the power of understanding how and when to use instance_exec; users of the DSL will expect it to make sense, and it will.

But wait! There's more!

What if you want to specify the same value for multiple attributes?

FactoryGirl.define do
      factory :user do
        first_name "Kristoff"
        last_name  "Bjorgman"

        password "12345"
        password_confirmation "12345"
      end
    end

In the above example, both the password and password_confirmation are set to the same value. This could be bad. What if you change the password for one, but forget to change the other? If they are inherently tied in their implementation, then that could lead to some unexpected behavior when they are not the same.

I would, and probably you would too, prefer to tell FactoryGirl to just use the value I'd already configured.

Fortunately FactoryGirl allows us to use a great trick in Ruby using the to_proc method. Here's what it looks like in use:

FactoryGirl.define do
      factory :user do
        first_name "Kristoff"
        last_name  "Bjorgman"

        password "12345"
        password_confirmation &:password
      end
    end

The important part is the &:password value provided to password_confirmation. Ruby will see the & character and treat the following as a block by calling to_proc on it. To implement this feature, FactoryGirl defines to_proc on attributes and there will use instance_exec to provide the symbol password to the block:

def to_proc
      block = @block

      -> {
        value = case block.arity
                when 1, -1 then instance_exec(self, &block)
                else instance_exec(&block)
                end
        raise SequenceAbuseError if FactoryGirl::Sequence === value
        value
      }
    end

What about lambdas and procs?

Some commenters in Reddit raised an important question about how these methods behave when given lambdas and procs.

If you provide a lambda which accepts no arguments as the block, instance_eval will raise an error:

object = Object.new
    argless = ->{ puts "foo" }
    object.instance_eval(&argless) #=> ArgumentError: wrong number of arguments (1 for 0)

This error occurs because Ruby will yield the current object to the provided block as self. So you can fix it by providing a lambda which accepts an argument:

args = ->(obj){ puts "foo" }
    object.instance_eval(&args) #=> "foo"

This changes a bit if you use instance_exec:

object.instance_exec(&argless) #=> "foo"
    object.instance_exec(&args) #=> ArgumentError: wrong number of arguments (0 for 1)
    object.instance_exec("some argument", &args) #=> "foo"

Because a proc is less restrictive with argument requirements, it will allow either approach to work without error:

p_argless = proc{ puts "foo" }
    object.instance_eval(&p_argless) #=> "foo"

    p_args = proc{|obj| puts "foo" }
    object.instance_eval(&p_args) #=> "foo"

    object.instance_exec(&p_args) #=> "foo"
    object.instance_exec(&p_argless) #=> "foo"

Now you know, instance_exec and instance_eval are similar in the way they behave, but you'll reach for instance_exec if you need to pass variables around.

##Announcing Ruby Metaprogramming Masterclass

I'm offering a new online class where I'll be teaching you how to master metaprogramming in Ruby on April 30th (the day after my birthday!)

I'm keeping the spaces limited to 25 so attendees will be able to talk and ask questions but already over a quarter of the seats are gone. So grab a seat now, before they're all gone.

Jim Gay

April 7, 2015

The 4 Rules of East-oriented Code: Rule 4

Jim Gay

April 7, 2015

Often the rules we create are defined by their exceptions.

It is difficult to create a program which continually passes objects and never returns data. Often the first rule of "Always return self" is met with immediate rejection because it's easy to see the difficulty you'd encounter if that rule is continually followed for every object.

In my presentation for RubyConf, I showed how we break the rules to allow value objects to handle data for a template. I previously wrote about the approach I used in the presentation to push data into a value object.

class Address
      def display(template)
        if protect_privacy?
          template.display_address(private_version)
        else
          template.display_address(public_version)
        end
        self
      end
    end

In the sample above, an Address instance commands a template to display_address with different versions of data: private_version or public_version. This makes a flexible interface that allows Address to create any number of different versions if necessary. Perhaps the requirements will demand a semi_public_version in the future; our design of the template need not change.

This is a great way to break the rules. Value objects allow us to parameterize a collection of data in a single object. The alternative to this approach would be to use setter methods on the template object:

class Address
      def display(template)
        unless protect_privacy?
          template.street = street
          template.apartment = apartment
          template.postal_code = postal_code
        end
        template.city = city
        template.province = province
        template.display_address
        self
      end
    end

We can plainly see that although the code follows the rules by commanding the template object, there's also quite a lot happening in this display method on Address. If the requirements change we might feel encouraged to complicate the unless block or "refactor" it into a case statement. While that might solve our problem, the resulting code could lead to some difficult to read and understand implementation details.

By breaking the rules with a value object we can better encapsulate the ideas in a private address object or public or any other type we desire.

But we're not just breaking the rules inside the Address methods; the template breaks the rules too. Rule 2 says that objects may query themselves and subsequently means they should not query other objects. But by choosing to break the rules we make a design decision at a specific location to make things better.

No matter what rules you follow, you decide not only to follow them, but decide to break them as well. To make your program easy to understand and to create reasonable expectations, you can lean on creating barriers. Preventing yourself from doing one thing frees you to do another.

Embrace constraints.

How do you add constraints to your programs? What are you better able to do by adding restrictions?

Jim Gay

March 17, 2015

The 4 Rules of East-oriented Code: Rule 3

Jim Gay

March 17, 2015

When I set out to create my presentation for RubyConf, I wanted to provide the audience with something they could easily try. By doing that, one could walk away and put themselves in a position to think about their code differently. While, James Ladd, the creator of East-oriented Code made some basic rules, I decide to take them and frame it in the specific context of Ruby:

Always return self
Objects may query themselves
Factories are exempt
Break the rules sparingly

After writing about Rule 1 and Rule 2 I'm very eager to get to Rule 3. It's an easy way to break the intent of this style without breaking the rules.

Factories are Exempt

They must be. If you returned self from Object.new you'd just get Object back, not an instance of an object. So factories are exempt from returning self.

The best way to get around any of these rules is to just make something into a factory. But here lies the danger. It's important to first think about what these objects are doing. For what are they responsible?

We could create a class to sweep our messy code under the rug.

user = User.new
    signup = UserSignupProcess.new
    signup.create_with(user)

The code above, we could say, is East-oriented. The factories create instances, and the signup object is told to create_with and given the user object.

Beyond this (inside create_with), it could easily be an enormous mess. While we can and probably should use different programming techniques for different situations, taking a load of if statements and sweeping it into a class could still be problematic.

Now, the sample code above is completely made up to show how you can take part of your program and say "this part is East-oriented, but over here I used this other technique. I call it If-oriented."

Examining your domain and creating a program to support it requires that you carefully evaluate what objects should exist, what their responsibilities are, and what you will name them.

East-orientation is all about designating responsibilities.

This leads us to breaking the rules...

We'll be getting to that later. There's likely very good reasons to break any particular programming rule, but it probably depends on the context.

I wrote Clean Ruby and the chapter on East-oriented Code before I set up the 4 rules for my presentation, but the same lessons are there. I'll be adding more to it, particularly as discussion and ideas around DCI evolve, but I'm putting effort toward wrapping up the Ruby DSL Handbook. It will soon be complete and the $12 price will go up to $24, so pick it up now if you're interested.

Ruby DSL Handbook is about how to create a DSL without headaches from metaprogramming and I just released an update with a chapter about creating a DSL without metaprogramming at all. Much like this discussion today, it's all about managing responsibilities.

Jim Gay

March 10, 2015

The 4 Rules of East-oriented Code: Rule 2

Jim Gay

March 10, 2015

In a previous article I wrote about the first rule of East-oriented Code.

Here again are the rules I set forth in my presentation at RubyConf:

Always return self
Objects may query themselves
Factories are exempt
Break the rules sparingly

The second rule, that "Objects may query themselves", allows the design of objects to work with their own attributes.

When we design our systems of interacting objects we can use the Tell, Don't Ask approach to limit the decisions in the code to objects which are responsible for the data used to make them.

The Tell, Don't Ask article begins by quoting Alec Sharp:

Procedural code gets information then makes decisions. Object-oriented code tells objects to do things.

In order for objects to do things, they may need to ask questions about their own data. Though the first rule of East-oriented Code says that you should return self, internal private methods don't need to follow this rule. We can and might need to create query methods to allow the object to make decisions.

It's easy to begin designing an object by specifying what it's attributes are:

class Person
      attr_reader :name, :nickname, :gender
    end

When we do that, we also implicitly allow other objects to use these attributes and make decisions:

if person.nickname =~ /^DJ/
      person.setup_playlist_preferences('house')
    else
      person.setup_playlist_preferences('classical')
    end

In the sample code above we're using a method setup_playlist_preferences which accepts a single argument. The decision about what value to set is made outside of the person object. As additional options are added to the system, this if may have elsif clauses added to it or it may turn into a case statement. With public attributes, those changes can appear in multiple places in your system, which can lead to headaches when the structures of your objects change.

Alternatively, we could command the object to do what we want:

person.setup_playlist_preferences

Any decision about what to do to setup preferences can be made inside the setup_playlist_preferences method.

Here's a summary from the C2 wiki

Very very short summary: It is okay to use accessors to get the state of an object, as long as you don't use the result to make decisions outside the object. Any decisions based entirely upon the state of one object should be made 'inside' the object itself.

One way to prevent decisions about an object from being made outside that object is to limit the public information:

class Person
      private
      attr_reader :name, :nickname, :gender
    end

If you like this, check out the new book I'm writing: Ruby DSL Handbook which is currently half-off the final price. It's designed to be a guide to help you make decisions about how to build your own DSL and compressions of concepts.

Jim Gay

February 10, 2015

The 4 Rules of East-oriented Code: Rule 1

Jim Gay

February 10, 2015

4 simple rules are pretty easy to remember, but a bit harder to understand and apply.

A key concept of East-oriented Code is to enforce the use of commands by returning the object receiving a message.

Here's a simple example of what that looks like:

def do_something
      # logic for doing something omitted...
      self
    end

It's incredibly simple to follow.

Here are the rules I set forth in my presentation at RubyConf:

Always return self
Objects may query themselves
Factories are exempt
Break the rules sparingly

The first three are hard rules. The fourth, obviously, is more lenient. We'll get to some guidance on breaking the rules in the future but for now let's look at applying this to your code.

Rule 1: Always return self

Although this rule is simple at first, it inevitably leads to the queston of getter methods.

What if your objects had no getters? What if an object's name attribute simply was inaccessible to an external object?

You can make your data private by either marking your attr_accessors as private:

attr_accessor :name
    private :name

Or you can use the private method to mark all of the following defined methods to be private:

private
    attr_accessor :name

How you choose to do it will depend upon your code, but this would help you remove any getter methods.

Now this leaves you with a conundrum. How do you use the information?

If you have a need for that name, what can you do?

The only answer is to create a command which will apply the data to the thing you need.

def apply_name_to(form)
      form.name = name
      self
    end

The restricitions we put in our code are often self-imposed.

We can make whatever we want, so what's to stop us from putting Rails model data manipulation in it's view template? Nothing concrete stops us from doing so.

The same goes for getter methods like name. If it is publicly accessible by external objects, then we can create whatever if and case statements we want. We can put logic wherever we want.

If we create our own restrictions, we can guide ourselves and other programmers to the direction we intend for our application's structure.

Creating restrictions

I've written about the Forwardable library in the past not only because of it's usefulness, but because we can copy the same pattern to create our own DSL.

Forwardable provides methods which create getter methods for related objects. But what if we created our own DSL for commands to related objects? What if we could pass the messages on, but allow the related object to handle the values?

Here's what that could look like:

class Person
      command :send_email => :emailer
    end
    person = Person.find(1) # get some record
    person.emailer = Emailer.get # get some object to handle the emailing
    person.send_email

That's a lot of pseudo-code but the parts we care about are sending the command to a related object. Commands return the receiving object, queries will return a value.

Here's what that code would look like without our (yet unimplemented) command DSL.

class Person
      def send_email
        emailer.send_email
        self
      end
    end

Any code which uses a Person will have to rely on the command to do its own thing. This prevents a programmer from leaking logic out of the person.

What should happen when the email is sent? With the structure above, this code, can't make decisions:

if person.send_email
      # do one thing
    else
      # this will never work now
    end

If you find that you often write code like the if statement above, you might wonder "where does that logic go now?" Now, you'll be forced to write this code:

person.send_email

And this means that your send_email now has the job of handling what to do:

class Person
      def send_email
        emailer.send_email
        # do some other things...
        self
      end
    end

That might provide you with better cohesion; the related behaviors remain together.

Getting back to that command DSL we used above...

This was the final point of my presentation at RubyConf: you can build guidlines like this for yourself.

I created a gem called direction to handle enforcing this East-oriented approach. I'll write more about that later, but it shows that I can create signals to other developers on my team. I can take a simple concept like a command and simplify my code to show other developers what's happening:

class Person
      command :send_email => :emailer
    end

Building a DSL can aid in communication. The language and terminology we use can compress ideas into easily digestible parts.

If you like this, check out my new book: Ruby DSL Handbook designed to be a guide to help you build your own compressions of concepts.

When should I fix this?

So when should we fix it?

Can't we document it?

Let's add a deprecation warning

We don't have time. Let's skip the tests.

Now what?

My project is different...

Get advice from others

Get advice from your code

Fix your code now

1. How many stars/watchers does it have?

2. How many active issues or pull requests are there?

3. When was the most recent release?

4. How many of its own dependencies does it have?

5. Are the maintainers friendly?

6. How many forks with un-merged changes exist?

7. Does it have documentation?

8. The crazy one: Can you understand the code?

Balancing the need for new features

Bridging the platform gap

Build around behavior, not versions

Preparing for the future

Prepare for the future by implementing your own

Applying new behavior

Why not just use...

Creating the shortest path to understanding

Where to look and where to understand

Defending my focus

Maintaining focus means removing distractions

When we share our understanding, we are free to be expressive

Learn and do by focusing on what matters

Building clear and explicit methods

Creating a custom initializer

Custom initializer with our custom tool

Building better solutions with your team in mind

Dangerous combination: implicit dependencies and confusing failures

Built-in dependency warning system

Everything in it's right place

No longer possible to forget

What this means for other developers

Ready to go

Generalizing your solution

Overcoming specific requirements in generalizations

Creating a custom general class

Knowing how to initialize

A thin, slice between you and the background.

Make decisions about background processing without rearranging your code

Saving for later

Imperative approach

Eastward flow

Introducing immutability

Controlling the constructor

Operating on values

Following the rules so you know when to break them

Things to note

What about lambdas and procs?

Factories are Exempt

Rule 1: Always return self

Creating restrictions