One of the most harmful things about people discussing the performance of web applications is the key metric that we use. Requests per second seems like the obvious metric to use, but it has several insidious characteristics that poison the discourse and lead developers down ever deeper rabbit holes chasing irrelevant gains. The metric prevents us from doing A/B comparisons, or discussing potential improvements without doing some mental arithmetic which appears beyond the capabilities of most of us.
Instead of talking about requests per second, we should always be focussed on the duration of a given request. It’s what our users notice, and it’s the only thing which gives us a notice.
I should prefix the remaining discussion here by saying that most of it does not apply to discussing performance problems at the scale of facebook, google or yahoo. The thing is, statistically speaking, none of you are building applications that will operate at that scale. Sorry if I’m the one who broke this to you, but you’re not building the next google :).
I should also state that requests per second is a really interesting metric when considering the throughput of a whole cluster. But throughput isn’t performance.
Diminshing marginal returns
The biggest problem I have with requests per second is the fact that developers seem incapable of knowing when to stop optimising their applications. As the requests per second get higher and higher, the improvements become less and less relevant. This lets us think we’ve defeated that pareto guy, while we waste ever-larger amounts of our employers’ time.
Let’s take two different performance improvements and compare them using both duration and req/s.
| A | 120 req/s | 300 req/s | 180 req/s |
| B | 3000 req/s | 4000 req/s | 1000 req/s |
As you can see, when you use req/s as your metric, change B seems like a MUCH bigger saving. It improves performance by 1000 requests a second instead of that measly 180, give that guy a raise! But let’s see what happens when we switch to using durations:
| A | 8.33 ms | 3.33 ms | 5 ms |
| B | 0.33 ms | 0.25 ms | 0.08 ms |
You see that the actual changes in duration in B is vanishingly tiny. 8% of one millisecond! Odds are that that improvement will vanish into statistical noise when compared to the latency of your network, or your user’s internet connection.
But when we use requests per second, that 1000 is so big and enticing that developers will do almost anything to get it. If they used durations as their metric, they’d probably have spent that time implementing a neat new feature, or responding to customer feedback.
Deltas become meaningless
A special case of my first complaint is that with requests per second the deltas aren’t meaningful without knowing the start and the finish points. As I showed above, a 1000 req/s change could be a tiny change, but it could also be an amazing performance coup. Take this next example:
| 1 req/s | 1001 req/s | 1000 req/s |
When expressed as durations you can see that it made a huge difference
| 1000 ms | 0.99 ms | 999.01 ms |
So 1000 requests per second could either be irrelevant, or fantastic. Durations don’t have this problem at all. 0.02ms is obviously questionable, and 999.01 ms is an obvious improvement.
This problem most commonly expresses itself when people say “that changeset took 50 requests per second off my application”. Without the before and after numbers, we can’t tell if that’s a big deal, or if the guy needs to take a deep breath and get back to work.
The numbers don’t add up
Finally, requests per second don’t lend themselves nicely to arithmetic, and make developers make silly decisions. The most common case I see this is when comparing web servers to put in front of their rails applications. The reasoning goes something like this:
Nginx does over 9000 requests per second, and apache only does 6000 requests per second!! I’d better use nginx unless I want to pay a 3000 requests per second tax.
When people do this comparison they seem to believe that by switching to nginx from apache their application will go from 100 req/s to 3100 req/s. As always, durations tell us a different story.
| 6000 req/s | 9000 req/s | 3000 req/s |
| 0.16 ms | 0.11 ms | 0.05 ms |
So we can see that odds are you’ll only gain a 5% of a millisecond’s improvement when switching. Perhaps that improvement is worthwhile for your application, but is it worth the additional complexity?
Conclusion
Durations are a much more useful, and more honest, metric when comparing performance changes in your applications. Requests per second is too wide-spread for us to stop using it entirely, but please don’t use it when talking about performance of your web applications or libraries.
After more than 15 months since the last post, and constant questions from users, I’m finally ready to bring The Rails Way back from hibernation.
The challenge I had here was the amount of time involved. Review articles are incredibly time consuming, scouring an application for code to improve can take hours, making the changes takes time, and all of that is dependent on getting that perfect submission.
So while I intend to continue to do review pieces here (keep those submissions coming in) I’m going to extend the format here to include a few different kinds of posts. I’ll be doing some focussed introductory pieces which cover the best practices for a few tricky areas that I see experienced rails programmers getting wrong. I’ll also be doing a few ‘soapboxy’ pieces where I can address misinformation about Rails and Ruby or just advocate a particular piece of technology or code that I think is really cool.
One of my primary goals with this relaunch will be to post regularly, but I’m not going to try and stick to a schedule that takes the fun out of it for me. Some weeks might see multiple posts, and others will see none at all. I’m just hoping that you guys will enjoy most of them.
This refactoring is based on a topic Marcel and I covered at RailsConf Europe.
Before
123456789101112 | classExpense< ActiveRecord::Base belongs_to :payee protected# Nice and concise, but what happens as we add more rules# and how do we write test cases for the four different possible # validation states?defvalidate errors.add("Not enough funds") if payee.balance - amount > 0 errors.add("Charge is too great") if payee.account.maximum_allowable_charge > amountendend |
After
123456789101112131415161718192021222324252627282930313233343536 | classExpense< ActiveRecord::Base belongs_to :payee# Instead of one large validation method, break each individual# rule into methods, and declare them here. validate :ensure_balance_is_sufficient_to_cover_amount validate :amount_does_not_exceed_maximum_allowable_charge protected# These validation callbacks simply add error messages if a particular # condition is met. Each of them can be tested and understood on their own# without having to understand the entire body of the validate method.defensure_balance_is_sufficient_to_cover_amount errors.add("Not enough funds") if insufficient_funds?enddefamount_does_not_exceed_maximum_allowable_charge errors.add("Charge is too great") if exceeds_maximum_allowable_charge?end# By defining separate predicate methods we can test each of them individually# and new programmers can see the intent of the code, not just the implementation.# Instead of subtracting the amount from the balance and checking if the# value is greater than 0, change the implementation to mirror the intent.# There was a bug in the before code, this is more obvious.definsufficient_funds? amount > payee.balanceenddefexceeds_maximum_allowable_charge? payee.account.maximum_allowable_charge > amountendend |
While the refactored version may have more lines of code, but don’t let that scare you. It’s far more important for code to be human readable than incredibly concise.
Today I’m reviewing Joe Van Dyk’s monkeycharger application, which is a web-service for storing and charging credit cards. I loved looking at this app, because its only interface is a RESTful web service: there is no HTML involved. (If you’ve never written an app that only exposes a web-service UI, you ought to. It’s a blast.)
In general, Joe has done a fantastic job with keeping the controllers slim and moving logic to models. The only significant gripe I had with the application is that it is not ActiveResource compatible.
For those of you that are late to the party, ActiveResource is the newest addition to the Rails family. It lets you declare and consume web-services using an ActiveRecord-like interface…BUT. It is opinionated software, just like the rest of Rails, and makes certain assumptions about the web-services being consumed.
- The service must understand Rails-style REST URLs. (e.g. “POST /credit_cards.xml” to create a credit card, etc.)
- The service must respond with a single XML-serialized object (Rails-style).
- The service must make appropriate use of HTTP status codes (404 if the requested record cannot be found, 422 if any validations fail, etc.).
It’s really not much to ask, and working with ActiveResource (or “ares” as we affectively call it) is a real joy.
However, monkeycharger tends to do things like the following:
12345678910 | classAuthorizationsController< ApplicationControllerdefcreate@credit_card = Authorizer.prepare_credit_card_for_authorization(params) transaction_id = Authorizer::authorize!(:amount => params[:amount], :credit_card => @credit_card) response.headers['X-AuthorizationSuccess'] = true render :text => transaction_idrescueAuthorizationError => e render :text => e.messageendend |
Three things: the request is not representing an “authorization” object, the response is not XML, and errors are not employing HTTP status codes to indicate failure.
Fortunately, this is all really, really easy to fix. First, you need (for this specific example) an Authorization model (to encapsulate both the the XML serialization and the actual authorization).
1234567891011121314151617181920 | classAuthorization attr_reader :attributesdefinitialize(attributes)@attributes = attributesenddefcredit_card@credit_card ||= Authorizer.prepare_credit_card_for_authorization(attributes)enddefauthorize!@transaction_id = Authorizer.authorize!(:amount => attributes[:amount],:credit_card => credit_card)enddefto_xml { :transaction_id => @transaction_id }.to_xml(:root => "authorization")endend |
Then, we rework the AuthorizationsController to use the model:
12345678 | classAuthorizationsController< ApplicationControllerdefcreate authorization = Authorization.new(params[:authorization]) authorization.authorize! render :xml => authorization.to_xml, :status => :createdrescueAuthorizationError => e render :xml => "<errors><error>#{e.message}</error></errors>", :status => :unprocessable_entityend |
(Note the use of the “created” status, which is HTTP status code 201. Other verbs just use “ok”, status code 200, to indicate success. Also, with an error, we return an “unprocessable_entity” status, which is HTTP status code 422. ActiveResource will treat that as a failed validation.)
With that change, you could now use ActiveResource to authorize a credit card transaction:
123456789101112 | classAuthorization< ActiveResource::Baseself.site = "http://my.monkeycharger.site"endauth = Authorization.new(:amount => 15, :credit_card_id => 1234,:remote_key => remote_key_for_card)if auth.save puts "success: #{auth.transaction_id}"else puts "error: #{auth.errors.full_messages.to_sentence}"end |
It should be mentioned, too, that making an app ActiveResource-compatible does nothing to harm compatibility with non-ActiveResource clients. Everything is XML, both ways, with HTTP status codes being used to report whether a request succeeded or not. Win-win!
Obviously, real, working code trumps theoretical whiteboard sketches every time, and Joe is to be congratulated on what’s done. Even though ActiveResource-compatibility can buy you a lot, you should always evaluate whether you really need it and implement accordingly.
I’m going to take a slightly different tack here, and review some of the unit tests in rails itself. They show up two common anti patterns, spurious assertions and coupling your tests to the implementation.
Perhaps the biggest benefit of a suite of unit tests is that they can provide a safety net, preventing you from accidentally adding new bugs or introducing regressions of old bugs. With a large codebase, the unit tests can also help new developers understand your intent, though they’re no substitute for comments. However if you’re not careful with what gets included in your test cases, you can end up with a liability.
Be careful what you assert
Whenever you add an assertion to your test suite you’re sending a signal to future developers that the behaviour you’re asserting is both intentional and desired. Future developers who try to refactor your code will see a failing test, and either give up, or waste time trying to figure out if the assertion is ‘real’ or whether it was merely added because that’s what the code happened to do at present.
For an example, take the test_hatbm_attribute_access_and_respond_to from associations_test.rb , especially the assertions that the project responds to access_level= and joined_on=. Because of the current implementation of respond_do?, those assertions pass. But should they?
In reality while those values will get stored in the object, they’ll never be written back to the database. This is a surprising result for some developers, and removing those accessor methods would go a long way to helping avoid some frustrating moments.
Mock and Stub with care
Mock object frameworks like flexmock and mocha make it really easy to test how your code interacts with another system or a third party library. However you should make sure that the thing that you’re mocking doesn’t merely reflect the current implementation of a method. To take a case from rails, take a look at setup_for_named_route in routing_test.rb.
It takes the seemingly sensible approach of building a stubbed-out implementation of url_for instead of trying to build a full implementation into the test cases. The stubbed version of url_for simply returns the arguments it was passed, this makes it extremely easy to work with and to test.
The problem is not with stubbing out the method, but in the way it is used in all the named route test cases. Take test_named_route_with_nested_controller.
123456 | deftest_named_route_with_nested_controller rs.add_named_route :users, 'admin/user', :controller => '/admin/user', :action => 'index' x = setup_for_named_route.new assert_equal({:controller => '/admin/user', :action => 'index', :use_route => :users, :only_path => false}, x.send(:users_url))end |
The strange hash value you see in the assertion is the result of the named route method calling url_for, and returning that. The current implementation of the named route helpers does this, but what if you wanted to implement a new version of named routes which completely avoids the costly call to url_for? Every single named route test fails, even though applications which use those methods will work fine.
In this situation you have two options, you could make your tests depend on the full implementation of url_for. This would probably slow down your test cases, and require a lot more setup code, but because the return values are correct you’re not likely to impede future refactoring.
The other option is to use different stubs for every test case. Leaving you with something like this:
1234567 | deftest_named_route_with_nested_controller rs.add_named_route :users, 'admin/user', :controller => '/admin/user', :action => 'index' generated_url = "http://test.named.routes/admin/user" x = setup_for_named_route.new x.stubs(:url_for).returns(generated_url) assert_equal(generated_url, x.send(:users_url))end |
Doing this for each and every test case is going to be quite time consuming and make your test cases extremely verbose. As with all things in software you’ll have to make a judgement call on this trade off and make a choice between coupling or verbosity.
Whatever approach you choose, just remember that misleading test ‘failures’ can slow down refactoring, and end up reducing your ability to respond to change. As satisfying as ‘100% coverage’ or 2:1 ratios may be, don’t blindly add assertions or mock objects just to satisfy a tool. Every line in your test cases should be there for a reason, and should be placed there with just as much care as you’d use for a line of application code.


