Ruby Background Tasks with Starling - Part 2
In my previous post, I briefly described how simple it is to add background processing to your RoR app with Starling and the Workling plugin. Now, let’s discuss the changes I made to the Workling plugin, and then get everything deployed and monitored. As good as the Workling plugin is, it had one limitation that hurt me. At Inquisix, I have 5 workers to handle importing contacts, stats/logging, emails, searching, etc. Most methods are pretty quick, but a large import process could take 5-10 minutes to complete. With the original Workling, that meant all my other background tasks would wait, and I didn’t want that. The easiest solution for me was to modify Workling so each worker polled its queues in a thread. That way, if the contact worker was busy importing 1000 contacts, my emails still got delivered in a reasonable amount of time.
Most of the changes to Workling, are limited to the lib/workling/starling/poller.rb file. Here is a summary of the changes I made:
- Added threading
- Updated the way ClassAndMethodRouting builds the routing hash. I needed a routing hash per worker. Note, whatever you do don’t ever try to call routing.build from inside another thread. Really strange things happen that were very difficult to track down. Ruby threads are nice, but I’m used to real threads.
- Added handling for MemCache exceptions. During development, I went to dinner and left the listener running without Starling. When I got back, my Mac was complaining the disk was full because my log was > 40Gig. Now, I explicitly catch MemCache exceptions and wait 30 seconds before trying again. The log will still grow, but at least not as fast. Besides, I will show you how to make sure Starling stays running.
- Keep your database connection alive. I found this out after leaving for the night. I thought I had a working system when I left, but nothing worked in the morning. Basically, ActiveRecord will drop your connection if you don’t do anything for a while. Normally, this is not a problem in web apps because the connection is updated every time you access a page, but it doesn’t work that way here. You have to do it yourself. Add this in your loop to be sure:
unless ActiveRecord::Base.connection.active? unless ActiveRecord::Base.connection.reconnect! RAILS_DEFAULT_LOGGER.fatal("FAILED - Database not available") break end end
Here is my workling patch. Now, I need to go back and update the tests, but the threads complicate that. It’s a learning experience…
That’s it for now. Next time, I will walk through the deployment and monitoring process as I have it.
March 30th, 2008 at 8:52 am
One other change I forgot to mention. In ClassAndMethodRouting, I changed the class/method split character from ‘:’ to ‘__’. The reason is that the MemCacheClient#stats method uses ‘:’ when it builds the stats. If ‘:’ is in the queue name, it gets confused and you get no information about your queues.
April 1st, 2008 at 9:29 am
hey dave, thanks for the great writeups! i’m glad to see workling is doing it for you :). i’ll look at getting the stuff you did into workling asap (as well as moving it to github). i’ll probaby have time for this on the 08.04 - we’re working on the launch of boomloop.com so i’m flat out atm!
April 29th, 2008 at 9:38 pm
I made a couple of minor changes to my Workling patch. First, if there is an exception, I reset the memcache connection rather than creating a new Workling::Starling::Client. This fits more into how memcache clients should work. Second, I added a call to Thread.pass in Poller#dispatch! after each message is processed. This insures that after each message is processed another thread gets a chance. Probably not necessary, but it made me feel better.
On another note, my system has processed several million messages so far using this scheme without any issues. Unfortunately, I can’t give you an exact number because I booted my server for other reasons (I usually boot after system updates - old Windows habit).
June 10th, 2008 at 10:28 pm
Updated the patch again to be a ZIP instead of an svn patch. Some people were having problems with that.
June 18th, 2008 at 1:17 am
Any idea why my worker functions would be called in a loop?
I have exactly one call:
BgWorker.asynch_set_enable(:ref => 1)
in my bg_worker.rb file I have
class BgWorker <1, :uid=>”bg_workers:set_enable:61bb317b03aaf804bdbc21ce635a779e”})
called in a loop. I don’t understand why. This worked before, and now runs in a loop.
Thanks
June 18th, 2008 at 2:16 am
aaaahhhh… Starling was running on the wrong port !!!
June 18th, 2008 at 9:06 am
I’m curious how the wrong port caused a loop.
June 18th, 2008 at 11:48 am
As an FYI, the original Working author has integrated my changes as well as several other improvements. I will be switching back to his very soon. Because everything is on git now, it’s easier to suggest new additions so everyone can get the benefit.
The author’s site: http://playtype.net/past/2008/2/6/starling_and_asynchrous_tasks_in_ruby_on_rails/
The github for the sources: git://github.com/purzelrakete/workling.git
June 27th, 2008 at 11:38 pm
Workling seems to poll memcache for starling messages every two seconds. This means a delay before a worker starts on a job.
Is this sleep between polls necessary?
Is there anything else I can do to minimize the delay in jobs starting?
Thanks Andy
June 28th, 2008 at 12:01 am
default poll is 1 second. Pretty fast.
Adding sleep_time to config/starling.yml can make it faster.
Making it faster drives the cpu load up.
development:
listens_on: localhost:22122
sleep_time: 0.001
June 28th, 2008 at 7:45 am
Balancing sleep_time with CPU is pretty much your only option. Since Starling is a really simple queuing system, it only supports polling. We could probably come up with a signaling scheme to prevent the polling and make jobs start immediately, but that is not the case that starling/workling were meant to solve.
July 23rd, 2008 at 7:52 pm
I’m wondering if threading is the best direction to go for this. I’ve been looking around, and it seems like rails/activerecord threading generally meant a world of hurt (is this view outdated?).
July 23rd, 2008 at 8:35 pm
I’ve had it running for months without a single thread-related issue. The trick is to make sure you have:
ActiveRecord::Base.allow_concurrency = true
when you startup your thread. To be complete, you should also have:
ActiveRecord::Base.verify_active_connections!
at the end of your thread.
Don’t ever try to have threads in your web application, but I haven’t seen problems with daemons.