Ruby Background Tasks with Starling – Part 2
EDIT: The latest workling on github has all of the changes described here. There is no need to download my patch anymore. See github.
In my previous post, I briefly described how simple it is to add background processing to your RoR app with Starling and the Workling plugin. Now, let’s discuss the changes I made to the Workling plugin, and then get everything deployed and monitored. As good as the Workling plugin is, it had one limitation that hurt me. At Inquisix, I have 5 workers to handle importing contacts, stats/logging, emails, searching, etc. Most methods are pretty quick, but a large import process could take 5-10 minutes to complete. With the original Workling, that meant all my other background tasks would wait, and I didn’t want that. The easiest solution for me was to modify Workling so each worker polled its queues in a thread. That way, if the contact worker was busy importing 1000 contacts, my emails still got delivered in a reasonable amount of time.
Most of the changes to Workling, are limited to the lib/workling/starling/poller.rb file. Here is a summary of the changes I made:
- Added threading
- Updated the way ClassAndMethodRouting builds the routing hash. I needed a routing hash per worker. Note, whatever you do don’t ever try to call routing.build from inside another thread. Really strange things happen that were very difficult to track down. Ruby threads are nice, but I’m used to real threads.
- Added handling for MemCache exceptions. During development, I went to dinner and left the listener running without Starling. When I got back, my Mac was complaining the disk was full because my log was > 40Gig. Now, I explicitly catch MemCache exceptions and wait 30 seconds before trying again. The log will still grow, but at least not as fast. Besides, I will show you how to make sure Starling stays running.
- Keep your database connection alive. I found this out after leaving for the night. I thought I had a working system when I left, but nothing worked in the morning. Basically, ActiveRecord will drop your connection if you don’t do anything for a while. Normally, this is not a problem in web apps because the connection is updated every time you access a page, but it doesn’t work that way here. You have to do it yourself. Add this in your loop to be sure:
unless ActiveRecord::Base.connection.active? unless ActiveRecord::Base.connection.reconnect! RAILS_DEFAULT_LOGGER.fatal("FAILED - Database not available") break end end
Here is my workling patch. Now, I need to go back and update the tests, but the threads complicate that. It’s a learning experience…
That’s it for now. Next time, I will walk through the deployment and monitoring process as I have it.
EDIT: The latest workling on github has all of my changes in it. There is no need to download the patch.

30/03/2008 at 8:52 am Permalink
One other change I forgot to mention. In ClassAndMethodRouting, I changed the class/method split character from ‘:’ to ‘__’. The reason is that the MemCacheClient#stats method uses ‘:’ when it builds the stats. If ‘:’ is in the queue name, it gets confused and you get no information about your queues.
01/04/2008 at 9:29 am Permalink
hey dave, thanks for the great writeups! i’m glad to see workling is doing it for you
. i’ll look at getting the stuff you did into workling asap (as well as moving it to github). i’ll probaby have time for this on the 08.04 – we’re working on the launch of boomloop.com so i’m flat out atm!
29/04/2008 at 9:38 pm Permalink
I made a couple of minor changes to my Workling patch. First, if there is an exception, I reset the memcache connection rather than creating a new Workling::Starling::Client. This fits more into how memcache clients should work. Second, I added a call to Thread.pass in Poller#dispatch! after each message is processed. This insures that after each message is processed another thread gets a chance. Probably not necessary, but it made me feel better.
On another note, my system has processed several million messages so far using this scheme without any issues. Unfortunately, I can’t give you an exact number because I booted my server for other reasons (I usually boot after system updates – old Windows habit).
10/06/2008 at 10:28 pm Permalink
Updated the patch again to be a ZIP instead of an svn patch. Some people were having problems with that.
18/06/2008 at 1:17 am Permalink
Any idea why my worker functions would be called in a loop?
I have exactly one call:
BgWorker.asynch_set_enable(:ref => 1)
in my bg_worker.rb file I have
class BgWorker <1, :uid=>”bg_workers:set_enable:61bb317b03aaf804bdbc21ce635a779e”})
called in a loop. I don’t understand why. This worked before, and now runs in a loop.
Thanks
18/06/2008 at 2:16 am Permalink
aaaahhhh… Starling was running on the wrong port !!!
18/06/2008 at 9:06 am Permalink
I’m curious how the wrong port caused a loop.
18/06/2008 at 11:48 am Permalink
As an FYI, the original Working author has integrated my changes as well as several other improvements. I will be switching back to his very soon. Because everything is on git now, it’s easier to suggest new additions so everyone can get the benefit.
The author’s site: http://playtype.net/past/2008/2/6/starling_and_asynchrous_tasks_in_ruby_on_rails/
The github for the sources: git://github.com/purzelrakete/workling.git
27/06/2008 at 11:38 pm Permalink
Workling seems to poll memcache for starling messages every two seconds. This means a delay before a worker starts on a job.
Is this sleep between polls necessary?
Is there anything else I can do to minimize the delay in jobs starting?
Thanks Andy
28/06/2008 at 12:01 am Permalink
default poll is 1 second. Pretty fast.
Adding sleep_time to config/starling.yml can make it faster.
Making it faster drives the cpu load up.
development:
listens_on: localhost:22122
sleep_time: 0.001
28/06/2008 at 7:45 am Permalink
Balancing sleep_time with CPU is pretty much your only option. Since Starling is a really simple queuing system, it only supports polling. We could probably come up with a signaling scheme to prevent the polling and make jobs start immediately, but that is not the case that starling/workling were meant to solve.
23/07/2008 at 7:52 pm Permalink
I’m wondering if threading is the best direction to go for this. I’ve been looking around, and it seems like rails/activerecord threading generally meant a world of hurt (is this view outdated?).
23/07/2008 at 8:35 pm Permalink
I’ve had it running for months without a single thread-related issue. The trick is to make sure you have:
ActiveRecord::Base.allow_concurrency = true
when you startup your thread. To be complete, you should also have:
ActiveRecord::Base.verify_active_connections!
at the end of your thread.
Don’t ever try to have threads in your web application, but I haven’t seen problems with daemons.
10/02/2009 at 2:37 pm Permalink
I can’t find any documentation on the starling.yml file? I only have a workling.yml. What options are available? How do I start starling and reference the .yml?
Thanks!
10/02/2009 at 3:45 pm Permalink
I always use the command line rather than a YML file because I didn’t see any doc either. However, I looked through the code and it looks like the format would be:
Then, run with starling –config starling.yml
I haven’t tried this, so YMMV. Check out load_config_file and parse_options in starling-0.9.8/lib/starling/server_runner.rb. It looks like load_config_file simply maps yml names to symbols, but it maps queue_path –> path and log_file –> logger.
Let me know if this works.
11/02/2009 at 8:12 pm Permalink
Dave,
I am running into a really weird situation. Starling process is not running at all, but workling task seems to be running in endless loop..any thoughts? In workling.yml, we specified starling server as localhost.
Thanks!
11/02/2009 at 10:35 pm Permalink
Very interesting. I would have expected that without Starling running, workling throws and error and stops. Was Starling running when you started Workling, or did Starling stop while Workling was running? In the later case, Workling will keep trying to talk to Starling, but it will pause for a while between tries.
11/02/2009 at 10:51 pm Permalink
Ya, we never had starling running at any point, thing is we dont have much control over the server and not sure how it was set up. But I dont think starling was ever started there, no folder where the task files are stored like /var/spool/starling. Anyway, Ram had mentioned in the comments above he had similar issues, but he had starling running on wrong port, in our case we dont have it running at all…..tried to reproduce it locally and appropriately workling complains it can’t find starling in specified port. Anyway, we will try to debug more, thanks.
11/02/2009 at 10:54 pm Permalink
I know there is code in there to quit on startup if Starling isn’t running. I just tried it locally, and Workling quit immediately when Starling was not running. Are you running the latest version from Github?
12/02/2009 at 5:43 pm Permalink
We just had the latest github version about 2 months ago, since then haven’t updated. To me it seems like, there was no starling, no spawn, bjrunner running, so looking at the plugin code, it invoked notremoterunner section of the code .. still doesn’t explain why it went in a loop. But now we started the starling and it’s working fine.
12/02/2009 at 7:07 pm Permalink
Very interesting. The behavior I see in that case is for workling to write something in the log and exit immediately.
23/03/2009 at 8:27 am Permalink
@Dave:
If I’ve understood your posts correctly, you indicated the latest version of workling now includes a facility to create concurrent threads in the case of long running jobs. Is there something special I need to do to get that working? My workling runs fine but still only serially processes jobs from the same worker. For example, I have a report_queue_worker which routinely spins off long-running jobs. All other jobs seem to still wait until the last one finishes.
23/03/2009 at 9:27 am Permalink
Workers are still single-threaded. The concurrency level is only at the worker level. Essentially, each worker class gets its own thread.