Ruby: Rate limiting concurrent downloads

Yesterday an interesting question was posed on stackoverflow, how to ensure your script doesn't scrape a website or API too fast when making concurrent requests. Like so many interesting questions, this one was deemed to be not a real question and closed by moderators. So today I'll share my thoughts on the subject here. Let's avoid the complication of using Event Machine for this one, which, I could argue, creates as many problems as it solves.

First we're going to set up a queue and some variables. We'll use open-uri for the downloads to make it easy:
require 'open-uri'

queue = [

num_threads = 3 # more is better, memory permitting
delay_per_request = 1 # in seconds

Next we create our threads and give them something to do. In a real script you'll need them to do something interesting but for this purpose they will just print out the url and response body size:
threads = []

num_threads.times do
  threads << do
    Thread.exit unless url = queue.pop
    puts "#{url} is #{open(url).read.length} bytes long"

Now that we have our threads we want to 'join' them. We also want to time them to see how long they took:
start =
threads.each{|t| t.join}
elapsed = - start

If they finished too quickly we need to take a short nap, otherwise we're free to continue processing the queue
time_to_sleep = num_threads * delay_per_request - elapsed
if time_to_sleep > 0
  puts "sleeping for #{time_to_sleep} seconds"
  sleep time_to_sleep

Ok, so now it's time to put it all together and process the queue in a loop.
require 'open-uri'

queue = [

num_threads = 3 # more is better, memory permitting
delay_per_request = 1 # in seconds

until queue.empty?
  threads = []

  num_threads.times do
    threads << do
      Thread.exit unless url = queue.pop
      puts "#{url} is #{open(url).read.length} bytes long"

  start =
  threads.each{|t| t.join}
  elapsed = - start

  time_to_sleep = num_threads * delay_per_request - elapsed
  if time_to_sleep > 0
    puts "sleeping for #{time_to_sleep} seconds"
    sleep time_to_sleep

If you found this useful, let me know.

