Ruby, Cron and Instagram

Instagram is an effective visual medium to help you keep your marketing effort in high gear. But you need a way to tie it to all your other online social activities so that your followers can get a centralized view. For example, you may want to link your latest four Instagram postings to your blog by offering a picture and a link for each entry on your blog’s landing page.

This article describes how you can use Instagram’s API to capture your latest activity and periodically (e.g.; every 15 minutes) update your website. It uses Ruby to do the hard work, and runs periodically using a Linux cron job.

Before you can access Instagram’s API, you will need to register your application and get two critical pieces of information – client_id and client_secret – from Instagram. This is the easy part. Just head over to the Instagram API site at http://instagram.com/developer/ and register your application.

Be sure to pay attention to the API terms of use. One of the terms says, “Do not abuse the API. Too many requests too quickly will get your access turned off.” This gave me some concern and colored how I implemented this solution. The way I read it, you don’t want to hit the Instagram API every time someone lands on your site, or you’ll run the risk of falling out of favor and losing your API access privileges. One way to conform to this rule is to simply have the website server access the API once every fifteen minutes, and then cache the results. When someone lands on your site, you simply provide the cached information. This keeps the load on Instagram to four accesses per hour, which, by my definition, is pretty reasonable.

So here’s the scheme: You have the server access the API and request the most recent four postings on the target account, along with a link to the Instagram post. In short, you’re simply asking for a link to the JPG files and their associated URL addresses. Simple, yes?

Indeed it is! But there are a few challenges. One such challenge for me was to figure out the ID of the client account. The ID is a simple 9-digit number that uniquely identifies the account to the API. My first instinct was to contact my client and ask. My second instinct was to find it myself. The latter instinct is probably more efficient, as most users would not necessarily have that information readily available. It turns out that you only need to access the Instagram page (e.g.; www.instagram.com/my_client_page) and then right-click on the browser to get a listing of the page source. Then you just have to look for something that looks like this:

"username":"my_client_page","profile_picture":"http:\/\/images.ak.instagram.com\/profiles\/profile_123456789_75sq_nnnnnnnnnn.jpg","id":"123456789","full_name":"My client"}

The ID number, in this case 123456789, appears within the “id:” JSON designation as well as part of the profile picture.

Armed with this intelligence, you’re now ready to implement the Ruby program as a cron. Below is the complete Ruby listing. Notice that you’ll need the “rest-client” and “json” gems.

#! /usr/local/rvm/rubies/ruby-2.1.1/bin/ruby
# instagram_update.rb
# This file is called by a cron job to update the instagram information within the instagram.json file

require 'rest_client'
require 'json'

class InstagramUpdate

  def initialize
    @instagram_api = 'https://api.instagram.com/v1/users/123456789/media/recent/?client_id=121d0b9f4b9049dbc4bfe65598981b8e&count=4'
    @instagram_output_file = '/home/sinatra/instagram.json'
    @instagram_info = Array.new
    # ...
  end

  def get_instagram
    begin
      response = RestClient.get @instagram_api, {:accept => :json}
      case response.code
        when 200
          info = JSON.parse(response.body, :symbolize_names => true)
          info[:data].each do |insta|
            @instagram_info.push({
              link: insta[:link],
              image: insta[:images][:low_resolution][:url]
            })
          end
          File.write(@instagram_output_file,@instagram_info.to_json)
        else
          # puts "FAILED!"
      end
    rescue
      # Do nothing, do not update file
    end
  end

end

instagram = InstagramUpdate.new
instagram.get_instagram

The “pound-bang” on the first line should indicate the location of your Ruby interpreter.

Afterwards, we create a special class, called InstagramUpdate, with only two methods: initialize and get_instagram.

The initialize method contains the ID of the client account as part of the API URL. It also contains the special client_id value that you get when registering your application with Instagram. (Obviously, the values in this example have been changed!) Notice that the URL contains a “count” value, limiting the response to the most recent four entries. The initialize method also contains a pointer to the output JSON file, and it creates a new array (instagram_info), where all the information from the Instagram API will be stored until it’s written out to the JSON file.

The get_instagram method uses the rest-client gem to perform a simple call to the API and collect the results within the “response” variable. At this point, the program has been greatly simplified for this example, but you should be able to see where it’s heading. It checks the response code, and only if it is a “200 Okay” will it proceed to gather the results into the instagram_info array.

Finally, at the bottom of this file, we instantiate the class and call the get_instagram method to acquire and store the information from Instagram.

Since I’m using Sinatra as the web server environment, I need only access the stored JSON file to get the information to display on the web page. You could do the same with Ruby on Rails.  The code below shows how I stored the Instagram information into a Ruby hash, where it can be accessed while servicing a browser request.

get '/' do
  # ...
  @instagram_info = JSON.parse(IO.read('instagram.json'), :symbolize_names => true)
  # ...
  haml :index
end

We could further optimize the code by caching the @instagram_info variable, thus saving an access to disk with each user request. (That’s something I’ll work on when further optimizing the site.)

The index.html.haml file then uses this information as follows:

        #instagram_container
          - @instagram_info.each do |instagram|
            .instagram_element
              %a{href:"#{instagram[:link]}",'target'=>'_blank'}
                %img{src:"#{instagram[:image]}",width:'242px'}

One final note about running cron with Ruby. I ran into a snag, because the execution environment for cron is not the same as my command line and Sinatra environments. I’m using CentOS on a RackSpace virtual server, so the following process was used to set up for a cron every 15 minutes.

    1. Use “crontab -e” to set up a schedule for your specific cron job. The entry I created in my job file looked like this:
      */15 *  *  *  * /home/sinatra/instagram_update.rb
    2. Run “rvm cron setup” to have the RVM set up your cron environment to run your cron job

The JSON file should now update every 15 minutes, on the quarter-hour.

Enjoy!

Dan