Archive for the ‘Development’ Category

A Useful UITableView Cell Creation Pattern

Saturday, December 19th, 2009

Like many iPhone apps, the app I’m currently working on uses several table views. Most of these display actual lists of data, and some are used as a convenient layout mechanism for input fields, navigation, and other UI elements (similar to iPhone preference screens).

UITableView and its associated classes like UITableViewCell, UITableViewDataSource, and UITableViewDelegate are very powerful, but they also require a fair amount of boilerplate code split across several methods. The specific matter that I am tackling in this post is the creation of cells, which happens inside the [UITableViewDataSource tableView:cellForRowAtIndexPath:] method. When dealing with only a single type of cell, it typically looks like this:

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath {
    // See if there's an existing cell we can reuse
    UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:@"Foobar"];
    if (cell == nil) {
        // No cell to reuse => create a new one
        cell = [[[UITableViewCell alloc] initWithStyle:UITableViewCellStyleDefault reuseIdentifier:@"Foobar"] autorelease];
 
        // Initialize cell
        cell.selectionStyle = UITableViewCellSelectionStyleBlue;
        cell.textLabel.textColor = [UIColor blueColor];
        // TODO: Any other initialization that applies to all cells of this type.
        //       (Possibly create and add subviews, assign tags, etc.)
    }
 
    // Customize cell
    cell.textLabel.text = [NSString stringWithFormat:@"Row: %d", indexPath.row];
    // TODO: Any other customization
    //       (Possibly look up subviews by tag and populate based on indexPath.)
 
    return cell;
}

As you can see, there’s a lot of boilerplate code here. This works well enough with one type of cell, but if you’re dealing with multiple types of cells (particularly in a grouped table view), this approach quickly gets out of hand. You end up with a long method with large, ugly switch statements. But if you look at this method closely, you’ll notice that there are only a few cell-specific areas:

  1. The cell identifier (MyCell in my example). This is used to look up and reuse existing cells (e.g. when scrolling through a large table of items) and avoids the costly creation of new cells every time. It’s a standard cell creation pattern for the iPhone and makes a lot of sense, but it also means that the cell specific code is spread across several places.

  2. The initialization code. This is where a cell of a given type is initialized for the first time. If you can get away with the standard cell styles (which cover a few different layouts of labels and images), you usually don’t need to do much here, besides setting your colors, fonts, and perhaps selection style. Otherwise, this is where you want to create and add your subviews, and assign a tag to them so you can populate them later.

  3. The customization code. Given a cell with the correct properties and subviews (which may have been reused or created during this call), this is where you populate it with the correct data. This typically involves looking up some sort of data based on the indexPath, and setting it either on the cell itself (using the textLabel, detailTextLabel, or imageView properties) or on its subview. The latter requires looking up the subviews using the tags you’ve assigned earlier.

With this in mind, I decided to factor out all the cell specific code, resulting in a generic method in my base view that can be used to create all the cells of my app. Here’s what that method looks like:

- (UITableViewCell *)createCellForIdentifier:(NSString *)identifier
                                   tableView:(UITableView *)tableView
                                   indexPath:(NSIndexPath *)indexPath
                                       style:(UITableViewCellStyle)style
                                  selectable:(BOOL)selectable {
    UITableViewCell *cell = [tableView dequeueReusableCellWithIdentifier:identifier];
    if (cell == nil) {
        cell = [[[UITableViewCell alloc] initWithStyle:style reuseIdentifier:identifier] autorelease];
        cell.selectionStyle = selectable ? UITableViewCellSelectionStyleBlue : UITableViewCellSelectionStyleNone;
 
        SEL initCellSelector = NSSelectorFromString([NSString stringWithFormat:@"initCellFor%@:indexPath:", identifier]);
        if ([self respondsToSelector:initCellSelector]) {
            [self performSelector:initCellSelector withObject:cell withObject:indexPath];
        }
    }
 
    SEL customizeCellSelector = NSSelectorFromString([NSString stringWithFormat:@"customizeCellFor%@:indexPath:", identifier]);
    if ([self respondsToSelector:customizeCellSelector]) {
        [self performSelector:customizeCellSelector withObject:cell withObject:indexPath];
    }
    return cell;
}

Structurally, this method is very similar to the previous one. It first tries to reuse an existing cell, creating and initializing a new one if it doesn’t find one. It then customizes the cell. You’ll notice that I’m using some performSelector calls here, coupled with a naming convention. For example, given an identifier of "Foobar", I will look for initCellForFoobar:indexPath and customizeCellForFoobar:indexPath to initialize and customize this cell respectively. A simple example might look like this:

- (void)initCellForFoobar:(UITableViewCell *)cell indexPath:(NSIndexPath *)indexPath {
    cell.textLabel.textAlignment = UITextAlignmentCenter;
    cell.textLabel.textColor = [UIColor blueColor];
    cell.textLabel.font = [UIFont boldSystemFontOfSize:16.0];
}
 
- (void)customizeCellForFoobar:(UITableViewCell *)cell indexPath:(NSIndexPath *)indexPath {
    cell.textLabel.text = [NSString stringWithFormat:@"Row: %d", indexPath.row];
}

Note that both methods are optional. In some cases (particularly in table views that are used for preferences or similar types of UI elements), there’s only a single cell of any given type, so I perform the complete initialization in initCellForFoobar and omit the customizeCellForFoobar method. In other cases, I may not require any special initialization and only have a customizeCellForFoobar method.

Obviously, the example above is trivial, and both methods can get significantly longer when dealing with custom subviews. In that case I am using the same tag based approach I mentioned above: Assign a tag to each subview inside initCellForFoobar, then look up the subview using the tag in customizeCellForFoobar. But the important thing is that the code is well factored, and the cell specific code is not mixed with boilerplate code.

Last not least, an example of the actual UITableViewDataSource method to create a new cell:

- (UITableViewCell *)tableView:(UITableView *)tableView cellForRowAtIndexPath:(NSIndexPath *)indexPath {
    NSString *identifier;
    BOOL selectable = YES;
    UITableViewCellStyle style = UITableViewCellStyleDefault;
 
    switch (indexPath.section) {
        case 0:
            identifier = @"Foo";
            break;
        case 1:
            identifier = @"Bar";
            selectable = NO;
            break;
    }
 
    return [self createCellForIdentifier:identifier
                               tableView:tableView
                               indexPath:indexPath
                                   style:style
                              selectable:selectable];
}

The example above is what I would typically use for a grouped table view, where each section contains a specific type of cell. But obviously this approach supports any type of table view, grouped or non-grouped. You just need to plug in your own logic to determine the type of cell, and leave everything else to the createCellForIdentifier:tableView:indexPath:style:selectable method we created earlier.

There are probably other approaches for simplifying working with table views, but this approach has worked very well for me. It really cuts down significantly on the amount of boilerplate code and allows me to focus on the actual application specific code.

Any questions, suggestions for further improving this, or perhaps alternative solutions that you’ve used? Leave a comment!

Building a Twitter Filter With Sinatra, Redis, and TweetStream

Sunday, November 8th, 2009

It’s been way too long since my last programming focused blog post, so let’s try to rectify this situation:

Background

A couple months ago, Twitter made available their Streaming API. This provides developers with a very efficient way to tap into the public Twitter stream. All you need to do is open and maintain a single HTTP connection, passing in a few filter parameters. Twitter then keeps streaming matching tweets to you. You have the option of either sampling the entire public stream, or passing in a list of keywords and / or user ids to track. In this post I will focus on the latter, but the basic usage remains the same.

My interest was piqued when I came across the excellent TweetStream library, which makes it trivially easy to write a Ruby client application for the Twitter Streaming API. I decided to take this opportunity to play with some other technologies and write a simple web app that displays a subset of tweets, along the lines of cursebird or twistori.

The app I came up with is Twatcher, so go check it out to get an idea of what I’m talking about. It (admittedly very crudely) identifies funny tweets by looking for tweets that contain the word “lol”. It then renders matching tweets using a simple UI not unlike that of twitter.com itself, and visually highlights the word “lol” in each tweet for emphasis. Perhaps most importantly, the app uses AJAX to periodically (currently every 10 seconds) pull in new tweets.

Architecture

In the remainder of this post, I will describe the architecture of Twatcher, along with the rationale behind it. I will also share some code snippets that should allow you to follow along and build your own Twitter filter app.

Given that the connection to the Twitter Streaming API has to happen in its own application (let’s call it our filter app), outside of the actual web app, we need a way for it to pass tweets to the web app. There are a couple of options here. Obviously we could store the tweets in a database like MySQL, and have the web app read them from there. But given the small schema (only tweets; we don’t care about users or any other relational data) and the ephemeral nature of the Twatcher app (at any given time we really only care about the N most recent tweets), this seems like overkill and leads to an unnecessarily write-heavy app. Instead, one of the various in-memory key/value stores seems like a much better fit. I first thought of memcached, but while it would be entirely possible to build this type of app on top of memcached, it’s not ideal. When a new tweet comes in from the Streaming API, we need to append it to our in-memory list of tweets. Memcached is a very low-level data store and only supports string values, so we would have to implement lists by serializing them as YAML, JSON, or binary Ruby objects. Either way would mean that instead of writing just the latest tweet, we would always have to re-write the entire list of tweets. Similarly, on the web app side, we would always have to read the entire list, even though we may only be interested in the 5 most recent tweets (say in our AJAX action). Combined, this would lead to a fair amount of overhead on both the networking as well as Ruby processing side.

Luckily, there’s another data store that is perfect for this type of app: Redis. On the most basic level it can be thought of as a key/value store like memcached, so it can act pretty much like a drop-in replacement for this. But it also has first class support for basic data structures, such as sets and lists. This means that instead of reading and writing entire lists of tweets, we can append a single tweet at a time, and we can efficiently retrieve the exact number of tweets that we need on the web app side (i.e. 20 for a full page view, and 5 for an AJAX request). Redis is stable, highly performant, and has a solid, extremely easy to use Ruby library. It also supports basic persistence, although we won’t need this for our app.

With the filter app and data store out of the way, that leaves the actual web app. Our requirements are very humble: We will only have two actions (one for the full page of tweets and one for AJAX updates), and perhaps a few more trivial actions in the future for things like help pages. Since we’re not using a relational database, we don’t need any sort of ORM layer. While we could use Ruby on Rails, this would mean shooting sparrows with cannons. For our purposes the Sinatra micro-framework seems like a much better fit.

I’m a big fan of HAML, so we’ll use this for our views. Of course there’s nothing HAML specific about our app, so you’re welcome to use ERB or your template language of choice instead.

We’ll use jQuery as our Javascript library, mainly for AJAX requests and basic visual effects (so we can smoothly slide new tweets into the existing page). Once more, our needs are simple, and I’m sure any of the popular Javascript frameworks would be more than up to the challenge. But my personal preference is jQuery.

I won’t go much into the deployment side of things, but twatcher.com relies on the usual suspects: Nginx (Apache would work fine as well), Passenger (you’re welcome to use Mongrel, Thin, etc.), Capistrano, and God (to start and monitor Redis and our filter app, though I may end up giving Bluepill a try). All of this runs very smoothly on a 256MB VPS slice on Webbynode (and I’m sure just as well on Slicehost or Linode). If necessary, we could easily scale up this app by bringing up additional Sinatra slices and adding HAProxy to the mix (or perhaps even just relying on DNS round robin).

Code

Now that the architectural overview is done, let’s take a look at some of the code. This isn’t the complete code base that I’m using on the site, but it’s a fully functional subset and hopefully enough to demonstrate the overall approach and get you started. Alternatively, you can grab the code from the twatcher-lite Github repository. I will eventually make the complete project (which includes configuration options, RSpec specs, etc.) available on Github as well.

But first a couple of prerequisites: We need to install a bunch of gems. For a production app, I would typically unpack these into the vendor directory, but for now let’s just install them system-wide:

sudo gem install tweetstream yajl-ruby ezmobius-redis json haml rack sinatra shotgun

You also need to install and start Redis. It’s easy enough, but beyond the scope of this blog post. Simply follow these instructions (I highly recommend the entire Redis article series by the way), but make sure to use the latest Redis release from the official website (currently 1.02) rather than the 1.0 version mentioned in the article.

Filter App

twitter_filter.rb:

This is the standalone filter app that mainly relies on the TweetStream library to retrieve tweets and then pushes them to Redis. In our final app we would want to use the Daemons library to run this app as a proper daemon, but for now you should be able to simply run it directly from the command line. Note that it relies on two additional files below. Simply place all of these into the same folder.

Make sure to set USERNAME and PASSWORD to your actual Twitter credentials. A word of caution: Apparently Twitter only allows a single Streaming API connection for standard accounts, and they will disconnect or blacklist you if you attempt to start multiple connections. I’m using a dedicated Twitter account for production, and my regular Twitter account during development. The actual version of this file that I’m using reads the (environment specific) credentials from a YAML file, but I didn’t want to distract from the core functionality for the purpose of this tutorial.

require 'tweetstream'
require File.join(File.dirname(__FILE__), 'tweet_store')
 
USERNAME = "my_username"  # Replace with your Twitter user
PASSWORD = "my_password"  # and your Twitter password
STORE = TweetStore.new
 
TweetStream::Client.new(USERNAME, PASSWORD).track('lol') do |status|
  # Ignore replies. Probably not relevant in your own filter app, but we want
  # to filter out funny tweets that stand on their own, not responses.
  if status.text !~ /^@\w+/
    # Yes, we could just store the Status object as-is, since it's actually just a
    # subclass of Hash. But Twitter results include lots of fields that we don't
    # care about, so let's keep it simple and efficient for the web app.
    STORE.push(
      'id' => status[:id],
      'text' => status.text,
      'username' => status.user.screen_name,
      'userid' => status.user[:id],
      'name' => status.user.name,
      'profile_image_url' => status.user.profile_image_url,
      'received_at' => Time.new.to_i
    )
  end
end

tweet_store.rb:

This is a thin abstraction layer on top of Redis that encapsulates both pushing and retrieving tweets. This allows us to keep Redis specific persistence code out of the filter and web apps and also comes in handy for testing (which I’m not getting into in this post), as we can easily swap it out for a mock implementation.

Note how we’re using the push_head operation to push a single tweet to Redis, and list_range to retrieve the N most recent tweets.

require 'json'
require 'redis'
require File.join(File.dirname(__FILE__), 'tweet')
 
class TweetStore
 
  REDIS_KEY = 'tweets'
  NUM_TWEETS = 20
  TRIM_THRESHOLD = 100
 
  def initialize
    @db = Redis.new
    @trim_count = 0
  end
 
  # Retrieves the specified number of tweets, but only if they are more recent
  # than the specified timestamp.
  def tweets(limit=15, since=0)
    @db.list_range(REDIS_KEY, 0, limit - 1).collect {|t|
      Tweet.new(JSON.parse(t))
    }.reject {|t| t.received_at <= since}  # In 1.8.7, should use drop_while instead
  end
 
  def push(data)
    @db.push_head(REDIS_KEY, data.to_json)
 
    @trim_count += 1
    if (@trim_count > 100)
      # Periodically trim the list so it doesn't grow too large.
      @db.list_trim(REDIS_KEY, 0, NUM_TWEETS)
      @trim_count = 0
    end
  end
 
end

tweet.rb:

The Tweet class wraps an individual tweet’s data hash and allows us to access the data using method call syntax (tweet.username) rather than hash element references (tweet['username']). It also contains some tweet related functionality, such as generating Twitter user links, highlighting the word “lol”, and making URLs clickable.

class Tweet
 
  def initialize(data)
    @data = data
  end
 
  def user_link
    "http://twitter.com/#{username}"
  end
 
  # Makes links clickable, highlights LOL, etc.
  def filtered_text
    filter_lol(filter_urls(text))
  end
 
  private
 
  # So we can call tweet.text instead of tweet['text']
  def method_missing(name)
    @data[name.to_s]
  end
 
  def filter_lol(text)
    # Note that we're using a list of characters rather than just \b to avoid
    # replacing LOL inside a URL.
    text.gsub(/^(.*[\s\.\,\;])?(lol)(\b)/i, '\1<span class="lol">\2</span>\3')
  end
 
  def filter_urls(text)
    # The regex could probably still be improved, but this seems to do the
    # trick for most cases.
    text.gsub(/(https?:\/\/\w+(\.\w+)+(\/[\w\+\-\,\%]+)*(\?[\w\[\]]+(=\w*)?(&\w+(=\w*)?)*)?(#\w+)?)/i, '<a href="\1">\1</a>')
  end
 
end

Web App

twatcher.rb:

This is the actual Sinatra web app. This is the entire app (minus the views), so perhaps now you can see why we’re using Sinatra instead of a full-blown Rails app. The views follow below.

Note that our two actions both return tweets. The main difference is that the /latest action (which is used by AJAX requests) only returns up to 5 tweets, and only if they’re newer than the specified date. It also omits the layout and specifies a special CSS class named latest. This allows us to initially hide the new tweets and then make them visible using a nice Javascript slide effect.

require 'sinatra'
require 'haml'
require File.join(File.dirname(__FILE__), 'tweet_store')
 
STORE = TweetStore.new
 
get '/' do
  @tweets = STORE.tweets
  haml :index
end
 
get '/latest' do
  # We're using a Javascript variable to keep track of the time the latest
  # tweet was received, so we can request only newer tweets here. Might want
  # to consider using Last-Modified HTTP header as a slightly cleaner
  # solution (but requires more jQuery code).
  @tweets = STORE.tweets(5, (params[:since] || 0).to_i)
  @tweet_class = 'latest'  # So we can hide and animate
  haml :tweets, :layout => false
end

views/layout.haml:

A pretty simple layout.

!!! Strict
%html{:xmlns=> "http://www.w3.org/1999/xhtml", 'xml:lang' => "en", :lang => "en"}
  %head
    %meta{'http-equiv' => "Content-Type", 'content' => "text/html; charset=utf-8"}
    %title twatcher
    %link{:rel => 'stylesheet', :href => '/stylesheets/style.css', :type => 'text/css'}
    %script{:type => 'text/javascript', :src => 'http://ajax.googleapis.com/ajax/libs/jquery/1.3.2/jquery.min.js'}
  %body
    #container
      #content
        = yield

views/index.haml:

The actual HTML content is pretty minimal: A heading and a list of tweets, which is included from a separate file (below) so we can reuse it for the AJAX action. We’re also inlining some jQuery code to refresh the tweets every 10 seconds. We insert the new tweets at the beginning, but remember that we’re using a CSS class to initially hide them. We then call slideDown to make them visible using a nice slide effect. We also trim the list of tweets at 50 to prevent the page from getting too long.

:javascript
  function refreshTweets() {
    $.get('/latest', {since: window.latestTweet}, function(data) {
      $('.tweets').prepend(data);
      $('.latest').slideDown('slow');
      $('.tweets li:gt(50)').remove();
 
      setTimeout(refreshTweets, 10000);
    });
  }
  $(function() {
    setTimeout(refreshTweets, 10000);
  });
 
%h1 Recent LOL Tweets
%ul.tweets
  = haml :tweets, :layout => false

views/tweets.haml:

Simply renders a list item for each tweet, with some basic CSS for styling purposes. I wouldn’t normally hardcode height and width for an img tag (and instead let CSS handle this), but for the purpose of this tutorial I wanted the page to render decently without a style sheet, and the Twitter profile pictures can be pretty large, making it look weird.

We also emit some simple Javascript to record the timestamp of the most recent tweet. We pass this into our AJAX request in index.haml above. An alternative solution (perhaps slightly cleaner from an HTTP perspective) would be to use the Last-Modified HTTP header instead of a Javascript variable, but this would mean messing with date parsing (never fun…) and also result in slightly more complex jQuery code, so I opted for the simpler solution.

- @tweets.each do |tweet|
  %li.tweet{:class => @tweet_class}
    %span.avatar
      %a{:href => tweet.user_link}
        %img{:src => tweet.profile_image_url, :alt => tweet.username, :height => 48, :width => 48}
    %span.main
      %span.text= tweet.filtered_text
      %span.meta== &#8212; #{tweet.name} (<a href="#{tweet.user_link}">@#{tweet.username}</a>)
 
- if !@tweets.empty?
  :javascript
    window.latestTweet = #{@tweets[0].received_at};

public/stylesheets/style.css:

The stylesheet is pretty basic, but since this blog post is already way too long, I’m not going to reproduce it here. Simply grab the live one instead.

Putting it all together

You should now have a bunch of Ruby files in the same folder, and three HAML files in a views subdirectory. Make sure you have started Redis according to the instructions above. Then open two shells:

In your first shell, start the filter app:

ruby twitter_filter.rb

The app should start and continue running until you hit CTRL+C.

In your second shell, start the web app. You could simply start it using:

ruby twatcher.rb

However, assuming you’ve installed the Shotgun gem according to the instructions above, I recommend using the following command instead:

shotgun twatcher.rb

This will cause Sinatra to automatically reload modified files during development, similar to the default behavior in Rails.

If everything started successfully, you should now be able to bring up the site in your browser. If you’ve started the web app by itself, hit port 4567. With Shotgun, hit port 9393 instead.

Conclusion

I hope you can appreciate how little code it took us to implement a complete Twitter filter web application, complete with AJAX updates. I count around 150 lines of code, and this includes plenty of comments and whitespace (granted, including the stylesheet it would be closer to 300 lines).

I also hope I’ve managed to pique your curiosity about Redis, Sinatra, and the TweetStream library. Many of us (myself included) tend to stick with the tools we’re familiar with, such as Rails and MySQL. But often, surprisingly elegant solutions emerge when using better-suited (and often simpler) tools.

Personally, I am excited about adding Redis and Sinatra to my standard toolset. I am also curious about what other types of applications might be able to get away with simple, ephemeral solutions like this. Definitely something worth exploring…

Upgrading an older MacBook Pro to 6GB of RAM

Saturday, September 5th, 2009

If you own a MacBook Pro and would like to upgrade to more than 4GB of RAM but think that your model does not support this, you may want to read the rest of this article.

I bought a MacBook Pro 17″ in April 2008, as my primary development machine. I knew that the standard config with 2GB of RAM wouldn’t be enough for my purposes, but I also wasn’t about to spend a ridiculous amount of money on an official memory upgrade from Apple, so I picked up two cheap G.SKILL 2GB DIMMs from Newegg.

This worked great for me so far, but even with 4GB of RAM, I occasionally ran into memory limits. For example I sometimes work on iPhone and complementary Rails apps at the same time, and having both Xcode (plus Interface Builder and the iPhone Simulator) and a Rails app, IDE, etc. running at the same time definitely uses a fair amount of memory. Especially if I use RubyMine (which is pretty nice, by the way, but a major memory hog). That’s one of the reasons why I often still work with a regular text editor such as TextMate. The situation gets even worse when I need to run a virtual machine, such as for IE browser testing. And of course there are all the other memory hungry apps that tend to be running all the time (Firefox and / or Safari, iTunes, etc.).

The last time I researched potential memory upgrades, I quickly discovered that my model (apparently) only supports a maximum of 4GB, so I gave up.

But this time I complained on Twitter, and a reply prompted me to research this issue more closely. Well, it turns out that many MacBook Pro models do indeed unofficially support 6GB of RAM, in form of 4GB + 2GB DIMMs. This MacRumors Guide has all the info you need. My model appears to be the Rev. E (as identified by the date of purchase, as well as the CPU frequency, video card, and video memory). And sure enough, the Rev. E and F models can handle up to 6GB of RAM.

Since I was already using G.SKILL memory, I opted for a 4GB G.SKILL DIMM (currently $129 at NewEgg). I would not recommend mixing DIMMs from different manufacturers, and in fact I have read some reports of people having trouble getting these configs to work.

The actual memory upgrade process is quick and easy (at least on the pre-unibody models), and Apple provides a convenient guide.

I should point out that due to the mixed (4GB + 2GB) memory configuration, you lose the Dual Channel capability. But based on what I read, this only affects certain types of apps and makes little difference in practice. I definitely didn’t notice any lower performance after the upgrade.

The increased memory means that my system rarely (if ever) has to swap. Now I can run my whole development stack as well as two virtual machines (Windows and Linux) and the machine is still very responsive.

Now I just have to find some new memory intensive applications to bring my system down to its knees… ;)

Update: This upgrade was the single biggest bang for the buck and has made a tremendous difference on my system. Having 6GB instead of 4GB was exactly the additional RAM I needed to be able to run all my various development tools at the same time. I am now running several Rails apps and have one RubyMine as well as two Xcode projects open without any issues, along with the usual productivity software, iTunes, etc. Definitely highly recommended!

iPhone Development is Fun!

Sunday, May 10th, 2009

It’s been a while since I completed my first iPhone development project, and I figured I’d finally write up my initial experience with this platform. A bit late, but better than never…

To put this into perspective, here’s a brief summary of my previous professional programming background:

I worked with C++ back in University but quickly adopted Java in 1998 and never looked back (good riddance to pointers and manual memory management!). Java served me well for the better part of the past 10 years, but I’ve increasingly become a fan of dynamic languages, and Ruby in particular (although I’ve dabbled a bit with Python as well). These days I mainly work with Ruby (mostly Ruby on Rails, but also standalone apps such as daemons, command line apps, etc.). Among many other reasons, I just love its expressiveness over Java’s verbosity.

With this in mind, I was admittedly a bit hesitant at first about iPhone development using Objective-C, even though I was definitely curious about the CocoaTouch platform (and Cocoa in general).

Objective-C

At first glance, Objective-C syntax looks quite odd. Instead of the dot-notation for method calls that Java and Ruby use (e.g. "foobar".length), Objective-C uses a square bracket based message passing syntax (e.g. [@"foobar" length]). This is harmless enough for simple method calls, but can become confusing when using nested calls (e.g. [[[MyClass alloc] init] autorelease]). Apparently, Apple decided this was the case as well, so when they introduced their simplified support for properties in Objective-C 2.0, they chose to go with dot-notation. This definitely cuts down on some of the clutter (especially when setting properties), but it also leads to a slightly awkward, inhomogeneous mix of notations in the code, as dot-notation cannot be used for regular method calls (and in fact some long-time Objective-C developers are boycotting the new properties syntax for that very reason). But in the end, this is a relatively minor stylistic difference, and I eventually got used to it.

The other Objective-C oddity is the Smalltalk-inspired way that parameters are passed to methods. Most current programming languages pass parameters as comma-separated lists. Objective-C essentially breaks method names up into multiple segments, each of which has its own parameter. This means that you end up with method signatures like this:

- (void)insertSubview:(UIView *)view atIndex:(NSInteger)index;

Invoking this method would look like this:

[myView insertSubview:mySubview atIndex:3];

It felt weird at first, but once I got my head wrapped around this syntax, I actually kind of started to like it, as it makes method calls read like regular English sentences. I have however found that it can be difficult to come up with natural sounding method and parameter names for my own classes, so my preference would be a C-based method call syntax with named params like Python has. But in the end, I got used to this syntax fairly quickly.

Probably the thing that bothers me most about Objective-C / Cocoa is the overall verbosity, especially compared to a very succinct language like Ruby. This is partially because variables need to be declared, partially because Cocoa’s method names tend to be long, and of course because as a C-derived language, Objective-C simply isn’t as expressive as a language like Ruby, with its blocks, enumerations, literal syntax for creating arrays and hashes, etc.

Then there’s the usual duplication also found in C/C++ with its header and implementation files. For example, in Ruby I can define a property simply by using attr_accessor :foo. In Objectice-C, I need to explicitly declare the actual member variable as well as the property in the header file, and add a synthesize statement to the implementation file. And this is already the shorthand syntax for properties…

Perhaps the thing I was most worried about: Memory Management. Objective-C 2.0 actually introduced garbage collection, but unfortunately this isn’t supported on the iPhone. Luckily, as it turns out, Objective-C’s retain count based memory management combined with autorelease pools is actually quite elegant and a major improvement over C/C++. I won’t go into details here (there are many resources on this topic), but the bottom line is that as long as you adhere to the accepted conventions (particularly regarding which methods return autoreleased vs. non-autoreleased objects), you should be fine. With Instruments, Apple also provides a powerful tool to help track down actual memory leaks. Would I prefer garbage collection? Absolutely! But the situation isn’t as bad as I had feared.

Objective-C also provides many nice features, such as Protocols and Categories. Protocols are very similar to Java Interfaces and are heavily used for delegation (see below). Categories allow functionality to be added to existing classes. For example, you could use this mechanism to allow NSDictionary, NSArray, or other NSObject subclasses to generate a JSON representation. They remind me of a less powerful (but still convenient) version of Ruby Mixins (less powerful because Mixins allow modules to be mixed into arbitrary classes, whereas Categories are implemented for a specific class).

CocoaTouch

This is really the best part of iPhone development. Having mostly worked on server side code, I’ve never been much of a GUI developer. But with a few positive exceptions (such as Qt), most of the GUI frameworks I have messed with were quite painful (MFC anyone?), with horrendous amounts of unmaintainable, auto-generated code.

CocoaTouch (like its big brother Cocoa on the Mac) is very well thought out and makes good use of design patterns. In particular, it heavily relies on delegation rather than inheritance, which leads to significantly cleaner and more modular code. For example, instead of subclassing a UI control in order to add the desired custom behavior, you typically implement the behavior in your controller class and specify this as the delegate for the UI control.

Aside from a few lines of boilerplate code that are part of the project templates, there is absolutely no auto-generated code. This is mostly possible because Interface Builder stores actual, serialized object instances inside xib files and all custom behavior is injected using the delegate mechanism.

CocoaTouch also provides very simple, high-level APIs for many powerful features, such as the iPhone’s camera / image picker.

The only thing that bugs me a lot is the disconnect between the Objective-C based high-level Cocoa API and the C-based lower-level APIs. Often times, the high-level APIs are limited to a few common cases, but when you need to stray from these, you need to talk straight to the underlying low-level APIs. This leads to an awkward mix of nice, object-oriented Objective-C code and ugly, procedural C code.

Development Tools

The various development tools are an integral part of the iPhone SDK. In particular, Xcode is a very decent IDE. It may not be the best IDE I’ve ever used, but it gets the job done and even sports some modern IDE features (such as refactoring). I wish it had Git support, though.

Interface Builder is an extremely convenient tool, so I try to use it whenever it makes sense. However, I do have a few complaints: Many properties are not exposed in IB, so they have to be set manually in the code. More importantly, IB is not extensible at all. For example, I would like to be able to implement my own subclass of UIView and have its relevant properties show up in IB. If my UIView subclass defines a text property (perhaps with some sort of annotation that specifies additional, Interface Builder specific metadata), I would like to see a text field in IB. If I define a color property, I would like to see a color picker, etc. Hopefully this will be possible in a future version.

Instruments is a powerful tool for finding memory leaks and performance bottlenecks. I have only used this a little bit, but it’s good to know that it’s there when I need it.

My major complaint in this area is about the horribly complicated and error prone provisioning process. But I’ll save this rant for another blog post.

Conclusion

Overall, iPhone development is a lot of fun, and a very refreshing change from web development. The iPhone is an amazingly powerful device, and after the initial learning curve, it is surprisingly easy to leverage many of its unique abilities, such as the accelerometer, multi-touch, Internet connection, and more.

I would love to be able to use CocoaTouch in conjunction with Ruby (such as RubyCocoa or MacRuby. But given that this is not an option on the iPhone, Objective-C is a decent alternative.

Now that I got this post out of the way, I plan on posting more about specific iPhone development topics. Stay tuned!

Workling and Amazon SQS

Saturday, April 4th, 2009

If you need to perform any time consuming work in your Rails actions, you’ll probably want to offload this into a background job. There are many different frameworks to help with this, and the one we use is Workling. The nice thing about Workling is that it provides an abstraction layer that allows you to decouple your actual background job implementation from the background execution strategy. For example, in our development environment we are using the Spawn runner (which simply forks the Rails process for each background job), but we need a proper, queue based runner in production. Up until recently we were using the Starling runner, which worked pretty well for a small set of machines.

However, after migrating our infrastructure to Amazon EC2 and rapidly scaling up the number of app servers, we figured it would be great to take advantage of Amazon SQS (Simple Queue Service), rather than maintaining our own queue servers. Fortunately, Workling’s plugin architecture makes it very easy to implement your own clients, so writing an SQS Workling client turned out to be fairly straightforward.

If you are interested in using this in your own project, simply use my Workling fork on Github. I haven’t decided yet whether to extract this into a separate plugin that you could install alongside Workling, so let me know if you have a strong preference. I’ll also get in touch with the Workling developers to see if they might be interested in pulling this feature into the main code base. But for now, you can simply install it by following the regular Workling plugin installation instructions, except using my Workling fork:

script/plugin install git://github.com/digitalhobbit/workling.git

The README includes detailed instructions on configuring the client, but it’s actually very easy:

Install the RightAws gem:

sudo gem install right_aws

Configure Workling to use the SqsClient. Add this to your environment:

Workling::Remote.dispatcher = Workling::Remote::Runners::ClientRunner.new
Workling::Remote.dispatcher.client = Workling::Clients::SqsClient.new

Add your AWS key id and secret key to workling.yml:

production:
  sqs_options:
    aws_access_key_id: <your AWS access key id>
    aws_secret_access_key: <your AWS secret access key>

You can optionally override the following settings, although the defaults will likely be sufficient:

    # Queue names consist of an optional prefix, followed by the environment
    # and the name of the key.
    prefix: foo_
 
    # The number of SQS messages to retrieve at once. The maximum and default
    # value is 10.
    messages_per_req: 10
 
    # The SQS visibility timeout for retrieved messages. Defaults to 30 seconds.
    visibility_timeout: 30
 
    # The number of seconds to reserve for deleting a message from the qeueue.
    # If buffered messages are getting too close to the visibility timeout,
    # we drop them so they will get picked up the next time a worker retrieves
    # messages, in order to avoid duplicate processing.
    visibility_reserve: 10
 
    # Below are various retry and timeout settings for the underlying right_aws
    # and right_http_connection libraries. You may want to tweak these based on
    # your workling usage. I recommend fairly low values, as large values can
    # cause your Rails actions to hang in case of SQS issues.
 
    # Maximum number of seconds to retry high level SQS errors. right_aws
    # automatically retries using exponential back-off.
    aws_reiteration_time: 2
 
    # Low-level HTTP retry / timeout settings.
    http_retry_count: 2
    http_retry_delay: 1
    http_open_timeout: 2
    http_read_timeout: 10

Now start the Workling Client:

script/workling_client start

That’s it!

There are still some caveats, such as the fact that messages are currently deleted from the queue at the beginning of processing rather than at the end (unfortunately Workling currently doesn’t provide the necessary hooks). This is good enough for us (we’re not relying on our background jobs for anything highly critical), but you probably don’t want to build your financial transaction processing on top of this… If there’s demand for this, I may try to extend Workling at some point to fix this issue.

Please leave a comment if you find this useful or have any other feedback. Also let me know if you encounter any bugs, or better yet, update my test case to reproduce the issue or send me a Github pull request with your fix. :)

Rails, Facebooker, and memcached Session Store

Saturday, February 28th, 2009

We’re using Facebooker for our Rails based Facebook apps. However, we ran into a problem after migrating our session store to the MemCacheStore. Every request was producing the following stacktrace:

/!\ FAILSAFE /!\  Sat Feb 28 10:24:09 -0800 2009
  Status: 500 Internal Server Error
  session_id '2.wYavYw2U9jBTnFOcX9rjMw__.86400.1235934000-1023424742' is invalid
    /Library/Ruby/Gems/1.8/gems/actionpack-2.2.2/lib/action_controller/session/mem_cache_store.rb:54:in `initialize'
    /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/cgi/session.rb:273:in `new'
    /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/ruby/1.8/cgi/session.rb:273:in `initialize_without_cgi_reader'
    /Library/Ruby/Gems/1.8/gems/actionpack-2.2.2/lib/action_controller/cgi_ext/session.rb:19:in `initialize_aliased_by_facebooker'
    /Users/mirko/Work/questionx/vendor/plugins/facebooker/lib/facebooker/rails/facebook_session_handling.rb:35:in `initialize'
    /Library/Ruby/Gems/1.8/gems/actionpack-2.2.2/lib/action_controller/cgi_process.rb:94:in `new'

It turns out that Facebook session ids include dots and underscores, which the MemCacheStore chokes on. Luckily I came across this forum post. The solution below is based on the approach outlined in the post, with a few modifications to more cleanly hook the patch into the method chain rather than replacing the original functionality completely. Simply drop the code below into an initializer (e.g. config/initializers/facebooker_memcache_session_patch.rb):

class CGI
  class Session
     class MemCacheStore
       def check_id_with_strip_fb_chars(id)
         check_id_without_strip_fb_chars(id.gsub(/[-\._]/, ''))
       end
       alias_method_chain :check_id, :strip_fb_chars
     end
  end
end

Merb and Rails are merging!

Tuesday, December 23rd, 2008

No, it’s not April 1st, and as far as I know, hell hasn’t frozen over either. The Merb and Rails teams have in fact announced that they will be joining forces. The end result will eventually be released as Rails 3.

Rather than repeating all the details here, below are links to the original announcements by various team members:

It is unclear how much of the actual Merb code will make it into Rails 3, but the important thing is that Rails will embrace many of Merb’s core principles, such as a lightweight core, performance, modularity (i.e. you’ll be able to easily swap ActiveRecord out for DataMapper or some other ORM framework), and a well-defined and stable public API that plugins can be based on.

I’ve been a big fan of Rails for many years, and it is certainly relatively mature and has a large developer community. At the same time, I’ve been drawn to Merb and related technologies (such as DataMapper) lately, and I strongly agree with their core principles.

Integrating these two frameworks will be no small feat, so the new combined team definitely has their work cut out for them. But as different as the two frameworks are, they also have a lot in common. By agreeing on this new direction, the team will be able to focus on the important tasks without having to deal with redundant functionality. I am looking forward to the new direction and am excited about trying out the first Rails 3 alpha whenever it is released.

Over the past few months, there’s been a fair amount of bickering between the Rails and Merb teams, and I’m all the more impressed with both teams for reaching this decision and deciding to work together. I am convinced that the Ruby Community will be a lot better for it.

Rails in the Cloud: AWS, Heroku, and Morph

Thursday, November 13th, 2008

Amazon Web Services

Over the course of the past 6 months or so, I’ve had the opportunity to explore various cloud hosting services, starting with Amazon Web Services. I don’t want to go into detail about AWS here, but suffice to say that I like this suite of services a lot, and it was a great fit for an SMS app / messaging server I deployed on this infrastructure. The application consisted of a set of loosely coupled components running on EC2 (some daemon apps and some Merb based web frontends), communicating via the Simple Queue Service (SQS). I leveraged S3 for deployments and backups.

AWS is great, and the ability to bring up new EC2 instances any time is very powerful. Need to test something out real quick? Simply launch a fresh instance, then kill it when you no longer need it. One of your app servers running at capacity? Simply launch another one, or even automate this by monitoring your load. Need to bring up 1000 instances for an hour in order to load test your web app, then shut them right down again? No problem! No need to ever call up an IT person at your local hosting company and get a new server provisioned, or even buy your own server and drive to a colocation facility yourself to set it up.

But the services that AWS provides are still very low level. This can be a good thing because it gives you complete control and flexibility over how to architect your application, but it also means that you’ll spend a good amount of time configuring Linux images and writing deployment, monitoring, and backup scripts. For many complex applications, this makes perfect sense, and I would pretty much always pick AWS over VPS or dedicated hosting services these days. But for a large class of straightforward web apps, it would be great if I didn’t have to worry about these details (especially since I’m admittedly not that much of an IT guy). Ideally I would simply push my code out into the cloud and delegate all deployment concerns to the hosting service. When taken to the extreme, I would not even want to have to know any details about the deployment infrastructure. All I would care about is that the app runs reliably and scales up as needed (and affordably).

Google App Engine

Google App Engine has a similar philosophy, and if I was a Python fan, this would be a very interesting option. As soon as Google starts supporting Ruby, I will definitely check this out. My guess is that Rails might be difficult to support, because it is very tightly coupled with relational databases, but other Ruby based web frameworks (in particular Merb, perhaps even with a special DataMapper adapter) seem like they could be a great fit for App Engine. DataMapper already has adapters for other non-relational data stores (such as CouchDB), so it seems like it should be fairly straightforward to build an adapter for the Google App Engine Datastore.

But I’m getting off-track from what I was really planning to write about:

Lately, several Rails based services have emerged that take a big step in this direction. I’ve had the chance to work with Heroku and Morph AppSpace, so I figured I’d share my early impressions.

Heroku

Still in private beta, Heroku is a Rails specific hosting service that prides itself on its ease of use and elastic scalability. You create a new Rails app using their web interface, and then either edit it straight in the browser via a nifty browser based IDE, or check it out via Git and edit it locally. Changes made via the browser take effect immediately, changes made via Git are automatically deployed upon pushing to the remote repository, including applying migration scripts. It can’t get much more streamlined than that!

Heroku sits on top of EC2, but this fact is completely hidden from the developer, who never interfaces directly with the underlying EC2 instance. This means that, among other limitations, you don’t have shell or FTP access to the server. You do however have access to the Rails console via Heroku’s web interface, allowing you to work with your model objects and take care of the odd ad-hoc task. You also have access to your Rake tasks and the Rails code generators. Overall, I didn’t find this limitation all that bad.

Heroku uses PostgreSQL (we would have preferred MySQL, but that is unfortunately not an option at this point), but you don’t have direct access to the database. The web interface has some rudimentary functionality for viewing your database schema and data, but there is no way to run SQL queries. I have found this a much bigger limitation than the lack of shell access, as it prevents a lot of ad-hoc analytics or quick and dirty data changes that occasionally come in handy (although I suppose it is easy enough to whip up a quick migration for the latter). There is a way to download a yaml based database dump, which you can then import into your local database (regardless of whether this runs PostgreSQL, MySQL, or Sqlite3). It’s a bit cumbersome, but at least the option is there, so your data is never held hostage.

Another major limitation is that Heroku does not support background tasks of any sort. This includes both cron jobs as well as tasks offloaded by the Rails app onto a job server such as BackgrounDRb or Workling. This may not be critical for all applications, but it is becoming increasingly common for Rails apps to offload a lot of the processing into asynchronous background tasks, allowing for a more responsive user experience as well as better scalability.

Heroku is free for now, but as far as I am aware they are aiming for a utility based pricing model, where you only pay for the actual bandwidth and CPU utilization you have consumed. If done right, this should be a great model that allows developers to launch their application without committing to any large upfront costs, and scale up the cost linearly with utilization.

Overall Heroku is very impressive, and if you’re starting out with a brand-new Rails app, it’s well worth considering, at least in the early phases. It does come with some significant limitations compared to traditional hosting options, which may or may not be a big deal for you. Of course these are offset with the elimination of IT related tasks that are no longer necessary in this environment.

But as much as I like Heroku’s premise, we were underwhelmed with the performance of our Rails app on Heroku, which was our main reason for exploring alternatives such as Morph. Of course, since Heroku is still in its early stages, I’m sure they will be able to improve this over time.

Morph AppSpace

Morph is similar in principle to Heroku, but in my experience it is generally a bit more flexible.

One of the main differences is that scaling isn’t quite as transparent as in Heroku. Morph applications run on one or more “cubes”, which they describe as a “virtualized application compute environment”. A single cube is free and may very well be sufficient during development, so as with Heroku there is no up-front cost (although free plans don’t support custom domains). Additional cubes cost $31 per month (the price goes down after 4 cubes) and also come with increased bandwidth and storage limits. But unlike typical VPS accounts, cubes are charged on a daily basis, so you can ramp up (or down) your cubes any time as needed to adjust to your application’s utilization. This does mean that you need to manually allow your app to scale by adding additional cubes, but this is easily accomplished in the Morph Control Panel. Apparently automatic scaling based on application load is planned for a future release.

The next difference is in the range of applications that Morph supports: Rails, Java (including Grails), and PHP (experimental). I have only used Morph for Rails applications and can’t speak to any of the other options.

Morph supports both PostgreSQL and MySQL (although the latter costs an extra $0.33 per day, apparently due to licensing issues). One very useful feature is that Morph gives you direct access to the database via a web based admin tool (phpMyAdmin in case of MySQL).

Unlike Heroku, Morph does not offer a web based IDE. It also doesn’t create a blank Rails app for you, nor does it offer source code hosting. Instead, it integrates with your existing source control system via a customized Capistrano deployment script. You simply specify the type (supported are Git, Subversion, Mercurial, Bazaar, and Local Directory) and URL of your SCM (GitHub in our case) and their wizard spits out a customized Capistrano script. After that, deployments are a breeze, not unlike Heroku. One difference is that Morph’s deployment process involves multiple stages, the first of which consists of uploading your code to S3. As a result, you don’t actually see detailed deployment status in the console and need to refer to the deployment logs in the Control Panel instead.

Another critical feature for us is that Morph has at least rudimentary support for cron jobs. The Control Panel allows you to run Rake tasks directly (just like Heroku), but it also allows Rake tasks to be scheduled at arbitrary intervals (the minimum interval is 1 minute). This still doesn’t allow us to offload arbitrary processing jobs to a job server, but it does at least enable jobs to be queued up in the database and processed asynchronously by a Rake task. We are using this mechanism to offload email processing to ar_mailer, which works great and leads to a significantly improved user experience.

On the downside, Morph does not provide access to the Rails console. For me this is largely offset by the ability to access the database, but console access would have been nice.

Last not least, at least for our application, Morph resulted in a significant speed boost (at least multiple X, close to an order of magnitude).

Verdict

So which service (if any) do I recommend? I wholeheartedly recommend both services. Which one is a better fit for you will likely depend on the specific features (and limitations) that matter to you. Personally, I’m mostly enamored with Morph at this point, the improved performance and ability to run background jobs being the biggest differentiators for me. It remains to be seen whether either Heroku or Morph remain good options for us as our application grows (the fact that neither support true background tasks or Memcached servers might become a limiting factor at some point), but if nothing else they’re an ideal way to get off the ground.

If you’re starting out with a new Rails application, you may want to simply try out both services (it really doesn’t take that long) to see which one you like better. The easiest way to do this is to start with Heroku, as this creates a new Rails app for you and provides Git hosting. You should be able to then use Morph’s Capistrano wizard to generate a custom Capistrano script for you and check this into your Git repository, at which point you’ll be able to deploy the same Rails app to both Heroku and Morph. Of course, if you later decide to stick with Morph, you should find a different source code host (I heartily recommend GitHub).

The bottom line is that this new breed of hosting services is extremely compelling. Sure, both Heroku and Morph still have some rough edges and won’t be a good match for all applications at this point, but the direction is very promising and I am excited to find out what the future holds.

buy viagra buy clomid buy pet meds herbal remedies

Amazon EC2 Out Of Beta

Thursday, October 23rd, 2008

Today, Amazon removed the beta label from their EC2 service, along with a bunch of related announcements. This is great news!

Over the past half year, I have become an enthusiastic user of the various AWS services, including EC2, which has been very stable for me thus far. But now Amazon is formalizing this by offering a 99.95% availability guarantee as part of the new EC2 SLA.

I don’t personally care about the new Windows support, but I suppose this might make some Microsoft aficionados happy…

Amazon also announced some exciting future features:

Management Console – The management console will simplify the process of configuring and operating your applications in the AWS cloud. You’ll be able to get a global picture of your cloud computing environment using a point-and-click web interface.

I’ve been using Ylastic as a Management Console, and can highly recommend this service. It allows me to monitor and manage our various AWS services from any machine, without having to install any apps locally. But having a Management Console built right into AWS would be neat (assuming it is as solid as Ylastic).

Load Balancing – The load balancing service will allow you to balance incoming requests and traffic across multiple EC2 instances.

This is great! Currently, AWS users have to roll their own load balancing implementation or rely on a more limited, DNS based solution. Most of my EC2 deployments don’t involve public websites, so I have not had to tackle this issue. But having a solid load balancing solution built right into AWS will be tremendously useful for me in the future.

Automatic Scaling – The auto-scaling service will allow you to grow and shrink your usage of EC2 capacity on demand based on application requirements.

Another feature that’s been on our roadmap, and I’m excited to hear there’s going to be an officially supported solution. Hopefully Amazon’s implementation will be flexible enough to allow different criteria to determine when to start and stop instances, such as CPU usage or SQS queue status.

Cloud Monitoring – The cloud monitoring service will provide real time, multi-dimensional monitoring of host resources across any number of EC2 instances, with the ability to aggregate operational metrics across instances, Availability Zones, and time slots.

Another awesome feature! So far I’ve shied away from setting up a tool like Nagios for in-depth monitoring of our EC2 instances. It sounds like Amazon’s built-in monitoring solution will meet this need.

Overall I’m very excited about the pace at which AWS has been improving over the past half year or so. Availability Zones and particularly Elastic IPs made a major difference, followed closely by EBS.

Here’s a small wishlist of features I’d like to see in EC2:

  1. Instance Aliases: When running more than a handful of instances, each one for a specific service, it becomes very difficult to keep track of which instance ID maps to which service. Ideally instances would have a user-defined alias, but unfortunately the EC2 API does not offer this functionality. Luckily I am currently managing all instances via Ylastic, which supplements this functionality, but it means that I would be lost if I had to manage my existing instances using a different tool. This really should be implemented on the EC2 API level.

  2. Querying User Data: Along the same lines, EC2 instances do support arbitrary user data (which I use to specify the role of the instance upon startup), but unfortunately this can only be queried from within the instance, not externally. Again, Ylastic solves this issue by keeping track of the user data itself, but this should be supported by the EC2 API.

This list used to be a lot longer, but the recent release of EBS, coupled with the announcements above, took care of much of it. Nice work, Amazon!

Rails And JSON Containing Unicode Characters

Wednesday, August 27th, 2008

As I mentioned in a previous blog post, Rails 2.1 natively supports incoming JSON requests. Unfortunately, it still struggles with JSON data containing non-ASCII characters.

According to the JSON spec, JSON fully supports UTF-8 encoded text, so with a few exceptions it generally should not be necessary to escape non-ASCII characters with \u Unicode escape sequences. However, many JSON libraries appear to escape all non-ASCII text in this fashion. This in itself should not be a problem, but ActiveSupport::JSON currently does not properly parse JSON containing \u escapes, resulting in strings with literal \u escape sequences rather than the desired UTF-8 encoded characters. This is especially confusing since ActiveSupport:JSON itself encodes all non-ASCII characters as \u escapes, so one might think that the reverse transformation yields the original data. But this behavior is likely explained by an odd implementation choice for its decoder: Rather than using the json (or json-pure) library, it converts the JSON data to YAML and then uses the YAML library to decode the data into Ruby objects.

Monkey-patching to the rescue! I decided to replace ActiveSupport::JSON::decode with an implementation that uses the json library. The easiest way is to stick the following code into a file named something like activesupport_json_unicode_patch.rb inside the config/initializers/ directory, where Rails will automatically pick it up.

require 'json'
 
module ActiveSupport
  module JSON
    def self.decode(json)
      ::JSON.parse(json)
    end
  end
end

You can verify the fix by adding a test case (I added a file named activesupport_json_test.rb to the test/unit/ directory):

require File.dirname(__FILE__) + '/../test_helper'
 
class ActiveSupportJsonTest < Test::Unit::TestCase
 
  def test_json_encoding
    unicode_escaped_json = '{"foo":"G\u00fcnter","bar":"El\u00e8ne"}'
    hash = ActiveSupport::JSON.decode(unicode_escaped_json)
    assert_equal({'foo' => 'Günter', 'bar' => 'Elène'}, hash)
  end
 
end

This test should fail without the patch and pass after adding it.

In addition to fixing the JSON / Unicode problem, this patch should also provide a nice speed boost, as we’re replacing the somewhat roundabout YAML based JSON decode method with a native one (particularly if you’re using the native json implementation rather than json-pure.)

It’s been a while…

Wednesday, August 27th, 2008

I realized that it’s been quite a while since my last update. Unfortunately it seems like the amount of interesting stuff I have to write about is inversely proportional to the spare time I have available for writing… ;)

Anyway, I figure I’ll try to get back into the habit of publishing smaller posts, but hopefully more regularly. Let’s see how it goes…

Ever since I went back into startup life four months ago, I’ve had the chance to play with a lot of exciting technologies, so there’s plenty of stuff to write about. Below are just some quick notes and mini-reviews, but I’ll be writing more about the various topics in the future.

  • Merb
    • This is quickly becoming my web framework of choice. It is well-thought-out, highly modular, and ideally suited for both small webservices as well as large scale websites.
    • The default application layout is very similar to Rails, making it easy for Rails developers to get up to speed on Merb. However, it also supports highly compact app layouts that are ideal for smaller webservices.
    • I’m sure I’ll write more about Merb later.
  • DataMapper
    • An awesome ORM framework, and one of several supported by Merb. Like Merb it is still pre-1.0 and therefore still evolving a lot, but it is already quite solid and very powerful.
    • More on this later…
  • RSpec
    • For my new projects, I have been using RSpec exclusively, and after a small learning curve it has really grown on me. Granted, I’m probably not taking advantage of all its features (for example I haven’t actually written my own matchers yet), but with a bit of discipline (and a heavy dose of mocking and stubbing), specs written in RSpec are probably an order of magnitude clearer and more readable than with Test::Unit.
    • I took the opportunity to get rid of test fixtures as well (and good riddance!)
  • Ruby in general + Rails
    • Right now I am working with a mix of Merb projects, Rails projects, and standalone Ruby apps.
    • I’ve had the opportunity to build a Ruby based framework for SMS based mobile apps, consisting of a processing pipeline with several independent services that communicate via asynchronous message queues, with some DSLs and meta-programming thrown in for good measure. This has allowed me to become much more familiar with Ruby than I was back when I just played with some Rails apps.
    • There’s definitely a right language for every task, but I have to admit that I find it harder and harder to imagine ever going back to Java…
  • Amazon Web Services (EC2, SQS, and S3)
    • Using Amazon AWS as a deployment platform is a whole new experience for me, and I am very impressed with the various services. They were already solid when I started working with them, but in the past few months Amazon managed to launch several major features (such as static IP addresses and persistent storage) that significantly lower the barrier of entry.
    • EC2 is particularly well-suited for the type of loosely coupled architecture we are developing, and I envision eventually being able to dynamically start and stop instances to adjust both to general growth over time, as well as fluctuations throughout the day (as mobile apps tend to exhibit certain usage patterns.)
    • SQS is a convenient way to tie the different components together (albeit with some limitations, such as an 8KB maximum message size), and of course S3 is available for any storage needs.
    • I will definitely blog in more detail about various AWS strategies and recipes.
  • Git
    • We’ve been using Git (and GitHub) for all new applications. We’re definitely not exploiting it to its full potential so far (given that we’re only a few developers that mostly work on different products, we haven’t had much of a need to leverage Git’s distributed nature), but I am definitely growing quite fond of it.
    • It is extremely fast, I love being able to commit code and browse the revision history even without a network connection, topic branches are convenient and powerful, and more.
    • Tool support is definitely still a weak point, but for the most part I’ve been happy using Git on the command line.
  • Lots of sysadmin / deployment stuff
    • System administration is definitely not my forte, but I get by (at least on Linux; Joyent’s Solaris servers still manage to throw me off occasionally…).
    • I have had a chance to get more exposure to deploying Rails apps (including Mongrel, Monit, Passenger, etc.), as well as building a deployment framework for AWS from scratch.
    • Deploying applications to such a cloud based architecture is quite different from a typical Rails or LAMP stack, but I’m quite happy with the initial version of the deployment framework I’ve built. I’m essentially using S3 to store versioned app packages (which correspond directly to a Git tag) as well as third party gems, and a configuration file (also in S3) for each environment (such as production or staging) defines which version should be deployed on it. Each service regularly polls S3 for configuration changes and updates itself if appropriate. We use a single machine image and configure each instance via user data upon startup, which allows the instance to pull down the appropriate files from S3 and start running the service it is intended for. A small set of Rake tasks manage deploying releases and promoting them from one environment to another one.
    • There are still some challenges ahead, though, such as a proper logging system (I’m planning to migrate our apps to log to a central syslog server.)
  • Various web based services, including Google Apps, GitHub, Lighthouse, Scout, and Ylastic
    • All of these are great utilities.
    • Google Apps is a must-have; any startup should use it at least for email and calendaring, as well as collaborating on documents or spreadsheets. We also use Google Sites as our Intranet / Wiki.
    • I have briefly blogged about Scout before and will likely blog about various Scout plugins in the future.
    • GitHub is an amazing application, not only for hosting our own source code, but also for following the various open source projects we depend on (such as Rails, Merb, and DataMapper.)
    • We haven’t had a chance to really dive into Lighthouse yet, so the jury is still out on it. I do believe that its refreshingly simple, tag based approach should work pretty well, though.
    • Last not least, Ylastic has been invaluable and a great way to manage our EC2 instances, images, debug SQS based apps, browse S3, and more. I’ll probably write up a more thorough review soon.
  • Honorary mention: Pandora and Airfoil
    • For keeping us supplied with music and allowing us to stream it to our Airport Express. :)

That’s it for now. Hopefully it won’t be two months before my next post…

Network Monitoring With Scout

Tuesday, June 24th, 2008

I’ve been meaning to set up a network monitoring tool at work for a while. We have a couple of different applications using various technologies (currently mainly Ruby on Rails and PHP), running on various VPS servers. While we are using Monit to keep an eye on our Rails apps and restart them if necessary, as well as a couple of custom webpages to track vital and growth stats of our apps, we currently don’t use any monitoring or (perhaps more importantly) alerting tools beyond that. After one of our PHP / MySQL apps stopped responding (due to the fact that we ran out of disk space, as we later discovered), I figured it was about time to put some more sophisticated network monitoring in place.

The de-facto standard application seems to be Nagios, which is quite powerful and configurable, but has an extremely steep learning curve. It also does not offer a friendly UI for configuring services and relies on static configuration files instead. There is also a newer crop of network monitoring apps that I was hoping might be a bit less daunting to get up to speed on, such as Zenoss (here is a brief overview of open source network monitoring apps). I downloaded a few of these, but ultimately realized that I would not be able to get my head wrapped around any of these apps purely using intuition, and that I would actually have to invest a fair amount of time to master at least the basics. I’m simply not enough of an operations expert… It is clear that all of these apps are extremely powerful, and probably great for larger deployments, but I really just needed a simple tool to check some operating system level vital stats or ping some URLs, for a handful of machines.

That’s when I remembered Scout, a hosted network monitoring service that launched fairly recently and that sounded very interesting when I first came across it. Their subscription plans are pretty reasonable (the $29/month plan for 4 servers should suffice for us at this stage), particularly given that we would have had to pay for an additional VPS slice or EC2 instance to host Nagios or some other deployed solution anyway. Best of all, Scout offers a free plan, and even though this only supports a single server, this is a good way to evaluate how well it works for our purposes.

Scout uses an interesting approach at monitoring servers. Rather than using SNMP or an agent that is continuously running on each server, Scout uses a lightweight client (installed via a Ruby gem) that needs to be run periodically (10 minutes being the minimal reporting interval), generally via a cron job. Once the client app is installed on each server to be monitored, the servers don’t need to be touched for future configuration changes. Instead, everything is configured on the Scout website, and pulled down by the client the next time it checks in. The entire configuration consists of a number of plugins that can be installed for each client. Out of the box, Scout supports around 20 plugins that range from basic monitoring tasks for server load or disk space to more specific plugins for Ruby on Rails, Mongrel, or MySQL.

Even better, Scout offers a very simple Plugin API for integrating your own plugins. Plugins are written in Ruby and mainly consist of a single method that either returns a bunch of stats as a hash, which is exposed by Scout both in tabular report and graph form, or triggers an alert in case of a problem. Since plugins have the full Ruby stack at their disposal, it is easy to write a plugin that shells out to a Unix command, performs an HTTP request, hits a database, or anything else you can think of.

One minor downside is that (as far as I can tell), there is no way to simply upload a plugin. Instead, Scout relies on a pull mechanism, which means that we would need to expose any proprietary plugins via a publicly accessible URL. This might be an issue if the plugin itself contains sensitive information, although settings (such as passwords or paths) can be decoupled from the code and configured via the web interface. While not ideal, putting the plugin code in a publicly accessible but not automatically discoverable location and only making it available for the duration of the initial download or future updates should minimize this risk and turn it into a minor inconvenience.

Based on my initial impression, Scout looks very promising. The reporting functionality is fairly basic, and particularly the graphs could perhaps use a bit more polish, but everything is very easy to use. Scout is clearly geared towards developers rather than sysadmins, so perhaps that is why it appeals to me. If your monitoring needs are relatively straightforward and you don’t need all the functionality that a deployed solution like Nagios offers, Scout is definitely worth a look, at least for relatively small deployments. I am not sure how well it scales beyond 16 servers (both in terms of administration and pricing), so it is possible that a deployed application might make more sense at that point.

Phusion Passenger Now Rack Compatible?

Sunday, June 1st, 2008

According to this blog post and several mentions on Twitter, Phusion announced today’s release of Passenger (aka mod_rails) 2.0 at RailsConf. Apparently, Passenger 2.0 will be Rack compliant and thus support not only Rails, but any Rack compatible web framework, including Merb and Sinatra. Interestingly, Passenger will not even be limited to Ruby any more and extend their support to WSGI, the Python web adapter framework that inspired Rack in the first place. For example, this will allow Passenger 2.0 to run the popular Django web framework. In light of these changes, Passenger will drop the name mod_rails.

I think these are fantastic news! As I mentioned in my previous post on Phusion Passenger, it makes deploying Rails apps trivially easy, and I am planning to use it as the default deployment platform for my Rails apps. I have also been flirting with Merb lately, and knowing that I am going to be able to deploy it just as easily as Rails makes a big difference to me.

Support for Rack is the logical next step for Passenger, so I am not all that surprised about the direction they are going. I am however a bit surprised about the timing. After all, version 1.0 was only released fairly recently and prominently branded as mod_rails. The [http://www.modrails.com/documentation.html](online documentation) even states (although I assume that this will be updated within the next few days):

Does it support other Ruby frameworks (Merb, Camping, etc.)?

No.

What?! Why??

Because this is an evil plot created by evil overlords, with the goal of world domination destroying all other Ruby frameworks. …

Actually…

There is the following saying: Jack of all trades, master of none. Our intention is to be masters, not Jacks. The primary goal of version 1.0 was to create an easy-to-use, low-maintenance, stable and fast Ruby on Rails deployment system for Apache. And we’ve put a lot of effort into reaching that goal. Implementing support for other Ruby frameworks would have deviated us from that goal and would have increased development time significantly.

That said, nothing prevents future versions from supporting other Ruby frameworks, or from becoming a generic Ruby web application deployment platform. Please discuss it with us if you’re interested in steering development towards that direction.

My guess is that there was a lot of demand for Rack support from users (even Rails now supports Rack), and after looking into it, they realized it was easier to integrate this than they initially expected. Either way, this is great news, and I look forward to trying out Passenger 2.0 with Merb and DataMapper, or even Sinatra for smaller apps.

Update: 2.0 RC1 has been released, and you can find more details in the release announcement on the Phusion blog. In addition to Rack and WSGI support, 2.0 sounds like a more solid and stable release overall, with a significantly smaller memory footprint, faster startup time, fair load balancing, upload buffering, and some convenient analysis tools. There’s also a native Ubuntu package now, in case you want to avoid compiling from source.

Rails 2.1 and Incoming JSON Requests

Sunday, May 25th, 2008

Earlier this week, we tried to figure out the cleanest and easiest way to get our Rails app to accept incoming JSON requests. Up until recently, developers were able to use various Rails plugins for this purpose, such as the json_request plugin.

Luckily, it turns out that full support for JSON was added to Rails in April, making it a first class citizen along with XML and regular URL-encoded form fields. This functionality will be officially released in Rails 2.1, but in addition to Edge Rails, it is already included in Rails 2.0.991, which is available from the Ruby on Rails Gem Repository. You can install this pre-release via:

sudo gem update rails --source http://gems.rubyonrails.org

Using this functionality is really simple. Let’s say we have created the following scaffolded Rails app with a Book resource, perhaps to manage your library:

rails library
cd library
script/generate scaffold book title:string author:string isbn:string price:decimal
rake db:migrate

As you know, you can now access the books controller in the browser via http://localhost:3000/books, and use the “New Book” link to create a new book via the scaffolded form that Rails provides. But you can also create books via JSON (or XML, for that matter). In fact, we will try XML first, which has been natively supported in Rails for a while:

curl -H "Content-Type:text/xml" -H "Accept:text/xml" \
  -d "<book><title>Posted via XML</title><author>Ex Emel</author><isbn>1234567890</isbn><price>34.99</price></book>" \
  http://localhost:3000/books

Note that I am setting both the Content-Type and Accept header to “text/xml”, indicating that the incoming request consists of XML and that we would like to receive an XML-formatted response as well. The response looks like this:

<?xml version="1.0" encoding="UTF-8"?>
<book>
  <author>Ex Emel</author>
  <created-at type="datetime">2008-05-26T05:58:38Z</created-at>
  <id type="integer">2</id>
  <isbn>1234567890</isbn>
  <price type="decimal">34.99</price>
  <title>Posted via XML 2</title>
  <updated-at type="datetime">2008-05-26T05:58:38Z</updated-at>
</book>

Now let’s try the same thing in JSON:

curl -H "Content-Type:application/json" -H "Accept:application/json" \
  -d "{\"book\":{\"title\":\"Posted via JSON\", \"author\":\"Jason Bourne\", \"isbn\":1234567890, \"price\":49.95}}" \
  http://localhost:3000/books

As you can verify in the browser, this request was successful and had the desired effect. However, unlike the XML case, there was no response this time. This is because our controller does not know how to render JSON results yet. Looking at the create method in the BooksController, we notice that the responds_to block contains entries for HTML and XML, but not for JSON. Simply copy the XML lines and replace all occurrences of xml with json. The updated method should look like this:

def create
  @book = Book.new(params[:book])
 
  respond_to do |format|
    if @book.save
      flash[:notice] = 'Book was successfully created.'
      format.html { redirect_to(@book) }
      format.xml  { render :xml => @book, :status => :created, :location => @book }
      format.json { render :json => @book, :status => :created, :location => @book }
    else
      format.html { render :action => "new" }
      format.xml  { render :xml => @book.errors, :status => :unprocessable_entity }
      format.json { render :json => @book.errors, :status => :unprocessable_entity }
    end
  end
end

If you run the same curl command again, you should now get the following response:

{"book": {"isbn": 1234567890, "updated_at": "2008-05-26T06:11:33Z",
"title": "Posted via JSON", "price": 49.95, "author": "Jason Bourne",
"id": 4, "created_at": "2008-05-26T06:11:33Z"}}

One important thing to note is that both the incoming and outgoing JSON contain an outermost element called book. This is in fact required for resource based JSON requests to work. The same is true for XML, but since XML requires an enclosing element (unlike JSON, which can be “naked”), it is perhaps less obvious in this case. The outermost JSON element should always have the same name as the resource it corresponds to.

A few notes on testing JSON requests:

We initially banged our heads against the wall trying to figure out how to convince our functional test to pass JSON-formatted parameters in the post method, including various hacks to override different settings on the @request object. Admittedly it had been a while since I had seriously used Rails the last time, but it later dawned on me that we were going about this the wrong way. Functional tests (in Rails, anyways… let’s not talk about its confusing and non-standard test terminology) bypass most of the actual HTTP request handling and are not meant to test this aspect of an application. They essentially pick up at the point where the controller has received its (already parsed) parameters in the params hash, regardless of whether these originated from an XML, JSON, or URL-encoded form request.

Since JSON support is implemented by Rails (and thus covered by its own unit tests), it probably does not make sense to focus too much on testing this general functionality in the individual application. But if you do want to test JSON requests, you can use integration tests for this purpose. The post method in integration tests is more low level and simulates the actual HTTP request, along with the parameter parsing.

So in our library example, we might use an integration test case such as the one below to specifically test creating a book via JSON:

def test_create_via_json
  assert_difference('Book.count') do
    post '/books/create',
      '{"book":{"title":"Posted via JSON", "author":"Jason Bourne", "isbn":1234567890, "price":49.95}}',
      {'Content-Type' => 'application/json', 'Accept' => 'application/json'}
  end
end

Native JSON support in Rails is definitely a useful feature. In fact, I was fairly surprised that this wasn’t already implemented until recently. But now that it’s here, it should come in very handy.

Phusion Passenger (aka mod_rails) on DreamHost

Wednesday, May 14th, 2008

A couple weeks ago, Phusion Passenger (aka mod_rails) was released. I recently tested this at work, on an EC2 instance, and my initial experience was so smooth that I am already planning to use it to deploy our various Rails applications. The benchmarks I’ve seen put its performance on approximately the same level as a Mongrel cluster, but its ease of use is an order of magnitude better. All you need to do is install an Apache mod and set up a virtual host config that points to your Rails app’s public directory. You don’t even need to tell it that the directory you’re pointing to represents a Rails app — mod_rails is smart enough to figure this out by itself (although there are a few Rails specific options you can use to control the base URI or the Rails environment). No more juggling Mongrel PIDs, complicated proxy configs, or anything of that sort. Simply create a tmp/restart.txt file to have Apache reload the Rails app after you deploy a new version. Tom Copeland posted a very simple Capistrano script for mod_rails, which essentially does just that and stubs out the usual Rails Capistrano tasks that are no longer necessary in this setup.

Yesterday, DreamHost announced their support for mod_rails. I had played with Rails on DreamHost several years ago (back when FCGI was still the generally accepted way to run Rails apps), but ultimately gave up on this because of the frustrating experience (performance, stability, and ease of deployment wise). Since then, VPS hosting services (such as SliceHost) have become the prevalent solution for hosting Rails apps. But with DreamHost officially supporting mod_rails, I figured I’d give this a spin to see how well it works in practice.

I am happy to say, it seems to work just as advertised! In order to test Rails on DreamHost, I downloaded the popular Mephisto Blog Application, unzipped it into a directory on my DreamHost account, and configured the database settings (I didn’t even bother with MySQL and opted for Sqlite3 for the purpose of this test). I then went into my domain’s settings on the DreamHost Web Panel, checked the “Ruby on Rails Passenger (mod_rails)” checkbox, and pointed to my Mephisto directory’s public subdirectory as the web directory for my domain (this is important, as the web directory defaults to yourdomain.com, without the /public that mod_rails expects).

A minute or two later, my changes had been applied and I was greeted by the Mephisto blog when I hit my domain in the browser. I configured my blog’s settings and entered some dummy articles, and found the performance to be very snappy — no different from PHP apps that I am hosting at DreamHost (such as this WordPress blog).

I think this is pretty exciting. Sure, there are many other cost-effective options to deploy Rails apps these days (such as the unique and highly promising Heroku or a cheap $20 VPS slice on SliceHost), but for a personal blog or another small, reasonably low-traffic website (such as the 12 or so random Rails apps all of us are concurrently working on and too cheap to spring for VPS hosting, since most of them will never go anywhere), having the option to easily deploy these on a shared hosting account is great.

Now I am hoping that mod_rails will be extended beyond just Rails to support any Rack compliant Ruby web framework, such as Merb or Sinatra.

Twitter / Ruby on Rails FUD

Thursday, May 1st, 2008

Earlier today, TechCrunch’s poorly researched claim that Twitter is abandoning Ruby on Rails in favor of PHP or Java generated a lot of buzz in the Twitter and Ruby communities (the claim was later refuted by Twitter developer Evan Williams).

Of course, the article’s comments attracted the usual, ignorant TechCrunch trolls. Most took the opportunity to pitch their framework of choice (such as PHP, Java, .NET, or Django), which they claimed would of course magically solve all of Twitter’s scalability issues.

I have to say I am appalled at this level of ignorance. People just don’t seem to realize that Twitter is a complex messaging application and that the front-end is only a relatively small aspect of it. Even if one particular front-end technology happens to be faster than another one (and admittedly Rails, as much as I like it, is not the fastest technology out there), this fact is bound to be negligible compared to the real challenges in scaling the back-end, starting with the database (trust me, like many developers I’ve learned this the hard way ;) ). Even for a typical web application (which Twitter is not), there are many performance improvements than can be implemented at that level (such as leveraging database replicas to separate writes from reads, or utilizing Memcached to cache queries and other data), all of which can be applied equally well to any front-end framework.

I’m not saying that it does not make sense to consider other technologies (there might very well be a breaking point at which it makes sense to evaluate Java or even rewriting parts of the system in C/C++), but in my opinion this should be considered a cost-savings measure when the application reaches a scale at which the cost of hardware far outweighs any savings due to increased developer productivity (think Google), and not a magic bullet for solving fundamental scalability issues (performance != scalability!)

One of the real difficulties in scaling Twitter lies in the fact that all Twitter hits are completely personalized and need to return fresh data, making it difficult to fully leverage caching. Also, since Twitter is a social application and the returned data is generated by each user’s social graph, there is no straightforward way to shard the database by user, as one might be able to do in a typical e-commerce or enterprise application (or pretty much any non-social app…). Without knowing more about Twitter’s internal architecture and their actual profiling results, it would be foolish of me to make any concrete recommendations — particularly silly ones like “Use technology XYZ, it will magically solve all your problems!” Too bad many of the developers out there don’t seem to realize this…

Gelato with DHH

Sunday, April 20th, 2008

I just got back from an ice cream social with Ruby on Rails creator David Heinemeier Hansson (aka DHH). I had come across his blog post about this casual event last week and figured I’d make my way over there, particularly since the location (Michael’s Gelato & Cafe) in downtown Palo Alto is only 5 minutes from my house.

There were probably about 20 other people (mainly Rails developers) at the event, and since the space was pretty small and we didn’t have a dedicated room, it was initially difficult to actually speak with David. But towards the end of the evening the group started getting smaller, which made it easier to participate in the conversation.

In terms of what David had to say, I did not catch too many noteworthy items that are not already known in Ruby on Rails circles. But a few things seem to be key to understanding both the history and the future of Rails:

  1. David created Rails to solve his specific problems. It wasn’t meant to directly solve every possible problem out there, and if there are useful features missing from it that simply means that he (or presumably the rest of the core team, now that it has become a larger community project) has not needed that particular functionality, not that it would not be useful.

  2. Rails is (as we all know) a very opinionated framework. It makes a lot of reasonable assumptions about various things (such as the names of “id” columns), which enables developers to accomplish a lot with very little code. However, all of these opinions are purely about the internals of the system and therefore relevant for developers, but never affect the end user in any way. David is strongly against including any features in Rails that would result in any default application flows or otherwise affect the actual behavior of the application. That’s why Rails still does not (and probably will never) include a built-in authentication system.

Otherwise, he seems pretty indifferent about a lot of the details about how people are working with Rails. For example, Ruby and Ruby on Rails’ performance was good enough for him several years ago, so this area has never been too much of a concern of his. He also doesn’t feel strongly about using Test::Unit vs. RSpec. In his opinion, Test::Unit is good enough, and actually easier to understand for someone who does not have much experience with testing, whereas RSpec probably has more of a learning curve (and he did not seem to excited about the particular syntax). He felt that the most important thing that BDD and RSpec brought to the table is the “should” keyword, and that changing Rails to allow this in test method names made a big enough difference in readability and helped enforce the convention of using a single assertion for each test case. (I actually was not aware that you can use “should” in Rails test names; I’ll have to look into this.)

I probably missed a bunch of other nuggets of wisdom because the room was a bit noisy and crowded, but I’m sure that all of these have been posted on various blogs and mentioned in various keynote speeches before.

Anyway, it was nice to meet David in person. He seems like a great guy, and I definitely appreciate that he took the time to meet with a small fraction of his user base tonight. It’s refreshing to work with technologies that are owned and driven by real-world people like this, rather than large corporations that have lost touch with their user base.

github – Git Repository Hosting

Tuesday, March 11th, 2008

The distributed Git version control system has been gaining a lot of traction lately. In contrast to traditional, centralized version control systems like CVS or Subversion, Git enables developers to easily fork each other’s repositories, pull patches from each other, etc. Even though I have not had a chance to use Git beyond pulling down some open source projects (such as Merb), it is clear that this enables much richer collaboration and development dynamics.

In particular, github seems to have quickly become a favorite service for hosting Ruby projects. github is still in invite-only beta, but it already looks very impressive. For example, forking an existing repository that is hosted on github only takes a single mouse click, and sending a pull request to the master repository or other forks is just as easy. Each repository has an RSS feed and a wiki, which turns github into a full project hosting service as opposed to just a version control service. In fact, github has been described as a social network for hackers. In addition, github offers convenient features such as tarball downloads.

The developers had announced that public open source projects would always remain free. Today, github unveiled the detailed pricing plans. Overall, the plans seem very reasonable. Each plan includes unlimited public repositories and public collaborators, but a limited disk space and number of private repositories and private collaborators (depending on the plan). Personally, I would have hoped for the smallest commercial plan (Micro, which includes 5 private repositories and a single private collaborator) to cost less than $7, perhaps $3 – $5. Many developers that work on non-open-source projects in their spare time would presumably be interested in such a plan, but $7 seems slightly steep, given that you can get a Dreamhost hosting plan for $10 per month, which supports Subversion, website hosting, shell access, and, with some effort, even Git hosting. Obviously the $7 github plan offers a significantly better user experience, but I am not sure if I could justify the expense as a small developer working on non (or not yet, anyways) profitable projects in their spare time.

But ultimately github’s main focus is clearly on open source projects, and it sure is great to have a modern alternative to Subversion based project repositories like SourceForge or RubyForge. In fact I went ahead and created my own forks of the various Merb sub-projects, in case I want to play with some Merb changes or submit additional patches. I also have some ideas (or even unfinished code) for other projects that I might end up moving to github as open source projects.

I believe that one of the smaller commercial plans would also be an excellent alternative for a small startup that wants to avoid the hardware and IT headaches of dealing with their own setup in the initial phases. In fact, by combining github with a hosted issue tracking service like Lighthouse, a VPS hosting service like SliceHost, and Google Apps for email, calendaring, document collaboration, and now also Google Sites for a wiki or intranet, it should make it much easier to bootstrap a new company without a massive IT investment.

I have a few github invites, so leave a comment if you’d like one.

iPhone SDK

Friday, March 7th, 2008

This morning, Apple finally announced details about the long-awaited iPhone SDK. The SDK is available for download starting today, and it sounds pretty intriguing!

It comes with a full application stack that includes a custom version of Cocoa (Cocoa Touch), and APIs to access all the iPhone specific features, including hardware-accelerated 3D graphics, the accelerometer, camera, contacts, as well as location-awareness. This sounds a lot more complete that I expected.

The tools side is covered very well, also: The SDK includes a custom version of the Xcode IDE, Interface Builder, Instruments (a profiler), and an iPhone Simulator, so you should be able to develop iPhone apps using essentially the same toolset as for regular OSX applications.

As expected, Apple will control the distribution for all iPhone applications through their App Store. Applications can be purchased and downloaded right from the phone, or via iTunes. Developers get complete control over the pricing, and Apple offers a 70/30 revenue split (70% for the developer), which seems pretty reasonable. Developers will also be able to make their applications available for free, in which case Apple takes no cut. It sounds like Apple will generally allow any application to be published, but there are some (mostly obvious) limitations (such as pornography, VOIP over cellular connections, or anything illegal).

The SDK itself is available for free on Apple’s developer website, although you have to have to register as an iPhone developer. However, in order to be able to submit applications to the App Store, developers need to join the $99 iPhone Developer Program. Apparently the developer program will initially be available on a limited basis, but I assume that most developers will be accepted in to the program later. Since the App Store is not scheduled to launch until June, this does not seem like a major issue, and in the mean time the iPhone Simulator that is included in the SDK looks to be a pretty exact copy of the actual iPhone, so this will have to do.

Users will need a firmware upgrade in order to run custom applications. I assume that this will be released around the same time as the App Store (although hopefully a bit sooner, for testing purposes). iPhone users will be able to upgrade for free, but iPod Touch users (like myself) will be charged a nominal fee (the exact amount has not been announced), supposedly due to differences in the way Apple accounts for the iPod Touch, since it does not come with a subscription plan.

The SDK download proved to be a major pain, since Apple’s website was hopelessly overloaded and it was impossible to even get the download to begin. I eventually got lucky and found the SDK on BitTorrent. Apple really should have made the 2GB download available on BitTorrent in the first place…

Anyway, I’m planning to play with the SDK a bit over the next few weeks or so. One thing I was hoping for is that the SDK might include support for RubyCocoa, since I’m not all that thrilled about the prospect of using Objective-C (or any C-derived language for that matter…), but based on my initial research things aren’t looking promising. Ruby is supported on jailbroken iPhones, so it seems like it should be feasible to support RubyCocoa as well. This will be the first thing

Oh, there were a few cool game demos as well, including an iPhone version of Spore! I am most interested in building my own iPhone apps, but I am also excited to see what kind of applications and games will be released by third parties. I am currently using several useful applications on my jailbroken iPhone Touch (such as an ebook reader and an unit converter), and it seems like it should be reasonably straightforward to port these to the official SDK, so I bet that most of these applications will be available soon, either for free or for a moderate price.

For more details, check out the following links:

Ola Bini on JRuby

Thursday, February 28th, 2008

Today, I attended a tech talk by ThoughtWorks’ Ola Bini on JRuby (JRuby Wikipedia entry here).

I keep hearing great things about JRuby (even Matz has good things to say about it), and it’s nice to hear this from the horse’s mouth as well (Ola is one of the JRuby core developers). Apparently the latest version is highly compatible with Ruby 1.8.6, although applications that use native C extensions are not supported. It’s also impressive how far JRuby has come in terms of performance. Until about a year or so ago, it was still significantly slower than the standard Ruby 1.8 interpreter. Not only have they caught up since then, but the latest JRuby version running on the latest JVM (in server mode) is actually several times faster than the Ruby 1.9 interpreter (aka YARV), which in turn is several times faster than Ruby 1.8 (apparently up to 50x in benchmarks, but more like 1.5x on average). That’s quite an achievement!

The tight integration between JRuby and Java is very impressive. You can explore this by using jirb, JRuby’s version of irb. In this console, you can use all the standard Ruby classes and constructs, as well as Java classes and libraries (Ola demoed this by creating a Swing window and a button, with a listener implemented in Ruby). It really seems quite seamless and powerful.

Due to time constraints, Ola did not talk much about JRuby and Rails (I suppose his JRuby on Rails book would describe this in sufficient detail), but it sounds like it is pretty well supported by now. In fact, ThoughtWorks is shipping Mingle, a shrink-wrap agile project collaboration tool built in Rails and deployed on JRuby. Ola also cited several other large scale JRuby deployments at Oracle and SUN. In many cases, using JRuby proved to be a way in the door for development teams that wanted to use Ruby but whose IT organizations had standardized on Java and were not prepared to support another runtime.

Ola mentioned the GlassFish gem as a convenient way to deploy Rails applications using JRuby, although he recommended to mainly use it for development and testing, but not necessarily for production. There are other tools that create a WAR file for a Rails application, which can then be deployed on any standard Java application server. I did not catch the name of the tools that Ola mentioned, but Goldspike appears to be one of them.

About 2 1/2 years ago, I prototyped a scripting engine that ran within the Java based backend for our multiplayer mobile game. At the time, I dismissed JRuby in favor of Groovy and BeanShell (we tried both to see which one we liked better), because JRuby was by far the slowest scripting language to run in Java. Today, there’s no doubt I would have strongly considered using JRuby instead.

It’s weird… somehow Java still feels like a heavyweight system to me, with its compiler, JVM, app servers, etc. But in reality, JRuby is now not only faster than the native Ruby interpreter, but also has a smaller memory footprint. And because it supports proper threads (native Java threads), you should actually have even better opportunities for optimizing the deployment of your Ruby web applications (rather than having to juggle a bunch of Mongrels…)

I will definitely have to take a closer look at JRuby and perhaps try it out with Merb. According to Ola, Merb is supported even though it does use some native C extensions. Sweet!

Update: Ola’s JRuby tech talk is on Youtube now.