I’m not by any stretch of the imagination an early adopter of technology. Nor am I (mostly) an old stick-in-the-mud who thinks that all programming should be done in COBOL. I love new technology but prefer other people to work out the bugs and niggles before I can use them reliably as tools.

So it is with some pleasure that I’ve decided to take the plunge and begin studying the Ruby programming language. Ruby has been around for 15 years now so the foundations are solid. Many rave about its simplicity, utility and natural-looking syntax.

My goal is to learn the lanugage by porting an existing application (the food journal written in PHP that I recently used to play with Zend Framework) first to the Ruby language, then to the web application framework Rails.

As usual, I embark on this project by reading a manual!

Update 12-Jul-2009: Browse the code on Github

Reading the Ruby “Pickaxe” Book

Ruby has several extant versions. The book I’m reading relates to v1.6 of the language whereas the current stable release is v1.8.x. There’s also a bleeding-edge v1.9.x. I probably should read the v1.8 book but it isn’t obviously available for free. Going on the principle that I can learn the basics then adopt the changes, I stick with the older, free manual.

Update: thanks to Lucy I was able to borrow a paper copy of the Ruby 1.8 book from the university library.

Some Observations

Clean, Terse and Expressive

It’s a very clean-looking yet expressive language. I can immediately see that my preference for terseness is going to be satisfied.

“Ruby has a simple definition of truth”

Zero is not false?! Making it true. “Simple definition of truth” indeed. I gotta try this! If that’s the case, it’s a major mind-bender when switching between languages.

# "simple truth"? I don't think so :)
[nil, false, 0, 1, ''].each { |x| puts "#{x} is true" if x }
>ruby simple-truth.rb
0 is true
1 is true
 is true

OUCH!! Definitely something to be aware of going forward. Committed to Git as my first ever Ruby program.

Ultimate Power?

Ruby appears to support meta-programming at every level. Methods can re-define themselves. Objects can be assigned classes on-the-fly. With great power comes great responsibility, and some debugging nightmares, methinks.

Tainting and Safety

External data is considered “tainted” by default and the interpreter’s aggression level can be configured on-the-fly. It is possible to create highly restrictive “sandboxes” in which code can only interact with a set of pre-existing objects.

Installing Ruby 1.8.6 on Windows

I downloaded and installed the Ruby 1.8.6 one-click install package. I had already installed the Debian packages on Linux, but as I’ve been having trouble getting Linux working on my laptop, I’ve decided to stay with Windows for the time being.

Hey, it even comes with an editor! That’s handy. I usually use Crimson Editor on Windows, but as it’s long been unmaintained I’m happy to put some effort into learning a new one.

The installation was painless and I was up-and-running with the editor and a first Ruby program within 10 minutes. The editor is a bit clunky but operable. I’ll stick with it for now.

Planning for the Food Journal Port

Like last time, I intend to port the logic of the application first, then the presentation. To that end I’ll need a library to access Flickr and Picasa.

Accessing Flickr

Flickraw satisfies this requirement, and looks comprehensive to boot, unlike the disappointing Zend Framework component.

Accessing Picasa

This is more complex. Ruby-Picasa promises to be super-simple but has extra library requirements that I don’t quite understand, whereas picasaweb hasn’t been maintained since 2007. I think for simplicity, I’ll just use REXML to access Picasa by hand for the time being.

Generating Output

The food journal uses the Smarty templating engine for generating dynamic output. I’m sure there are many available templating systems for Ruby but I’ll look later, when I’m ready to generate some.

Oh, wait, look! There’s Canny – a Smarty-like templating system for Ruby. That’ll do.

Getting Started

As I’m using Windows, I need to make Ruby available as a CGI handler to XAMPP. I found these instructions, followed them and had the test page up and working in no time.

Next, I clone the food journal Git repo into the htdocs area of my XAMPP installation. A couple of minor tweaks later and the application is working – not a surprise as it’s still PHP :)

I intend to work from the inside out i.e. port the internal classes and logic first, then connect up the photo provider adapters and finally the user interface. As this is my first port to Ruby, I intend to do it entirely the wrong way without caring too much. Each file I port will be renamed from .php to .rb, the PHP code commented out and the equivalent Ruby code added under the comments. I can refactor it to be more Ruby-ish later.

Porting the User Class

Ooooh, that was relatively painless. The similarity between PHP and Ruby is such that this basic class ported almost line-for-line, the Ruby code being more compact.

Porting Some Other Classes

21 June 2009 @ 14:15 Sorry for the lack of updates. I have been gradually porting other food journal classes to Ruby and I’ve reached a pretty good point. I got enough of a feel for the language that I started doing things more Ruby-ish than I had originally intended. It’s just so tempting. Writing one “line” of Ruby code to replace several actual lines of PHP code appeals to my sense of minimalism and aesthetics.

Here’s an example, taken from the Flickr class, which is responsible for downloading user and photo data from the popular photo sharing website. The getPhotosetID method which finds the ID for the user’s food journal photoset which must be done by scanning through all the photosets and selecting the first one with the title “Food Journal”.

First the PHP code. Admittedly, this could be made much more compact, but it is what I had to start with.

private function getPhotosetID($userid) {
	$params = array("user_id"=>$userid);
	$results = $this->callFlickrAPI("flickr.photosets.getList",$params);
	$photosets = $results['photosets']['photoset'];
	$count = count($photosets);
	for ( $i = 0; $i < $count ; $i++ ) {
		if( strcasecmp($photosets[$i]['title']['_content'],'food journal') == 0 ) {
			return $photosets[$i]['id'];
		}
	}
}

And now the Ruby version I wrote.

def getPhotosetID( userid )
  notfound = lambda { raise 'No food journal photoset' }
  callFlickrAPI(
    "flickr.photosets.getList", { 'user_id' => userid }
  )['photosets']['photoset'].detect( notfound ) {|photoset|
    0 == photoset['title']['_content'].casecmp('food journal')
  }['id']
end

Unlike the PHP interpreter, the Ruby interpreter is capable of applying a hash key lookup to the result of a function call. The detect method returns the first element for which the predicate block returns true and calls the supplied function (notfound here) if no elements satisfy the predicate.

I’m liking this langauge, even more so because I know that what I’ve done so far can be improved!

Time For Some Output

I how have sufficient pieces in place to think about producing some web output. After investigation, I’ve decided not to use the Smarty-like library I previously found as it has not been maintained since 2004. Instead I’ll be using Amrita2 – a pure XML-based templating system that is available as a gem.

Working with Amrita2

Oooow… my head hurts! All the documentation I can find is in the form of test cases because the project’s homepage is down. There’s no direct equivalent to Smarty’s include feature (although macros should work). However, after a bit of messing about I manage to output the food journal homepage, including the dynamic list of links to user’s food journals.

Ditching Amrita2 In Favour of Tenjin

22-Jun-2009 @ 19:30 Amrita2 is too much like hard work and doesn’t obviously support the features I need. I Googled for ruby template engine and found Tenjin which looks to support everything I want and more. Specifically, it supports file inclusion and layouts. Let’s see…

20:20 Tenjin is delightful! A couple of teething troubles passing parameters to included templates but I get the output I want in next-to-no-time. Fast and functional. Wow!

I notice that Tenjin is leaving cache files in the templates directory. That’s fine for now but they’ll have to be stored elsewhere if I move this code into production.

Porting the Photostream Functions

28-Jun-2009 @ 10:40 Once I had Tenjin going it was a snap to port the simple templates, leaving only the difficult bits (as usual). Yesterday I started porting the photostream functions and hit a snag. There’s a neat PHP feature that Ruby doesn’t directly implement – automatic multi-dimensional array construction.

$navigation[$year][$weekno]['count']++;

Ruby can directly represent this data structure but doesn’t have an equivalent syntax. PHP automatically creates the array keys if they don’t exist, and the “++” at the end does “the right thing” such that if the “count” key doesn’t exist its value is considered to be zero then incremented, otherwise it is incremented in-place. Trying this in Ruby with an empty hash results in an undefined method call exception.

navigation = {}
navigation[2009][20]['count'] += 1
#> NoMethodError: undefined method `[]' for nil:NilClass

This makes sense. The “navigation” hash does not have a key “2009″ and so accessing it returns nil. The attempt to access key (or array element) “20″ is a call to the “[]” method which doesn’t exist on NilClass.

I figured out a workaround. First I Googled for php-like array syntax in ruby which was discouraging but yielded this mailing list post and the following code.

navigation = {}
navigation[year] = {} if navigation[year].nil?
navigation[year][weekno] = {} if navigation[year][weekno].nil?
navigation[year][weekno]['count'] = 0 if navigation[year][weekno]['count'].nil?
navigation[year][weekno]['count'] += 1

This is essentially what PHP is doing under the covers, but 4 lines of code for a simple multi-dimensional array assignment offends my sense of aesthetics, specifically DRY. So I read over the book and discovered an interesting Ruby-ism – the ||= operator. This allows me to perform the entire assignment on one line, albeit a fairly ugly line.

(((navigation[year] ||= {})[weekno] ||= {})['count'] ||= 0) += 1

Ah, not quite! The interpreter can’t handle the increment at the end, leaving me with a compromise…

((navigation[year] ||= {})[weekno] ||= {})['count'] ||= 0
navigation[year][weekno]['count'] += 1

Deeper Appreciation of PHP Arrays

As I get deeper into the porting effort, I’m realising that PHP arrays are the Swiss Army knife of data structure modelling tools, offering great power, flexibility and ease of use albeit at a high run-time cost. PHP arrays are hybrids of numerically-indexed arrays and hashes with strict ordering for numeric keys and insertion ordering for string keys. Ruby hashes are unordered making them less suitable for the task at hand.

Not to worry though. For now, I can solve the hash ordering issue at presentation time then refactor to use a combination of the rbtree gem and arrays.

Continuing The Photostream Porting

29-Jun-2009 @ 18:50 I’m determined to get to the point where I can run the entire application in a web browser. I’m down to porting the most complex of the templates and it’s heavy going.

20:30 Heavy lifting done. The day and week journals produce output, albeit a bit oddly due to the unordered hash issue. I’ll address that another day though.

Web Output

2-Jul-2009 @ 18:20 Now that the heavy lifting is done I’m going to switch from command-line testing to web-based. Rather than use a global variable to hold the CGI object, I’m going to play with Ruby’s mix-in feature.

19:10 I couldn’t mix-in the CGI class so I used inheritance. Then it was a hassle figuring out why the script couldn’t load gems when executed in the server environment. Figured that out and got some output.

Picasa Adapter Port

4-Jul-2009 @ 12:00 Now on to something more meaty! Ruby provides as part of the standard library REXML – a pure Ruby implementation of a non-validating XML 1.0 processor and the XPath 1.0 query language. The food journal Picasa adapter uses quite complex XML and XPath processing to extract photo information from Atom feeds. The PHP XML implementation is based on libxml – a C library – so I’m not expecting much in the way of performance from the Ruby one but I’m going to do some profiling once I’ve finished and verified the port.

13:00 Hmm… I always wondered why the feeds coming from Picasa didn’t contain a lot of information about the photos, specifically the timestamps weren’t coming back as data. Turns out I was using the wrong projectionbase instead of api. I didn’t read the Picasa API manual properly. Now I can do more and retrieve less data!

14:00 Picasa adapter done. It’s bloody slow but I was expecting that. Anyway, the food journal port is almost functionally complete. Just some tidying and bug fixing remains.

Tidying and Bug Fixing

First and foremost I must fix the chaotic order of the hourly display. It is suppose to have daybreak at the top and progressively later times down the page. Right now the hours are all over the place. I can’t quick-fix by simply iterating over the sorted hash keys as that would always have midnight at the top. So, instead, I have to refactor how the data is produced – going back to a simple array starting at daybreak and making the rendering of the real hour a concern of the presentation.

5-Jul-2009 @ 11:15 I completed the refactoring of the photostream storage to arrays this morning and it appears to work well, with the added bonus of being far less code. I then added back the daybreak navigation drop-down, noting along the way that the Ruby CGI library disappointingly doesn’t support generating XHTML elements, despite a patch to do so having been submitted in January 2006.

The next tidying effort is to eliminate the use of Unix timestamps within all the date and time handling code. Ruby has full-fledged support for dates and times without having to mess about with “the count of seconds since epoch”.

The Problem With Time

… is that it moves forward but from different starting points depending on where and when you are in the world. Time zones are a programmer’s worst nightmare and aren’t made any easier to deal with using Ruby’s built-in libraries. For example, Google returns the time a photo was taken as the count of milliseconds since 1st January 1970, but gives no indication as to whether that is adjusted for a time zone. I can empirically determine that it isn’t adjusted, but now I need to tell the Ruby libraries about this fact. I can create an unadjusted time using Time.at(ts).gmtime but now I have a problem – I don’t want a Time object, I want a DateTime object. There doesn’t appear to be any efficient built-in means to coerce between the two types. I Googled a solution though.

This made me think… why do I want to use a DateTime and not a Time? Well, it’s to do with the formatting function strftime, which is more functional with DateTime than with Time. To wit…

DateTime.now.strftime('%A, %e %b %Y')
#=> "Sunday,  5 Jul 2009"
Time.now.strftime('%A, %e %b %Y')
#=> ""

12:30 Thinking further about this, I decide to pass for now. There are other more functional improvements I can make, such as error handling!

Error Handling

This is fun – feed the application garbage and see how it responds. I know for sure that certain types of error are not currently handled, such as non-existent users and rubbish data.

14:50 Got basic error handling working, but I’m getting bored now.

Final Tidying

12-Jul-2009 @ 13:15 I spent this morning working with the food journal code on Linux, getting it to run under Lighttpd. I then cleaned up the code, mostly removing the embedded but commented-out PHP. That done, I saw that it was mostly good, and uploaded the code to Github.

Final Thoughts

I mostly like Ruby, at least enough to take a step further and port a different application – one with a very rich user interface and a database. Oh yes, it’s time for me to get on the Rails!

Leave a Reply

You must be logged in to post a comment.