The Zend Framework is a long-established and well-respected set of libraries for building PHP web applications. I’ve not used it before and I’ve recently become interested in learning more about it so I’ve decided to have a play. Rather than start entirely from scratch, I’m going to re-use the database schema of a famous dive-logging web application and see how much effort is involved in re-implementing the basics; logging in and displaying a page of dives.

More importantly, how much fun is it to work with?

Update 12-Jul-2009: Browse the code on Github

Step 0 – Local Preparation

There’s no point in declaring that I’m playing in public if the result of my playtime is private, so I’ll publish the site next to the blog at http://libertini.net/libertus/zf/. Also, even though I’m playing, I don’t want to risk losing any effort so I’ll work locally in a Git repository and use it’s push function to send any code I write to the public site.

But first, I need to set up my workstation for web application development. I need PHP (of course), a database server and a web server configured for personal home pages. MySQL and Apache are a bit heavyweight for my needs, so I’ll install Lighttpd (aka “lighty”) and SQLite. Before I do that, it’s due diligence to Google the  combination to make sure it’ll serve. No total disasters in the results… so press on.

~> sudo apt-get install lighttpd php5-cgi php5-sqlite

Next comes configuration. Ubuntu’s packages pre-configure Lighty to work with PHP and FastCGI, so all I need to do is enable the appropriate modules and set a test script.

~> sudo lighty-enable-mod fastcgi userdir
~> sudo /etc/init.d/lighttpd force-reload
~> mkdir public_html
~> echo "<?php phpinfo();" > public_html/index.php

Browsing to localhost/~paul shows the expected PHP information page. Too easy. Sometimes I pity those who choose to use Windows. :)

Step 1 – Getting Started With Zend Framework

As usual, I start by reading the manual (and downloading the reference guide). However, the greedy bastards at Zend require me to register before I can download anything. That’s not the spirit, guys! Out of principle, I use GuerillaMail and great profanity to screw with their marketing numbers.

1a Reading The Manual

12-Mar-2009 @ 20:20 I’ve been reading the reference guide (PDF, English, 1.6) for most of the evening. It is certainly comprehensive but inconsistent, poorly edited (missing words, for instance) and employs poor English frequently enough to be more annoying than amusing. It also commits the mortal sin of claiming to have an index without providing one. I’ve got to page 187 of 1132. /me soldiers on…

14-Mar-2009 @ 13:30 Still reading. I noticed that Ubuntu provides a package for ZF but it is a bit old at version 1.5.3 whereas the manual I’m reading is for version 1.6 and the latest downloadable ZF is version 1.7.6. The Jaunty package is newer but I doubt I can use it.

14-Mar-2009 @ 21:30 Finished reading for the day at Chapter 33. The DB component is disappointingly simple. Components such as Form are disturbing. It is crystal clear to me now that the primary purpose of ZF is to lock developers into PHP and discourage them from seeking the correct solution to certain problems – an old Microsoft trick.

15-Mar-2009 @ 08:15 Continuing reading. I’m now minded to use ZF on a different application than I first thought – the food journal. It’ll be interesting to see if I can improve that previously hand-written application. I reckon the following components are pertinent; Model/View/Controller, Paginator, GData_Picasa, Service_Flickr, Cache, Config, Date, Test.

15-Mar-2009 @ 10:00 Nearly done reading. I’m at page 999. Thank goodness for coffee and cigarettes! I’ll soon be able to start the fun part. Amusingly, the chapter on the Translate component has some of the worst English translations in the entire manual f.e. “The default delimiter for CSV string is the ‘;’ sign. But it has not to be that sign.”

15-Mar-2009 @ 11:15 Yay! I reached the end of the manual. Now to continue getting started by developing a couple of failing environmental tests (assert ZF available and of compatible version), install and configure ZF such that the test pass without modification (very important) then get on with thefun stuff. After I play a couple of rounds on Rock Band, that is. :)

1b Preparing The Environment

Assuming that I’m starting from scratch, I need somewhere to build the code for the application. I don’t need nor want to care where ZF is installed, only that it is available and of a compatible version. I think that’s best achieved using PHP’s include_path directive which I can set globally and, I hope, specifically for my user environment within Lighty. I’m not concerned with security when working locally so I’ll build the app in my public_html directory for ease of access.

To build and run tests, I need the PHPUnit testing framework package. I’ll also be using that to run the tests supplied with ZF prior to linking it up with my application.

sudo apt-get install phpunit

Next I need to build the first test within the application directory structure.

~/public_html> mkdir -p zf-app/tests
~/public_html> cd zf-app/tests
~/public_html> touch BasicEnvironment.php

I have a totally empty test at the moment. How does PHPUnit deal with that?

~/p/z/tests> phpunit .
PHPUnit 3.2.16 by Sebastian Bergmann.
File "..php" could not be found or is not readable.

Eh? Oh… look at the PHPUnit version. The “run tests in directory” facility was added in 3.3. Perhaps I’ve installed the wrong package? Nope… that’s the version supplied with Intrepid and also in Jaunty. I’m getting pretty sick of Ubuntu. I install from PEAR instead.

After building the two basic tests I run them directly. As expected, they fail because class Zend_Version isn’t available. That’s cool. Time to commit.

~/p/zf-app> git init
Initialized empty Git repository in /home/paul/public_html/zf-app/.git/
~/p/zf-app> git add tests/*.php
~/p/zf-app> git commit
Created initial commit abea47a: Basic environment tests
1 files changed, 24 insertions(+), 0 deletions(-)
create mode 100644 tests/BasicEnvironment.php

1c Installing Zend Framework

I’ve got the tarball of ZF 1.7.6 from the website. I want to install it locally, run its tests, setup my local environment to point to it then run my tests, which will pass when I’ve got things right.

~> mkdir libs
~> cd libs
~/libs> tar zxf ~/Desktop/ZendFramework-1.7.6.tar.gz
~/libs> cd ZendFramework-1.7.6/tests
~/l/Z/tests> phpunit AllTests
Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 30720 bytes)
in /home/paul/libs/ZendFramework-1.7.6/library/Zend/Controller/Dispatcher/Standard.php on line 262

BOOM! Now that I didn’t expect. Seems the test suite requires a higher memory limit than the default. Ah, there’s a TestConfiguration.php.dist file that needs to be adapted before I can begin. I’ll relax the memory limit and enable tests for SQLite.

~/l/Z/tests> time phpunit AllTests
PHPUnit 3.3.15 by Sebastian Bergmann.
 
............................................................   60 / 7087
............................................................  120 / 7087
............................................................  180 / 7087
............................................................  240 / 7087
.S..........................................................  300 / 7087
............................................................  360 / 7087
...........................SSSSSS.......................SSSS  420 / 7087
SSSSSSSSSSSSSSSSSS..........................................  480 / 7087
............................................................  540 / 7087
............................................................  600 / 7087
............................................................  660 / 7087
....................................SSS.....................  720 / 7087
.....S.S.SS.................................................  780 / 7087
..............................................S.............  840 / 7087
.....................S......................................  900 / 7087
............................................................  960 / 7087
............................................................ 1020 / 7087
............S............................................... 1080 / 7087
..............................S............................. 1140 / 7087
............................................................ 1200 / 7087
.............SS....S.S...............................S.....S 1260 / 7087
............................................................ 1320 / 7087
............................................................ 1380 / 7087
............................................................ 1440 / 7087
............................................................ 1500 / 7087
..I......................................................... 1560 / 7087
......I......SSSSSSSSSSS.........S.......................... 1620 / 7087
.......................................I.................... 1680 / 7087
............................................................ 1740 / 7087
............................................................ 1800 / 7087
..............................S............................. 1860 / 7087
............................................................ 1920 / 7087
............................................................ 1980 / 7087
............................................................ 2040 / 7087
............................................................ 2100 / 7087
............................................................ 2160 / 7087
............................................................ 2220 / 7087
............................................................ 2280 / 7087
.......I.................................................... 2340 / 7087
............................................................ 2400 / 7087
............................................................ 2460 / 7087
............................................................ 2520 / 7087
..........................I................................. 2580 / 7087
...............I............................................ 2640 / 7087
............................................................ 2700 / 7087
............................................................ 2760 / 7087
............................................................ 2820 / 7087
............................................................ 2880 / 7087
............................................................ 2940 / 7087
............................................................ 3000 / 7087
............................................................ 3060 / 7087
............................................................ 3120 / 7087
............................................................ 3180 / 7087
............................................................ 3240 / 7087
............................................................ 3300 / 7087
............................................................ 3360 / 7087
............................................................ 3420 / 7087
............................................................ 3480 / 7087
............................................................ 3540 / 7087
............................................................ 3600 / 7087
............................................................ 3660 / 7087
....................................................S....... 3720 / 7087
...I....................................................I... 3780 / 7087
...........S..........S..SSSS...............S............... 3840 / 7087
............................................................ 3900 / 7087
............................................................ 3960 / 7087
............................................................ 4020 / 7087
..................................S................S........ 4080 / 7087
............................................................ 4140 / 7087
............................................................ 4200 / 7087
............................................................ 4260 / 7087
...............I............................................ 4320 / 7087
............................................................ 4380 / 7087
............................................................ 4440 / 7087
............................................................ 4500 / 7087
............................................................ 4560 / 7087
............................................................ 4620 / 7087
............................................................ 4680 / 7087
............................................................ 4740 / 7087
............................................................ 4800 / 7087
............................................................ 4860 / 7087
............................................................ 4920 / 7087
............................................................ 4980 / 7087
............................................................ 5040 / 7087
............................................................ 5100 / 7087
............................................................ 5160 / 7087
....................E....................................... 5220 / 7087
............................................................ 5280 / 7087
............................................................ 5340 / 7087
............................................................ 5400 / 7087
............................................................ 5460 / 7087
............................................................ 5520 / 7087
............................................................ 5580 / 7087
............................................................ 5640 / 7087
......................................E.EEEEEEEEEEEEEEEEE... 5700 / 7087
............................................................ 5760 / 7087
................................................S........... 5820 / 7087
..............................S............................. 5880 / 7087
.........S................SSSSSSSS.......................... 5940 / 7087
............................................................ 6000 / 7087
............................................................ 6060 / 7087
............................................................ 6120 / 7087
............................................................ 6180 / 7087
S........................F...I....F......................... 6240 / 7087
...............I............................................ 6300 / 7087
....I....................................................... 6360 / 7087
.....................................S...................... 6420 / 7087
............................................................ 6480 / 7087
............................................................ 6540 / 7087
............................................................ 6600 / 7087
...............I............................................ 6660 / 7087
............................................................ 6720 / 7087
......................................................F..... 6780 / 7087
............................................................ 6840 / 7087
............................................................ 6900 / 7087
............................................................ 6960 / 7087
............................................................ 7020 / 7087
.........I.................................................. 7080 / 7087
.......
 
Time: 04:50
 
There were 19 errors:
 
1) testCreate(Zend_Memory_MemoryManagerTest)
Zend_Memory_Exception: Memory manager can't get enough space.
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:408
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:381
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:287
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:254
 
2) testQueryParser(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
3) testEmptyQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
4) testTermQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
5) testMultiTermQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
6) testPraseQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
7) testBooleanQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
8) testBooleanQueryWithPhraseSubquery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
9) testBooleanQueryWithNonExistingPhraseSubquery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
10) testFilteredTokensQueryParserProcessing(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
11) testWildcardQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
12) testFuzzyQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
13) testInclusiveRangeQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
14) testNonInclusiveRangeQuery(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
15) testDefaultSearchField(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
16) testQueryHit(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
17) testDelayedResourceCleanUp(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
18) testSortingResult(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
19) testLimitingResult(Zend_Search_Lucene_Search23Test)
Uninitialized string offset:  7
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538
/home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201
 
--
 
There were 3 failures:
 
1) testSetExpirationSeconds(Zend_SessionTest)
iteration over default Zend_Session namespace failed; expecting result === ';a === apple;o === orange;p === pear', but got '  thrown in /home/paul/libs/ZendFramework-1.7.6/library/Zend/Session.php on line 444'
Failed asserting that <boolean:false> is true.
 
2) testTableNameSchema(Zend_Session_SaveHandler_DbTableTest)
Expected exception Zend_Db_Statement_Exception
 
3) testBasic(Zend_Validate_File_MimeTypeTest)
Test expected true with 'image/gif'
Messages: array (
  'fileMimeTypeFalse' => 'The file 'testsize.mo' has a false mimetype of 'text/plain'',
)
Failed asserting that <boolean:false> matches expected value <boolean:true>.
 
FAILURES!
Tests: 7087, Assertions: 31113, Failures: 3, Errors: 19, Incomplete: 14, Skipped: 82.
Command exited with non-zero status 2
124.96user 7.65system 4:57.53elapsed 44%CPU (0avgtext+0avgdata 0maxresident)k
312inputs+28616outputs (2major+335358minor)pagefaults 0swaps

Give me one good reason to not be pissed off by that. To me, the point of a test suite is verification of reliability. I now have irrefutable evidence from its own test suite that Zend Framework 1.7.6 cannot be relied upon. Why should I risk using it?

Still, it could indicate incompatibility with my environment rather than a deliberate relesing of the software with failing tests. The memory manager failure is particularly disturbing but may have something to do with my setting an unlimited memory limit. I don’t care about the Lucene search so I can ignore those. The failures seem minor enough. I’ll carry on, but my confidence has been shaken.

1d Making Zend Framework Available To My Application

According to the manual, I can use the PHPRC environment variable to specify a user-local configuration file. I need to ensure ZF is in the include path and, preferrably, configure the ZF auto class loader. I’ll do that using the auto_prepend_file directive. Unfortunately, I had to hard-code the entire include path as I couldn’t get the reference syntax to work.

Here’s the configuration I’ve used. I’ve not run the tests yet to prove this is correct.

~/libs/php.ini.paul

[php]
include_path = ".:/usr/share/php:/usr/share/pear:/home/paul/libs/ZendFramework-1.7.6/library"
auto_prepend_file = /home/paul/libs/zf-bootstrap.php

~/libs/zf-bootstrap.php

<?php
require_once 'Zend/Loader.php';
Zend_Loader::registerAutoload();
~> set -x PHPRC ~/libs/php.ini.paul
~> php -i | grep include_path
include_path => .:/usr/share/php:/usr/share/pear:/home/paul/libs/ZendFramework-1.7.6/library

Now back to my application tests…

~> cd public_html/zf-app/tests/
~/p/z/tests> phpunit BasicEnvironment
PHPUnit 3.3.15 by Sebastian Bergmann.
 
.F
 
Time: 0 seconds
 
There was 1 failure:
 
1) testZendFrameworkVersionIsCompatible(BasicEnvironment)
Need at least Zend Framework 1.6.0, have 1.7.6
Failed asserting that <boolean:false> is true.
/home/paul/public_html/zf-app/tests/BasicEnvironment.php:22
 
FAILURES!
Tests: 2, Assertions: 2, Failures: 1.

Oops! That’s a logic inversion bug in my test! Once fixed, both tests passed. Committed the updated test. I can do some real fun stuff now.

Step 2 – Porting an Application to Zend Framework

2a Installing the Food Journal Application

First I need to code for the food journal application, which I export from the Subversion repository. That done, I move the directories around a bit to match ZF conventions. The app is already split into webroot and offroot which map to “public” and “application” respectively. Did that and committed.

2b Making the Application Work Again

Next I’ll make the application work again using the new directory structure. I don’t want to leap into porting bits to ZF until I’m satisfied the basics are operating correctly. The app uses a path mapping file, so it’s a simple matter of creating it and setting the offroot and webroot paths. That almost worked – the app uses the Smarty templating system which needs write access to its compiled template directory. That done, the app begins to work.

Gotta love code written by professionals, eh? That was almost too easy!

Anyway, on to the individual porting efforts. I’ll create a separate topic branch for each so that I can cherry-pick the changes I want during integration.

2c Porting the Picasa Adapter

The application was designed to work with multiple back-end photo storage systems. As I happened to write the Picasa adapter, it seems the ideal place to start. The core of the class is the following method;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public function getPhotostream( $user )
{
    return
        $this->organisePhotostream(
            $this->extractPhotosFromAlbums(
                $this->loadAlbumsFromUserFeed(
                    $this->loadUserFeed(
                        $user
                    )
                )
            )
        )
    ;
}

Pretty, eh? I’ve no desire, nor should I need, to change any of that because it’s perfect. Only the implementation will change, which is good because it’s an ugly mess of DOM and XPath queries dealing with Google’s Atom feed format.

Here’s where it gets difficult. Getting a user feed using Zend_Gdata_Photos is easy enough, but what then? The ref guide implies I can just iterate over the entries in the feed. Time for some debug printouts, methinks. Having already installed FireBug and FirePHP, I’d like to play with the Zend_Log facility.

I really don’t like the Gphoto API, or maybe the documentation is confusing. Either way, this port is proving more frustrating that I thought it would be. Enough for today.

21-Mar-2009 @ 08:45 Carrying on this morning. I’ve decided to code naked, which isn’t some special technique – I’m in bed! I left off with a gripe about the Zend_Gphoto API and some nearly-working code. My main trouble is figuring out which layer of inheritance exposes the method that returns the data I want in the form I want it. Scanning a single Zend_Gdata_Photos_PhotoEntry object with FirePHP it looks like I want the media group which exposes descriptive data and thumbnails, but the methods of the class all return strings and I definately want an object. One level of inheritance back I find what I’m looking for – Zend_Gdata_Media_Entry::getMediaGroup() which returns an object. NO IT DOESN’T! The method I want, which returns the data I can see is in the object, has also been overridden to return a string, rendering the API virtually useless. What a crock of shit! I’m wasting my time trying to work like this.

21-Mar-2009 @ 17:40 Now I’m really confused. The API docs say that PhotoEntry::getMediaGroup() returns a string but it in fact returns an object. Sigh.

Double-sigh. Whilst getting frustrated about the lack of FirePHP output, I speculatively decided to send the content of the exception I’m throwing to flush the debug channel. Zend_Gdata_App_HttpException: Invalid chunk size followed by what is clearly a fragment of the feed. Is this yet another HTTP client implementation that can’t handle chunked transfer encoding correctly? Is Google’s server misbehaving? How am I supposed to know? As if to add insult to injury, the script is now refusing to return and is not obeying the time limit (my fault, see below). This isn’t fun so I’m going to do something hopefully more exciting – try OpenSolaris.

The Zend_Gdata_Photos API is useless. The Zend HTTP client may be broken.

22-Mar-2009 @ 08:00 Such failures might deter a lesser individual but not me. I’m going to work around the maybe-broken HTTP client by using the Zend_Cache. I’m not giving up on the Photos API because I haven’t looked at the underlying code yet so the problems could still be in my head or the documentation.

Cacheing first. I’m thinking that, as my Picasa adapter makes several large HTTP requests in quick succession that the failures may be related to pipelining, so keeping the results locally will at least allow me to focus on the data manipulation.

08:40 After some wrangling with the cache module I’m back to where I started – the script never returns. I suspect it is related to my attempt to retrieve the thumbnails and send the structure over FirePHP so I remove that and try again. I also notice that I’ve neglected to save anything to the cache. I add that before trying again. Silly!

Hmmm… the cache files have been written to /tmp as expected but the script is still hanging. Odd. The only thing I can think of that could cause this is the addition of dumping the exception content. Remove that also. Try again. Immediate response. Very weird. No matter… I can continue.

09:20 Success at last! My simple testing indicates that the array of photos is being constructed correctly using the Zend_Gdata_Photos API. Time for a complete run. It looks good, so now to clean up the code, commit and publish.

While cleaning up I noticed that I had relaxed the memory and time limits on the script. Oops. I took out the relaxation code and immediately the script failed by running out of memory. The PHP default of 16MB is not enough, especially on a 64-bit machine. Raising the limit to 128MB also caused failure. Doubling to 256MB still failed. The script was successful with a 512MB memory limit. That’s outrageous and something I will have to investigate.

10:20 Step 2c complete. The Picasa adapter is sufficiently ported to Zend Framework that I can consider what to do next.

Step 2d Implementing MVC

This is where I expect ZF to shine but is also the most invasive change to the application. The food journal is simple so doesn’t use the front controller pattern. Instead is is separated into a front page and an application script which accepts parameters. To successfully port this to the ZF MVC I’ll need to carefully plan the changes and re-design the application’s operational map to suit. I’ll also need to re-read the ZF MVC documentation.

However, the hacker in me says JFDI! Resist or succumb?

Ah, what the heck? Let’s compomise! There’s no harm in writing down how the application currently hangs together.

The Front Page

Well that was immediately valuable! I can see straight away that the front page of the application does indeed employ the front controller pattern by accepting a page parameter.

Page name Purpose
none Display the front page with news, links to the “try it out” pages (named flickr and picasa), a list of preset food journals (linking to foodjournal.php) and a list of links to informative pages (named about and news).
about Helpful information about the application and the navigation (preset food journals and informative links)
news Latest news and scientific research, with navigation
flickr Information about how to set up a food journal on Flickr, a form to type in a Flickr screen name to display the food journal and navigation
picasa Information about how to set up a food journal on Picasa, a form to type in a Picasa screen name to display the food journal and navigation

Clean and consistent. The navigation footer is repeated on all the pages. This should be a piece of cake to re-implement using ZF MVC. The templating engine is currently Smarty but I’ll switch over to the ZF style of simple PHP includes. Anyway, time to work through the ZF Quickstart which focuses on MVC.

Set Up The Project Structure

The quickstart says I should create directories for controllers, views and scripts. I’ll do this on a git topic branch to keep it separate from the Picasa work I did earlier – they aren’t related after all. Git makes it easy to merge multiple branches and I’ll be using that later.

~/p/zf-app> git branch
  master
* zend-picasa-port
  zend-port
~/p/zf-app> git checkout -b zend-mvc-front-page zend-port
Switched to a new branch "zend-mvc-front-page"
~/p/zf-app> mkdir -p application/controllers application/views/scripts

Create a Rewrite Rule

As I’m using Lighttpd and not Apache, the quickstart tells me what I need to do but not how to do it. I’m not sure if Lighty handles per-directory configuration files so I’d better read the manual. It doesn’t so I’ll have to set the rewrite rule in the server configuration.

Some Googling and head-scratching later, I settled on the following rule, which is broken but works well enough to let me carry on.

url.rewrite-once = (
  ".*.(js|gif|jpg|png|css)$" => "$0",
  "~paul/zf-app/public(.*)" => "~paul/zf-app/public/index.php$1"
)

Create a Bootstrap File

As the application already has a bootstrap-like file, I won’t just copy the quickstart version blindly. I’ll need to adapt the current file to match what ZF needs, which is pretty much just setting up the controller and dispatching the request. Having done that, I commit the changes in a known-broken state. I like to see the evolution of code.

Create an Action Controller & View

Creating the index controller class is easy – it’s got no code! I moved the Smarty template for the index into the views/scripts/index directory, renamed it to index.phtml, commented out the header and footer inclusions and then ran the script. Up pops a page! Too easy.

Create an Error Controller & View

Cough! I’ll come back to this later.

Create a Layout

Ah, here’s the way to set the headers and footers. Looks pretty straightforward too. The application currently includes a header and footer template on each page template which is a bit repetitive (although there are good reasons for it). Using ZF I can do both all in a single file and include the content. On the other hand, both the header and footer have dynamic elements that I’ll have to comment out for now. I suspect I’ll be able to implement them later using view helpers.

This also raises a question about Git. It can track files being moved around, sometimes automatically by detecting substatially similar content, but can it tell if two files are merged into one? I doubt it and I’m not aware of any VCS that can.

Hmm… another issue. I normally refuse to enable PHP’s short_open_tags but the quickstart makes extensive use of them. Should I make an exception in this case? No, because they hide that fact that I’m echoing strings of text synthesised using PHP, whereas normal open tags make it perfectly clear.

<?= $this->layout()->content ?>

versus

<?php echo $this->layout()->content; ?>

I suspect I’ll be writing my own XSLT-based views before long. I abhor XML synthesised from text. It’s so primitive and prone to validity errors. XML documents are data structures.

Hmm… the navigation footer needs to link to several other pages and the food journal itself but the quickstart is, as yet, silent on how to do so. Rather than copy the URL generation code that I don’t understand, I’ll omit the navigation links for now.

HA! Having made the layout changes, as soon as I run the application I get a syntax error message complaining about an unexpected T_STRING. It turns out that PHP is currently configured with short_open_tags enabled and it’s complaining about the <?xml ... ?> preamble. Funny how that’s missing from the quickstart. Anyway, I alter the system-level PHP configuration to disable short_open_tags, run the application and it works – the index page has a basic header and footer.

28-Mar-2009 @ 10:30 A couple of evenings ago I took the sledgehammer approach and completed the app’s front page in MVC form. I used the url view-handler to create the navigation links and damn they’re ugly! The links in the template went from

<a href="?page=picasa">Using Picasa</a>

to

<a href="<?php echo $this->url(array('controller'=>'index','action'=>'picasa'),'default',true) ?>">Using Picasa</a>

Hopefully that is my inexperience with ZF showing through and there exists a more elegant and comprehensible way to draw links. Anyway, I’m not going to obsess about it as today I want to push towards completing my play by implementing the food journal functionality.

Step 2e Implementing the Food Journal

I prepare for this by drawing the links from the front page to the Flickr and Picasa pages using the same ugly url helpers from the layout navigation links.

Next, I need to implement the forms on the pages which supply the food journal with the necessary details to retrieve someone’s photos and display them. The food journal will be a separate controller. Passing the details to the controller seems like an ideal time to use parameterised routes. The food journal takes several parameters;

provider
Code for the back-end photo storage provider. Currently supports flickr and picasa. No default so must be specified.
username
Identifier or screen name of the person whose food journal is to be displayed. Depends on the selected provider.. No default so must be specified.
start
Start date from which to show the food journal. Defaults to “last Monday”.
range
Controls the type of food journal display. Supports index, day and week (the default).

daybreak

Tweaks the time food journal display begins each day. Defaults to “07:00″.

It seems clear that the non-default parameters must form the route and that the others continue to be provided in the query string, yielding URLs like http://hostname/app-base/foodjournal/picasa/Libertus96?start=2008-08-08&range=day. However, this introduces a problem – how am I to direct a form submission to a varying URL? That is not possible with HTML as the form element’s action attribute is fixed. Perhaps MVC routing is inappropriate here. I could solve this by issuing a redirect but I don’t like to introduce such an overhead unnecessarily.

Tidying Up a Little

Before I embark on the major rewrite, I’m going to tidy up the code a little by removing the parts that are now implemented using ZF MVC. The application still carries on after ZF has done its bit and spits out Smarty error messages. The control logic from the previous front controller implementation is still in place. All that has to go.

Implementing the Food Journal Controller

28-Mar-2009 @ 13:45 After several social distractions, I’m ready to implement a parameterised controller. Thinking about the logic, I realise that range is really the action because it determines which view is to be displayed. I can use URLs in the form http://hostname/app-base/foodjournal/picasa/Libertus96/week/2008-08-08?daybreak=07:00, following the principle that all manatory elements must precede any optional element.

I re-read the manual section on MVC routing.

28-Mar-2009 @ 14:30 Ahhh… now some things start to make more sense, including the URL view-helper I previously declared ugly. Here’s the route I decided upon;

Zend_Controller_Front::getInstance()
                     ->getRouter()->addRoute(
                        'foodjournal'
                      , new Zend_Controller_Router_Route(
                            'foodjournal/:provider/:username/:action/:start'
                          , array(
                                'controller' => 'foodjournal'
                              , 'action' => 'week'
                              , 'start' => date( 'Y-m-d', strtotime('last Monday') )
                            )
                          , array(
                                'action' => 'index|day|week'
                              , 'provider' => 'picasa|flickr'
                              , 'start' => 'd{4}-d{2}-d{2}' /// FIXME match date
                            )
                        )
                     );

I created a new FoodjournalController class with three actions, namely indexAction(), weekAction() and dayAction(), moved the existing Smarty templates into the views/scripts/foodjournal directory naming them index, week and day respectively. I then added a navigation link to my personal food journal on Picasa like so;

$this->url(array('provider'=>'picasa', 'username'=>'libertus96'), 'foodjournal', true)

That’s not quite so ugly now that it makes sense! :) The basic URL routing also works fine in that I can access all three views by changing the URL. I had to comment out vast parts of the Smarty templates that dealt with dynamic data. Next step is to acquire and make use of the parameters to get some of the dynamism back.

28-Mar-2009 @ 17:15 Success! I focused on the default weekly food journal view and reimplemented some basic dynamic data, including the week start date and the user’s display name. Still lots more to do but I have the model for it now.

Removing “FIXME: Built With Zend Framework”

25-Mar-2009 @ 10:00 It’s a lovely sunny morning so I’m braving the cold and put on some shorts. Yesterday saw some decent progress in my understanding of the ZF MVC implementation. I could almost say I’m starting to have fun! Early on, during the initial work on layouts I had to hard-code the site title because I didn’t know how to make it dynamic. That was originally done in Smarty by passing a title prefix to the header include. I’m now aware that ZF offers a title placeholder so I’m going to use that and remove the (rather unfair) “FIXME”,

25-Mar-2009 @ 10:30 That didn’t take too long. Turns out the title helper is pretty flexible and can do exactly what I needed it to. The “FIXME” is gone and so have the commented-out bits that implemented the same functionality from the original templates. Much cleaner now. I think it’s time to approach loading the photos and navigation.

Less Broken – Acquiring The User’s Photostream

There are already functions available to load photos and calculate navigation which I can use. They’re not designed to be compatible with ZF but I can’t think of any reason why they wouldn’t work. And they do! It’s that joy again – working with software designed and written by seasoned professionals. Clear separation of concerns and some thought given to future maintainers.

25-Mar-2009 @ 13:45 I’ve re-used the existing photostream acquisition functions and implemented the day header links on the week view. The trouble now is that the branch I’m working on doesn’t contain the refactored Picasa adapter so there’s no cacheing and it takes ages to load my photo feed. Often so long that it misbehaves and quits half-way through. I’m going to create a new merged branch to continue development. Later on I’ll rebuild the development history by cherry-picking.

~/p/zf-app> git branch
  master
* zend-mvc-front-page
  zend-picasa-port
  zend-port
~/p/zf-app> git checkout -b zend-mvc-picasa-merged
Switched to a new branch "zend-mvc-picasa-merged"
~/p/zf-app> git merge zend-picasa-port
Auto-merged public/paths.php
Merge made by recursive.
 application/class.picasa.php |  154 ++++++++++++++++++++----------------------
 public/paths.php             |    2 +
 2 files changed, 76 insertions(+), 80 deletions(-)

That’s better. After an initial pause for the feed to be parsed and stored in the cache, clicking on the daily links returns fairly quickly. The cost is a massive increase in memory use but I’ve already noted that as a subject of investigation. For now I don’t care.

Filling In The Blanks

Next I’ll re-implement the “no photos” navigation template which provides handy links to the nearest day for which a user has photos if the current view doesn’t show any. It’s not a difficult template to port but I am starting to get a bit annoyed copying the url generation code. It’s OK for now but I’ll have to re-implement it as a view helper at some point.

29-Mar-2009 @ 15:30 Photos! The food journal is almost back to where I started with it. One minor anomaly is that the times on the photos appear to be an hour forward. Anything to do with today’s switch to British Summer Time, perhaps? Hmmm…

With photos appearing on the week and day views, all that remains is the index view, which is straightforward, and the navigation area which is a little more demanding.

29-Mar-2009 @ 17:15 Almost done! I’ve reimplemented the index view, navigation area and the list of pre-selected users. All that remains functionally is to implement the ad hoc food journal forms on the Flickr and Picasa pages. I’ll have to figure out why the times are an hour ahead. I’ll need to implement error handling. Then clean up.

But I’ve done enough for today. My bum is sore and I really fancy a beer in what remains of the sunshine. :)

Basic Error Handling

29-Mar-2009 @ 20:00 Ah, what the heck? Beer in hand, I implemented very basic error handling by copying the code out of the ZF Quickstart. This will allow me to remove a lot of now obsolete code.

I removed the obsolete code (lots of it) and, at the same time, fixed the hour difference on the photo times by setting the default timezone to UTC. Just the ad hoc journal display to implement now.

Ad Hoc Journal Display

30-Mar-2009 @ 10:00 Damn British Summer Time! I feel like half the morning is gone already and I wanted to spend all of it finishing off the development. Oh well, just the most important feature left to reimplement – allowing people to type in an arbitrary Flickr or Picasa screen name and have the food journal displayed.

As I mentioned before, this is a little challenging because of the way in which URLs to the food journal are now drawn. I suppose I should read the manual to see if I can do this without redirection, but if I don’t issue a redirect the site’s URLs will become internally inconsistent. So I suppose what I’ll do is make each form submit back to its own controller, have the action validate so that errors can be displayed in-place, or redirect to the food journal controller if the input is OK. I’m aware that controllers can also forward actions between themselves so I could also use that strategy. First, though, I’ll check to see how broken things are as they stand.

Hmm… simply not working. However, I find in the manual the Redirector action helper which looks to do exactly what I need, specifically redirecting to a controller route. First I try forwarding the request to the food journal controller by overriding the index controller’s preDispatch method. Screw the URL structure!

HA! Screwing with the URL structure is a disaster! Although the Flickr ad hoc form can now display a food journal, the navigation area of the journal doesn’t work correctly. I can fix it though.

Having fixed the daybreak form action (using the url view-helper) I survey the damage in Git and find that I’ve made two separate logical changes to the source at once. With other version control systems this could be a pain to untangle, but not with Git! I use the (almost) magical

git add --patch

to stage the separate changes separately and

git diff --cached

to verify what I’m about to commit is just the one change.

30-Mar-2009 @ 11:25 The port of the food journal application to Zend Framework is now functionally complete! w00t!

Bug Hunt!

Error handling is primitive, unhelpful and reveals sensitive site-related information.
Loading the food journal for Jeffry.Hayes on Picasa results in a cache exception.
Loading the food journal for hellodaly on Picasa results in several script timeouts then a memory exhaustion error.
Food journals don’t show user’s display name in page title.

Step 3 – Finishing

30-Mar-2009 @ 13:30 I’d like to be in a position to publish today so some basic finishing is in order. There are bugs I have to fix – one of them severe. I’d also like to profile the application to determine how best to configure the public server’s resources. Fortunately the severe bug relates to pathological resource utilisation so I’ll need the profiling data before I can adequately fix it.

Less Primitive Error Handling

First, fix the silly caching exception and, by relation, the primitive error handling. The ZF quickstart example shows how to modify the behaviour of the error display based on the environment in which the application is running, specifically development or production. I’ll use that to address the “error handler reveals sensitive data” bug as well as to explore how I can handle exceptions from the photo storage providers in a friendlier manner. Previously, the application dealt with “user not found” or “photos not found” exceptions by redirecting to the approriate ad hoc display page with to display the error message. I’ll reimplement this behaviour.

The fix for the caching bug was easy but I need the previous behavior to experiment with capturing and logging truly unexpected exceptions if the site is running in production mode. I’ve been using FirePHP for logging during development and I’m happy with that because it’s helpfully revealing.

30-Mar-2009 @ 15:15 That was fun! The application now automatically detects if it is running in a development or production environment. In production, for security, technical details of fatal errors are written to the webserver’s error log and the user is politely requested to help us fix problems by emailing the details. In development mode, error details are sent to FirePHP and displayed on-screen.

All that was for unexpected errors. Now to deal with the expected ones, such as the user typing in a Flickr or Picasa screen name that doesn’t exist, by returning to the appropriate page and showing a helpful message.

30-Mar-2009 @ 17:00 That was less fun. I’ve implemented friendlier error messages for expected errors with both Flickr and Picasa, but I’m not happy with the way I’ve done it. It feels hacky. However, it does seem to work so I’ll stick with it for now.

Application Profiling

All the known bugs are fixed except for the resource exhaustion issue. I doubt I can fix this using intuition alone, so I need to gather more detailed evidence using a technique called profiling.

2-Apr-2009 @ 19:00 My preferred tools (in fact, probably the only tools) for profiling PHP applications are Xdebug and KCacheGrind. Xdebug is a PHP extension that (among many other useful features) gathers low-level data about the execution of PHP code which KCacheGrind can read and visualise. Both are available as Ubuntu packages, so…

sudo apt-get install php5-xdebug kcachegrid

Xdebug needs a bit of configuration – I want to control where profiling data files are written and ensure that profiling only occurs when I ask for it by appending ?XDEBUG_PROFILE to the URL for the application. Profiling data files are large and profiling slows down the application so I don’t want it constantly enabled.

19:45 Heh, with profiling enabled the application can’t load and process my food journal within the 30-second time limit. I’ll have to relax that if I want a complete profiling run. I expect the first run to take a long time because the cache is empty and the data has to be pulled from Picasa over the internet. The second run is a lot faster due to caching but seems to consume a lot of memory to achieve the speed boost.

Hmmm… initial profiling is fairly revealing. Of the entire runtime of the program, 37% is spent making 105,000 calls to Zend_Gdata_App_FeedEntryParent->lookupNamespace and 25% is spent making 121,000 calls to Zend_Gdata_App_Base->lookupNamespace.

4-Apr-2009 @ 11:45 I created two profiles this morning; the first with no cached Picasa data and the second with data coming from the cache. I implemented caching at the object level, that is, the Zend_Gdata_Photo_* objects are serialised into the cache after the feed data has been loaded and processed. The application runtimes are an order of magnitude apart and the profiles reveal why. The uncached profile shows the same pattern as the one captured a couple of days ago – the majority of the runtime is spent looking up namespaces. The second profile is more representative of the performance of the application itself and not the Zend_Gdata classes.

Hmm… at least some attempt has been made to improve the namespace lookup performance according to this change description but I have to question the methodology used as my profiling evidence shows that it isn’t the lookup performance per se that causes the problem – it is the number of times namespace lookup is performed. For instance, see this crazy implementation of Zend_Gdata_App_FeedEntryParent::takeChildFromDOM. Epic fail throughout the Zend_Gdata code! I think I’ll have to contribute a patch.

What “Open Source” Means

If it is broken and I have the skill and inclination, I can repair it. If I am then so inclined, I can share the improvement with everyone else. Well I have the skill and I need the repair. First, I have to build a test case to prove that a) the problem is what I think it is, and b) my fix is effective.

The Zend_Gdata API documentation implies that I can construct a feed object from a DOM that I have already loaded. I’ll use local copies of my Picasa feeds for the test so that environment is as controlled as possible.

5-Apr-2009 @ 10:00 After some wireless network hassles this morning I’m back working on a performance-improving patch to Zend Framework. My hacking yesterday showed that the code can be rewritten to work twice as fast.

~/l/Z/library> git branch
  lookupnamespace-fix
* master
~/p/z/tests> time php Zend_Gdata_Photos_Performance.php
MD5 checksum = 4d796492005eddc1f0957b51873082fa
<strong>6.51user</strong> 0.16system 0:06.70elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
~/l/Z/library> git checkout lookupnamespace-fix
Switched to branch "lookupnamespace-fix"
~/p/z/tests> time php Zend_Gdata_Photos_Performance.php
MD5 checksum = 4d796492005eddc1f0957b51873082fa
<strong>3.23user</strong> 0.18system 0:03.43elapsed

The test program constructs, from local files, one Zend_Gdata_Photos_UserFeed object and 4 Zend_Gdata_Photos_AlbumFeed objects, aggregates them into a single stdClass object then takes the MD5 checksum of serializing the aggregate. This helps assure me that my alterations have not affected the data in any way. It is no substitute for a complete unit test suite but is good enough for my current purposes.

I’ll show results from profiling later. I still have one more function to rewrite for final performance improvement figures. There are many functions I could rewrite but I’m focusing on proving how much more performance could be squeezed out so I’m only changing the most-called functions.

11:25 That done, I notice that Zend Framework 1.7.8 has been released. I’m working with 1.7.6 so it behooves me to download the new version and see if the performance has improved. If not, I’ll need to apply my patch against the new version and see what happens.

11:40 All but one of the changes in the patch apply cleanly on ZF 1.7.8. After manual re-application I re-ran my test. Here are the final results with profiling data.

Performance of Zend_Gdata API relating to quantity of lookupNamespace() calls
Version Runtime (seconds) Checksum Profile
1.7.8 base 6.5 d49558f6839c8731d72f4b66e919dbdd
1.7.8 patched 2.57 d49558f6839c8731d72f4b66e919dbdd

That’s a 2.5x speed increase. Pity that to achieve it I had to make the code less clean, but that’s not really my problem. Time to figure out how to push this discovery to the upstream maintainers. Found Trevor Johns and sent patch + test + data over to him added my findings as a sub-task to the already open issue relating to Zend_Gdata latency.

The 90/10 Rule

11-Apr-2009 @ 09:00 The week has been interesting albeit with little progress on the application. I have applied to become a contributor to Zend Framework so that the necessary parties can review, adapt, adopt or improve my Zend_Gdata performance enhancement work. It appears that the 90/10 rule is kicking in, essentially that that last 10% of the work on my application is going to consume 90% of the overall time I will have spent. You may recall from before that I noticed the application consumed a lot of memory while processing Picasa feeds. Others have also noticed. Due to this behaviour I cannot release the application so I must craft a fix. To do so I need more evidence of where, how and why the application is consuming memory. Another job for the wonderful Xdebug.

Memory Use Profiling

First I shall quantify what I mean by “excessive memory consumption”. I shall use the same test program and data I designed to analyse the lookupNamespace performance issue and lightly adapt it to report the overall memory use. PHP has a built-in function for doing that at a gross level – memory_get_peak_usage(). I can compare what is reported to the size of the data files to calculate a simple expansion factor – how many times more RAM is required to process a file than the size of the file. With that I can make a judgment as to whether there will be capacity problems in the expected environment in which the application will operate. I can also judge the cost of using Zend Framework against the benefits.

09:45 The test data I’m using are a Picasa user feed (32k) and 4 Picasa album feeds (153k, 154k, 142k and 471k). For simplicity, I’ll call that 1 megabyte. The test program constructs a Zend_Gdata_Photos_UserFeed object from the user feed and Zend_Gdata_Photos_AlbumFeed objects from each album feed. The 5 objects are aggregated into a single object, serialized and the MD5 hash calculated. How much memory is required to do that on my x64 Ubuntu 8.10 system with Zend Framework 1.7.8 and PHP 5.2.6?

I added the following code to the end of the program;

echo 'Peak memory  = '
    .number_format(memory_get_peak_usage()/(1024*1024),1).'MB (internal) '
    .number_format(memory_get_peak_usage(true)/(1024*1024),1).'MB (external)'."\n";

The results are astonishing – enough to make me question my methodology!

~/p/z/tests> php Zend_Gdata_Photos_Performance.php
MD5 checksum = d49558f6839c8731d72f4b66e919dbdd
Peak memory  = <strong>76.4MB</strong> (internal) 77.0MB (external)

Wow! Now, that is the memory required by the entire script not just the part of the framework in which I’m interested. I’ll have to eliminate factors external to the Zend_Gdata library otherwise I’d be tempted to throw it away right now! So, I’ll take the memory usage just before the user feed is loaded and just after the album feeds are loaded and report the difference. For comparison I shall also report the size of the aggregated object after serialization.

~/p/z/tests> php Zend_Gdata_Photos_Performance.php
Memory used  = 45.4MB
Serialized   = 5.7MB
MD5 checksum = d49558f6839c8731d72f4b66e919dbdd

Still astonishingly bad! Picasa feeds totaling under 1MB in size are 6 times larger when represented as PHP objects and consume 7 times that amount of RAM to create. Indeed, the Zend_Gdata library appears to be consuming memory like it’s going out of style.

10:15 I need more evidence. I know that some expansion of the original files is necessary – the Zend_Gdata library requires the XML files to be loaded into DOM objects, so next I’ll eliminate that from the memory use calculations. It would be unfair and unproductive to poke the finger of blame at the wrong area.

I refactor the test program to first load all the test files into DOM objects, report the memory used for that, then process each DOM object into the respective Zend_Gdata object. I deliberately overwrite the DOM object references with the Zend_Gdata object references as a hint to the garbage collector that the DOM objects are no longer needed after processing.

Hmmm… my first run shows no memory used by loading the XML files into DOM objects. I think that’s because I’m asking for memory usage based on internal allocations whereas the memory used by DOM is externally allocated by libxml.

Hmmm again… even after requesting externally allocated memory use the DOM loading appears to consume nothing. I’ve reached the limits of the simple profiling tools. Now I shall switch on Xdebug tracing. I add the following lines to the top of the test program;

ini_set('xdebug.show_mem_delta',1);
xdebug_start_trace('trace');

Running the test program takes a lot longer but results is a 255MB trace file being created for my analytical delight. Here’s a couple of snippets

 ~/p/z/tests> head trace.xt
TRACE START [2009-04-11 10:19:58]
    0.0017     199688       +0     -> memory_get_usage() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:13
    0.0018     199768      +80     -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:15
    0.0033     200304     +536     -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
    0.0091     200952     +648     -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
    0.0149     201280     +328     -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
    0.0204     201712     +432     -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
    0.0379     202072     +360     -> memory_get_usage() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:21
    0.0380     202144      +72     -> number_format() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:23
    0.0382     202784     +640     -> Zend_Loader::autoload() /home/paul/libs/ZendFramework-1.7.8/library/Zend/Loader.php:0
~/p/z/tests> tail trace.xt
   31.7695   47833768       +0                   -> Zend_Gdata_App::setStaticHttpClient() /home/paul/libs/ZendFramework-1.7.8/library/Zend/Gdata/App.php:258
   31.7720   47833064     -704     -> memory_get_usage() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:40
   31.7727   47833064       +0     -> number_format() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:42
   31.7728   47833064       +0     -> serialize() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:44
   32.0284   53852504 +6019440     -> strlen() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:46
   32.0284   53852504       +0     -> number_format() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:46
   32.0285   53852504       +0     -> md5() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:48
   32.1402     139400
TRACE END   [2009-04-11 10:20:30]

Just from the snippets some interesting evidence emerges, corroborating my previous analysis. The sequence of DOM loads consume very little memory, the serialization of the final aggregated object consumes around 6MB and the Zend_Gdata object construction consumes around 48MB.

Trace Analysis With MySQL

So now I know with more detail what I already figured out. My next step is to browse through the trace file looking for patterns of memory consumption. That’s not fun – it’s hard work. Which makes me think I should automate it. Perhaps I can load the trace file into a database and analyse it with SQL. Yeah, that’s what I’m going to do.

sudo apt-get install mysql-server php5-mysql

To make life easier for myself and MySQL, I changed the trace file output format to machine-readable. I then created the following table and loaded the trace data into it.

CREATE TABLE trace (
    level TINYINT(2) UNSIGNED NOT NULL
  , funcnum INTEGER UNSIGNED NOT NULL
  , is_exit TINYINT(1) UNSIGNED NOT NULL
  , timeindex DECIMAL(10,6) UNSIGNED NOT NULL
  , memusage INTEGER UNSIGNED NOT NULL
  , func VARCHAR(128)
  , is_user TINYINT(1) UNSIGNED
  , include_file VARCHAR(255)
  , file VARCHAR(255)
  , line_in_file INTEGER UNSIGNED
  , PRIMARY KEY (funcnum, is_exit)
  , KEY (timeindex)
  , KEY (func)
) ENGINE=MyISAM;
 
LOAD DATA LOCAL INFILE 'trace.xt'
INTO TABLE trace
FIELDS TERMINATED BY '\t'
IGNORE 2 LINES;

Now I can ask some interesting questions. First, which functions consume the most memory overall?

SELECT e.func, e.level, SUM(x.memusage-e.memusage) AS mem
FROM trace e JOIN trace x USING (funcnum)
WHERE e.is_exit = 0 AND x.is_exit = 1
GROUP BY e.func, e.level
ORDER BY 3 DESC
LIMIT 20;

funclevelmem
Zend_Gdata_Feed->__construct339443088
Zend_Gdata_App_FeedEntryParent->__construct439438688
Zend_Gdata_App_Base->transferFromDOM539438688
Zend_Gdata_Photos_AlbumFeed->takeChildFromDOM638164848
Zend_Gdata_Photos_AlbumFeed->__construct237947048
Zend_Gdata_Photos_PhotoEntry->__construct735870304
Zend_Gdata_Media_Entry->__construct834202848
Zend_Gdata_Entry->__construct934201696
Zend_Gdata_App_FeedEntryParent->__construct1133289488
Zend_Gdata_App_MediaEntry->__construct1033289488
Zend_Gdata_App_Base->transferFromDOM1233249552
Zend_Gdata_Photos_PhotoEntry->takeChildFromDOM1333044768
Zend_Gdata_Media_Entry->takeChildFromDOM1431465184
Zend_Gdata_App_Base->transferFromDOM1522077528
Zend_Gdata_Media_Extension_MediaGroup->takeChildFromDOM1621739344
Zend_Gdata_App_Base->registerNamespace1912328752
Zend_Gdata_Extension->__construct189144352
Zend_Loader::loadClass38165176
Zend_Loader::autoload28165176
include_once48095968

There’s nothing at all surprising about these results until line 16. Before that point, the memory consumption is dominated by the Zend_Gdata object constructors and DOM conversion methods. Why, though, is registerNamespace() consuming 12MB of RAM so deep into the stack? The XML namespaces in feeds recognised by the Zend_Gdata library are essentially constant so they should only need to be registered once for each different class. How many times is that function called?

SELECT COUNT(*) FROM trace WHERE func = 'Zend_Gdata_App_Base->registerNamespace' AND is_exit = 0;
18268

Ouch! Now, I know that the methods which call registerNamespace are called registerAllNamespaces. Which classes make the call, how many times and how much memory is consumed by each?

SELECT caller.func, COUNT(*) AS calls, SUM(callee_exit.memusage-callee.memusage) AS mem
FROM trace callee JOIN trace caller ON caller.funcnum = callee.funcnum - 1
JOIN trace callee_exit ON callee_exit.funcnum = callee.funcnum
WHERE callee.func = 'Zend_Gdata_App_Base->registerAllNamespaces'
AND caller.is_exit = 0
AND callee.is_exit = 0
AND callee_exit.is_exit = 1
GROUP BY caller.func
ORDER BY 2 DESC;

funccallsmem
Zend_Gdata_Media_Extension_MediaThumbnail->__construct9491780096
Zend_Gdata_Media_Extension_MediaTitle->__construct323478800
Zend_Gdata_Media_Extension_MediaDescription->__construct323651008
Zend_Gdata_Media_Extension_MediaGroup->__construct323488752
Zend_Gdata_Media_Extension_MediaCredit->__construct323437032
Zend_Gdata_Entry->__construct323938424
Zend_Gdata_Media_Extension_MediaKeywords->__construct323349928
Zend_Gdata_Media_Extension_MediaContent->__construct323881128
Zend_Gdata_Photos_PhotoEntry->__construct3131432048
Zend_Gdata_Media_Entry->__construct3131152
Zend_Gdata_Photos_AlbumEntry->__construct1042440
Zend_Gdata_Geo_Extension_GmlPoint->__construct1030496
Zend_Gdata_Geo_Extension_GeoRssWhere->__construct1013312
Zend_Gdata_Geo_Extension_GmlPos->__construct1030616
Zend_Gdata_Feed->__construct54400
Zend_Gdata_Photos_AlbumFeed->__construct40
Zend_Gdata_Photos_UserFeed->__construct18552

I can’t say for sure where I’m going with this analysis but my instinct tells me there’s something awry in the Zend_Gdata library’s treatment of XML namespaces. I’m getting a faint whiff of one of the worst possible code smells – a design flaw.

Investigating A Possible Design Flaw in Zend_Gdata

15:00 What is the difference between a design flaw and a bug? Generally, a bug is a coding error that causes software to behave incorrectly. A design flaw is an error of thought or judgment that causes software to be built in such a way that it behaves correctly but improperly. Bugs can usually be seen in the code, design flaws generally cannot. In this case, the logic of the software is correct but that correctness has been achieved at intolerable cost.

I’ve been working with XML for a long time now so I’m fully aware of the painful nature of dealing with namespaces – they’re hard to get right. I managed my pain by choosing tools specifically designed to work with XML, such as XSLT and XPath. My original implementation of the food journal’s Picasa adapter used XPath queries to select the nodes it needed from the feed. The designers of Zend_Gdata have eschewed these tools for reasons I don’t know, but I’d speculate that the nature of the API they were intending to create (an all-encompassing object hierarchy representing Google data feeds) did not lead them to think about how they could leverage the existing tools, so they built their own.

Before I leap into what is likely to become a monster piece of work, I need to quantify to myself the cost and benefits. I started this whole journey as a means to play with Zend Framework, which I have done to a degree but I seem to be giving a lot of attention to one small part of it. That is costing me experience of the other more interesting parts. The benefits of focusing on Zend_Gdata are; a) I can immediately make a contribution, b) I need the component to perform sanely in order to release my application and c) people from Google may be watching! What right-minded software geek doesn’t want to be noticed by Google? :)

There is also the pure and simple enjoyment of exercising my skills at improving code. Anyone crazy enough or bored enough to have read through this article will have realised that I really get a buzz out of doing this kind of work. It is hopefully clear that I’m very good at it. I use a combination of intuitive and evidence-based techniques to figure out why things are not as they should be then meticulous measurement to prove that any change I make achieves my intended effect.

So what it comes down to is this; if I can correct the namespace handling design flaw in Zend_Gdata, I may reduce the memory footprint by a quarter or more across all use-cases for the library. That is a clear benefit. Who else is going to do it if I don’t? What the hell? It’s worth a look at least.

Why Does Zend_Gdata Consume So Much Memory?

This is a code review. It’s not going to be pretty and I’m definitely not going to hold back. I’m looking for ways to improve the code which means I’m going to find all the bad stuff.

All filenames are relative to library/Zend/Gdata/ and I’m reviewing version 1.7.8. Each time I spot a potential waste of memory, I’ll try to fix it and re-run my tests to measure the effect. I’ll also provide a link to the actual source code in the repository so readers without the framework installed can follow me.

App.php, method importString
This method uses DOMDocument::loadXML to check that the passed string can be parsed, then passes the string through to the feed object’s transferFromXML method. Without doubt, the next operation will be DOMDocument::loadXML. Not only is this duplicating effort, an unnecessary DOM document is held in memory for the duration.
Removed

17:00 I’ve lost the will to live. So many classes…

The Design Flaw In The Zend_Gdata Library

12-Apr-2009 @ 12:30 After sleep and a quick review of the code, I figured out the design flaw.

The classes in the Zend_Gdata library co-mingle the representation of things with the transport mechanism used to manipulate the things. For instance, a Zend_Gdata_Photos_PhotoEntry object represents a photograph of which a property is size, yet by inheriting from Zend_Gdata_Base also carries a set of XML namespaces relevant only to the means of transport through which the value of the property was acquired, not the property itself. Each object that derives from Zend_Gdata_Base carries with it, for its entire lifetime, some state from the context in which it was or will be populated, before and beyond the lifetime of the context.

More simply put, no size of any photograph has a XML namespace, yet the Zend_Gdata library endows all sizes of all photos with several namespaces! No wonder the library consumes memory like its going out of style!

I described it to a layman like so: imagine that every currency note one carried also had the equivalent value in gold attached. Wallets and purses would become far heavier for no additional benefit.

Fortunately, the library designers are at least partly aware of the flaw, as evidenced by Issue ZF-3467 “Add new classes for most of our APIs that abstract the XML even further and facilitate 90% of the use-cases per API conveniently”.

The question becomes, what am I going to do about it? Hmmm…

15:00 Nothing. That’s what I’m going to do. Sweet F.A. Not my code, not my problem. More importantly, not fun to get involved with. I may change my mind if asked (or paid) but I have better things to do with my own time than clean up other people’s messes.

I’ll do what I can with the food journal application to overcome the brokenness of Zend_Gdata. I need to move on.

Refactoring the Food Journal Picasa Adapter

One thing I can do to work around the memory use of Zend_Gdata is refactor the Picasa adapter to process one album at a time. My current implementation, though pretty, is flawed in that it requires all the user’s albums to be loaded into memory at once. I’ll also apply caching at the photostream level rather than at the album feed level, which allows me to discard most of the unused data in the feeds.

16:15 The refactoring didn’t take long and seems to have been successful. The peak memory usage to load my food journal is 75MB, down from 286MB. It’s still a lot, but good enough.

Porting the Food Journal Flickr Adapter

19-Apr-2009 @ 11:45 After reading over the documentation for the Zend Framework Flickr service component, it seemed worth having a go at porting the food journal’s Flickr adapter over. The current code is relatively simple. The core is a function callFlickrAPI( $api, array $params ) which looks like it will map pretty directly to methods on the Zend_Service_Flickr class.

Despite its simplicity, there is one aspect of this code that concerns me with regard to refactoring it – I didn’t write it. That means I would benefit from the existence of a set of tests, the data for which will have to be derived from the current code. At least, the list of returned photos for a known user should be identical before and after refactoring. As the data is deterministic, I’ll begin with my favoured simple technique of comparing the MD5 hash of the serialization of the output, switching to more detailed testing only if necessary.

Preparing the Repository

As I’m happy enough with the porting efforts so far (the Picasa adapter and MVC), I merge the tip of that branch into the original branch called “zend-port” and start a new branch called “flickr-adapter-port”.

Basic Unit Tests

I’ll admit I’m not a member of the modern unit testing movement. My experiences discussing the goals of the movement with those who claim to be members leaves me nagging doubts about the evidential basis. That the quality of code with unit tests is better than that of code without seems to be an article of faith rather than a conclusion drawn from empirical data. I’m not surprised. Automated unit tests are a software tool and geeks get highly political about tools.

No matter. For my present purposes, a basic set of tests helps me to refactor the code I’m working with. I have no reason to test the inner working of the Flickr adapter class, only that it produces the same outputs for known inputs. I decided to check, for a known test users, that the adapter returns the correct number of photos and that the photostream, when serialized, resolves to the same MD5 hash.

Hmm… looking over the code I can see that there are two distinct operations that I’m going to refactor; the acquisition of photos and the acquisition of user data. I’ll need to add a few more tests to cover the latter as that’s the easiest point to begin refactoring.

15:00 I’ve had to create a test suite in order to make the test data retrieved from Flickr a shared fixture, otherwise each test makes a round-trip to the server. Now my tests cover the basic user details and the photos. I’m ready to get into the code now.

Functional Deprivation Causes Abandonment

Ah well, that didn’t take long and I didn’t waste too much of my time. Basically, the Zend_Flickr_Service component doesn’t provide all the features necessary for the food journal. Specifically, it does not expose the “people.getInfo” API which the adapter uses to acquire the user’s display name, nor does it expose the “photosets.getList” API that the adapter uses to determine which photoset is a food journal, nor does it expose the “photosets.getPhotos” API that the adapter uses to retrieve only the food journal photos.

There’s no point in continuing, so I delete the Git branch I was working on.

2 Comments

  1. Zend Form Remove Error Messages | Edmonds Commerce Blog says:

    [...] Dojo Form decorators example and usage « Zend FrameworkHandling Zend Framework Form error messagesPlaying with Zend FrameworkGrouping Form errors for display purpose in Zend Framework « Zend …Installing apache 2.2, [...]

  2. Libertus says:

    I’ve added syntax highlighting for the code snippets using the WP-Syntax plugin.

Leave a Reply

You must be logged in to post a comment.