The Zend Framework is a long-established and well-respected set of libraries for building PHP web applications. I’ve not used it before and I’ve recently become interested in learning more about it so I’ve decided to have a play. Rather than start entirely from scratch, I’m going to re-use the database schema of a famous dive-logging web application and see how much effort is involved in re-implementing the basics; logging in and displaying a page of dives.
More importantly, how much fun is it to work with?
Update 12-Jul-2009: Browse the code on Github
Step 0 – Local Preparation
There’s no point in declaring that I’m playing in public if the result of my playtime is private, so I’ll publish the site next to the blog at http://libertini.net/libertus/zf/. Also, even though I’m playing, I don’t want to risk losing any effort so I’ll work locally in a Git repository and use it’s push function to send any code I write to the public site.
But first, I need to set up my workstation for web application development. I need PHP (of course), a database server and a web server configured for personal home pages. MySQL and Apache are a bit heavyweight for my needs, so I’ll install Lighttpd (aka “lighty”) and SQLite. Before I do that, it’s due diligence to Google the combination to make sure it’ll serve. No total disasters in the results… so press on.
~> sudo apt-get install lighttpd php5-cgi php5-sqlite
Next comes configuration. Ubuntu’s packages pre-configure Lighty to work with PHP and FastCGI, so all I need to do is enable the appropriate modules and set a test script.
~> sudo lighty-enable-mod fastcgi userdir ~> sudo /etc/init.d/lighttpd force-reload ~> mkdir public_html ~> echo "<?php phpinfo();" > public_html/index.php
Browsing to localhost/~paul shows the expected PHP information page. Too easy. Sometimes I pity those who choose to use Windows.
Step 1 – Getting Started With Zend Framework
As usual, I start by reading the manual (and downloading the reference guide). However, the greedy bastards at Zend require me to register before I can download anything. That’s not the spirit, guys! Out of principle, I use GuerillaMail and great profanity to screw with their marketing numbers.
1a Reading The Manual
12-Mar-2009 @ 20:20 I’ve been reading the reference guide (PDF, English, 1.6) for most of the evening. It is certainly comprehensive but inconsistent, poorly edited (missing words, for instance) and employs poor English frequently enough to be more annoying than amusing. It also commits the mortal sin of claiming to have an index without providing one. I’ve got to page 187 of 1132. /me soldiers on…
14-Mar-2009 @ 13:30 Still reading. I noticed that Ubuntu provides a package for ZF but it is a bit old at version 1.5.3 whereas the manual I’m reading is for version 1.6 and the latest downloadable ZF is version 1.7.6. The Jaunty package is newer but I doubt I can use it.
14-Mar-2009 @ 21:30 Finished reading for the day at Chapter 33. The DB component is disappointingly simple. Components such as Form are disturbing. It is crystal clear to me now that the primary purpose of ZF is to lock developers into PHP and discourage them from seeking the correct solution to certain problems – an old Microsoft trick.
15-Mar-2009 @ 08:15 Continuing reading. I’m now minded to use ZF on a different application than I first thought – the food journal. It’ll be interesting to see if I can improve that previously hand-written application. I reckon the following components are pertinent; Model/View/Controller, Paginator, GData_Picasa, Service_Flickr, Cache, Config, Date, Test.
15-Mar-2009 @ 10:00 Nearly done reading. I’m at page 999. Thank goodness for coffee and cigarettes! I’ll soon be able to start the fun part. Amusingly, the chapter on the Translate component has some of the worst English translations in the entire manual f.e. “The default delimiter for CSV string is the ‘;’ sign. But it has not to be that sign.”
15-Mar-2009 @ 11:15 Yay! I reached the end of the manual. Now to continue getting started by developing a couple of failing environmental tests (assert ZF available and of compatible version), install and configure ZF such that the test pass without modification (very important) then get on with thefun stuff. After I play a couple of rounds on Rock Band, that is.
1b Preparing The Environment
Assuming that I’m starting from scratch, I need somewhere to build the code for the application. I don’t need nor want to care where ZF is installed, only that it is available and of a compatible version. I think that’s best achieved using PHP’s include_path directive which I can set globally and, I hope, specifically for my user environment within Lighty. I’m not concerned with security when working locally so I’ll build the app in my public_html directory for ease of access.
To build and run tests, I need the PHPUnit testing framework package. I’ll also be using that to run the tests supplied with ZF prior to linking it up with my application.
sudo apt-get install phpunit
Next I need to build the first test within the application directory structure.
~/public_html> mkdir -p zf-app/tests ~/public_html> cd zf-app/tests ~/public_html> touch BasicEnvironment.php
I have a totally empty test at the moment. How does PHPUnit deal with that?
~/p/z/tests> phpunit . PHPUnit 3.2.16 by Sebastian Bergmann. File "..php" could not be found or is not readable.
Eh? Oh… look at the PHPUnit version. The “run tests in directory” facility was added in 3.3. Perhaps I’ve installed the wrong package? Nope… that’s the version supplied with Intrepid and also in Jaunty. I’m getting pretty sick of Ubuntu. I install from PEAR instead.
After building the two basic tests I run them directly. As expected, they fail because class Zend_Version isn’t available. That’s cool. Time to commit.
~/p/zf-app> git init Initialized empty Git repository in /home/paul/public_html/zf-app/.git/ ~/p/zf-app> git add tests/*.php ~/p/zf-app> git commit Created initial commit abea47a: Basic environment tests 1 files changed, 24 insertions(+), 0 deletions(-) create mode 100644 tests/BasicEnvironment.php
1c Installing Zend Framework
I’ve got the tarball of ZF 1.7.6 from the website. I want to install it locally, run its tests, setup my local environment to point to it then run my tests, which will pass when I’ve got things right.
~> mkdir libs ~> cd libs ~/libs> tar zxf ~/Desktop/ZendFramework-1.7.6.tar.gz ~/libs> cd ZendFramework-1.7.6/tests ~/l/Z/tests> phpunit AllTests Fatal error: Allowed memory size of 33554432 bytes exhausted (tried to allocate 30720 bytes) in /home/paul/libs/ZendFramework-1.7.6/library/Zend/Controller/Dispatcher/Standard.php on line 262
BOOM! Now that I didn’t expect. Seems the test suite requires a higher memory limit than the default. Ah, there’s a TestConfiguration.php.dist file that needs to be adapted before I can begin. I’ll relax the memory limit and enable tests for SQLite.
~/l/Z/tests> time phpunit AllTests PHPUnit 3.3.15 by Sebastian Bergmann. ............................................................ 60 / 7087 ............................................................ 120 / 7087 ............................................................ 180 / 7087 ............................................................ 240 / 7087 .S.......................................................... 300 / 7087 ............................................................ 360 / 7087 ...........................SSSSSS.......................SSSS 420 / 7087 SSSSSSSSSSSSSSSSSS.......................................... 480 / 7087 ............................................................ 540 / 7087 ............................................................ 600 / 7087 ............................................................ 660 / 7087 ....................................SSS..................... 720 / 7087 .....S.S.SS................................................. 780 / 7087 ..............................................S............. 840 / 7087 .....................S...................................... 900 / 7087 ............................................................ 960 / 7087 ............................................................ 1020 / 7087 ............S............................................... 1080 / 7087 ..............................S............................. 1140 / 7087 ............................................................ 1200 / 7087 .............SS....S.S...............................S.....S 1260 / 7087 ............................................................ 1320 / 7087 ............................................................ 1380 / 7087 ............................................................ 1440 / 7087 ............................................................ 1500 / 7087 ..I......................................................... 1560 / 7087 ......I......SSSSSSSSSSS.........S.......................... 1620 / 7087 .......................................I.................... 1680 / 7087 ............................................................ 1740 / 7087 ............................................................ 1800 / 7087 ..............................S............................. 1860 / 7087 ............................................................ 1920 / 7087 ............................................................ 1980 / 7087 ............................................................ 2040 / 7087 ............................................................ 2100 / 7087 ............................................................ 2160 / 7087 ............................................................ 2220 / 7087 ............................................................ 2280 / 7087 .......I.................................................... 2340 / 7087 ............................................................ 2400 / 7087 ............................................................ 2460 / 7087 ............................................................ 2520 / 7087 ..........................I................................. 2580 / 7087 ...............I............................................ 2640 / 7087 ............................................................ 2700 / 7087 ............................................................ 2760 / 7087 ............................................................ 2820 / 7087 ............................................................ 2880 / 7087 ............................................................ 2940 / 7087 ............................................................ 3000 / 7087 ............................................................ 3060 / 7087 ............................................................ 3120 / 7087 ............................................................ 3180 / 7087 ............................................................ 3240 / 7087 ............................................................ 3300 / 7087 ............................................................ 3360 / 7087 ............................................................ 3420 / 7087 ............................................................ 3480 / 7087 ............................................................ 3540 / 7087 ............................................................ 3600 / 7087 ............................................................ 3660 / 7087 ....................................................S....... 3720 / 7087 ...I....................................................I... 3780 / 7087 ...........S..........S..SSSS...............S............... 3840 / 7087 ............................................................ 3900 / 7087 ............................................................ 3960 / 7087 ............................................................ 4020 / 7087 ..................................S................S........ 4080 / 7087 ............................................................ 4140 / 7087 ............................................................ 4200 / 7087 ............................................................ 4260 / 7087 ...............I............................................ 4320 / 7087 ............................................................ 4380 / 7087 ............................................................ 4440 / 7087 ............................................................ 4500 / 7087 ............................................................ 4560 / 7087 ............................................................ 4620 / 7087 ............................................................ 4680 / 7087 ............................................................ 4740 / 7087 ............................................................ 4800 / 7087 ............................................................ 4860 / 7087 ............................................................ 4920 / 7087 ............................................................ 4980 / 7087 ............................................................ 5040 / 7087 ............................................................ 5100 / 7087 ............................................................ 5160 / 7087 ....................E....................................... 5220 / 7087 ............................................................ 5280 / 7087 ............................................................ 5340 / 7087 ............................................................ 5400 / 7087 ............................................................ 5460 / 7087 ............................................................ 5520 / 7087 ............................................................ 5580 / 7087 ............................................................ 5640 / 7087 ......................................E.EEEEEEEEEEEEEEEEE... 5700 / 7087 ............................................................ 5760 / 7087 ................................................S........... 5820 / 7087 ..............................S............................. 5880 / 7087 .........S................SSSSSSSS.......................... 5940 / 7087 ............................................................ 6000 / 7087 ............................................................ 6060 / 7087 ............................................................ 6120 / 7087 ............................................................ 6180 / 7087 S........................F...I....F......................... 6240 / 7087 ...............I............................................ 6300 / 7087 ....I....................................................... 6360 / 7087 .....................................S...................... 6420 / 7087 ............................................................ 6480 / 7087 ............................................................ 6540 / 7087 ............................................................ 6600 / 7087 ...............I............................................ 6660 / 7087 ............................................................ 6720 / 7087 ......................................................F..... 6780 / 7087 ............................................................ 6840 / 7087 ............................................................ 6900 / 7087 ............................................................ 6960 / 7087 ............................................................ 7020 / 7087 .........I.................................................. 7080 / 7087 ....... Time: 04:50 There were 19 errors: 1) testCreate(Zend_Memory_MemoryManagerTest) Zend_Memory_Exception: Memory manager can't get enough space. /home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:408 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:381 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:287 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Memory/Manager.php:254 2) testQueryParser(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 3) testEmptyQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 4) testTermQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 5) testMultiTermQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 6) testPraseQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 7) testBooleanQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 8) testBooleanQueryWithPhraseSubquery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 9) testBooleanQueryWithNonExistingPhraseSubquery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 10) testFilteredTokensQueryParserProcessing(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 11) testWildcardQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 12) testFuzzyQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 13) testInclusiveRangeQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 14) testNonInclusiveRangeQuery(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 15) testDefaultSearchField(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 16) testQueryHit(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 17) testDelayedResourceCleanUp(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 18) testSortingResult(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 19) testLimitingResult(Zend_Search_Lucene_Search23Test) Uninitialized string offset: 7 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene/Storage/File.php:200 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:436 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:538 /home/paul/libs/ZendFramework-1.7.6/library/Zend/Search/Lucene.php:201 -- There were 3 failures: 1) testSetExpirationSeconds(Zend_SessionTest) iteration over default Zend_Session namespace failed; expecting result === ';a === apple;o === orange;p === pear', but got ' thrown in /home/paul/libs/ZendFramework-1.7.6/library/Zend/Session.php on line 444' Failed asserting that <boolean:false> is true. 2) testTableNameSchema(Zend_Session_SaveHandler_DbTableTest) Expected exception Zend_Db_Statement_Exception 3) testBasic(Zend_Validate_File_MimeTypeTest) Test expected true with 'image/gif' Messages: array ( 'fileMimeTypeFalse' => 'The file 'testsize.mo' has a false mimetype of 'text/plain'', ) Failed asserting that <boolean:false> matches expected value <boolean:true>. FAILURES! Tests: 7087, Assertions: 31113, Failures: 3, Errors: 19, Incomplete: 14, Skipped: 82. Command exited with non-zero status 2 124.96user 7.65system 4:57.53elapsed 44%CPU (0avgtext+0avgdata 0maxresident)k 312inputs+28616outputs (2major+335358minor)pagefaults 0swaps
Give me one good reason to not be pissed off by that. To me, the point of a test suite is verification of reliability. I now have irrefutable evidence from its own test suite that Zend Framework 1.7.6 cannot be relied upon. Why should I risk using it?
Still, it could indicate incompatibility with my environment rather than a deliberate relesing of the software with failing tests. The memory manager failure is particularly disturbing but may have something to do with my setting an unlimited memory limit. I don’t care about the Lucene search so I can ignore those. The failures seem minor enough. I’ll carry on, but my confidence has been shaken.
1d Making Zend Framework Available To My Application
According to the manual, I can use the PHPRC environment variable to specify a user-local configuration file. I need to ensure ZF is in the include path and, preferrably, configure the ZF auto class loader. I’ll do that using the auto_prepend_file directive. Unfortunately, I had to hard-code the entire include path as I couldn’t get the reference syntax to work.
Here’s the configuration I’ve used. I’ve not run the tests yet to prove this is correct.
~/libs/php.ini.paul
[php] include_path = ".:/usr/share/php:/usr/share/pear:/home/paul/libs/ZendFramework-1.7.6/library" auto_prepend_file = /home/paul/libs/zf-bootstrap.php
~/libs/zf-bootstrap.php
<?php require_once 'Zend/Loader.php'; Zend_Loader::registerAutoload();
~> set -x PHPRC ~/libs/php.ini.paul ~> php -i | grep include_path include_path => .:/usr/share/php:/usr/share/pear:/home/paul/libs/ZendFramework-1.7.6/library
Now back to my application tests…
~> cd public_html/zf-app/tests/ ~/p/z/tests> phpunit BasicEnvironment PHPUnit 3.3.15 by Sebastian Bergmann. .F Time: 0 seconds There was 1 failure: 1) testZendFrameworkVersionIsCompatible(BasicEnvironment) Need at least Zend Framework 1.6.0, have 1.7.6 Failed asserting that <boolean:false> is true. /home/paul/public_html/zf-app/tests/BasicEnvironment.php:22 FAILURES! Tests: 2, Assertions: 2, Failures: 1.
Oops! That’s a logic inversion bug in my test! Once fixed, both tests passed. Committed the updated test. I can do some real fun stuff now.
Step 2 – Porting an Application to Zend Framework
2a Installing the Food Journal Application
First I need to code for the food journal application, which I export from the Subversion repository. That done, I move the directories around a bit to match ZF conventions. The app is already split into webroot and offroot which map to “public” and “application” respectively. Did that and committed.
2b Making the Application Work Again
Next I’ll make the application work again using the new directory structure. I don’t want to leap into porting bits to ZF until I’m satisfied the basics are operating correctly. The app uses a path mapping file, so it’s a simple matter of creating it and setting the offroot and webroot paths. That almost worked – the app uses the Smarty templating system which needs write access to its compiled template directory. That done, the app begins to work.
Gotta love code written by professionals, eh? That was almost too easy!
Anyway, on to the individual porting efforts. I’ll create a separate topic branch for each so that I can cherry-pick the changes I want during integration.
2c Porting the Picasa Adapter
The application was designed to work with multiple back-end photo storage systems. As I happened to write the Picasa adapter, it seems the ideal place to start. The core of the class is the following method;
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | public function getPhotostream( $user ) { return $this->organisePhotostream( $this->extractPhotosFromAlbums( $this->loadAlbumsFromUserFeed( $this->loadUserFeed( $user ) ) ) ) ; } |
Pretty, eh? I’ve no desire, nor should I need, to change any of that because it’s perfect. Only the implementation will change, which is good because it’s an ugly mess of DOM and XPath queries dealing with Google’s Atom feed format.
Here’s where it gets difficult. Getting a user feed using Zend_Gdata_Photos is easy enough, but what then? The ref guide implies I can just iterate over the entries in the feed. Time for some debug printouts, methinks. Having already installed FireBug and FirePHP, I’d like to play with the Zend_Log facility.
I really don’t like the Gphoto API, or maybe the documentation is confusing. Either way, this port is proving more frustrating that I thought it would be. Enough for today.
21-Mar-2009 @ 08:45 Carrying on this morning. I’ve decided to code naked, which isn’t some special technique – I’m in bed! I left off with a gripe about the Zend_Gphoto API and some nearly-working code. My main trouble is figuring out which layer of inheritance exposes the method that returns the data I want in the form I want it. Scanning a single Zend_Gdata_Photos_PhotoEntry object with FirePHP it looks like I want the media group which exposes descriptive data and thumbnails, but the methods of the class all return strings and I definately want an object. One level of inheritance back I find what I’m looking for – Zend_Gdata_Media_Entry::getMediaGroup() which returns an object. NO IT DOESN’T! The method I want, which returns the data I can see is in the object, has also been overridden to return a string, rendering the API virtually useless. What a crock of shit! I’m wasting my time trying to work like this.
21-Mar-2009 @ 17:40 Now I’m really confused. The API docs say that PhotoEntry::getMediaGroup() returns a string but it in fact returns an object. Sigh.
Double-sigh. Whilst getting frustrated about the lack of FirePHP output, I speculatively decided to send the content of the exception I’m throwing to flush the debug channel. Zend_Gdata_App_HttpException: Invalid chunk size followed by what is clearly a fragment of the feed. Is this yet another HTTP client implementation that can’t handle chunked transfer encoding correctly? Is Google’s server misbehaving? How am I supposed to know? As if to add insult to injury, the script is now refusing to return and is not obeying the time limit (my fault, see below). This isn’t fun so I’m going to do something hopefully more exciting – try OpenSolaris.
The Zend_Gdata_Photos API is useless. The Zend HTTP client may be broken.
22-Mar-2009 @ 08:00 Such failures might deter a lesser individual but not me. I’m going to work around the maybe-broken HTTP client by using the Zend_Cache. I’m not giving up on the Photos API because I haven’t looked at the underlying code yet so the problems could still be in my head or the documentation.
Cacheing first. I’m thinking that, as my Picasa adapter makes several large HTTP requests in quick succession that the failures may be related to pipelining, so keeping the results locally will at least allow me to focus on the data manipulation.
08:40 After some wrangling with the cache module I’m back to where I started – the script never returns. I suspect it is related to my attempt to retrieve the thumbnails and send the structure over FirePHP so I remove that and try again. I also notice that I’ve neglected to save anything to the cache. I add that before trying again. Silly!
Hmmm… the cache files have been written to /tmp as expected but the script is still hanging. Odd. The only thing I can think of that could cause this is the addition of dumping the exception content. Remove that also. Try again. Immediate response. Very weird. No matter… I can continue.
09:20 Success at last! My simple testing indicates that the array of photos is being constructed correctly using the Zend_Gdata_Photos API. Time for a complete run. It looks good, so now to clean up the code, commit and publish.
While cleaning up I noticed that I had relaxed the memory and time limits on the script. Oops. I took out the relaxation code and immediately the script failed by running out of memory. The PHP default of 16MB is not enough, especially on a 64-bit machine. Raising the limit to 128MB also caused failure. Doubling to 256MB still failed. The script was successful with a 512MB memory limit. That’s outrageous and something I will have to investigate.
10:20 Step 2c complete. The Picasa adapter is sufficiently ported to Zend Framework that I can consider what to do next.
Step 2d Implementing MVC
This is where I expect ZF to shine but is also the most invasive change to the application. The food journal is simple so doesn’t use the front controller pattern. Instead is is separated into a front page and an application script which accepts parameters. To successfully port this to the ZF MVC I’ll need to carefully plan the changes and re-design the application’s operational map to suit. I’ll also need to re-read the ZF MVC documentation.
However, the hacker in me says JFDI! Resist or succumb?
Ah, what the heck? Let’s compomise! There’s no harm in writing down how the application currently hangs together.
The Front Page
Well that was immediately valuable! I can see straight away that the front page of the application does indeed employ the front controller pattern by accepting a page parameter.
| Page name | Purpose |
|---|---|
| none | Display the front page with news, links to the “try it out” pages (named flickr and picasa), a list of preset food journals (linking to foodjournal.php) and a list of links to informative pages (named about and news). |
| about | Helpful information about the application and the navigation (preset food journals and informative links) |
| news | Latest news and scientific research, with navigation |
| flickr | Information about how to set up a food journal on Flickr, a form to type in a Flickr screen name to display the food journal and navigation |
| picasa | Information about how to set up a food journal on Picasa, a form to type in a Picasa screen name to display the food journal and navigation |
Clean and consistent. The navigation footer is repeated on all the pages. This should be a piece of cake to re-implement using ZF MVC. The templating engine is currently Smarty but I’ll switch over to the ZF style of simple PHP includes. Anyway, time to work through the ZF Quickstart which focuses on MVC.
Set Up The Project Structure
The quickstart says I should create directories for controllers, views and scripts. I’ll do this on a git topic branch to keep it separate from the Picasa work I did earlier – they aren’t related after all. Git makes it easy to merge multiple branches and I’ll be using that later.
~/p/zf-app> git branch master * zend-picasa-port zend-port ~/p/zf-app> git checkout -b zend-mvc-front-page zend-port Switched to a new branch "zend-mvc-front-page" ~/p/zf-app> mkdir -p application/controllers application/views/scripts
Create a Rewrite Rule
As I’m using Lighttpd and not Apache, the quickstart tells me what I need to do but not how to do it. I’m not sure if Lighty handles per-directory configuration files so I’d better read the manual.
Some Googling and head-scratching later, I settled on the following rule, which is broken but works well enough to let me carry on.
url.rewrite-once = ( ".*.(js|gif|jpg|png|css)$" => "$0", "~paul/zf-app/public(.*)" => "~paul/zf-app/public/index.php$1" )
Create a Bootstrap File
As the application already has a bootstrap-like file, I won’t just copy the quickstart version blindly. I’ll need to adapt the current file to match what ZF needs, which is pretty much just setting up the controller and dispatching the request. Having done that, I commit the changes in a known-broken state. I like to see the evolution of code.
Create an Action Controller & View
Creating the index controller class is easy – it’s got no code! I moved the Smarty template for the index into the views/scripts/index directory, renamed it to index.phtml, commented out the header and footer inclusions and then ran the script. Up pops a page! Too easy.
Create an Error Controller & View
Cough! I’ll come back to this later.
Create a Layout
Ah, here’s the way to set the headers and footers. Looks pretty straightforward too. The application currently includes a header and footer template on each page template which is a bit repetitive (although there are good reasons for it). Using ZF I can do both all in a single file and include the content. On the other hand, both the header and footer have dynamic elements that I’ll have to comment out for now. I suspect I’ll be able to implement them later using view helpers.
This also raises a question about Git. It can track files being moved around, sometimes automatically by detecting substatially similar content, but can it tell if two files are merged into one? I doubt it and I’m not aware of any VCS that can.
Hmm… another issue. I normally refuse to enable PHP’s short_open_tags but the quickstart makes extensive use of them. Should I make an exception in this case? No, because they hide that fact that I’m echoing strings of text synthesised using PHP, whereas normal open tags make it perfectly clear.
<?= $this->layout()->content ?>
versus
<?php echo $this->layout()->content; ?>
I suspect I’ll be writing my own XSLT-based views before long. I abhor XML synthesised from text. It’s so primitive and prone to validity errors. XML documents are data structures.
Hmm… the navigation footer needs to link to several other pages and the food journal itself but the quickstart is, as yet, silent on how to do so. Rather than copy the URL generation code that I don’t understand, I’ll omit the navigation links for now.
HA! Having made the layout changes, as soon as I run the application I get a syntax error message complaining about an unexpected T_STRING. It turns out that PHP is currently configured with short_open_tags enabled and it’s complaining about the <?xml ... ?> preamble. Funny how that’s missing from the quickstart. Anyway, I alter the system-level PHP configuration to disable short_open_tags, run the application and it works – the index page has a basic header and footer.
28-Mar-2009 @ 10:30 A couple of evenings ago I took the sledgehammer approach and completed the app’s front page in MVC form. I used the url view-handler to create the navigation links and damn they’re ugly! The links in the template went from
<a href="?page=picasa">Using Picasa</a>
to
<a href="<?php echo $this->url(array('controller'=>'index','action'=>'picasa'),'default',true) ?>">Using Picasa</a>
Hopefully that is my inexperience with ZF showing through and there exists a more elegant and comprehensible way to draw links. Anyway, I’m not going to obsess about it as today I want to push towards completing my play by implementing the food journal functionality.
Step 2e Implementing the Food Journal
I prepare for this by drawing the links from the front page to the Flickr and Picasa pages using the same ugly url helpers from the layout navigation links.
Next, I need to implement the forms on the pages which supply the food journal with the necessary details to retrieve someone’s photos and display them. The food journal will be a separate controller. Passing the details to the controller seems like an ideal time to use parameterised routes. The food journal takes several parameters;
- provider
- Code for the back-end photo storage provider. Currently supports flickr and picasa. No default so must be specified.
- username
- Identifier or screen name of the person whose food journal is to be displayed. Depends on the selected provider.. No default so must be specified.
- start
- Start date from which to show the food journal. Defaults to “last Monday”.
- range
- Controls the type of food journal display. Supports index, day and week (the default).
daybreak
It seems clear that the non-default parameters must form the route and that the others continue to be provided in the query string, yielding URLs like http://hostname/app-base/foodjournal/picasa/Libertus96?start=2008-08-08&range=day. However, this introduces a problem – how am I to direct a form submission to a varying URL? That is not possible with HTML as the form element’s action attribute is fixed. Perhaps MVC routing is inappropriate here. I could solve this by issuing a redirect but I don’t like to introduce such an overhead unnecessarily.
Tidying Up a Little
Before I embark on the major rewrite, I’m going to tidy up the code a little by removing the parts that are now implemented using ZF MVC. The application still carries on after ZF has done its bit and spits out Smarty error messages. The control logic from the previous front controller implementation is still in place. All that has to go.
Implementing the Food Journal Controller
28-Mar-2009 @ 13:45 After several social distractions, I’m ready to implement a parameterised controller. Thinking about the logic, I realise that range is really the action because it determines which view is to be displayed. I can use URLs in the form http://hostname/app-base/foodjournal/picasa/Libertus96/week/2008-08-08?daybreak=07:00, following the principle that all manatory elements must precede any optional element.
I re-read the manual section on MVC routing.
28-Mar-2009 @ 14:30 Ahhh… now some things start to make more sense, including the URL view-helper I previously declared ugly. Here’s the route I decided upon;
Zend_Controller_Front::getInstance() ->getRouter()->addRoute( 'foodjournal' , new Zend_Controller_Router_Route( 'foodjournal/:provider/:username/:action/:start' , array( 'controller' => 'foodjournal' , 'action' => 'week' , 'start' => date( 'Y-m-d', strtotime('last Monday') ) ) , array( 'action' => 'index|day|week' , 'provider' => 'picasa|flickr' , 'start' => 'd{4}-d{2}-d{2}' /// FIXME match date ) ) );
I created a new FoodjournalController class with three actions, namely indexAction(), weekAction() and dayAction(), moved the existing Smarty templates into the views/scripts/foodjournal directory naming them index, week and day respectively. I then added a navigation link to my personal food journal on Picasa like so;
$this->url(array('provider'=>'picasa', 'username'=>'libertus96'), 'foodjournal', true)
That’s not quite so ugly now that it makes sense!
The basic URL routing also works fine in that I can access all three views by changing the URL. I had to comment out vast parts of the Smarty templates that dealt with dynamic data. Next step is to acquire and make use of the parameters to get some of the dynamism back.
28-Mar-2009 @ 17:15 Success! I focused on the default weekly food journal view and reimplemented some basic dynamic data, including the week start date and the user’s display name. Still lots more to do but I have the model for it now.
Removing “FIXME: Built With Zend Framework”
25-Mar-2009 @ 10:00 It’s a lovely sunny morning so I’m braving the cold and put on some shorts. Yesterday saw some decent progress in my understanding of the ZF MVC implementation. I could almost say I’m starting to have fun! Early on, during the initial work on layouts I had to hard-code the site title because I didn’t know how to make it dynamic. That was originally done in Smarty by passing a title prefix to the header include. I’m now aware that ZF offers a title placeholder so I’m going to use that and remove the (rather unfair) “FIXME”,
25-Mar-2009 @ 10:30 That didn’t take too long. Turns out the title helper is pretty flexible and can do exactly what I needed it to. The “FIXME” is gone and so have the commented-out bits that implemented the same functionality from the original templates. Much cleaner now. I think it’s time to approach loading the photos and navigation.
Less Broken – Acquiring The User’s Photostream
There are already functions available to load photos and calculate navigation which I can use. They’re not designed to be compatible with ZF but I can’t think of any reason why they wouldn’t work. And they do! It’s that joy again – working with software designed and written by seasoned professionals. Clear separation of concerns and some thought given to future maintainers.
25-Mar-2009 @ 13:45 I’ve re-used the existing photostream acquisition functions and implemented the day header links on the week view. The trouble now is that the branch I’m working on doesn’t contain the refactored Picasa adapter so there’s no cacheing and it takes ages to load my photo feed. Often so long that it misbehaves and quits half-way through. I’m going to create a new merged branch to continue development. Later on I’ll rebuild the development history by cherry-picking.
~/p/zf-app> git branch master * zend-mvc-front-page zend-picasa-port zend-port ~/p/zf-app> git checkout -b zend-mvc-picasa-merged Switched to a new branch "zend-mvc-picasa-merged" ~/p/zf-app> git merge zend-picasa-port Auto-merged public/paths.php Merge made by recursive. application/class.picasa.php | 154 ++++++++++++++++++++---------------------- public/paths.php | 2 + 2 files changed, 76 insertions(+), 80 deletions(-)
That’s better. After an initial pause for the feed to be parsed and stored in the cache, clicking on the daily links returns fairly quickly. The cost is a massive increase in memory use but I’ve already noted that as a subject of investigation. For now I don’t care.
Filling In The Blanks
Next I’ll re-implement the “no photos” navigation template which provides handy links to the nearest day for which a user has photos if the current view doesn’t show any. It’s not a difficult template to port but I am starting to get a bit annoyed copying the url generation code. It’s OK for now but I’ll have to re-implement it as a view helper at some point.
29-Mar-2009 @ 15:30 Photos! The food journal is almost back to where I started with it. One minor anomaly is that the times on the photos appear to be an hour forward. Anything to do with today’s switch to British Summer Time, perhaps? Hmmm…
With photos appearing on the week and day views, all that remains is the index view, which is straightforward, and the navigation area which is a little more demanding.
29-Mar-2009 @ 17:15 Almost done! I’ve reimplemented the index view, navigation area and the list of pre-selected users. All that remains functionally is to implement the ad hoc food journal forms on the Flickr and Picasa pages. I’ll have to figure out why the times are an hour ahead. I’ll need to implement error handling. Then clean up.
But I’ve done enough for today. My bum is sore and I really fancy a beer in what remains of the sunshine.
Basic Error Handling
29-Mar-2009 @ 20:00 Ah, what the heck? Beer in hand, I implemented very basic error handling by copying the code out of the ZF Quickstart. This will allow me to remove a lot of now obsolete code.
I removed the obsolete code (lots of it) and, at the same time, fixed the hour difference on the photo times by setting the default timezone to UTC. Just the ad hoc journal display to implement now.
Ad Hoc Journal Display
30-Mar-2009 @ 10:00 Damn British Summer Time! I feel like half the morning is gone already and I wanted to spend all of it finishing off the development. Oh well, just the most important feature left to reimplement – allowing people to type in an arbitrary Flickr or Picasa screen name and have the food journal displayed.
As I mentioned before, this is a little challenging because of the way in which URLs to the food journal are now drawn. I suppose I should read the manual to see if I can do this without redirection, but if I don’t issue a redirect the site’s URLs will become internally inconsistent. So I suppose what I’ll do is make each form submit back to its own controller, have the action validate so that errors can be displayed in-place, or redirect to the food journal controller if the input is OK. I’m aware that controllers can also forward actions between themselves so I could also use that strategy. First, though, I’ll check to see how broken things are as they stand.
Hmm… simply not working. However, I find in the manual the preDispatch method. Screw the URL structure!
HA! Screwing with the URL structure is a disaster! Although the Flickr ad hoc form can now display a food journal, the navigation area of the journal doesn’t work correctly. I can fix it though.
Having fixed the daybreak form action (using the url view-helper) I survey the damage in Git and find that I’ve made two separate logical changes to the source at once. With other version control systems this could be a pain to untangle, but not with Git! I use the (almost) magical
git add --patch
to stage the separate changes separately and
git diff --cached
to verify what I’m about to commit is just the one change.
30-Mar-2009 @ 11:25 The port of the food journal application to Zend Framework is now functionally complete! w00t!
Bug Hunt!
Error handling is primitive, unhelpful and reveals sensitive site-related information.
Loading the food journal for
Loading the food journal for
Food journals don’t show user’s display name in page title.
Step 3 – Finishing
30-Mar-2009 @ 13:30 I’d like to be in a position to publish today so some basic finishing is in order. There are bugs I have to fix – one of them severe. I’d also like to profile the application to determine how best to configure the public server’s resources. Fortunately the severe bug relates to pathological resource utilisation so I’ll need the profiling data before I can adequately fix it.
Less Primitive Error Handling
First, fix the silly caching exception and, by relation, the primitive error handling. The ZF quickstart example shows how to modify the behaviour of the error display based on the environment in which the application is running, specifically development or production. I’ll use that to address the “error handler reveals sensitive data” bug as well as to explore how I can handle exceptions from the photo storage providers in a friendlier manner. Previously, the application dealt with “user not found” or “photos not found” exceptions by redirecting to the approriate ad hoc display page with to display the error message. I’ll reimplement this behaviour.
The fix for the caching bug was easy but I need the previous behavior to experiment with capturing and logging truly unexpected exceptions if the site is running in production mode. I’ve been using FirePHP for logging during development and I’m happy with that because it’s helpfully revealing.
30-Mar-2009 @ 15:15 That was fun! The application now automatically detects if it is running in a development or production environment. In production, for security, technical details of fatal errors are written to the webserver’s error log and the user is politely requested to help us fix problems by emailing the details. In development mode, error details are sent to FirePHP and displayed on-screen.
All that was for unexpected errors. Now to deal with the expected ones, such as the user typing in a Flickr or Picasa screen name that doesn’t exist, by returning to the appropriate page and showing a helpful message.
30-Mar-2009 @ 17:00 That was less fun. I’ve implemented friendlier error messages for expected errors with both Flickr and Picasa, but I’m not happy with the way I’ve done it. It feels hacky. However, it does seem to work so I’ll stick with it for now.
Application Profiling
All the known bugs are fixed except for the resource exhaustion issue. I doubt I can fix this using intuition alone, so I need to gather more detailed evidence using a technique called
2-Apr-2009 @ 19:00 My preferred tools (in fact, probably the only tools) for profiling PHP applications are
sudo apt-get install php5-xdebug kcachegrid
Xdebug needs a bit of configuration – I want to control where profiling data files are written and ensure that profiling only occurs when I ask for it by appending ?XDEBUG_PROFILE to the URL for the application. Profiling data files are large and profiling slows down the application so I don’t want it constantly enabled.
19:45 Heh, with profiling enabled the application can’t load and process my food journal within the 30-second time limit. I’ll have to relax that if I want a complete profiling run. I expect the first run to take a long time because the cache is empty and the data has to be pulled from Picasa over the internet. The second run is a lot faster due to caching but seems to consume a lot of memory to achieve the speed boost.
Hmmm… initial profiling is fairly revealing. Of the entire runtime of the program, 37% is spent making 105,000 calls to Zend_Gdata_App_FeedEntryParent->lookupNamespace and 25% is spent making 121,000 calls to Zend_Gdata_App_Base->lookupNamespace.
4-Apr-2009 @ 11:45 I created two profiles this morning; the first with no cached Picasa data and the second with data coming from the cache. I implemented caching at the object level, that is, the Zend_Gdata_Photo_* objects are serialised into the cache after the feed data has been loaded and processed. The application runtimes are an order of magnitude apart and the profiles reveal why. The uncached profile shows the same pattern as the one captured a couple of days ago – the majority of the runtime is spent looking up namespaces. The second profile is more representative of the performance of the application itself and not the Zend_Gdata classes.
Hmm… at least some attempt has been made to improve the namespace lookup performance according to
What “Open Source” Means
If it is broken and I have the skill and inclination, I can repair it. If I am then so inclined, I can share the improvement with everyone else. Well I have the skill and I need the repair. First, I have to build a test case to prove that a) the problem is what I think it is, and b) my fix is effective.
The Zend_Gdata API documentation implies that I can construct a feed object from a DOM that I have already loaded. I’ll use local copies of my Picasa feeds for the test so that environment is as controlled as possible.
5-Apr-2009 @ 10:00 After some wireless network hassles this morning I’m back working on a performance-improving patch to Zend Framework. My hacking yesterday showed that the code can be rewritten to work twice as fast.
~/l/Z/library> git branch lookupnamespace-fix * master ~/p/z/tests> time php Zend_Gdata_Photos_Performance.php MD5 checksum = 4d796492005eddc1f0957b51873082fa <strong>6.51user</strong> 0.16system 0:06.70elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k ~/l/Z/library> git checkout lookupnamespace-fix Switched to branch "lookupnamespace-fix" ~/p/z/tests> time php Zend_Gdata_Photos_Performance.php MD5 checksum = 4d796492005eddc1f0957b51873082fa <strong>3.23user</strong> 0.18system 0:03.43elapsed
The test program constructs, from local files, one Zend_Gdata_Photos_UserFeed object and 4 Zend_Gdata_Photos_AlbumFeed objects, aggregates them into a single stdClass object then takes the MD5 checksum of serializing the aggregate. This helps assure me that my alterations have not affected the data in any way. It is no substitute for a complete unit test suite but is good enough for my current purposes.
I’ll show results from profiling later. I still have one more function to rewrite for final performance improvement figures. There are many functions I could rewrite but I’m focusing on proving how much more performance could be squeezed out so I’m only changing the most-called functions.
11:25 That done, I notice that
11:40 All but one of the changes in the patch apply cleanly on ZF 1.7.8. After manual re-application I re-ran my test. Here are the final results with profiling data.
That’s a 2.5x speed increase. Pity that to achieve it I had to make the code less clean, but that’s not really my problem. Time to figure out how to push this discovery to the upstream maintainers. Found sent patch + test + data over to him added my findings as a sub-task to the
The 90/10 Rule
11-Apr-2009 @ 09:00 The week has been interesting albeit with little progress on the application. I have applied to
Memory Use Profiling
First I shall quantify what I mean by “excessive memory consumption”. I shall use the same
09:45 The test data I’m using are a Picasa user feed (32k) and 4 Picasa album feeds (153k, 154k, 142k and 471k). For simplicity, I’ll call that 1 megabyte. The test program constructs a Zend_Gdata_Photos_UserFeed object from the user feed and Zend_Gdata_Photos_AlbumFeed objects from each album feed. The 5 objects are aggregated into a single object, serialized and the MD5 hash calculated. How much memory is required to do that on my x64 Ubuntu 8.10 system with Zend Framework 1.7.8 and PHP 5.2.6?
I added the following code to the end of the program;
echo 'Peak memory = ' .number_format(memory_get_peak_usage()/(1024*1024),1).'MB (internal) ' .number_format(memory_get_peak_usage(true)/(1024*1024),1).'MB (external)'."\n";
The results are astonishing – enough to make me question my methodology!
~/p/z/tests> php Zend_Gdata_Photos_Performance.php MD5 checksum = d49558f6839c8731d72f4b66e919dbdd Peak memory = <strong>76.4MB</strong> (internal) 77.0MB (external)
Wow! Now, that is the memory required by the entire script not just the part of the framework in which I’m interested. I’ll have to eliminate factors external to the Zend_Gdata library otherwise I’d be tempted to throw it away right now! So, I’ll take the memory usage just before the user feed is loaded and just after the album feeds are loaded and report the difference. For comparison I shall also report the size of the aggregated object after serialization.
~/p/z/tests> php Zend_Gdata_Photos_Performance.php Memory used = 45.4MB Serialized = 5.7MB MD5 checksum = d49558f6839c8731d72f4b66e919dbdd
Still astonishingly bad! Picasa feeds totaling under 1MB in size are 6 times larger when represented as PHP objects and consume 7 times that amount of RAM to create. Indeed, the Zend_Gdata library appears to be consuming memory like it’s going out of style.
10:15 I need more evidence. I know that some expansion of the original files is necessary – the Zend_Gdata library requires the XML files to be loaded into DOM objects, so next I’ll eliminate that from the memory use calculations. It would be unfair and unproductive to poke the finger of blame at the wrong area.
I refactor the test program to first load all the test files into DOM objects, report the memory used for that, then process each DOM object into the respective Zend_Gdata object. I deliberately overwrite the DOM object references with the Zend_Gdata object references as a hint to the garbage collector that the DOM objects are no longer needed after processing.
Hmmm… my first run shows no memory used by loading the XML files into DOM objects. I think that’s because I’m asking for memory usage based on internal allocations whereas the memory used by DOM is externally allocated by libxml.
Hmmm again… even after requesting externally allocated memory use the DOM loading appears to consume nothing. I’ve reached the limits of the simple profiling tools. Now I shall switch on Xdebug tracing. I add the following lines to the top of the test program;
ini_set('xdebug.show_mem_delta',1); xdebug_start_trace('trace');
Running the test program takes a lot longer but results is a 255MB trace file being created for my analytical delight. Here’s a couple of snippets
~/p/z/tests> head trace.xt
TRACE START [2009-04-11 10:19:58]
0.0017 199688 +0 -> memory_get_usage() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:13
0.0018 199768 +80 -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:15
0.0033 200304 +536 -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
0.0091 200952 +648 -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
0.0149 201280 +328 -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
0.0204 201712 +432 -> DOMDocument::load() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:18
0.0379 202072 +360 -> memory_get_usage() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:21
0.0380 202144 +72 -> number_format() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:23
0.0382 202784 +640 -> Zend_Loader::autoload() /home/paul/libs/ZendFramework-1.7.8/library/Zend/Loader.php:0~/p/z/tests> tail trace.xt 31.7695 47833768 +0 -> Zend_Gdata_App::setStaticHttpClient() /home/paul/libs/ZendFramework-1.7.8/library/Zend/Gdata/App.php:258 31.7720 47833064 -704 -> memory_get_usage() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:40 31.7727 47833064 +0 -> number_format() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:42 31.7728 47833064 +0 -> serialize() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:44 32.0284 53852504 +6019440 -> strlen() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:46 32.0284 53852504 +0 -> number_format() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:46 32.0285 53852504 +0 -> md5() /home/paul/public_html/zf-app/tests/Zend_Gdata_Photos_Performance.php:48 32.1402 139400 TRACE END [2009-04-11 10:20:30]
Just from the snippets some interesting evidence emerges, corroborating my previous analysis. The sequence of DOM loads consume very little memory, the serialization of the final aggregated object consumes around 6MB and the Zend_Gdata object construction consumes around 48MB.
Trace Analysis With MySQL
So now I know with more detail what I already figured out. My next step is to browse through the trace file looking for patterns of memory consumption. That’s not fun – it’s hard work. Which makes me think I should automate it. Perhaps I can load the trace file into a database and analyse it with SQL. Yeah, that’s what I’m going to do.
sudo apt-get install mysql-server php5-mysql
To make life easier for myself and MySQL, I changed the
CREATE TABLE trace ( level TINYINT(2) UNSIGNED NOT NULL , funcnum INTEGER UNSIGNED NOT NULL , is_exit TINYINT(1) UNSIGNED NOT NULL , timeindex DECIMAL(10,6) UNSIGNED NOT NULL , memusage INTEGER UNSIGNED NOT NULL , func VARCHAR(128) , is_user TINYINT(1) UNSIGNED , include_file VARCHAR(255) , file VARCHAR(255) , line_in_file INTEGER UNSIGNED , PRIMARY KEY (funcnum, is_exit) , KEY (timeindex) , KEY (func) ) ENGINE=MyISAM; LOAD DATA LOCAL INFILE 'trace.xt' INTO TABLE trace FIELDS TERMINATED BY '\t' IGNORE 2 LINES;
Now I can ask some interesting questions. First, which functions consume the most memory overall?
SELECT e.func, e.level, SUM(x.memusage-e.memusage) AS mem FROM trace e JOIN trace x USING (funcnum) WHERE e.is_exit = 0 AND x.is_exit = 1 GROUP BY e.func, e.level ORDER BY 3 DESC LIMIT 20;
| func | level | mem |
|---|---|---|
| Zend_Gdata_Feed->__construct | 3 | 39443088 |
| Zend_Gdata_App_FeedEntryParent->__construct | 4 | 39438688 |
| Zend_Gdata_App_Base->transferFromDOM | 5 | 39438688 |
| Zend_Gdata_Photos_AlbumFeed->takeChildFromDOM | 6 | 38164848 |
| Zend_Gdata_Photos_AlbumFeed->__construct | 2 | 37947048 |
| Zend_Gdata_Photos_PhotoEntry->__construct | 7 | 35870304 |
| Zend_Gdata_Media_Entry->__construct | 8 | 34202848 |
| Zend_Gdata_Entry->__construct | 9 | 34201696 |
| Zend_Gdata_App_FeedEntryParent->__construct | 11 | 33289488 |
| Zend_Gdata_App_MediaEntry->__construct | 10 | 33289488 |
| Zend_Gdata_App_Base->transferFromDOM | 12 | 33249552 |
| Zend_Gdata_Photos_PhotoEntry->takeChildFromDOM | 13 | 33044768 |
| Zend_Gdata_Media_Entry->takeChildFromDOM | 14 | 31465184 |
| Zend_Gdata_App_Base->transferFromDOM | 15 | 22077528 |
| Zend_Gdata_Media_Extension_MediaGroup->takeChildFromDOM | 16 | 21739344 |
| Zend_Gdata_App_Base->registerNamespace | 19 | 12328752 |
| Zend_Gdata_Extension->__construct | 18 | 9144352 |
| Zend_Loader::loadClass | 3 | 8165176 |
| Zend_Loader::autoload | 2 | 8165176 |
| include_once | 4 | 8095968 |
There’s nothing at all surprising about these results until line 16. Before that point, the memory consumption is dominated by the Zend_Gdata object constructors and DOM conversion methods. Why, though, is registerNamespace() consuming 12MB of RAM so deep into the stack? The XML namespaces in feeds recognised by the Zend_Gdata library are essentially constant so they should only need to be registered once for each different class. How many times is that function called?
SELECT COUNT(*) FROM trace WHERE func = 'Zend_Gdata_App_Base->registerNamespace' AND is_exit = 0; 18268
Ouch! Now, I know that the methods which call registerNamespace are called registerAllNamespaces. Which classes make the call, how many times and how much memory is consumed by each?
SELECT caller.func, COUNT(*) AS calls, SUM(callee_exit.memusage-callee.memusage) AS mem FROM trace callee JOIN trace caller ON caller.funcnum = callee.funcnum - 1 JOIN trace callee_exit ON callee_exit.funcnum = callee.funcnum WHERE callee.func = 'Zend_Gdata_App_Base->registerAllNamespaces' AND caller.is_exit = 0 AND callee.is_exit = 0 AND callee_exit.is_exit = 1 GROUP BY caller.func ORDER BY 2 DESC;
| func | calls | mem |
|---|---|---|
| Zend_Gdata_Media_Extension_MediaThumbnail->__construct | 949 | 1780096 |
| Zend_Gdata_Media_Extension_MediaTitle->__construct | 323 | 478800 |
| Zend_Gdata_Media_Extension_MediaDescription->__construct | 323 | 651008 |
| Zend_Gdata_Media_Extension_MediaGroup->__construct | 323 | 488752 |
| Zend_Gdata_Media_Extension_MediaCredit->__construct | 323 | 437032 |
| Zend_Gdata_Entry->__construct | 323 | 938424 |
| Zend_Gdata_Media_Extension_MediaKeywords->__construct | 323 | 349928 |
| Zend_Gdata_Media_Extension_MediaContent->__construct | 323 | 881128 |
| Zend_Gdata_Photos_PhotoEntry->__construct | 313 | 1432048 |
| Zend_Gdata_Media_Entry->__construct | 313 | 1152 |
| Zend_Gdata_Photos_AlbumEntry->__construct | 10 | 42440 |
| Zend_Gdata_Geo_Extension_GmlPoint->__construct | 10 | 30496 |
| Zend_Gdata_Geo_Extension_GeoRssWhere->__construct | 10 | 13312 |
| Zend_Gdata_Geo_Extension_GmlPos->__construct | 10 | 30616 |
| Zend_Gdata_Feed->__construct | 5 | 4400 |
| Zend_Gdata_Photos_AlbumFeed->__construct | 4 | 0 |
| Zend_Gdata_Photos_UserFeed->__construct | 1 | 8552 |
I can’t say for sure where I’m going with this analysis but my instinct tells me there’s something awry in the Zend_Gdata library’s treatment of XML namespaces. I’m getting a faint whiff of one of the worst possible
Investigating A Possible Design Flaw in Zend_Gdata
15:00 What is the difference between a design flaw and a bug? Generally, a bug is a coding error that causes software to behave incorrectly. A design flaw is an error of thought or judgment that causes software to be built in such a way that it behaves correctly but improperly. Bugs can usually be seen in the code, design flaws generally cannot. In this case, the logic of the software is correct but that correctness has been achieved at intolerable cost.
I’ve been working with XML for a long time now so I’m fully aware of the painful nature of dealing with namespaces – they’re hard to get right. I managed my pain by choosing tools specifically designed to work with XML, such as XSLT and XPath. My original implementation of the food journal’s Picasa adapter used XPath queries to select the nodes it needed from the feed. The designers of Zend_Gdata have eschewed these tools for reasons I don’t know, but I’d speculate that the nature of the API they were intending to create (an all-encompassing object hierarchy representing Google data feeds) did not lead them to think about how they could leverage the existing tools, so they built their own.
Before I leap into what is likely to become a monster piece of work, I need to quantify to myself the cost and benefits. I started this whole journey as a means to play with Zend Framework, which I have done to a degree but I seem to be giving a lot of attention to one small part of it. That is costing me experience of the other more interesting parts. The benefits of focusing on Zend_Gdata are; a) I can immediately make a contribution, b) I need the component to perform sanely in order to release my application and c) people from Google may be watching! What right-minded software geek doesn’t want to be noticed by Google?
There is also the pure and simple enjoyment of exercising my skills at improving code. Anyone crazy enough or bored enough to have read through this article will have realised that I really get a buzz out of doing this kind of work. It is hopefully clear that I’m very good at it. I use a combination of intuitive and evidence-based techniques to figure out why things are not as they should be then meticulous measurement to prove that any change I make achieves my intended effect.
So what it comes down to is this; if I can correct the namespace handling design flaw in Zend_Gdata, I may reduce the memory footprint by a quarter or more across all use-cases for the library. That is a clear benefit. Who else is going to do it if I don’t? What the hell? It’s worth a look at least.
Why Does Zend_Gdata Consume So Much Memory?
This is a code review. It’s not going to be pretty and I’m definitely not going to hold back. I’m looking for ways to improve the code which means I’m going to find all the bad stuff.
All filenames are relative to library/Zend/Gdata/ and I’m reviewing version 1.7.8. Each time I spot a potential waste of memory, I’ll try to fix it and re-run my tests to measure the effect. I’ll also provide a link to the actual source code in the repository so readers without the framework installed can follow me.
- App.php, method importString
- This method uses DOMDocument::loadXML to check that the passed string can be parsed, then passes the string through to the feed object’s transferFromXML method. Without doubt, the next operation will be DOMDocument::loadXML. Not only is this duplicating effort, an unnecessary DOM document is held in memory for the duration.
- Removed
17:00 I’ve lost the will to live. So many classes…
The Design Flaw In The Zend_Gdata Library
12-Apr-2009 @ 12:30 After sleep and a quick review of the code, I figured out the design flaw.
The classes in the Zend_Gdata library co-mingle the representation of things with the transport mechanism used to manipulate the things. For instance, a Zend_Gdata_Photos_PhotoEntry object represents a photograph of which a property is size, yet by inheriting from Zend_Gdata_Base also carries a set of XML namespaces relevant only to the means of transport through which the value of the property was acquired, not the property itself. Each object that derives from Zend_Gdata_Base carries with it, for its entire lifetime, some state from the context in which it was or will be populated, before and beyond the lifetime of the context.
More simply put, no size of any photograph has a XML namespace, yet
I described it to a layman like so: imagine that every currency note one carried also had the
Fortunately, the library designers are at least partly aware of the flaw, as evidenced by
The question becomes, what am I going to do about it? Hmmm…
15:00 Nothing. That’s what I’m going to do. Sweet F.A. Not my code, not my problem. More importantly, not fun to get involved with. I may change my mind if asked (or paid) but I have better things to do with my own time than clean up other people’s messes.
I’ll do what I can with the food journal application to overcome the brokenness of Zend_Gdata. I need to move on.
Refactoring the Food Journal Picasa Adapter
One thing I can do to work around the memory use of Zend_Gdata is refactor the Picasa adapter to process one album at a time. My current implementation, though pretty, is flawed in that it requires all the user’s albums to be loaded into memory at once. I’ll also apply caching at the photostream level rather than at the album feed level, which allows me to discard most of the unused data in the feeds.
16:15 The refactoring didn’t take long and seems to have been successful. The peak memory usage to load my food journal is 75MB, down from 286MB. It’s still a lot, but good enough.
Porting the Food Journal Flickr Adapter
19-Apr-2009 @ 11:45 After reading over the documentation for the Zend Framework Flickr service component, it seemed worth having a go at porting the food journal’s Flickr adapter over. The current code is relatively simple. The core is a function callFlickrAPI( $api, array $params ) which looks like it will map pretty directly to methods on the Zend_Service_Flickr class.
Despite its simplicity, there is one aspect of this code that concerns me with regard to refactoring it – I didn’t write it. That means I would benefit from the existence of a set of tests, the data for which will have to be derived from the current code. At least, the list of returned photos for a known user should be identical before and after refactoring. As the data is deterministic, I’ll begin with my favoured simple technique of comparing the MD5 hash of the serialization of the output, switching to more detailed testing only if necessary.
Preparing the Repository
As I’m happy enough with the porting efforts so far (the Picasa adapter and MVC), I merge the tip of that branch into the original branch called “zend-port” and start a new branch called “flickr-adapter-port”.
Basic Unit Tests
I’ll admit I’m not a member of the modern unit testing movement. My experiences discussing the goals of the movement with those who claim to be members leaves me nagging doubts about the evidential basis. That the quality of code with unit tests is better than that of code without seems to be an article of faith rather than a conclusion drawn from empirical data. I’m not surprised. Automated unit tests are a software tool and geeks get highly political about tools.
No matter. For my present purposes, a basic set of tests helps me to refactor the code I’m working with. I have no reason to test the inner working of the Flickr adapter class, only that it produces the same outputs for known inputs. I decided to check, for a known test users, that the adapter returns the correct number of photos and that the photostream, when serialized, resolves to the same MD5 hash.
Hmm… looking over the code I can see that there are two distinct operations that I’m going to refactor; the acquisition of photos and the acquisition of user data. I’ll need to add a few more tests to cover the latter as that’s the easiest point to begin refactoring.
15:00 I’ve had to create a test suite in order to make the test data retrieved from Flickr a shared fixture, otherwise each test makes a round-trip to the server. Now my tests cover the basic user details and the photos. I’m ready to get into the code now.
Functional Deprivation Causes Abandonment
Ah well, that didn’t take long and I didn’t waste too much of my time. Basically, the Zend_Flickr_Service component doesn’t provide all the features necessary for the food journal. Specifically, it does not expose the “people.getInfo” API which the adapter uses to acquire the user’s display name, nor does it expose the “photosets.getList” API that the adapter uses to determine which photoset is a food journal, nor does it expose the “photosets.getPhotos” API that the adapter uses to retrieve only the food journal photos.
There’s no point in continuing, so I delete the Git branch I was working on.


Zend Form Remove Error Messages | Edmonds Commerce Blog says:
[...] Dojo Form decorators example and usage « Zend FrameworkHandling Zend Framework Form error messagesPlaying with Zend FrameworkGrouping Form errors for display purpose in Zend Framework « Zend …Installing apache 2.2, [...]
April 11, 2009, 10:53 amLibertus says:
I’ve added syntax highlighting for the code snippets using the WP-Syntax plugin.
April 13, 2009, 9:18 am