ZF Oauth Provider

Zend Framework has pretty good OAuth consumer implementation. However, it has no support for implementing OAuth provider, and it turns out that there aren’t many other libraries for it. Most examples out there base on PECL oauth extension, which works just fine, with one caveat – you have to have this PECL extension installed, while ZF implementation does not require that.

So I went ahead and wrote some code that allows to easily add OAuth provider to your ZF-based or ZF-using application. That should make writing OAuth provider easier.

Note that the code does not implement the whole server – just the OAuth protocol wrapper, you’d still have to do all the work of managing tokens/keys/nonces by yourself. See example server in the repository and the wiki on github for more details on how to do it, but the protocol follows what PECL oauth does pretty closely, so many tutorials for it would be mostly applicable to this one too.

Check out Zend_Oauth_Provider on github, if you want to improve it – please fork and submit pull requests.


zf book review

As I mentioned before, I got the book Zend Framework 1.8 Web Application Development for review. It took me a bit more time than I though to do this (one of the reasons will become clear soon) but here it finally comes.

I think it is a great book for somebody who is somewhat acquainted with Zend Framework and wants to get really good at working with it. It covers a lot of ground, so if you never worked with ZF before you might be a bit overwhelmed by the amount of stuff going on (then again, maybe I am underestimating you ๐Ÿ™‚ ), so you may want either skip some detail to return to it later when the need arises or have a run through a very basic tutorial for getting to know how the framework works before. The book itself has the basic startup section but I feel for a complete newbie it still might be a bit tough to keep up with the amount of the material in the book. That of course will come handy later when you got the basics figured out.

I personally am a big fan of learning by example and I think one line of code is often worth a hundred words, so I was really pleased that this book is based on building a complete application and comes with full application code to accompany it. The application is a storefront with an admin interface, so it covers most of the common tasks in a typical PHP application.
While it means that on the road there were certain decisions to be taken, and certain ways of doing things will be chosen over certain others, the author clearly identifies the decision points and explains the reasons – i.e. why certain things go to a model and not controller, why this extension point and not that one is used, etc. ZF has a very rich set of features, so there’s no single “right way” to do all things – but the book certainly shows you one of the ways to reach the complete and nicely structured application.

I actually took a bit of an experiment – as I was at the time building some ZF-based application, I decided to use the book as a first reference for any question that I needed with the app. I am pleased to report that the book indeed proved very helpful and I was able to find most of the advanced topics – like the use of ACLs, interactions between forms, views and decorators, modifying the behavior of the standard ZF classes, etc. – answered in the book and demonstrated in the code. The author successfully avoided the temptation to quote the manual extensively and instead picks up where the manual leaves off – i.e. how does one use stuff that the manual describes in practice.

The book also covers – albeit somewhat lightly – the topic that is neglected by so many other ZF books – namely testing. It shows how to setup the test environment and how to execute some basic application tests. Ideally, I would like the topic of tests to be much more prominent and featured as something parallel to development and not something you do after (though I know that’s how it rally happens many times ๐Ÿ˜‰ ) – but I know there’s only so much you can put into one book ๐Ÿ™‚

Summarily, I think it is a good book to have if you do or about to do serious ZF development.

More on PHP performance

After writing the post criticizing Google’s “performance advice” for PHP beginners, I started thinking – OK, I don’t like Google’s advice, what would I propose instead?

So here are my thoughts about what would be good for the beginner to consider when he starts with PHP performance optimizations. Note that I do not say it’s the only thing you should do – there are a bunch of articles, talks, blogs, etc. about PHP performance and many of them contain very good advice and go into much more details than I intend to go into. But I think the items below are ones that you should ensure you are doing to the full extent before you go to look around for performance tricks.

Also, from the start I want to say that I work for Zend Technologies and I participated in development of many Zend solutions, both free and commercial. I am going to mention both kinds in this article, where relevant. I am aware that there are alternative solutions, but I will mention the ones I know the best. So please do not take this as commercial advertisement or any claim on relative merits of other solutions – it is not the intent. The intent is to give general direction and some examples, if somebody prefers other solutions in the same direction – that’s fine.

Bytecode cache
If you care about performance and don’t use bytecode cache then you don’t really care about performance. Please get one and start using it. If you want ready-made commercially-supported solution with nice GUI, etc., look at Zend Server, if you’re more into compile-it-yourself command-line then you may want to look at APC, or other alternatives.

Profile you code before you start optimizing it! Otherwise it would be like travelling around a foreign city with signs written in an unreadable language witout any map or GPS. You’ll probably get somewhere, but you wouldn’t have any idea where you are, where you should go and how far are you from the place you need to be. Profiling would allow you to know which parts of code are worth investing into and which aren’t. You can use Zend Studio/Debugger or Xdebug for that.

Most PHP installations run in “shared nothing” mode where as soon as the request processing ends, all the data associated with the request is gone. It has some advantages, but also one big disadvantage – you can not preserve results of repeated operations. That is, unless you use caching.
You should look into caching all operations which take considerable time and can return the same result for a prolonged period of time or same data set. That may include configurations, database queries, service requests, complex calculations, full pages or page fragments, etc., etc. Caching expensive operations is one of the most powerful performance improvements you can do.
There are numerous low-level caching solutions – memcached, APC, Zend Server (you can find a good guide to it on DevZone) and others. On top of it, you may look into Zend Framework’s caching infrastructure – which support the backends described above and more and makes caching much easier.

Optimize your data
Usually the most expensive places of the PHP application are where it accesses external data – namely, database or filesystem or network. Look hard into optimizing that – reduce number of queries, improve database structure, reduce filesystem accesses, try to bundle data to make one service call instead of several, etc. For more advanced in-depth look, use tools like strace (Unix) and Process Explorer (Windows) to look into system calls your script produces and think about ways to eliminate some ofย  them. You would not be able to eliminate all of them but each of them is a worthy target.

Don’t try to outsmart the engine
There are a lot of “tips” floating around about which constructs in PHP are faster or slower than others. I think you can safely ignore all of these tips, especially if you’re a beginner. Odd are, 9 cases out of 10 they won’t give you any improvement at all, and in the remaining one case it will be either not applicable in your code or not worth the time spent on it. Yes, there are ways to save couple of opcodes and remove couple of lookups here and there – but unless you’ve already done with all of the previous steps it is not worth it. And some of the advice out there will actually make you code slower, less robust and less secure without you even noticing. So I think for the beginners is better to stay away from trying to outsmart the engine altogether.

Benchmark in real life

Many of the advices I mentioned above have benchmarks as a proof. The problem is these benchmarks always test only a short piece of code. However, you would not be running that one-liner – you would be running the whole big application. This reminds me of a joke about a physicist that developed the model of a spherical horse in vacuum in order to use it to win bets on horse racing. If you want better chances to win than that physicist, test in real environment, not in vacuum. If you have an idea for some improvement, verify that this improvement actually improves your application, not just an artificial benchmark. If this is impossible, use profile results to estimate potential benefit – if you find a way to optimize function that summarily runs for 0.1% of overall execution time, you probably won’t do any good to the application as a whole.

Leverage the extensions
That seems too obvious, but I have seen a lot of code that duplicates functions available in some PHP extension. There are a lot of functions in PHP and if you do something that others may have done before, check in the manual. You have DOM/SimpleXML extensions for XML, JSON extension for JSON, SOAP extension for doing SOAP, etc., etc. Do not create custom serialization/deserialization if serialize()/deserialize() would work for you.
If you have some very performance-sensitive bit of script and you can do C programming (beginner in PHP doesn’t mean beginner in everything :), consider even making your own extension, it’s not that hard.

Avoid extra notices/errors/etc.
Even suppressed errors have cost in PHP, so try and write your code so it would not produce notices, strict notices, warnings, etc. You may want to enable logging of all errors to examine that. Never enable displaying errors in production though – it will only lead to a major public embarrassment.

Use php.ini-production as a start
If you need a set of php.ini settings which would not hurt your performance and not break anything, look into php.ini-production in PHP source. You may need to change a couple of details (e.g. include path) but it’s a good starting point.

Use big realpath cache
Realpath cache is very useful for the engine when it tries to find the unique full name of the file from just filename or relative path. By default, it’s 16K but if you have a lot of files with long pathes, it’s better to increase the size – it would save the expensive disk accesses.

There are probably more things that could be said, but this post is pretty long already, so I will end it here and you are welcome to add your opinion in comments.

Benchmarking Zend Framework loader

One of the things I am doing in course of my work is performance benchmarks for various stuff – PHP, Zend products, applications, etc. Performance in PHP space is currently like alchemy – there are a lot of rumors floating around about various properties of various stuff, but much less reliable data that can be verified and used. PHP has standard bench.php script, but it covers only a small part of what real-life applications do. I wish there were more established tests and methods for benchmarking PHP engine and applications. But benchmarking is a complicated subject, and variety of PHP platforms and applications makes it harder to create useful general-purpose benchmarks.

But more to the point. On Zend Framework lists there was a topic raised about performance impact of Zend_Loader component, which is used for – no surprise here! – loading classes, including autoloading, etc. Some folks thought that since Zend_Loader is executing some code before actual loading the required file, it must cost something. And it makes sense. However, how much does it cost?
Well, the best way to know the price of something is to ask – and in this case, to run the test. So that’s what I did – I made a list of 725 Framework classes (ZF now has more than 1000 but I composed the list some time ago and had also to drop some to avoid some tricky dependencies). And I wrote two scripts – one that would load these classes with require_once and one that would load them using Zend_Loader::loadClass. Both the data file and the scripts are available for download for those that would like to play with it. I tested them with and without Zend’s bytecode cache, to see how much one can save using bytecode caching technology.

So, the results were as follows:

Without bytecode cache:

          require_once Zend_Loader
php5.2        4.42      4.42
php5.3        4.96      4.97

With bytecode cache:

           require_once Zend_Loader
php5.2        63.04     56.62
php5.3        61.28     55.52

The numbers are requests per second, so more is better. Test run on Linux dual 2GHz AMD.

What we can conclude from these?

  1. It is very important to understand that it is a narrow-point benchmark that tests only one function in one specific way. Please do not draw conclusions on behavior of whole applications based only on this benchmark.
  2. You do want to use bytecode caching. You won’t get 15x performance on any real application, but it does speed up loading very significantly.
  3. Without bytecode caching, it doesn’t matter if you use require_once or Loader – both are equally slow ๐Ÿ™‚
  4. With bytecode caching, Loader has some overhead – explanation for this is that with file accesses eliminated, require_once of course has little left, while Loader still does a couple of function calls. But on real-life apps it’d probably be very small, provided that it’s about 10% even on loading-only huge-class-list benchmark, and your application probably does something useful instead of loading 700+ framework classes :)) Meaning, fears of using the class loader vs. doing require_once are seriously overstated.
  5. 5.3 is still a moving target, to don’t put too much stake in current benchmark results for 5.3, they probably will be different by the time 5.3 is in release cycle (hopefully, better :))

P.S. This post does not talk about other things like “what if I stuff all classes I use into single file”, etc. Maybe next time.