More on PHP performance

After writing the post criticizing Google’s “performance advice” for PHP beginners, I started thinking – OK, I don’t like Google’s advice, what would I propose instead?

So here are my thoughts about what would be good for the beginner to consider when he starts with PHP performance optimizations. Note that I do not say it’s the only thing you should do – there are a bunch of articles, talks, blogs, etc. about PHP performance and many of them contain very good advice and go into much more details than I intend to go into. But I think the items below are ones that you should ensure you are doing to the full extent before you go to look around for performance tricks.

Also, from the start I want to say that I work for Zend Technologies and I participated in development of many Zend solutions, both free and commercial. I am going to mention both kinds in this article, where relevant. I am aware that there are alternative solutions, but I will mention the ones I know the best. So please do not take this as commercial advertisement or any claim on relative merits of other solutions – it is not the intent. The intent is to give general direction and some examples, if somebody prefers other solutions in the same direction – that’s fine.

Bytecode cache
If you care about performance and don’t use bytecode cache then you don’t really care about performance. Please get one and start using it. If you want ready-made commercially-supported solution with nice GUI, etc., look at Zend Server, if you’re more into compile-it-yourself command-line then you may want to look at APC, or other alternatives.

Profiling
Profile you code before you start optimizing it! Otherwise it would be like travelling around a foreign city with signs written in an unreadable language witout any map or GPS. You’ll probably get somewhere, but you wouldn’t have any idea where you are, where you should go and how far are you from the place you need to be. Profiling would allow you to know which parts of code are worth investing into and which aren’t. You can use Zend Studio/Debugger or Xdebug for that.

Caching
Most PHP installations run in “shared nothing” mode where as soon as the request processing ends, all the data associated with the request is gone. It has some advantages, but also one big disadvantage – you can not preserve results of repeated operations. That is, unless you use caching.
You should look into caching all operations which take considerable time and can return the same result for a prolonged period of time or same data set. That may include configurations, database queries, service requests, complex calculations, full pages or page fragments, etc., etc. Caching expensive operations is one of the most powerful performance improvements you can do.
There are numerous low-level caching solutions – memcached, APC, Zend Server (you can find a good guide to it on DevZone) and others. On top of it, you may look into Zend Framework’s caching infrastructure – which support the backends described above and more and makes caching much easier.

Optimize your data
Usually the most expensive places of the PHP application are where it accesses external data – namely, database or filesystem or network. Look hard into optimizing that – reduce number of queries, improve database structure, reduce filesystem accesses, try to bundle data to make one service call instead of several, etc. For more advanced in-depth look, use tools like strace (Unix) and Process Explorer (Windows) to look into system calls your script produces and think about ways to eliminate some of  them. You would not be able to eliminate all of them but each of them is a worthy target.

Don’t try to outsmart the engine
There are a lot of “tips” floating around about which constructs in PHP are faster or slower than others. I think you can safely ignore all of these tips, especially if you’re a beginner. Odd are, 9 cases out of 10 they won’t give you any improvement at all, and in the remaining one case it will be either not applicable in your code or not worth the time spent on it. Yes, there are ways to save couple of opcodes and remove couple of lookups here and there – but unless you’ve already done with all of the previous steps it is not worth it. And some of the advice out there will actually make you code slower, less robust and less secure without you even noticing. So I think for the beginners is better to stay away from trying to outsmart the engine altogether.

Benchmark in real life

Many of the advices I mentioned above have benchmarks as a proof. The problem is these benchmarks always test only a short piece of code. However, you would not be running that one-liner – you would be running the whole big application. This reminds me of a joke about a physicist that developed the model of a spherical horse in vacuum in order to use it to win bets on horse racing. If you want better chances to win than that physicist, test in real environment, not in vacuum. If you have an idea for some improvement, verify that this improvement actually improves your application, not just an artificial benchmark. If this is impossible, use profile results to estimate potential benefit – if you find a way to optimize function that summarily runs for 0.1% of overall execution time, you probably won’t do any good to the application as a whole.

Leverage the extensions
That seems too obvious, but I have seen a lot of code that duplicates functions available in some PHP extension. There are a lot of functions in PHP and if you do something that others may have done before, check in the manual. You have DOM/SimpleXML extensions for XML, JSON extension for JSON, SOAP extension for doing SOAP, etc., etc. Do not create custom serialization/deserialization if serialize()/deserialize() would work for you.
If you have some very performance-sensitive bit of script and you can do C programming (beginner in PHP doesn’t mean beginner in everything :), consider even making your own extension, it’s not that hard.

Avoid extra notices/errors/etc.
Even suppressed errors have cost in PHP, so try and write your code so it would not produce notices, strict notices, warnings, etc. You may want to enable logging of all errors to examine that. Never enable displaying errors in production though – it will only lead to a major public embarrassment.

Use php.ini-production as a start
If you need a set of php.ini settings which would not hurt your performance and not break anything, look into php.ini-production in PHP source. You may need to change a couple of details (e.g. include path) but it’s a good starting point.

Use big realpath cache
Realpath cache is very useful for the engine when it tries to find the unique full name of the file from just filename or relative path. By default, it’s 16K but if you have a lot of files with long pathes, it’s better to increase the size – it would save the expensive disk accesses.

There are probably more things that could be said, but this post is pretty long already, so I will end it here and you are welcome to add your opinion in comments.

About these ads

63 thoughts on “More on PHP performance

  1. Pingback: PHP Reference Links | kabayview.com

  2. Pingback: 一些关于 PHP notice / warning 与性能相关的资料 ‹ 龙猫の笔记

  3. Pingback: learn

  4. The bottlenecks are the Networks and not the CPU. If you monitor your CPU load, sending plain HTML pages over a network will not tax your CPU at all, the bottleneck will be the network.

  5. It was interesting to read about the “smart tips” (I think you can safely ignore all of these tips).
    On the other hand – in these comments I found a better way to optimize the for loops.
    I have always done this in the simpel (and wrong / slow way).
    Therfore looking for hints and tips is not so bad. But in fact its right – most of the hints might not improve the performance.

  6. Pingback: PHP Blogger: Linktipps: Performance und Produktivität - Ein PHP Blog auf deutsch

  7. Thanks for this post, really nice tips! I made the same experience by avoiding extra notices and errors. Performance was always much better than before.

  8. I am use zend frame work.When i surf that site some time site page is going to blank.But in view source i get all the code..

    Can any give reply

  9. Pingback: PHPのパフォーマンス向上のためのリンク集 | WEB RHODIA

  10. The biggest problem with improving performance is to find a spot where we have most time spent.

    Is there any software that can be installed on a production host and later can tell:
    which functions spent most time.

    Problem is that if you take only one request on dev. server, it will produce completely different results compared to what you get if you do 10 000+ requests on different modules of a server.

    Also it might be interesting to have availability to “replay” server load.
    For example: you record all user activities(on production) for a hour, then you use this data on development server to replay data after doing major performance changes.

    • Interestingly enough, I’m working on a tool just like that to be a part of the next Zend Server version. Should be out in coming monthes :)

      • Then it is even more reasons to start testing Zend Server, at least in development enviroment :)

  11. Pingback: Google PHP performance tips | Jonas Lejon

  12. Pingback: Fordnox » Blog Archive » PHP optimization advice from Google

  13. Pingback: Stationsbloggen » Arkivet » Google kan inte PHP

  14. Pingback: Echte PHP Performance Tipps | CWD - Customized Web Development

  15. –try and write your code so it would not produce notices–

    Does it mean that I should check if every $_POST or $_SERVER isset()? It creates a notice if it is not checked.

    It is not clear why I should do it. It seems to me it takes time to check if it set or not, while I am pretty sure that it is set even without this check.

    And I would mention one more item. While sanitizing an integer in PHP 4 it is enough to use casting $i=(int)$i;

    In PHP5 there are special sanitizing filters already, but on my production site it is still PHP4. So I use casting. I read somewhere that casting for integers is very fast and resources’ cheap.

    • I don’t know how you can be sure something in $_POST is set – you don’t control it, the client does. But if you _know_ it’s set (e.g., you checked it before) – then, of course, no point in checking twice :)
      Now, you can take all kinds of shortcuts, including not checking certain things and even risking producing some notices in some rare cases, and that may be ok. That’s your decision as an app developer. What I wanted to achieve here is for you to be aware that there’s a cost in it. If you’re OK with the cost and other consequences of not checking something – it’s fine.

      • Stas, thank you.

        For example, I want to make sure that a page is not shown via SSL (https://). I use this:

        if($_SERVER['HTTPS'] == ‘on’) {
        header(“Location: http://{$_SERVER["HTTP_HOST"]}{$s_root}”);
        }

        It creates a notice. But this does not create a notice:

        if(isset($_SERVER['HTTPS'])) {
        if($_SERVER['HTTPS'] == ‘on’) {
        header(“Location: http://{$_SERVER["HTTP_HOST"]}{$s_root}”);
        }
        }

        What is faster, to check if $_SERVER['HTTPS'] is set, or ignore this suppressed notice? I work on one server, and $_SERVER['HTTPS'] is sort of always set.

        Another situation: in the shopping cart are incoming variables $_POST['id'], $_POST['title'], $_POST['price'], $_POST['quantity'], $_POST['block_price'], $_POST['block_quantity'], $_POST['ifpackage'].

        I am checking now only if one is set, the $_POST['id']. If it is set I assume that others are set too. Then I do sanitation and use them.

        This creates suppressed notices. But if I check that all 7 $_POST incoming variables are set, then there is no notice. What is your opinion? Checking if 7 variables are set is cheaper than ignoring this notice? Or not.

        • You can use small function for fetching data from $_POST.
          Something like:

          function post($field,$default=null){
          return isset($_POST[$field]) ? $_POST[$field] : $default;
          }

          1. it won’t generate any notices
          2. easier to write: just post(‘id’)
          3. easy to set default value.

      • Stas, I switched on in php.ini reporting of notices and rewrote my application so that notices are not generated.

        I removed several stupidities in my code at the same time, like: $link=mysql_connect(localhost,$username,$password)

        PHP thought that localhost a constant, an empty one, and actually substituted it with a string ‘localhost’.

        It was not that difficult to remake the code so that it does not produce any notice. Besides, I read somewhere yesterday that long code does not mean slow code, as it is compiled before execution. And the point is to write correct code.

        Thank you for pointing me in this direction.

        • A few years ago I worked on a large PHP based cms. After we spent a day fixing all the error notices, we noticed small but real improvement in performance.

          Will

  16. Some run servers in cgi mode with suphp so that scripts run in the name of the user. In that case which bytecode cache is the best to be used in a multi user environment?

    • I think cgi/suphp is currently not compatible with bytecode caches. It can be made to work, but AFAIKk right now it’s not supported by any of the major ones.

  17. Pingback: [PHP] パフォーマンス向上の心得 | Screw-Axis

  18. Pingback: More on PHP performance « PHP 10.0 Blog

  19. Caching is a must in any application. I currently use PEAR Cache Lite on my projects and it’s very fast and easy to integrate !
    Great post.

  20. Pingback: Interesting reads (PHP & more) and tools (Twitter & more) « Jon-G blogs for Net-Entwicklung.de

  21. Excellent post, thank you.

    We have dramatically increased our overall site performance by caching just a couple of expensive queries :-)

  22. Pingback: Web Development

  23. Pingback: Web Development

  24. Pingback: Top Posts « WordPress.com

  25. Pingback: Marcyes / More on PHP performance - PHP 10.0 Blog

  26. I agree wholeheartedly on the whole benchmarking and profiling. I just wrote up an article last week or so on a few benchmarking tools we use and how we use them. It might help a few people take that first step into trying it out. just don;t spend all your time benchmarking and nor doing any coding :)

    http://blueprint.intereactive.net/benchmarking-our-php/

    Also really helpful was your take on leveraging extensions. I realize that I don’t know or use half the extensions out there that could help ease my coding.

    Great article!!!

  27. Pingback: PHP Performance « 3wstudio

  28. Pingback: PHP 10.0 Blog: More on PHP performance | Webs Developer

  29. One simple optimization that most people miss is not using count($var) as a condition of for loops.


    // Wrong
    for ($i = 0; $i < count($Results); $i++)
    {
    // Do Something
    }

    // Correct
    $ResultCount = count($Results);
    for ($i = 0; $i < $ResultCount; $i++)
    {
    // Do Something
    }

    It’s very simple, and now you’re not counting the size of the stack every time you process the loop.

    • T. Crider this is one of the tips which are mentioned in the don’t try to outsmart the engine paragraph.

      And instead of adding a line you can set the ResultCount variable in the first segment of the for loop which makes the context of the variable clearer.

      for($i = 0, $j = count($Restults) ; $i < $j ; $i++)
      {
      // do something
      }

      • @ Shane:
        When using post-increment, the value of the variable is stored in a temporary location. With pre-increment the temporary variable isn’t needed.
        It’s not much, but with a couple of 100,000 cycles you see a little difference. ;)

        • I have also an example that improvments like using if($var===null) instead of if(is_null($var)) significantly improved performance (for 1-2 seconds). But it was an algorithem for searching one solution in 300! combinations.

  30. Pingback: PHP 10.0 Blog: More on PHP performance | DreamNest - Technology | Web | Net

  31. Pingback: Focus On | News | Server-Side Magazine

  32. It all makes sense to me, nice list.

    Only if you get below 100-50ms of rendering time you can consider micro optimizations and other voodoo. I guess the problem is that people read presentations saying: ‘we reduced time by 50% changing from include_once to include …’. Although it might make difference to someone like facebook/flicker or other giant, you wont feel it unless your render time is sooooooo small there is nothing else left to improve.

    Nice article, bring us some more :- ))

    Art

  33. Nice article. In my opinion if your application are using database (mysql, postgresql what ever), in first step you should start from it. I saw some applications where database queries just killed them :-) Because someone didn’t know how to use ORM and so on.

Comments are closed.