PHP performance tips from Google
Posted by Stas on June 26, 2009
I saw a link on twitter referring to PHP optimization advice from Google. There are a bunch of advices there, some of them are quite sound, if not new – like use latest versions if possible, profile your code, cache whatever can be cached, etc. Some are of doubtful value – like the output buffering one, which could be useful in some situations but do nothing or be worse in others, and if you’re a beginner generally it’s better for you to leave it alone until you’ve solved the real performance problems.
However some of the advices make no sense at best and are potentially harmful at worst. Let’s get to it:
First one: Don’t copy variables for no reason. I don’t know what the author intended to describe there, but PHP engine is refcounting copy-on-write, and there’s absolutely no copying going on when assigning variables as they described it:
$description = strip_tags($_POST['description']);
echo $description;
I don’t know where it comes from but it’s just not so, unless maybe in some prehistoric version of PHP. Which means unless you’re going back to 1997 in a time machine this advice is no good for you.
Next one: Avoid doing SQL queries within a loop. This actually might make sense in some situations, however the code examples they give there is missing one important detail that makes it potentially harmful for beginners (see if you can spot it):
$userData = [];
foreach ($userList as $user) {
$userData[] = '("' . $user['first_name'] . '", "' . $user['last_name'] . '")';
}
$query = 'INSERT INTO users (first_name,last_name) VALUES' . implode(',', $userData);
mysql_query($query);
Please repeat after me – DO NOT INSERT USER DATA INTO SQL WITHOUT SANITIZING IT!
Of course, I can not know that $user was not sanitized. Maybe the intent was that it was. But if you give such example and target beginners, you should say so explicitly, every time! People tend to copy/paste examples, and then you get SQL injection in a government site.
Another thing: most of real-life PHP applications usually do not insert data in bulk, except for some very special scenarios (bulk data imports, etc.) – so actually in most cases one would be better off using PDO and prepared statements. Or some higher-level frameworks which will do it for you. But if you roll your own SQL – sanitize the data! This is much more important than any performance tricks.
Next one: Use single-quotes for long strings. PHP code is parsed and compiled, and any possible difference in speed between parsing “” and ” is really negligible unless you operate with hundreds of megabyte-size strings embedded in your code. If you do so, your quotes probably aren’t where you should start optimizing. And of course, using caching (see below) eliminates this difference altogether.
Next one: Use switch/case instead of if/else. This makes no sense since switch does essentially the same things as if’s do. See for yourself, here is the “if” code:
0 2 A(0) = FETCH_R(C("_POST")) [global]
1 2 A(1) = FETCH_DIM_R(A(0), C("action")) [Standard]
2 2 T(2) = IS_EQUAL(A(1), C("add"))
3 2 JMPZ(T(2), 7)
4 3 INIT_FCALL_BY_NAME(function_table, C("addUser"))
5 3 Au(3) = DO_FCALL_BY_NAME() [0 arguments]
6 4 JMP(16)
7 4 A(4) = FETCH_R(C("_POST")) [global]
8 4 A(5) = FETCH_DIM_R(A(4), C("action")) [Standard]
9 4 T(6) = IS_EQUAL(A(5), C("delete"))
10 4 JMPZ(T(6), 14)
11 5 INIT_FCALL_BY_NAME(function_table, C("deleteUser"))
12 5 Au(7) = DO_FCALL_BY_NAME() [0 arguments]
13 6 JMP(16)
14 7 INIT_FCALL_BY_NAME(function_table, C("defaultAction"))
15 7 Au(8) = DO_FCALL_BY_NAME() [0 arguments]
16 9 RETURN(C(1))
17 9 HANDLE_EXCEPTION()
Here is the “switch” code:
0 2 A(0) = FETCH_R(C("_POST")) [global]
1 2 A(1) = FETCH_DIM_R(A(0), C("action")) [Standard]
2 3 T(2) = CASE(A(1), C("add"))
3 3 JMPZ(T(2), 8 )
4 4 INIT_FCALL_BY_NAME(function_table, C("addUser"))
5 4 Au(3) = DO_FCALL_BY_NAME() [0 arguments]
6 5 BRK(0, C(1))
7 6 JMP(10)
8 6 T(2) = CASE(A(1), C("delete"))
9 6 JMPZ(T(2), 14)
10 7 INIT_FCALL_BY_NAME(function_table, C("deleteUser"))
11 7 Au(4) = DO_FCALL_BY_NAME() [0 arguments]
12 8 BRK(0, C(1))
13 9 JMP(15)
14 9 JMP(19)
15 10 INIT_FCALL_BY_NAME(function_table, C("defaultAction"))
16 10 Au(5) = DO_FCALL_BY_NAME() [0 arguments]
17 11 BRK(0, C(1))
18 12 JMP(20)
19 12 JMP(15)
20 12 SWITCH_FREE(A(1))
21 13 RETURN(C(1))
22 13 HANDLE_EXCEPTION()
No. CONT BRK Parent0 20 20 -1
You can see there’s a little difference – the latter has CASE/BRK opcodes, which act more or less like IS_EQUAL and JMP, but their plumbing is a bit different, but in general, code is the same (you could even argue “switch” code is a bit less optimal, but that is really the area you shouldn’t be concerned with before you can read and understand the code in zend_vm_def.h – which is not exactly a beginner stuff.
Another thing that the author absolutely failed to mention and which should be one of the very first things anybody who cares about performance should do – is to use a bytecode cache. There are plenty of free ones (shameless plug: Zend Server CE includes one of them – all the performance improvements for $0
and you don’t have to change a bit of code to run it.
Now, I understand Google is not a PHP shop like Yahoo or Facebook or many others. But this article is signed “Eric Higgins, Google Webmaster” and one would expect something much more sound from such source. And in fact there are a lot of blogs and conference talks on the topic and lots of community folks around that I am sure would be ready to help with such article – I wonder why wasn’t it done? Why apparently the best advice we can find from Google is either trivial or useless or wrong?
I think they can do much better, and they should if they take “making the web faster” seriously.
P.S. After having all this written, I also found a comment from Gwynne Raskind, which I advise to read too.
links for 2009-06-26 | burningCat said
[...] PHP performance tips from Google [...]
danielj said
erm. Just quoting is not enough and will not prevent sql injection.
You shall properly cast and escape. Case numerics and use mysql_real_escape_string() – or an appropriate function for your DB – on all literals. I prefer sprintf() for that:
$sql = sprintf( ‘
SELECT …
FROM table
WHERE numeric_column = %d
AND string_column = “%s”
AND float_column = %f ‘
, $numeric
, mysql_real_escape_string( $string, $connection)
, $float
);
Stas said
You are right, quoting is not a good description as it might be misunderstood as just sticking ‘’s around the variable is enough. I’ve corrected the wording.
Julien said
$sql = “SELECT `…` FROM `table` ”
.= “WHERE `numeric_column` = ‘” . (int)$numeric . “‘ ”
.= “AND `string_column` = ‘” . mysql_real_escape_string( $string, $connection) . “‘ ”
.= “AND `float_column` = ‘” . (float)$float . “‘ “;
This is how I’d do it
Phil said
That’s all well and good until someone who doesn’t know what they are doing comes along and “maintains” your code, or uses the same style and forgets a cast.
Best bet is to move over to PDO and use prepared statements with type hinting.
Tom Wardrop said
I don’t think there’s anything misleading or inappropriate about said article. I think all tips mentioned would be quite helpful to beginners. Sure, you may be saving micro-seconds here and there, but considering it doesn’t take any extra effort to implement these tips (besides just knowing and getting use to them), it’s definitely worth it.
Now, let’s talk about the first example “Don’t copy variables for no reason.”. This is a perfectly valid tip. The result of the example code given is two completely separate strings, almost identical (hence, double the memory is taken up).
As for the second example, this is also fine. As to not confuse the beginner, he’s focused only on the tip in focus, without confusing the user by talking about SQL injection, which hasn’t got anything to do with performance at all. If he were to mention such a thing, he’d almost have to write a completely separate article.
Now finally, for the last tip you criticise, this is also a valid tip. I guess his only mistake was to not mention under what circumstances the switch statement would be quicker. If you have an if statement which contains some form of processing (besides the actual boolean evaluation), such as a function call, then the switch statement will be quicker as it will only have to run that function (as an example) once, where as an else/if statement would need to re-run the function for every ‘else if’.
Stas said
I think there’s much more important point – my fault is that I failed to convey it beyond all the technical details. The point is even if those tricks were valid – which I still maintain most of them aren’t, at least in the form they were presented – if you are a beginner and start optimizing your application, giving you a random collection of engine tricks that might allow you to save 0.1% of execution time in random places is absolutely worst advice you could ever give to a beginner. Optimization should not start with engine tricks. Actually, when you properly optimize you will probably need no such tricks anyway, as I hardly can imagine any application which performance depends on if you use single or double quotes.
Giving a beginner these random tricks of doubtful use and mentioning really important things only in passing gives readers impression that performance optimization is – at least when it comes to PHP – collection of weird tricks and that’s the way you should go to make your applications perform. Nothing could be farther from the truth.
I think I need another post on the matter…
Piccolo Principe said
The topic of single quotes is often banished, but Zend Framework coding standards, for example, force to use single quotes. I think it’s a KISS approach: don’t use double quotes if you don’t do variable substitution because people will think that there are variables in the string, just as you use private methods instead of making all public because this way no one will suppose that the method is being called from an external class (and thus cannot be refactored easily). And you also avoid to have ‘\n’ parsed.
Herman Radtke said
The variables tip is misleading. Obviously variable assignment in any language costs something, even in C. The author is focusing on PHP here and making the inference that there is enough of a cost to consciously avoid variable assignment. A beginner will take this “tip” and try to use as few variables as possible to make the code “fast”. As Stas points out, this will have almost zero effect on the speed of the code. It will, however, have a disastrous consequences to the readability and maintainability of the code.
Richard Lynch said
$foo = $bar;
This does *NOT* take up twice as much memory!
It takes an extra 8 bytes or so, until you *change* $foo or $bar, at which point copy-on-write kicks in.
Jamie said
Case/switch is actually a little less optimal. I’ve seen benchmarks done using if blocks and switch blocks. The if blocks performed a bit faster. My only guess is that Google is going for readability of the code.
PHP 10.0 Blog: PHP performance tips from Google | Webs Developer said
[...] this new post to the PHP 10.0 blog Stas has some responses to the recent suggestions from Google as to how to [...]
PHP 10.0 Blog: PHP performance tips from Google | DreamNest - Technology | Web | Net said
[...] this new post to the PHP 10.0 blog Stas has some responses to the recent suggestions from Google as to how to [...]
Adam said
Great article. Way to stick it to Google.
The only thing I might say in response is that they may have a point with the if/else vs. switch which is that if you look at the opcode you see that the if/else makes a call to access the key in the hash table for each if/elseif whereas switch only does it once. I don’t know how much of a difference this makes, but I’ve always tried to avoid hash table reads vs a read to a regular variable.
Just my two cents.
Stas said
It might make some sense if they had a function and noted that it takes significant time to run it, but saving a couple of hash lookups isn’t really a thing you should be worried about. Even then the difference is not between if() and switch() per se as between doing same thing once and multiple times.
Andrei said
The Google list of advices for PHP optimizations is not even funny …
“Optimization is hard! Let’s go shopping”
Top Posts « WordPress.com said
[...] PHP performance tips from Google I saw a link on twitter referring to PHP optimization advice from Google. There are a bunch of advices there, some of [...] [...]
Visko said
The article is a joke. They are taking the piss, don’t you get it?
Mike said
Good article, but I disagreed with you when you said “any possible difference in speed between parsing “” and ” is really negligible”. Even though it is a small amount of extra processing time. It can slowly add up and if you have a high traffic website a 0.1% increase in efficiency can lower server costs considerably.
PHP performance tips from Google | Mark Joseph Aspiras said
[...] Source: WordPress [...]
Mark said
Hey,
Nice article you have there.
Now i’m wondering… how did you do those if and switch code blocks? is that assemble code or c code.. could you explain how i get output like that?
Thanx,
Mark.
Stas said
This output is the engine opcodes (see sources in http://cvs.php.net/ZendEngine2/zend_vm_def.h if you are interested how they work) – the intermediary code that PHP source gets compiled into. Unfortunately, the tool that I used to display them isn’t public as of now, but there are other tools that can do the same.
Tim said
Note that in the “Avoid unnecessary copies” bit you seem to have missed the call to strip_tags(). After the assignment, $description != $_POST['description'].
Stas said
I know. However whatever is strip_tags producing, it’s there. It doesn’t matter if you assign it to variable, to 10 variables or to no variables. All assignment does is creating one more entry in hashtable and changing refcount.
Lenguajes X » Recomendaciones de Google para optimizar PHP y las replicas said
[...] recomendado una serie de consejos para optimizar nuestro código PHP, y no se ha hecho esperar la respuesta de la comunidad, diciendo que son consejos [...]
Sebastian said
Thanks Stas for pointing out the articles wrongness and shortcomings. Google should really be doing much better than that pap piece on PHP performance. To write about performance and not include any metrics is shameful. As is not following one’s own first piece of advice: “Profile your code to pinpoint bottlenecks”.
There are many pages on the web giving similar performance tips for PHP and all they really seem to do is give PHP a bad name. Clearly, quite some effort went into making the Google article (the video is slick). It would have been much better if Google put the effort into making a more in-depth piece on profiling or caching. I know I’ve found information on APC, for instance, a bit lacking.
Dicas de otimização de PHP do Google estavam furadas | José Ricardo said
[...] PHP performance tips from Google [...]
Vesess » Google’s PHP performance tips attract ire from PHP world said
[...] A Note on Google’s So-called Best Practices Make the Web Faster – Google groups PHP performance tips from Google [...]
Desarrollo Web Varia | Propiedad Privada said
[...] imperdibles, aunque no se deben dejar de leer los comentarios y referencias, porque han metido alguna gamba de [...]
PHP 10.0 Blog: More on PHP performance | DreamNest - Technology | Web | Net said
[...] post on the PHP 10.0 blog, Stas looks at performance in PHP applications as his own response to the Google suggestions they recently released. So here are my thoughts about what would be good for the beginner to [...]
PHP 10.0 Blog: More on PHP performance | Webs Developer said
[...] post on the PHP 10.0 blog, Stas looks at performance in PHP applications as his own response to the Google suggestions they recently released. So here are my thoughts about what would be good for the beginner to [...]
Echte PHP Performance Tipps | CWD - Customized Web Development said
[...] PHP performance tips from google by Stas Malyshev [...]
Steve-o said
Switch statements vs if/then/else statement performance is different from language to language. The preference for switch statements is primarily for “read-ability.” It’s actually good that the performance of switch and if/then/else statements in PHP is on the same level. There should now be no excuse to NOT use switch statements when applicable, for the sake of yourself and all other developers. Avoiding complicated logic chains should always be the goal.
But I agree that most of the code examples by Google were horrendous.
PHP communityn sågar Googles tips | Andreas Eriksson - Baronen said
[...] två stycken artiklar som bevisar att de flesta av Googles tips inte stämmer. Den första är PHP 10.0 bloggen och den andra en en diskussion på Google [...]
Stationsbloggen » Arkivet » Google kan inte PHP said
[...] skrev igår om hur PHP communityn sågar googles tips. Framför allt länkar han till några andra artiklar som rör samma ämne och som är mycket intressanta och i viss mån oroväckande (för de [...]
Fordnox » Blog Archive » PHP optimization advice from Google said
[...] optimization advice from Google Response to Google’s optimization advices Another response to Google’s optimization advices More on PHP performance Tags: google, php, zend Comments (0) [...]
PHP optimizing @ fake's said
[...] tips about optimizing PHP code and at first I happily took them in. Later on, having read other points of view, I started to wonder a bit about some of the optimizations and later still I realised that [...]