Static typing

There is some renewed discussion about introducing static typing in PHP. I just read one very interesting post: The Safyness of Static Typing which I suggest everybody that is interested in this topic should read (and the links there). You may agree or disagree, but it is worth reading and even if you disagree it is worth ensuring you know the answers to the questions raised there, otherwise your disagreement lacks substance. I must admit I liked that post because it agreed with my feelings (not substantiated prior to that by any experimental data besides general experience I’ve acquired in the field) that type safety is not as close to silver bullet as some put it.

Within the context of PHP, I’m not sure if more strict typing (coercive typing is something in between and would require a bit different treatment) would be beneficial. I can see where it could be useful – i.e., for making JIT it probably would be very nice. On the other hand, Javascript has excellent JIT engines, as I have heard, without any additions of strict typing, so it’s not absolutely¬†necessary. With PHP code living in runtime and static analysis tools not being routine part of mainstream development, at least as far as I have seen, I’m not sure addition of strict typing would help in any substantial way. Facebook guys, obviously, disagree – I wonder if they have some data to back it up, i.e. how that worked in practice and especially how “hybrid” model – i.e. having typed and untyped code coexist (that as I understand is what is happening, may be I am wrong here) works out and if it indeed provides better safety and reduced development time?

P.S. oh, and if you want a surefire way to annoy me, please call strict typing “type hinting”. I’m sure in the history of PHP there were examples of worse terminology (“safe mode” comes to mind as one) but that does not excuse this most unfortunate decision to name strictly typed arguments “hinting”.


6 thoughts on “Static typing

  1. Pingback: In the News: 2014-09-09 | Klaus' Korner

  2. I think in PHP there are different aspects to strict typing / type hinting which provide different benefits: (1) documentation benefit, (2) type assertions at call time, (3) type assertions at design time, and (4) IDE support (intellisense/refactoring). The first and last for me are most easily dismissed when you have such a thing as docblock comments. There is limited need to have types hinted explicitly in code when they can be simply documented in a comment and the IDE will pick that up just fine. You could make an argument that comments grow stale, but type hints can grow stale as well, since they are not checked at design time anyway. So, what remains is the ability to make type assertions.

    PHP’s type hinting is a way of doing call time type assertions with less typing. If you want to do type hinting for primitive types, you would start your function with <<>>. It does basically the same thing. When people say strict typing however, what they usually mean is design-time type assertions: your compiler tells you that you cannot do an operation on a type, before the code is ever run. This lets you catch this class of errors before shipping. So, for me, type hinting is not the same thing as strict typing as most people understand it, unless you combine it with a static type checker that can validate that the code in its entirety respects the type hint (which means all the code that calls it must be equally type-hinted, something not possible in today’s PHP). That in turn would mean you lose flexibility to do duck typing. I think I largely agree with what is in the video: strict typing replaces some of the checks that are in unit tests, so if you are going to have comprehensive unit tests you derive very little value from strict typing, and strict typing adds a cost in the form of a loss of flexibility.

    I see two cases when you do want strict typing (design-time type assertions), and they are sort of related. When you call an API of a module built by somebody else outside of your team, you would like it to adhere to a contract: I accept these types, I produce these types. Strict typing lets you make that guarantee at design time. You can rely on that API respecting the contract and not sometimes producing or accepting a completely weird and undocumented type. So the two cases are: (1) large-scale software development, and (2) third party code use. In both cases you are often dealing with API’s that you need to treat like black boxes. Then strict typing becomes really useful, since it mandates that all teams and all API’s follow a minimum set of contractual requirements. I think this is why facebook is enamored with strict typing. At their scale, the argument that “you catch it with unit tests” falls flat on its face, because on such a large codebase you will have developers that do not write proper type checks in their unit tests (yet still have 100% line coverage), and then the design-time type guarantee is suddenly very useful.

    Personally, I fall into the camp that on large projects you should try to do as much checking (including type checking) at design time as you can, using static validation tools, and strict typing is one instrument in that toolbox. When you are dealing with multi-million line codebases the ability to make guarantees about all code through static validation logic is a clear quality benefit. Unit tests that pass tell you only one thing: the unit tests pass. They make no guarantees to you about what the underlying code does unless you read every single unit test and compare it to what the code is supposed to be doing. If you are responsible for maintaining code quality on a large codebase written by others, too large to do code review of all code, then you will want that static type checking in addition to those unit tests.

    • The problem is the errors type assertion will catch are extremely rare and usually immediately obvious – how often function tries to return DateTime but instead by mistake returns SplFileInfo? How often you wouldn’t catch such thing with the first and simplest test? I’m not sure Facebook ever measured how strict typing helped them, if at all, to catch bugs that otherwise wouldn’t be caught. I suspect the number is much less than one would imagine.

      The issue of interfacing third-party libraries is solved by documentation, and without docs it’s not very useful – ok, now you know that function returns DateTime, but what kind of DateTime? Where are the values coming from? What is the timezone? etc., etc. – you still need the docs, bare type doesn’t give you much information. In fact, in many strictly typed languages the movement is in the reverse direction – to avoid spelling out the types and let the language derive them. And usually just type is not enough anyway – say you know the function returns string. Is it HTML? SQL? XML? Was it sanitized? Can you display it in Javascript context without causing XSS problems? If I run this SQL, would my database return correct set of records? Typing almost never answers there questions (unless you get into languages with much more powerful type system than PHP). These are real bugs that take people a lot of time to catch, and for this strict typing is no help.

      Strict typing helps a lot when you deal with languages that are C-like – when trying to use string as int can lead to an epic disaster. But in a language like PHP this disaster would not happen. Of course, you may still have some invariants broken when you have “foo” instead of a number and have it converted to 0 – but your code should know how to deal with 0 anyway since 0 is a valid number, so type checking wouldn’t help you here.

      • You’re right that string vs int is not a very useful distinction in PHP. You want to make a more specific subtype declaration: int between 1 and 10, string with max length 50 containing html, etc… However, I disagree that strict typing doesn’t benefit you in PHP at all. I’ve moved a web services layer from loose typing to strict typing (parsing phpdoc comments to extract type info and validating input parameters and return values based on that), and a surprising number of bugs were shaken out as a consequence.

        Examples of the sorts of bugs strict typing helps you catch that I observed in the wild when we introduced the strict typing system:
        – Off by one parameters (one parameter missing causing the rest to move over one spot)
        – Bad parameter ordering
        – Wrong parameter types (expecting int, getting bool)
        – Unpredictable return value types (it is declared to return type Foo but returns type Bar every once in a blue moon)
        – Underspecified response values (returning associative arrays instead of defined types)

        Strict typing is valuable at API boundaries. I find it more trouble than it’s worth in the body of functions, but the parameters and response types of a public method should be strictly typed in my opinion. You save time overall by having fewer bugs.

Comments are closed.