snapsvg

2012-07-19

Today's PHP-induced bug: arbitrary object properties

Today's stupid bug that shouldn't happen is brought to you by PHP's retarded object system. The various systems it appears to be bastardised from are incompatible but it is mostly similar to Java's: single inheritance, interfaces and access protection.

Access protection is where today's bug comes from. As you had better already know, PHP's classes can be declared with properties:

class Foo {
    public $bar;
    protected $_foo;
    private $_baz;
}

This declares that, with an object $obj of type Foo one can access $obj->bar but not $obj->_foo or $obj->_baz without being lexically within the class (for $_baz) or a subclass (for $_bar) of Foo.

But unlike Java's slightly more robust system, PHP allows you to arbitrarily create properties on objects:

$obj->bazinga = 1;

Since bazinga is not even declared on class Foo, the correct response for any self-respecting object system would be to tell you to sod off and stop playing the fool.

Alternatively, it would not provide a way for you to declare properties on the class in the first place, because doing so is wholly inappropriate when any property not declared is implicitly available, and public.

This sort of logic is akin to the logic of the courts that ruled that ISPs must block the pirate bay, but not block any other website that shares torrents, nor any website that allows you to access the pirate bay via proxy. That is to say, it's never going to work.

The problem is of course that the point of declaring protected and private member variables on classes is twofold, like most things:

  1. The structure of the data represented by objects of this class is a defined contract. Attempting to access non-existent properties will error because the user is trying to do something that cannot be done by the contractual interface, meaning that the user is most likely mistaken as to the identity of the variable they are dealing with.
  2. The data structure of the object is usually not for public consumption and defines a state in which the object may be. The object exposes an interface by which the data values can be manipulated but the user has no control over how this manipulation takes place internally.

If you want arbitrary strings to be available as a public property on a data structure you wanted a map. This is implemented as an associative array in PHP, a hash in Perl and Ruby, and a dictionary in Python.

Point 1 of the two reasons why you don't do what PHP doesn't understand why you don't do is the reason why I wish it didn't. I had defined an interface on my class promising that any value passed to a particular method would be later returned by the same method when not passed any argument besides null:

public function value($val = null) {
    if ($val !== null) {
        $this->_value = $val;
    }

    return $this->_value;
}

A very common interface. The fact that it is implemented by storing the value in the protected property $_value is no business of the user's. The class defined no public property '$value' but, when I did this:

$obj->value = 'foo';

No complaints were raised. And yet for all such objects, nullness persisted throughout my data.

It is not simply that my class, not having declared the property 'value', had no obligations regarding such a property: the interface defined on my class had the very specific obligation to deny the property 'value' from being accessed in any way whatsoever, because that property is not a part of the data structure represented in my class.

PHP allows you to access any identifier on an object as a property, and only the subset that you explicitly declare protected will be so. The rest are first given to a special method called __get to get, or __set to set, and if they do nothing it just goes ahead and creates something for you.

I have no implicit problem with these magic get/set methods per se—the problem arrives when the default function of these methods is to allow the thing through! If the default reaction were to tell you to take a hike this whole thing would be fine. Give a class the opportunity to deal with arbitrary properties, sure; that then becomes part of the contractual interface of the object. But no contract should be so liberal as to let people crap all over them unchecked.

PHP: you are doing classes wrong.


5 comments:

  1. It sounds like a lot of design mistakes in PHP, in that it's an attempt to combine ideas from two different places, with the result that it seems to maintain all the worst things of both ideas with none of the advantages of either.

    ReplyDelete
  2. Are you certain about the behavior you observed? To wit, the __get and __set magic methods must be explicitly defined in your class in order to be used by the PHP preprocessor. What's more, $obj->method is the same as calling $obj->method(), although you shouldn't be able to put a method call on the LHS of an assignment. Finally, PHP offers pretty flexible configuration - what were your error reporting settings when you observed these behaviors?

    ReplyDelete
    Replies
    1. Are you thinking of Perl? $obj->method refers to the property $method on the object. $obj->method() runs the method 'method'. The parens are required on functions in PHP, but the two are indeed equivalent in Perl.

      Here's a demonstration of the behaviour. Feel free to play around with it.

      http://sandbox.onlinephpfunctions.com/code/61dc9401034337f73ceb44ddddbff6cef7dcf652

      Delete
  3. I think the problem is more with your expectation than with the language. You said "the interface defined on my class had the very specific obligation to deny the property 'value' from being accessed in any way whatsoever, because that property is not a part of the data structure represented in my class" - if that was what you wanted, you should have implemented it:

    public function __set($prop, $val) {
    if (! property_exists($this, $prop)) {
    throw new UnexpectedValueException("No such property {$prop}");
    }
    }

    ReplyDelete
    Replies
    1. Yes I know I could implement it—but I still think the defaults are backwards. Classes define interfaces, and the expectation is that anything deviating from the interface is denied access by default. (The mechanisms by which you would override the default of "no get lost" would be more complex but PHP provides all the tools necessary to do it; but I won't get into that here.)

      Delete