snapsvg

2014-02-27

Code review time!

Look! A horrible piece of code in a horrible language in a horrible frame for a sickeningly twee ceremony that should have been made obsolete along with the Inquisition!

Let's review it.

Here's the code, with line numbers.

01    <?
02      function do_wed() {
03        if ($objections != true) {
04          function do_vow() {
05            $vow = 1;
06            do {
07              if ($richer === 1
08                  && $poorer === 1
09                  && $sickness === 1
10                  && $health === 1) {
11                function have_hold($a,$b) {
12                  ini_set('session.gc_maxlifetime','forever');
13              }
14              have_hold('husband','wife');
15              define('friend', true);
16              define('partner', true);
17              define('faithful', true);
18              if ($i = 'do') {
19                   $f = 'finger';
20                   $r = 'ring;
21                   $f = $f + $r;
22                   }
23               }
24               $vow = $vow + 1;
25              } while ($vow != 2);
26            }
27            do_vow();
28            $register = array_fill($details);
29            print_r($register)
30            return $kiss;
31            }
32          }
33        do_wed();
34    ?>

Let's go!

line 1

We use long tags here. <?php

line 3

Undefined variable $objections.

$objections != true better written !$objections. But this is not what you meant; you meant count($objections) == 0, since it will be an array of them

line 4

Don't define functions inside other functions.

lines 6, 25

You know how many vows you want. Use a for loop. Better, use an array of vows and populate it with two Vow objects, which represent the conditions each person agrees to. This means you can marry more than 2 people. The do_wed() function should take the people to wed as arguments. Use func_get_args() to loop over all of them, or (...$parties) in the next version of PHP.

Useless loop anyway. do_vow() should be called twice with the person currently vowing.

"Twice" is a western concept. This code is not internationalised.

lines 7-10

Undefined variables. None of these equals 1. It is unlikely that all four of these things would equal 1 at the same time. You want to test the party's agreement to these concepts, not the value of these variables. You need Person objects.

line 11

A function in a function in a function? This function takes two parameters and uses neither. Get rid of them.

line 12

This ini parameter takes an integer. 'forever' is not an integer.

line 13

This closing brace does not line up with the function definition on line 13. It does line up with the if on line 7, which implies you've forgotten to close the function, but scrutiny shows that you've misaligned the brace.

line 14

have_hold does not take any parameters any more.

This is exclusivist. Not all marriages are between a husband and a wife. These should be parameters to do_wed().

This function is run twice, both times with the same parameters. It should swap over for the second iteration.

line 16

'partner' is presumably the person we are not currently dealing with.

line 17

'faithful' is not a boolean value and should be configured per app. It needs to be a data structure containing parameters of faithfulness, i.e. boundaries.

line 18

This is always true. Remove this condition. $i is never used, so remove the assignment too.

lines 19, 20

Useless variables. Either accept them as parameters or use the literal strings directly.

line 21

If you'd not used these useless variables you'd realise you're trying to numerically add strings. . is the concatenation operator. What is a 'fingerring'?

$f is discarded. Just omit this entire block.

line 22

What is this supposed to line up with?

line 23

This closes the if that looks like it is closed on line 13. But it does not line up with it.

line 24

Better written $vow++, but we've replaced this with an array of Vow objects containing agreement parameters, so don't do this any more.

line 25

The only reason this would be a while loop is if you're just going to keep asking until both (all) parties agree. This is not how one should enter into a marriage.

line 26

This closes do_vow() but does not line up with it.

line 27

This is what should be run n times, once per party in the agreement.

line 28

array_fill takes three parameters. Register should be an object.

line 29

Syntax error - missing semicolon.

print_r is not the best thing to use here. Serialise this properly, perhaps with JSON so it can be consumed by an API or HTML so it can be styled and displayed properly.

line 30

Undefined variable $kiss. Kiss is a verb and should be a function.

lines 31, 32

These braces should line up with what they close.

line 33

Don't run a function when it is defined - that's not how you create a library.

This function could at least be parameterised with the names of the people being married. Isn't Etsy about crafts and hence personalisation?

2014-02-06

Model student

Models! Model trains, model students, model aeroplanes, model citizens. Fashion model, data model, business model. Ford Model T. Model number.

All these different uses of the word model have a commonality, the understanding of which is important to the understanding of what it is we mean when we talk about models in computing. This commonality may be considered the abstract meaning of "model": the meaning that exists behind all the real-world uses of it.

This concept is that of representation. Physical models are scaled-down representations of the things they model. A fashion model is really the representation of real people who would wear clothes (showing quite how divorced from reality fashion really is). A business model is a wordy representation of how the business will operate. Even the term "Ford Model T" is actually referring to the blueprint of all cars of that type: "Model" is referring to the type, not the car itself.

In computing, then, a model is a representation, a blueprint, a prototype that encapsulates the important details about the thing it is modelling. A good model will be a minimal but sufficient representation of the system it is modelling.

An easy example is the rolling of dice.

1d6

Dice are a familiar system to everyone, I hope. They neatly encapsulate our idea of randomness, at least that one we're taught in primary school, whereby the outcome of the system is not predictable from the input.

When we roll a d6 we expect to see one of its six faces pointing upwards but we don't know which one until it does so. Indeed on most dice we see the number represented as a pattern of dots; the number of dots being the number it shows.

This, if you're not used to thinking in these terms, is very specific. There are many extra features of a d6 that have nothing to do with the randomness of the d6. Every feature of the die except its shape (and mass distribution) can be altered and it would still exhibit the same properties of randomness.

Modelling systems, therefore, requires a keen eye about what are the underlying mechanics that allow the system to work, and what are the superficial parts of it that happen to be the case in this particular instance.

At its barest, a d6 is a system that, when run, produces a random integer from 1 to 6. The random distribution is even across all numbers: which is to say, the more times it is rolled, the more we expect to see the counts for each result become equal.

To model a d6, therefore, we simply need a system that can produce the same result.

Math.ceil(Math.random() * 6)

This piece of Javascript models a 6-sided die. Run it in your browser's console if you don't believe me. Run it lots. Here's what happened when I ran it 50 times1:

[2, 2, 6, 3, 5, 4, 3, 3, 2, 4, 
 1, 5, 3, 4, 6, 1, 6, 6, 4, 5,
 3, 1, 6, 5, 2, 4, 6, 6, 6, 5,
 3, 6, 1, 2, 3, 2, 3, 3, 1, 5,
 2, 5, 3, 2, 4, 3, 5, 6, 6, 5]

And sorted:

[1, 1, 1, 1, 1, 2, 2, 2, 2, 2,
 2, 2, 2, 3, 3, 3, 3, 3, 3, 3,
 3, 3, 3, 3, 4, 4, 4, 4, 4, 4,
 5, 5, 5, 5, 5, 5, 5, 5, 5, 6,
 6, 6, 6, 6, 6, 6, 6, 6, 6, 6]

At this level, Javascript's RNG2 should be roughly uniform in distribution, and with true randomness we should not expect uniform results at such small quantities. This distribution certainly seems random and within parameters for uniform distribution, so we've simplified the concept of a d6 into a minimal and sufficient algorithm.

dn

Not all modelling is about functionality. Much of data modelling is about just that: data!

A model like a d6 is fundamentally fairly useless. Indeed the idea of a d6 is just a very tight constraint on a very useful concept - randomness. It serves little purpose to model a d6 specifically, because the number of uses for a d6 is, in the grand scheme of things, small.

In the real world, we use models in computing for two basic purposes: retrieval and prediction. The first one is used to store representations of things that exist, such as people or products. Those are data models. We store these data models to let people log into a system, or to display a list of the products to customers. The second is used to try to work out what would happen in certain situations, based on the understanding that we have about the system in the first place - such as weather. These are functional models, of which the d6 above is one example.

In both situations the model is useless without the things being modelled having data. Properties of the objects store information about the objects and supply parameters to the algorithms we've devised.

We have hit upon the idea of parameterising algorithms. As noted, the d6 algorithm is somewhat useless because all it does is model a d6, which is of limited utility.

We can increase the utility by modelling the algorithm of any die. This is the second thing to be aware of when learning to abstract away the fundamentals from the real-world example. Earlier, we learned that we can turn a gazillion atoms' worth of die into a few electrons' worth of RNG by simply taking a number between 1 and 6 - this is the fundamental behaviour of a d6.

Now, we can look at other real-world dice and see how their behaviour relates to the d6:

  • A d4 picks a number between 1 and 4
  • A d6 picks a number between 1 and 6
  • A d12 picks a number between 1 and 12
  • A d20 picks a number between 1 and 20
  • A d100 picks a number between 1 and 100

It doesn't take a complex neural network to see the pattern here. A dn picks a random number between 1 and n.

If we wanted to model a d4 we could amend our d6 model:

Math.ceil(Math.random() * 4)

And we're done. Well done! You've invented job security. Now we've got two models for two different scenarios, and we know how to repeat the process for any die we like.

You should at least by now have the feeling I'm leading you to a point; and if you haven't guessed it yet I'll make the point.

We haven't modelled the pattern.

You can model dice until you're blue in the face but a good model captures the fundamental principles. The d6 model captured the fundamental principles of a d6, but we want a model that captures the fundamental principles of all dice. We need to model the abstract; the pattern that we spotted when we listed our dice.

Abstraction

"Abstract" is another one of those words that no one understands until they're faced with it, and then it confuses them until they understand it, and then they realise why it's been used all along. Most people know abstract as a form of art, and therefore associate it with meaningless shapes and random colours or something.

The abstract of something is those features about the thing that remain behind when you take the actual thing away. The abstracts are those conceptual things that mean you can describe it without actually having one; but which, if you had never seen one, would mean you may recreate a different thing.

This is what we did with the d6. We took the abstract concept of a d6, which is to randomly generate a number between 1 and 6, and then we recreated it in an algorithm that looks nothing like a die. It's a string of characters on a screen, now. It doesn't even roll. Or bounce.

Abstracting across many things is an art form in itself. For a start, the things have to be related, or else there's no real abstraction to make. Secondly, the degree to which things are actually related to one another can vary wildly, so knowing what level of abstraction to make is also a challenge. Thirdly, abstractions themselves may be similar; in which case you can start relating things that look the same in the abstract but are entirely unrelated in real life.

Now that I've thoroughly lost you, let me bring you back to earth. When we laid out all the dice we know and examined how they work we saw a pattern, which is that a die with n sides is an RNG between 1 and n. A pattern is something we can model; we model it with parameterisation.

Parameterisation is when you take a series of concrete examples and you remove one of the things from it and replace it with a variable; in this case, we replaced all the numbers with n3. The multiple types of die have been reduced to a single type, whose number of faces is now variable.

The number of faces the die has is now a property of the die. We have a model with data!

How do we represent it? Well in Javascript terms, parameters are given to functions, and objects have properties. We can divide the model into the two parts, functionality and data, by using a function to represent rolling a die and an object to represent an actual die.

function rollDie(die) {
    return Math.ceil(Math.random() * die.sides);
}

var d6 = { sides: 6 };
var d12 = { sides: 12 };

Here we have one function that will roll a die and return the result. Then we have two dice, each of which is a simple object with the property sides. Inside the rollDie function we use the sides property of something called die, which we can see is mentioned in the parentheses in the function definition. This together means that whatever is given to rollDie is assumed to be a model of a die, and to have a property sides that represents the number of sides it has.

rollDie(d6);
rollDie(d12);

If we provide a die model as a parameter to the rolling function, the rolling function can inspect the property of the model, extract the data, and use the data in the original algorithm. The algorithm has not, fundamentally, changed. It is simply the case that now it is parameterised; which is to say that instead of duplicating the function for every possible invocation, we can create data models that represent the thing we are dealing with, and provide the data to the function. We have abstracted the pattern (1dn returns a number between 1 and n) by making the variable, n, well—variable!

Verbs and nouns

The world is made of verbs and nouns. Systems verb nouns. People roll dice. People buy products. Computers authenticate passwords. Ecommerce systems suggest related products. Search engines search documents. URLs refer to resources.

Our data models therefore comprise verbs and nouns. Our d6 model was a verb4, but the noun was hard-coded. Hard-coding is the failure to parameterise. Instead of accepting a parameter, the noun - d6 - was assumed by the verb, because the verb was the whole of "roll a d6".

Our later model had a verb, rollDie, which could roll any noun that looked like a die. It had two dice, d6 and d12, which represented 6- and 12-sided dice, respectively. But the rollDie verb did not rely on those dice. The verb was abstracted from the nouns because with the new verb, anyone can create a die of any size and roll it:

var d27 = { sides: 27 };
rollDie(d27);

... so long as they have access to the verb part - the functionality - of our model.

By parameterisation we can turn a verb into a verb and a noun - "roll a d6" turns into "roll" and "a d6". By doing the opposite, we can turn a separate verb and noun into a single verb. Good modelling comes from learning when it is right to include the noun in the verb, and when the noun is a parameter. In some cases, the noun is fetched from somewhere else - a different verb (to fetch) and a different part of the model, with its own nouns.

In the real world, computer modelling is much more involved than this. Data are often linked to other data, such that if one changes another must reflect it. A shopping basket, for example: if you add an item to the basket, the total must increase. If you change the quantity of an item, the subtotal for that item must increase, and so must the basket total.

In that example, we already introduced nouns and verbs that we can model. Basket; item; total; subtotal; quantity. Some of these are things, and some of them are properties. Some are both! Items are real things, but the list of items is a property of the basket. The total is a property of the basket, and the subtotal is a property of the item when in context of a basket and having a quantity!

Sometimes we replace nouns with verbs: instead of storing the total, we may choose to calculate the total on demand based on the items.

Sometimes we replace verbs with nouns: when you roll a die, its value remains the same until you roll it again, but you should be able to ask it what value it shows. Our model could not do this. Alas! Our simple and sufficient model is no longer sufficient.

Sometimes we separate a verb into a verb and a noun: we turn rolling a d6 into rolling, and create a d6 to roll. This allows us to either roll a different die, or do something different to the die.

Sometimes we combine a verb and noun into a single verb: when we get the total of a basket, we don't separate it into "get" and "total"; if you change the noun here, the verb makes no sense!

Even a simple example like a die can escalate, and it is easy to get overwhelmed by the interactions—imagine the complexity of a "simple but sufficient" model of an entire shop!—but ultimately we are modelling nouns and verbs; all we have to do is parameterise correctly and find the correct abstractions.

Modelling systems

Hopefully you will have, by means of a concrete example and a lot of nebulous ideas, some concept of what it is to model things in computer systems. Ultimately, you will need some way of defining functions - a programming language - and some way of storing data - maybe a database.

Modelling a system therefore involves a good eye for what is a verb and what is a noun. That is to say, if you want to "roll a d6", does this suffice as a verb? Or is "d6" a noun? What if you want to "calculate the total"?

There is no cheat sheet here. Experience is your best recourse. But perhaps we can jot down some things to consider when modelling a system.

  • How big is the system? The d6 system was small, but the shop system was large. Can it be smaller systems?
  • How big are the nouns? A d6 has 6 faces, but the number 6 is enough to model that. Meanwhile, a basket has many items, but more information is needed; items are separate things, but faces are not.
  • Can you de-noun your verb? Does the verb make sense on other things? Does it actually? You can roll anything with sides; but can you get something other than a total from a basket? Can you get a total from something other than a basket?
  • Can you combine a verb and noun? Have you gone too far parameterising? If your shop has only one basket, the basket is not a parameter: the verbs can assume it.
  • Can your verb fetch a parameter, instead of accepting or assuming it? When you roll a die, perhaps you can establish elsewhere which die you are rolling. Perhaps the items on a basket know they are items; and there is only one basket, so you can get the items when you need them.

That's all for now on models. In future posts we will take a look at how data get around inside these systems, how we store them, and the transient nature of data while the system is actually running.

1 var a = [], i = 0; for (i = 0; i < 50; i++) { a.push(Math.ceil(Math.random() * 6)); } a;

2 Random number generator

3 Replacing all the ds with m may be a tempting thing to do here, but we shouldn't. That's because d has been constant across all of our examples; it simply serves to refer to the thing we are modelling in the first place. n is the new variable, because the thing it has replaced varies. d, being constant, is the thing our model is taking away entirely! It serves no purpose to know that we are rolling dice, any more; the d is therefore simply our reminder about what we are aiming for.

4 Commonly one would not copy-paste an algorithm into a console and run it. Instead, the algorithm would be packaged in a function and the user would be told to run the function. We did this later, when we parameterised, but to simplify and save on explanations, we avoided using a function in the first examples.