Thursday, 17 April 2008

By reference vs. By value in php

Variables and references

In a recent post to the PHP Object Generator mailing list, John asked a question about the significance of the & in front of the $variable name when passing objects into routines in POG.

Since this has come up in a few fora that I keep tabs on, as well as in my own coding, I thought that I'd have a go at explaining it.

As with all my thoughts and replies, I reserve the right to be utterly wrong, so caveat lector.

When assigning variables from one place to another, there are two possible meanings behind the assignment. The first, which seems to be the obvious interpretation to the majority of people is that assigning variable A to variable B implicitly means 'take the value stored in variable A and put it into variable B'.

Thus, in php, if we define $a as containing 'abc', then assign $a to $b, we expect that both $a and $b will contain 'abc'. Indeed, with the following php script, that is what we will get.
<?php
$a = "abc";
$b = $a;
printf("a is '%s', b is '%s'\n", $a, $b);
?>
produces:
a is 'abc', b is 'abc'

An alternative interpretation of the assignment statement, which to many people appears bizarre but which is useful in certain circumstances, is that by assigning $a to $b we might actually be saying 'make variable B refer to the same thing that variable A refers to'. Under this assumption, if we change the value in variable A (or B) after the assignment of $a = $b, then the value that the other variable holds will also change (since they refer to the same thing). Let's try that and see which of our interpretations is correct.
<?php
$a = "abc";
$b = $a;
printf("a is '%s', b is '%s'\n", $a, $b);

$a = "xyz";
printf("a is now '%s', b is now '%s'\n", $a, $b);
?>
produces:
a is 'abc', b is 'abc'
a is now 'xyz', b is now 'abc'

So, we can see from this that the first interpretation is correct. Variable $a and $b are separate things - once the value has been assigned from one to the other, they exist as independent entities. Now let's look at the & operator in this context by using it to assign to a third variable, $c.

Taking our second script and extending it a bit, we now have the following:
<?php
$a = "abc";
$b = $a;
$c = &$a; // assign a reference to $a into $c
printf("a is '%s', b is '%s', c is '%s'\n", $a, $b, $c);

$a = "xyz";
printf("a is now '%s', b is now '%s',
c is now '%s'\n", $a, $b, $c);
?>
When we run this, it produces:
a is 'abc', b is 'abc', c is 'abc'
a is now 'xyz', b is now 'abc', c is now 'xyz'


Hmmm. We assigned $a to $c before we changed the value of $a. Why does $c apparently contain the value that we assigned to $a after the first print statement? Well, the answer is 'Because $c contains a reference to $a'. Effectively it points towards $a and says 'Yeah, what he said!'. Effectively $a and $c are one and the same thing.

So, that's the score with 'ordinary' variables and it holds true for both php4 and php5. Let's see now what happens with objects.

Objects and references

In this next section, I'm going to be assuming the existence of the following php object. It has a public name attribute and - for convenience - it 'knows' how to convert itself into a string.
<?php

class testClass {
var $name;

function testClass($name) {
$this->name = $name;
}

function __toString() {
return $this->name;
}
}

?>

Let's try our last script but using this object instead of simple strings.

<?php
$a = new testClass("My object name");
$b = $a;
$c = &$a; // assign a reference to $a into $c
printf("a is '%s', b is '%s', c is '%s'\n", $a, $b, $c);

$a->name = "New Name";
printf("a is now '%s', b is now '%s',
c is now '%s'\n", $a, $b, $c);
?>

Now if we run this, we get one of two possible result. Exactly what you get depends on what version of php you run the code through.

PHP 4

Running the script under php4, we get the following:
a is 'My object name', b is 'My object name', c is 'My object name'
a is 'New Name', b is 'My object name', c is 'New Name'

PHP 5

However, under php5 the result is:
a is 'My object name', b is 'My object name', c is 'My object name'
a is 'New Name', b is 'New Name', a is 'New Name'

So, what's the reason for the difference? What's going on?


Comparing the two versions, we see that under php4 assigning an object variable to a different variable results in two separate objects. Each can be modified independently which is why in the second print statement $a gives a different answer from $b. Assigning using an explicit reference into $c ($c = &$a) copies a reference into the $c variable so $a and $c can be considered to be the same. Thus simple variables like strings and complex objects behave essentially the same as each other under php4.

Under php5, however, we see different behaviour. Here, even though we assign from $a to $b using a simple assignment statement ($b = $a), because $a is an object we get a reference assignment. This is (apparently) to bring php into line with languages like java. Under php5, the & reference assignment continues to behave as in php4. Thus, there appears to be no way to get php4-like behaviour under php5 as far as objects are concerned.

In fact, if we want to take copies of an object into a different variable such that each is a separate entity under php5, we have to use the clone() function i.e. $b = clone($a).

There now, that hardly hurt at all, did it?

Tuesday, 1 April 2008

Getting SSHKeychain to work on Leopard (OS X 10.5)

For several years I've used the excellent utility SSHkeychain to manage my interactions with external servers. I spend a lot of time running multiple shells and moving files from one system to another and the efficiency that a good key agent brings is something that I've come to rely on.

However, recently I switched to a new machine which came with 10.5 installed and I discovered that Apple have elected to build their own ssh-agent interface into the system.

But it's rubbish.

It doesn't remember the keys other than by adding them to the keychain (which I refuse to do) and it can't be configured to drop them on system events like the screen saver kicking in. And if I install SSHKeychain then the system just ignores it. Damn, that's annoying.

So I googled about a bit and found a couple of posts that talked about the issue, one of which said that SSHKeychain was dead and another that came up with a partial solution that involved extra keychains and an application hack to remove keys from the agent when certain things happen. But no-where - not even on the SSHKeychain site - did I find a recipe to restore SSHKeychain functionality properly.

So, knowing a bit about UNIX, the way that Apple has implemented its on-demand services daemon and the way that ssh-agents work in general, I've come up with a simple way to get back the control over how my system operates.

It turns out to be remarkably simple.

SSH agents work by setting themselves up and listening on a local socket. They advertise this fact by setting an environment variable called SSH_AUTH_SOCK. Programs that need access to the agent look for this environment variable and, if it is set, make a call to it in order to get the necessary authentication.

SSHKeychain does just this, setting up a socket for each machine user named according to the user's unique (by machine) user id number.

Unfortunately, Apple uses the same thing to intercept agent requests and overrides the SSHKeychain setting, pointing the SSH_AUTH_SOCK at its own local socket. Requests to the Apple-supplied socket launch the OS built-in agent instead.

However, there is a way that we can get control of the SSH_AUTH_SOCK variable. We can put an entry in our .profile script (or the [t]csh equivalent .cshrc) to look for the SSHKeychain socket and point at it if it exists.

So, if you love SSHKeychain as much as I do then run, don't walk, to your favourite text editor and edit (or create) a file called .profile in your login directory to contain the following code:


if [ -e "/tmp/${UID}/SSHKeychain.socket" ];
then
export SSH_AUTH_SOCK="/tmp/${UID}/SSHKeychain.socket"
fi


Now, whenever you fire up a terminal, if you have SSHKeychain running then it will be invoked to handle your key requests. The half-a**ed Apple attempt at a key agent GUI will only be used if you've forgotten to run SSHKeychain.

I hope that helps folks, and I look forward to any comments and/or improvements.