Javascript Madness Intro

Javascript Madness: Treacherous Type Conversions

Jan Wolter
Feb 12, 2009

One of the most frequent sources of Javascript bugs for me is the way type conversions between strings and numbers work. This page exists to allow me to vent my annoyance on the subject, and perhaps to help put other programmers on guard.

In Javascript, as in most any other programming language, 1776 and '1776' are two very different creatures. The first is a number, the second is a character string.

In some languages you can only convert from one to the other by an explicit conversion. For example, in C, you convert a string to a number with atoi("1776") or a number to a string with sprintf(buf,"%d",1776). Furthermore, all variables are typed. You declare whether they will hold strings or numbers at the time they are created.

Other languages, like Perl, are much more liberal. The same variable can hold either a string or a number, and it hardly ever matters which one it does contain. If an operation calls for a number and one of it's arguments happens to be the string '1776', then it quietly converts it to the number 1776. Thus you hardly every have to worry about which of the two representations of a number is actually stored in a variable. (Though you do have to deal with the fact that you have two different comparison operators, == and eq, one that converts its arguments to numbers before comparing, and one that converts them to strings before comparing.)

Javascript, like Perl, allows variables to store any type of data, and does a lot of automatic type conversions between strings and numbers, but the automatic type conversions don't give you the total coverage that they do in Perl. If your variable contains '1776' instead of 1776, most operations will treat it just like a number, but a few won't. Thus, instead of causing immediate problems as it would in C, or running along just fine, as it would in Perl, it goes along for a while and then starts doing weird and unexpected things.

These kinds of bugs are hard to find, and they happen a lot, because pretty much any way you read a number into a program gives you the string representation of the number. The best practice is to always immediately convert all numeric data into actual numeric variables as soon as you read them in, but it is very easy to miss this and have string representations of numbers slip into your program, where they may or may not eventually wreck havoc.

Additive Overload

Most arithmetic operators in Javascript convert their arguments to numbers. Multiplication, subtraction, and division all do this. The exception is the plus operator. Instead of converting it's arguments, it's arguments convert it. If either of the arguments is a string, it becomes a concatination operator instead of an addition operator.

This business of having the same operator do different things depending on the types of the operands is called "overloading". It was a moderately good idea in languages like C++ with strongly typed variables. When you typed '+' sign in C++ you pretty much always knew what it was going to be doing, because you knew the types of the variables that were being added.

But in Javascript, the variable types are unknown until run time. So, though you may know what you want your plus operators to be doing when you type them, what they actually do is up in the air until run time. Pass it the wrong type, and you are in for a surprise.

Here we have a little function that adds up n elements of the a array, starting with the element first:

    function sumSome(a, first, n)
    {
	var sum = 0;
	for (var i = first; i < first+n; i++)
	     sum += a[i];
	return sum;
    }
If you call sumSome(a,2,3) this returns a[2]+a[3]+a[4]. But if you read the value n in from some place, and forgot to convert it from a string to a number, so that you ended up calling sumSome(a,2,'3'), then instead you get the sum of a[2] through a[22], which comes as a bit of a surprise.

The problem is that the first+n addition in the loop condition becomes a concatination instead of an addition if either first or n is a string, so the upper bound of the loop becomes '23' instead of 5. But the < operator doesn't care. It happily converts the string '23' to a number 23 and does its comparison.

Note that here the < operator, which helpfully does type conversions, actually compounds the problem. The plus operator gave us a string instead of a number, but the < operator, instead of choking on it, merrily turns it back into a number and runs with it, so instead of a nice error message, we get wacky behavior which is much harder to debug.

This overloading of the plus operator is not entirely without theoretical advantages. For example, I could do this:

    string= sumSome(['a', 'b', 'c', 'd', 'e'], 2, 3);
This would get me the concatination of the three strings in the array starting at index two, that is 'cde'. Nifty, one function can sum or join arrays. Nifty, but pretty danged useless. I've never, ever had a need for this in any program. When I type the plus sign, I ALWAYS know what I want it to do. In most languages I could be confident that that was what was actually going to happen, either because I was typing different operators for addition and concatination (as in Perl), or because I knew the types of the arguments (as in C++). Not in Javascript.

Here's another oddity. Suppose you have this line of code:

   alert( nTopShelf + nBottomShelf + ' bottles of beer on the wall' );
but your boss wants more formal language, so you change it to:
   alert('Bottles of Beer on Wall: ' + nTopShelf + nBottomShelf );
Oops. In the first case nTopShelf + nBottomShelf was an addition (assuming both of those are numbers), but in the second case it is a concatination. So addition is no longer associative or commutative. All those precious math facts that your math teachers beat into your head, blown to smithereens!

Subtraction doesn't suffer from this problem. In fact, subtraction is the standard way of converting strings to numbers. So the following isn't the no-op it appears to be:

    size = size – 0;
This converts size to a number if it wasn't already a number. It's an explicit type conversion masquerading inexplicitly as arithmatic. But the following operator, though it looks similar, is very different:
    size = size + 0;
If size is a string this doesn't convert it to a number. Instead it multiplies it by ten (integers only - '3.14' would not be changed and '6.022e23' would be raised to the tenth power). Yuck, yuck, yuckity, yuck.

The switch Switcheroo

Suppose you have this code:
    if (hokeypokey == 0)
	putYourLeftFootIn();
    else if (hokeypokey == 1)
	putYourLeftFootOut();
    else
	shakeItAllAbout();
And you decide to change it to this:
    switch (hokeypokey)
    {
    case 0:
	putYourLeftFootIn();
	break;
    case 1:
	putYourLeftFootOut();
	break;
    default:
	shakeItAllAbout();
	break;
    }
Does the same thing, right? Wrong. If you start with hokeypokey set to the string '0' instead of the number 0, then the if-statement calls putYourLeftFootIn(), while the switch calls shakeItAllAbout(). When I first encountered this, my first thought was that it was a bug in the Javascript interpretor, but it's in the standard. When a switch statement compares a value to a case, it does an === identity comparison, not an == equality comparison, so it does no type conversion. So since string zero doesn't equal number zero, we fall into the default case.

Here again, the type of a variable, which often doesn't matter because of the automatic type conversions, suddenly does matter. And once again, it's really questionable whether this behavior is ever really useful. Can you imagine ever writing a switch statement where you had different cases that did different things for the the string '0' and the number 0? If you ran across a program that did that, wouldn't you want to strangle the author?

Conclusions

These issues go deep into the underlying design philosophy of Javascript. I doubt if they are going to change any time soon, though it's not impossible. Back in 1999, Waldemar Horwat proposed adding optional type declarations to Javascript. Those changes would help with many (but not all) of the problems described here. These days the same fellow is on the Javascript standards committee, but I've seen no signs of any similar plan being revived. Too bad.

So if you are programming in Javascript you are just stuck with these issues. The only thing you can do is program defensively. Try to remember to subtract zero from any numbers you read in. Probably library functions intended for broad use should subtract zero from any arguments that are expected to be numbers (and maybe throw an exception if the result is NaN).