Friday, August 24, 2012

Javascript: parseInt() quirky or misused?

Ensuring you are using the correctly typed variables before using the + operator is important in Javascript. If your intent was to numerically sum them and your variables are strings, you'll end up with a concatenated string. Nothing is worse than having c = a + b in your code and spend an hour debugging only to find out that a and b were concatenated as a string instead of numerically summed. In a loosely type language, this is one place where a different, explicit operator is needed for string concatenation. Using + for both numeric addition and string concatenation causes nothing but headaches. Its one feature that I like about PHP - . is used as the string concatenation operator and + is for summing numbers. Types are converted to enable either operation as needed.

However, in Javascript, we must ensure a variable is the correct type before trying to sum two values. A common approach is to use the parseInt() function to ensure you're working with numbers instead of strings. However, it seems that few (and at one point in time myself included) are unaware that the function actually takes a second argument. Omit it and you may get unexpected results in some situations:



// This is OK
parseInt('1');
parseInt('8');
parseInt('9');
parseInt('10');
parseInt('11');

// This is fine until 8 and 9
parseInt('01');
parseInt('08');
parseInt('09'); // This will be zero
parseInt('10'); // This will be zero
parseInt('11');

// This is OK
parseInt('01', 10);
parseInt('08', 10);
parseInt('09', 10);
parseInt('10', 10);
parseInt('11', 10);

// These are NaN
parseInt('ABC');
parseInt('ABC1');

// This is 1
parseInt('1ABC');



You can tinker with this code on JSFiddle if you'd like to see it in action.

So way the odd behavior for '08' and '09'? In the description of parseInt() on w3schools, it states the behavior of the function if the second parameter is omitted from the call:
If the radix parameter is omitted, JavaScript assumes the following:

  • If the string begins with "0x", the radix is 16 (hexadecimal)

  • If the string begins with "0", the radix is 8 (octal). This feature is deprecated

  • If the string begins with any other value, the radix is 10 (decimal)



So everything is fine until you start having leading zeros in your strings. As soon as the string has a leading zero, it starts doing strange things.

Additionally, if the string can not be parsed to a number (no leading numbers in the string), NaN is returned so you can't just blindly sum the result without checking for that exception.

As alternative to parseInt(), type coercion can be used to force a variable to a numerically typed value:


(+ ''); // equals 0
(+ '1'); // equals 1
(+ '8'); // equals 8
(+ '08'); // equals 8
(+ 'ABC'); // equals NaN
(+ 'ABC1'); // equals NaN
(+ '1ABC'); // equals NaN


Here, everything works fine as long as the string is really a full number in string form. If anything is non-numeric, the result will be NaN. The exception seems to be empty string evaluates to zero. parseInt() will return NaN for empty string.

So how do you ensure you always get a number no matter what value is in the variable? parseInt() is easy to fix by adding 10 as the second parameter. However, I'm lazy and tend to forget it and it doesn't solve the problem of getting NaN in certain situations. Coercion avoids the function call entirely, but is unable to handle any alpha characters.

One solution is just to create a wrapper function:



myParseInt(s)
{
var t = parseInt(s, 10);
return isNaN(t) ? 0 : t;
}


This solution assumes you always want to work with base 10 values and the cases that result in NaN will be mapped to zero. If you want to parse hexadecimal strings, you can still use the built-in parseInt() as-is.

A slightly more aggressive approach is to proxy parseInt() and modify its behavior. Here is a blended approach that will use parseInt() if a radix is provided or just use type coercion. A final check is performed to ensure a numeric value is always returned (paste this into the JSFiddle provided earlier to see the change):



(function ()
{
var _parseInt = parseInt;

parseInt = function(s, r)
{
var t = (r ? _parseInt(s, r) : (+ s));
return (isNaN(t) ? 0 : t);
}

})();



This should only be used when you know it won't cause issues with other code that might be checking for NaN and branching differently than on a zero value.

Regardless of how you choose to handle the process of converting strings to numbers, knowing how each approach behaves will enable you to correctly handle each case and apply the correct logic to obtain the desired result.