This is an introduction to JavaScript. There are plenty of these on the web, but most of them appear to be for non-programmers, teaching the building-blocks of programming as JavaScript uses them. And that’s fine, because many people who have never programmed in their lives, find themselves needing a tiny bit of script for their web page, and need to start somewhere.
This article is designed to give much more in-depth information about JavaScript. It assumes a complete absence of knowledge about it, but it also assumes that you are already a competent programmer, preferably in a “C-like” language, and a smattering of Python will help too.
I’m going to cover everything I can think of short of the object model.
First of all, a common misconception to clear up: JavaScript is nothing to do with Java whatsoever. Nothing. Nada. Zilch. JS wasn’t even originally called that (for that matter, nor was Java, but that’s another story). The Java- prefix is entirely there for marketing reasons that presumably made sense to some flake-nosed dot-com executive back in the 90s, squeezing their thighs together as they reviewed their stock options. Warning: This article contains whining on the subject of JS’ “features”. If you’d discovered these the hard way, you’d want to vent too…
JS is a dynamic language, with C-ish syntax for its basic flow control structures. Some of the more advanced datatypes look very Pythonesque, and indeed, so is its behaviour in quite a few respects.
In fact, it’s a surprisingly powerful, expressive language, and could’ve been a very elegant bit of work.
However, it is not.
It is a dark festival of pain. Gotchas lurk in the darkness, biding their time. Brooding. Like bears: They Will Eat You.
JS is a thoroughly bastardised language, there’s no getting away from that, and that goes double for the browser APIs.
In general, JS seems to follow the Principle of Most Surprise.
JS can be run stand-alone (eg Rhino), or inside of special hosts (eg Mac OS X Widgets), but most commonly it is embedded into web pages, by doing this:
<script language='javascript'>
// source code goes here
</script>
Or:
<script language='javascript' src='/path/to/source.js' />
Within the scope of the <script> tags, normal HTML rules do not apply – < is <, not the beginning of a tag, and & is &, not the lead-in to a character entity. (That is all of the HTML required for this article.)
Most of the basic syntax is straight out of C. All of the following behave as in C/C++/etc:
comments (both /** C **/ style and // C++ style )
if / then / else
switch / case / break / default
while / do / break / continue
for / break / continue
Functions look similar to C, but have some differences. We’ll come to those soon. For-loops have an alternate form we’ll deal with later. Classes are completely different from most other languages.
Semi-colons at the ends of statements: You can use them, but they’re generally optional. JS “auto-inserts” semi-colons, according to arcane rules in the ECMAScript (the “standardised” – hee hee – version of JS) specification. It generally means, as long as the intent is fairly unambiguous, you don’t need them.
In JS, you never declare the type of a variable. Variables themselves, in fact, have no type – a variable is not a named memory location as in C, but a named reference to an object, as in Python. Just as in Python, the referred-to object has a type, which affects what you can do with or to it.
Unlike Python, however, you do need to declare that local variables exist, just not what type they are. This is not immediately apparent, because if you don’t declare a variable, JS will assume a declaration for it.
A global declaration.
Yes, JS took the opposite decision to Python; instead of assuming variables are local, and having to explicitly declare global variables, JS assumes all variables are global unless you explicitly declare them local.
This has some very surprising and annoying effects. In particular, if you forget to declare a variable, your code may well work perfectly, until it completely screws over some other piece of code using the global of the same name.
Or you write some recursive code, and die horribly in an unexpected infinite loop – which may well lock up, or even crash, your browser, since not all of them are smart enough to fail out of infinite recursion. Whee!
Yes, JS is designed to have something required 95% (but not all) of the time, that when you forget it, fails silently, and may crash or lock up the execution environment with no means to debug it other than trial and error! Thank you, JS language designers! Thank you, browser implementers!
Anyway, you declare variables with the var keyword, eg var msg = "hello" – if you don’t assign a value, variables default to undefined. I recommend always, always declaring your vars regardless of whether you need to or not. It’s just safer. If anyone knows of a “lint” for JS, that warns you about un-vard variables, let me know and I’ll add a link at this point…
Update: Martin Clausen points out JSLint. You need to pick your options carefully, but on the other hand, it has a great warning message on the front page…
Functions are slightly different from C. There are two formats, named and anonymous. Named functions are similar to C but there are, of course, no types declared for the parameters or return value. Where you would put the return type in C, you place the function keyword:
function treeWalk(branch, visitor)
{
visitor(branch)
var i // not going to let you forget! you'll thank me later!
for (i in branch.children)
{
treeWalk(branch.children[i], visitor)
}
}
Here, visitor is a function being passed in, we’ll show how later. branch is some user-defined object we’re assuming has a children array.
Also, I’m sure you’ve noticed the unusual for construction. This is the “iterator” version:
for (<variable> in <variable of container type>) <statements>
This iterates over each item in the container type. However, in another classic show of JS irritance, it returns only the index of each member. If the container is an array of 10 items, it will (probably) return the integers 0 through 9 (see the section on arrays below). If the container is a hash-map, it will return each key. This necessitates an extra, superfluous hash lookup as there seems to be no equivalent (that I’ve found so far) to Python’s for (key, value) in container: statements loop.
Anonymous functions are pretty obvious, you just omit the name:
function(param1, param2 ...)
{
statements
}
On its own, this is worthless; however, the above is syntactically an expression. You can place the entire definition pretty much anywhere, including variable assignments, so this next example is identical to defining a function named mul:
var mul = function(a,b) { return a*b }
A different example, using the treeWalk function from earlier:
treeWalk(rootNode, function(item) { logDebug(item) } )
Obviously, you have numbers (double floats – JS doesn’t actually do integers) and strings and they work pretty much as you’d expect. However JS has a tendency to occasionally (and surprisingly) coerce types amongst each other in various ways. For instance (using the Rhino command-line JS interpreter for demonstration purposes):
js> 2 + 2
4
js> "2" + 2
22
js> "2" * 2
4
“+” is used for both mathematical addition and string concatenation. When used with at least one string parameter, it coerces other types to strings as well, and concatenates; but all other math operations coerce strings into numbers instead. Whee! Combine with operator precedence for bonus fun and confusion:
js> "2" + 2 + 2
222
js> 2 + "2" + 2
222
js> 2 + 2 + "2"
42
js> "2" + 2 * 2
24
js> "2" * 2 + 2
6
js> ("2" + 2) * 2
44
Deep, slow breaths… it’ll pass…
Oh yeah, and JS comparison operations are awkward. == coerces types, === doesn’t:
js> 2==2
true
js> 2=="2"
true
js> 2===2
true
js> 2==="2"
false
And watch out for accidental string comparisons; however, this isn’t usually too bad as, if either operand is a number, the coercion will be in the numeric direction:
js> 4 > 2
true
js> 4 > 22
false
js> 4 > "22"
false
js> "4" > "22"
true
Oh! And JS will cheerfully turn invalid numbers into NaNs without complaint, which then propagate tentacularly throughout the code!
js> "2" * 2
4
js> "two" * 2
NaN
js> x = "two" * 2
js> 4 * x
NaN
Remember, deep, slow breaths…
Arrays and hashmaps are built in to JS. If you just need to declare one with some data pre-initialised, there’s syntax for it directly ripped from Python – square brackets for lists, braces for hashmaps:
js> myArray = [1, 2, 3, 17, 23, 42, 69]
js> myHash = {"key": "value", "key2": "value2"}
Both are indexed with square brackets:
js> myArray[3]
17
js> myHash["key2"]
value2
You can mix and match any combo of data types as the values of any array or hashmap. For the keys, it seems to coerce all datatypes to string form and use that – so the JS statement myHash[fred] = value appears to be like the Python: myDict[repr(fred)] = value
You can remove items from both kinds of storage with the same syntax, the delete statement:
js> delete myArray[3]
true
js> delete myHash["key2"]
true
(delete seems to return true always. Yes, even if your index is out of range. Yes, even if there is no such key. No, I don’t know why.)
However, you should be aware that if you remove an item from an indexed array, it leaves an empty space behind (undefined), and all other items still have the same indices:
js> myArray
1,2,3,,23,42,69
js> print(myArray[3])
undefined
js> print(myArray[4])
23
(That double-comma is not a typo. undefined values have no output when converted to strings, but there’s still a ‘slot’ reserved for it in the array.)
Now, when you iterate over an array, it does skip over such empty slots:
js> for (i in myArray) print(i, ":", myArray[i])
0 : 1
1 : 2
2 : 3
4 : 23
5 : 42
6 : 69
Nonetheless, for many tasks, having undefined values floating around can be enough of a pain in the ass that you’ll find yourself writing functions to either “compress” an array, or to return a copy where the specified index is skipped, giving you delete-and-compress in one function (at the expense of an unnecessary copy operation):
function deleteArrayItem(source, index)
{
var result = new Array()
for (i in source)
if (i!=index)
result.push(source[i])
return result
}
Update: Reddit user davidsickmiller mentions an array method I’d missed back when I wrote this in 2006: Array::splice will remove a range from an array, returning the removed elements, and packing the original array:
js> myArray = [1, 2, 3, 17, 23, 42, 69]
js> myArray.splice(3,1)
17
js> myArray
1,2,3,23,42,69
Arrays automatically extend themselves when you insert a value at an out-of-range index, by the way. And if the range isn’t contiguous, all the slots in-between are created undefined.
Arrays have a length member, eg myArray.length but it doesn’t tell you how many items are present, it tells you how many slots there are; in other words, it always returns (highest-index-you’ve-inserted-items-at)+1. If you want to know the actual count, well, you’ll just have to iterate through and count them.
And on that note, I’m going to leave you to fend for yourself for a while.