[ /js/principles ]

Javascript:: core principles

NOTE: This article is aimed at people with some experience programming in at least one other language. You may benefit from it if you're a total newb, but you aren't my intended audience.

As of the time this article is being written, if you google "the world's most misunderstood programming language", you'll get a bunch of hits about Javascript.

I personally find it a joy to work with, but I hear a lot of complaints about it. In any case, it seems like Javascript is here to stay.

It's been called the assembly language of the web, due to the large and ever increasing number of languages which compile to javascript, despite the fact that js is already a very high level language. Despite its humble beginnings, javascript was the only option when it came to clientside scripting throughout the nineties, and that position resulted in a wonderful arms race which got as as far as we are now.

I don't expect that I will be able to explain how so many misconceptions arose around Javascript, but I can try to tackle what I think are the most egregious errors. I'll do my best not to get derailed in a bunch of theory, I just want to help people wrap their minds around how it behaves.

Variable assignment

When you want to store a variable, you create it with the keyword var. This word is special, so trying to name a variable 'var' will cause problems. That should be fairly easy to work around, since Javascript is incredibly permissive about what characters and words you can use. Variables can be created without an initial value.

var a,b,c; // create these variables at the current scope
// if referenced, these variables will return 'undefined'
// you probably don't want to do this, but there are times when it is useful

Variables can store anything at all. If it's a valid piece of data, a variable name can reference it. The type of data referenced can be changed at any time, so one minute the variable a can be a number, and the next, it can be a string, or an object, or an array, or a function. So how do those datatypes behave?

Numbers

Many languages have separate datatypes for integers (whole numbers) and floats (numbers with decimal points). Other languages have signed (numbers which include information about whether they are positive or negative) and unsigned (numbers which use the space of the sign bit to store a little bit more numerical information instead of sign information). Javascript was designed to be transmitted across networks every time a page downloaded, and so, rather than optimize for performance based on how numbers were stored, bandwidth was optimized by requiring fewer keywords, and making source code shorter overall.

As such, numbers are just numbers. Arithmetic is pretty straightforward, (3/2)*(1/1.5) is equal to 1. Spend some time playing around with it, and you probably won't find anything surprising here.

Arrays

In other languages, these are called lists, or vectors. Unlike many other languages, however, Javascript does not require that your array has a type. It doesn't need to be exclusively composed of numbers, or some other data type, it just stores arbitrary information at index points. You can mix and match however you like. Arrays simply bundle up a number of datapoints into a single piece of data which can be referenced by a function name, or used as a literal piece of data.

Finally, arrays can nest in other arrays. Perhaps there's some depth at which they cannot next any further, but if there is, I've never hit that limit.

Strings

Javascript has no 'character' datatype, like most other languages. Instead, single characters are treated as strings of length 1. Strings can be resized as necessary, and can contain unicode characters like λ.

Objects

The equivalent of a hash-map, or dictionary in other languages. Objects store values accessible by their keyword, rather than by their index, as with an array. There are two syntaxes you can use to access a particular key:

  1. varname.keyname which is fairly short and straightforward.
  2. `varname["keyname"] which is slightly more verbose, but is more generally applicable.

Why is the second one more general?

Suppose you have a string "keytoaccess", referred to by the variable name pewpew. If you want to use the value of that variable "keytoaccess" as the key to an object, you will run into trouble with the first syntax.

var varname = {keytoaccess:"it's alive!"};

varname.pewpew // this will try to access a key `pewpew` in the object `varname`
// this key does not exist, so you can't access it
// welcome to bug city, population: you

With the second syntax, you can pass pewpew as a keyword. It will first access the value of the variable name pewpew, namely "keytoaccess", then pass it as a key to the object varname.

varname[pewpew] // this will do what you initially expected
// assuming you expected it to return the string "it's alive!"

Remember this lesson well, as it's something lots of people mess up.

The mystery of Javascript Functions

Functions basically come in two core flavours in javascript:

  1. The anonymous function, or lambda
  2. The keyword function

Many people think that they are the same, but THIS IS NOT THE CASE! If you confuse the two, you run the risk of encountering nasty bugs that will make you pull your hair out. Heed these words!

The first thing your javascript engine will do is split your code up into tokens, then it will parse out what they mean, then it will run it. if you declare a function using the keyword syntax, that function will be instantiated before anything else happens. This means that you can have a function appear at the bottom of your source code, call it from the top. Your JS engine will know that the above function invocation refers to the keyword which is later defined.

doSomething();

function doSomething(){
  console.log("I'm a function, and I do something");
};

If you run this code, you'll see that the function does indeed execute properly.

In contrast, a lambda, or anonymous function behaves exactly like any other variable. That is, you can use it explicitly as a constant using the following syntax:

function(){
  console.log("This is an anonymous function, or lambda. Note that it has no name.");
};

This kind of function syntax is evaluated in the sequence in which it is encountered. If it is never bound to a variable name, it can only be used immediately, like so:

function(){
  console.log("I'm just demonstrating immediate function invocation in javascript");
}();

All you need to do is pass arguments (or empty brackets, if it's a null-ary function), and it will be called immediately (just like a regular function). Note that because of the way that the keyword function declaration behaves at parse time, you will not be able to invoke it in the same fashion.

For example:

function iHaveABadFeelingAboutThis(){
  console.log("I smell an error.");
}();

A wild bug appears!

No matter, though. You wouldn't have gained anything by invoking it this way, since it's guaranteed to stay in memory by virtue of being a named function.

Just invoke it as you normally would.

Functions as first class citizens

I heard that term thrown around a lot when I was new to js, but I never really heard it explained very well. What it means is that you can use a function the same way you would use anything else.

For example, you could define a function that prints its argument three times:

function printThrice(x){
  console.log(x);
  console.log(x);
  console.log(x);
};
printThrice("pew");

You can similarly define a function which will accept a function as an argument, and call it three times:

function callThrice(f){
  console.log(f());
  console.log(f());
  console.log(f());
};
callThrice(function(){
  console.log("pew");
});

This is the point where people generally start to complain about javascript becoming unreadable. Just remember how immediate invocation looked and behaved, and keep in mind that the arguement you are passing is behaving exactly the same way. It's just an anonymous function.

For the earlier example (printThrice), you could avoid passing the string literal by binding it to a name before passing it:

var s = "pew";
printThrice(s);

You can do exactly the same thing with your function call:

var F = function(){
  console.log("pew");
};

callThrice(F);

You may find that easier to read, but now that function is going to stay in memory until you overwrite it, or the program completes. When you see people passing functions as arguments to be called within other functions (in most situations we refer to these as 'callbacks'), it's simply because they are only going to use that function once, and there's no need to bind it to a name. Now that you understand it, get over it!

Although, there are many cases where I've seen this, when the function could totally be bound up by a name, resulting in cleaner code. That's just poorly written js. That isn't the fault of the language, but that of bad programmers who don't understand when they can afford to abstract things away.

Study it! Come to terms with it! Use it! The truth will set you free! Knowing is half the battle! ETCETERA!

--ansuz

2014/10/07