Wednesday, February 11, 2009

Javascript namespacing and packaging - Draft 1

Packaging and namespacing standard for javascript projects - Draft 1

Date: Tue Feb 10 19:03:38 EST 2009
Subject: Packaging and namespacing standard for javascript projects - Draft 1
From: Daniel Bush - dlb.id.au AT gmail.com

BACKGROUND

The following questions motivate this README file.
  • What is the best way to structure a javascript project?
  • How do you package it and handle versioning?
  • How do you prevent clashes in the "global namespace"?
  • In short: how do you create an eco-system of javascript libraries created by multiple authors/vendors with multiple versions which other authors can easily use and handle in a reliable, controlled way?
Motivations for this document:
  • http://yuiblog.com/blog/2006/06/01/global-domination/
  • http://yuiblog.com/blog/2007/06/12/module-pattern/
  • http://blog.web17.com.au/2009/02/namespaces-in-javascript-1.html
This draft is a first attempt. It is likely there may be major revisions or unresolved issues.

TERMINOLOGY

One way to understand the problems covered in this document is to clearly identify the consituent parts that play a role. These are:
Publisher
A person or organisation who creates javascript code as an application or as a library to be used by other javascript applications by other publishers or the same publisher.
Global Namespace (GN)
A global variable that publishers use to contain all projects they produce. A GN is a javascript object:
var $GN$ = {};
(The '$$' format is discussed further down this document.)
Foreign Global Namespace (FGN)
An FGN is a GN used by some other publisher, which the current publisher is referencing in one of their projects. In the following, we'll often just refer to "GN" for a project regardless of whether it is using another project or being used by another project.
Project
Has the following features:
  • exists as a set of files and subdirectories under a root directory
  • unminified/unserialised; project may be spread across several files and include comments and spacing for human readability as well as additional files that aren't executed.
  • is version controlled using a version control system (VCS)
  • defines/creates one place in the GN of the publisher eg
     $GN$.project1
     $GN$.area1.project1
     etc    
  • may have dependencies on other projects in the same GN or in FGN's.
  • may be used as a dependency (library) or standalone
  • is intended to be packaged - see 'Package' below
A project can be built out of other projects.
Dependency
A project which another project relies on for some of its functionality. Dependencies are usually packaged and as a result, represent a specific version or snapshot of the project. Dependencies may use the same GN or an FGN.
Dependency Chain
A tree representing the dependencies of a project. Here is an example:
    B1
   /
A--
   \
    C1--B2
In this diagram, A,B,C are projects. B1,C1,B2 are specific versions of the projects they represent. Note: Although B1 and B2 are potentially different versions of the same project, they both occupy the same location of GN for that project. The last version to be loaded will therefore occupy this position in the GN once the dependency chain has been fully loaded into the system.
Level 1 of the tree represents A's dependencies (B1 and C1). The dependencies of A's dependencies are called 2nd order dependencies of A and occupy level 2 of the tree (B2).
Note: 'A' may be tagged version in a VCS or an in-between development version. Either way, any given snapshot of A relies on specific versions of its dependencies.
Load Order
The order in which a project and its dependencies (and their dependencies etc) are loaded. Load order can be defined recursively. A project loads itself by first loading its dependencies in a specified order (see packaging) and then loading itself.Loading involves the process of defining the GN if it does not exist already, and then building the relevant part of the GN namespace that is defined by the project or dependency. Using the dependency chain example above, the appropriate load orders are:
B1 B2 C1 A
Or:
B2 C1 B1 A
Load order is affected by the order in which the packages are referenced using script tags in an html file.
Load Time
(Sometimes referred to as 'load order time') The time during which the project and its dependencies are loaded into the interpreter. Can be thought of as the initialisation phase of the project This is emphasised because it is during this phase that clashes with project global variables may occur.
Package / Packaging
When a project is "packaged" it becomes a package.Packages have the following features:
  • usually minified; comments and spacing of the original project code are reduced or removed
  • usually a single file (serialised)
  • usually represent a particular tagged version of the project (usually using a tag in the VCS).
  • may include dependencies
  • javascript files of the project have to be serialised in load order when creating the single package file
  • The version number is explicity added to the end of the file. Eg project1-0.2.js would be version 0.2 of project1.
Module
A module (also referred to as the "module pattern") is a pattern identified by Yahoo!. See: http://yuiblog.com/blog/2007/06/12/module-pattern/ . Modules can be used to construct a part of a GN or project at load time and encourage the building of projects outside of the global namespace.
Frameworks
Frameworks are large, overarching sets of projects or libraries. Examples are jQuery and Prototype. Frameworks tend to use their own special GN's eg jQuery uses 'jQuery'; or occupy really common GN's eg Prototype uses 'Event', 'Ajax' etc Frameworks are mostly exempt from this document.
If you are using one, it will probably be included at the top of the load order. Projects and dependencies in the rest of the load order can then use the framework. It is up to the publisher to make sure they use an appropriate version of the framework to fit the requirements of the project and its dependencies.

RECOMMENDATIONS

1) Referencing global variables

Do NOT reference global variables in yet-to-be-executed function eg an asynchronous event handler.This includes referencing the GN or part of the GN or any other global variable that is a shortcut to the GN or part of the GN.
Global variables should be accessed at load time and preferably with a local variable within the scope of a module.
Rationale:
You cannot guarantee that the same version of a project is accesible through the GN after load time, because another project/dependency may have overloaded it through their dependencies. For shortcuts, the situation can be exacerbated; several dependencies (projects) may use the same shortcut to reference completely different things. If these shortcuts are global these shortcuts will get corrupted.
Example:
  var project1 = $GN$.area1.project1;
  project1.Object1 = function(){...}
  var p = new project1.Object();  
In this example, 'project1' is a global variable shortcut to a GN which is set up and then used straight away. Note: the above example should not be used for projects which are likely to be used as dependencies themselves as it is polluting the global namespace. See next point.
Example - bad:
  var project1 = $GN$.area1.project1;
  function doSomethingLater() {
    ...
    var obj = new project1.Object1();
    ...
  }  
In this example, the 'doSomethingLater' function is not executed but references a global shortcut 'project1'. In an environment that has multiple dependencies from multiple authors, the 'project1' could be overwritten by another dependency at load time and is not guaranteed to be correct. Even if we don't use a shortcut, it is possible that $GN$.area1.project1 has been overwritten with a different version.

2) Using modules to build the GN and to reference other GN's

If your project is intended or likely to be used as a dependency (a library to some other project), then i) it should be constructed using a module (at load time) and ii) the GN's of its dependencies should be referenced inside this module. Shortcuts to such GN's should be created within the scope of the module at or near the top.
Rationale:
A dependency which has other dependencies, will reference them in its module that builds that dependency at load time thus ensuring that the dependency has kept a reference to the appropriate version .
Example:
$GN1$.project1 = function() {
  ...
  var project2 = $GN2$.area1.project2;
  ...
  ...
  ...
}();
In this example, project2 is a dependency of project1 and is referenced within the executed scope of the anonymous function. Assuming load order and packaging have been set up properly, this should ensure that project1 is using the appropriate version of project2.
Note: Use of this technique within the context of the load order may mitigate any clashes in the GN; but it still makes sense to keep GN's unique to clearly distinguish publishers.

3) GN naming scheme

Use double dollar sign to denote a GN. The string between the dollar signs should represent a domain name of the publisher with the dots replaced by underscores. eg $web17_com_au$ would represent the GN for the publisher who owns the web17.com.au domain name.
Rationale:
If there are a proliferation of publishers creating javascript projects and libraries that can be used by other publishers, the likelihood of the same GN being used by different publishers increases; a modified domain name will reduce this risk.
The dollar signs signify that the variable is a GN and is part of the packaging system rather than just another variable in a program.
It also encourages the use of shortcuts. The GN should not be interspersed throughout the code in the project. It should be referenced by a shortcut at the top of a module or project.

OTHER SUGGESTIONS

1) Starting a GN

To start a GN, do:
var $GN$ = $GN$ || {};
This statement should be outside of any function. You do not create FGN's. By definition, they have been created for you by the dependency.

2) Modules

A suggested module pattern:
$GN$.project1 = function() {
 var module={};
 var shortcut1 = $FGN1$.path.to.project.Object1;
 var shortcut2 = $FGN2$.path.to.project.Object2;
 ...

 // Private module members.
 var private_var1 = ...;

  ...

 // Public module members.

 module.Object1 = function(...) {...}
 module.function1 = function(...) {...}
 ...

 // Return 'module' to build $GN$.project1 namespace.
 return module;
}();

3) Extending modules - submodules

Suppose you have built the module in point 2). You can extend the module (maybe in a separate file) likeso:
$GN$.project1.area1 = function() {
 ...
}();
Notes:
  • this file would need to come after the first file defining $GN$.project1 at load time
  • A new module is being created at project1.area1 - ie a submodule; this module clearly cannot see any private data in the parent module.
  • care needs to be taken that area1 doesn't overwrite a namesake defined in the project1 module1.
$GN$.project1 = function() {
  ...
}
$GN$.project1.Object2 = function() {
  var shortcut1 = $FGN1$.project2;  // Not so good.
  ...
}
In this example, the project1 module is augmented with a new object not a submodule.An attempt is made to create a shortcut within Object2. Unfortunately, this object can be instantiated at a later time when $FGN1$.project2 might have been overwritten with a different version. Granted, this is probably not that likely.

4) Layout of project

The following might represent the file structure for a project:
/
/ext
/ext/projectA-0.2.js
/ext/projectB-1.1.2.js
/globals.js
...
/module1
/module1/module1.js
/module1/submodule1.js
/module1/submodule2.js
...
/module2/module2.js
...
  • The above layout uses a directory called 'ext' to store dependencies - ie other projects are used by this project.
  • The dependencies are packaged with their version numbers displayed so it obvious to the maintainer.
  • The globals.js file sets up the $GN$ for the project including the relevant part of the $GN# which this project will handle. Eg
    var $GN$ = $GN$ || {};
    $GN$.module1={};  // Initialize the module for this project.    
  • A large project might create several modules and submodules; it may make sense to provide a subdirectory for each module and its submodules.

Sunday, February 8, 2009

Namespaces in javascript 1

Here are some of the issues I want to cover - not just in this post but maybe in several including this one:
  • What is the best way to structure a javascript project?
  • How do you package it and handle versioning?
  • How do you prevent clashes in the "global namespace"?
  • In short: how do you create an eco-system of javascript libraries created by multiple authors/vendors with multiple versions which other authors can easily use and handle in a reliable, controlled way?

The Namespace / Global Problem

It's easy to create small useful components or helper libraries in javascript which you could publish, say to github.com, and have other people use. But, if enough of this happens and you don't pay attention to how the global namespace[1] is used then you will eventually get some namespace clashes. For my part, I'm interested in how to safely namespace javascript components written by me or other people.

Creating a simple namespace

First off, easiest way to create a global namespace (ie a global variable that acts as a namespace) is to do something like:
var NS1={};
between the script tags on your web page or in a .js file referenced by your webpage.Then, you can proceed to use NS1 to namespace any other object constructors, functions, variables you care to use.
NS1.Object1 = function() {...}
var o = new NS1.Object1();
A popular javascript library like prototype already stakes its claim to globalisation by taking some of the more obvious global variable names eg 'Ajax', 'Event' etc JQuery is more cautious and puts everything in a 'jQuery' namespace. Both libraries use '$' as a convenience global variable.In:
Douglas Crockford and Yahoo! discuss suggested practices. If you're a company called Yahoo, then you use a global variable YAHOO (which I'll refer to as a global namespace[1]). Douglas suggests using all-caps to make it more obvious that this global variable has a special purpose - to be a global namespace.The module pattern in the second link takes it one step further and suggests a way to build up a group of both public and private variables and functions (a "module") which are built within the scope of an anonymous function. The object returned by this anonymous function is called a "module" and is referenced by a name within the global namespace - in this case "module1":
YAHOO.module1 = function() {
  var private_var1 = .... ;
  ...
  return {
    Object1: function() { ... }
  };
  ...
}();
The returned object or module will have public variables and functions - in the example above, Object1 is a public constructor function of the returned anonymous object. This extends the YAHOO.module1 namespace to YAHOO.module1.Object1.
Which we can use like this:
var o = new YAHOO.module1.Object1();
It is just as easy to have private functions, objects and variables simply by using the 'var' statement within the anonymous function.

Shortcuts for namespaces

It can get tiresome having to reference an object or field by its fully qualified namespaced name.We can create a shortcut like this:
var O = YAHOO.module1.Object1
and then use it
var o = new O()
This is great except where "O" happens to be global.You might be tempted to set O globally, then just casually reference it inside your application:
var O = YAHOO.module1.Object1;
var NS1.SomeObject2 = function() {
  this.doSomething = function() {
  ...
  var o = new O();
  ...
  }
}
In the above, NS1 is a global namespace for some publisher of javascript widgets.The problem is that NS1 has used "O" as a global variable shortcut for a library that it is using from the YAHOO global namespace (it could just as easily be some other object in the same namespace).What happens if NS2 comes along and builds a javascript widget that relies on NS1's "SomeObject2"?
Nothing much except if NS2 in turn decides to use "O" as its global shortcut for some other object that it is using.Bear in mind, the application that NS2 is building will load on the page first by loading NS1 which in turn will first load the YAHOO module1 and its object.So the load order is YAHOO, NS1 and NS2. The global "O" will get set first by NS1 and then overwritten by NS2.
Then, when NS2 runs and instantiates the object in NS1, "O" no longer means what NS1 intended. This of course illustrates the problem with using global variables in general.The moral of the story is to set your shortcuts within the scope of your object.
More precisely, it's ok to use a shortcut for your namespace if it gets used then and there; if it gets referenced in a yet-to-be-executed scope (eg some function that gets called later) then we open ourselves up to potential, nasty, hard-to-debug problems especially where multiple vendors/authors are being used.
The module pattern above encourages non-global shortcuts. As with everything else in our module, the anonymous function houses both the stuff of the module and the shortcuts it uses.

[1] - the term "global namespace" can get used with 2 different meanings in this document; (1) the actual global "space" in which global variable names exist; and (2) a given global variable acting as a namespace for given author/vendor.