Wednesday, February 11, 2009

Javascript namespacing and packaging - Draft 1

Packaging and namespacing standard for javascript projects - Draft 1

Date: Tue Feb 10 19:03:38 EST 2009
Subject: Packaging and namespacing standard for javascript projects - Draft 1
From: Daniel Bush - dlb.id.au AT gmail.com

BACKGROUND

The following questions motivate this README file.
  • What is the best way to structure a javascript project?
  • How do you package it and handle versioning?
  • How do you prevent clashes in the "global namespace"?
  • In short: how do you create an eco-system of javascript libraries created by multiple authors/vendors with multiple versions which other authors can easily use and handle in a reliable, controlled way?
Motivations for this document:
  • http://yuiblog.com/blog/2006/06/01/global-domination/
  • http://yuiblog.com/blog/2007/06/12/module-pattern/
  • http://blog.web17.com.au/2009/02/namespaces-in-javascript-1.html
This draft is a first attempt. It is likely there may be major revisions or unresolved issues.

TERMINOLOGY

One way to understand the problems covered in this document is to clearly identify the consituent parts that play a role. These are:
Publisher
A person or organisation who creates javascript code as an application or as a library to be used by other javascript applications by other publishers or the same publisher.
Global Namespace (GN)
A global variable that publishers use to contain all projects they produce. A GN is a javascript object:
var $GN$ = {};
(The '$$' format is discussed further down this document.)
Foreign Global Namespace (FGN)
An FGN is a GN used by some other publisher, which the current publisher is referencing in one of their projects. In the following, we'll often just refer to "GN" for a project regardless of whether it is using another project or being used by another project.
Project
Has the following features:
  • exists as a set of files and subdirectories under a root directory
  • unminified/unserialised; project may be spread across several files and include comments and spacing for human readability as well as additional files that aren't executed.
  • is version controlled using a version control system (VCS)
  • defines/creates one place in the GN of the publisher eg
     $GN$.project1
     $GN$.area1.project1
     etc    
  • may have dependencies on other projects in the same GN or in FGN's.
  • may be used as a dependency (library) or standalone
  • is intended to be packaged - see 'Package' below
A project can be built out of other projects.
Dependency
A project which another project relies on for some of its functionality. Dependencies are usually packaged and as a result, represent a specific version or snapshot of the project. Dependencies may use the same GN or an FGN.
Dependency Chain
A tree representing the dependencies of a project. Here is an example:
    B1
   /
A--
   \
    C1--B2
In this diagram, A,B,C are projects. B1,C1,B2 are specific versions of the projects they represent. Note: Although B1 and B2 are potentially different versions of the same project, they both occupy the same location of GN for that project. The last version to be loaded will therefore occupy this position in the GN once the dependency chain has been fully loaded into the system.
Level 1 of the tree represents A's dependencies (B1 and C1). The dependencies of A's dependencies are called 2nd order dependencies of A and occupy level 2 of the tree (B2).
Note: 'A' may be tagged version in a VCS or an in-between development version. Either way, any given snapshot of A relies on specific versions of its dependencies.
Load Order
The order in which a project and its dependencies (and their dependencies etc) are loaded. Load order can be defined recursively. A project loads itself by first loading its dependencies in a specified order (see packaging) and then loading itself.Loading involves the process of defining the GN if it does not exist already, and then building the relevant part of the GN namespace that is defined by the project or dependency. Using the dependency chain example above, the appropriate load orders are:
B1 B2 C1 A
Or:
B2 C1 B1 A
Load order is affected by the order in which the packages are referenced using script tags in an html file.
Load Time
(Sometimes referred to as 'load order time') The time during which the project and its dependencies are loaded into the interpreter. Can be thought of as the initialisation phase of the project This is emphasised because it is during this phase that clashes with project global variables may occur.
Package / Packaging
When a project is "packaged" it becomes a package.Packages have the following features:
  • usually minified; comments and spacing of the original project code are reduced or removed
  • usually a single file (serialised)
  • usually represent a particular tagged version of the project (usually using a tag in the VCS).
  • may include dependencies
  • javascript files of the project have to be serialised in load order when creating the single package file
  • The version number is explicity added to the end of the file. Eg project1-0.2.js would be version 0.2 of project1.
Module
A module (also referred to as the "module pattern") is a pattern identified by Yahoo!. See: http://yuiblog.com/blog/2007/06/12/module-pattern/ . Modules can be used to construct a part of a GN or project at load time and encourage the building of projects outside of the global namespace.
Frameworks
Frameworks are large, overarching sets of projects or libraries. Examples are jQuery and Prototype. Frameworks tend to use their own special GN's eg jQuery uses 'jQuery'; or occupy really common GN's eg Prototype uses 'Event', 'Ajax' etc Frameworks are mostly exempt from this document.
If you are using one, it will probably be included at the top of the load order. Projects and dependencies in the rest of the load order can then use the framework. It is up to the publisher to make sure they use an appropriate version of the framework to fit the requirements of the project and its dependencies.

RECOMMENDATIONS

1) Referencing global variables

Do NOT reference global variables in yet-to-be-executed function eg an asynchronous event handler.This includes referencing the GN or part of the GN or any other global variable that is a shortcut to the GN or part of the GN.
Global variables should be accessed at load time and preferably with a local variable within the scope of a module.
Rationale:
You cannot guarantee that the same version of a project is accesible through the GN after load time, because another project/dependency may have overloaded it through their dependencies. For shortcuts, the situation can be exacerbated; several dependencies (projects) may use the same shortcut to reference completely different things. If these shortcuts are global these shortcuts will get corrupted.
Example:
  var project1 = $GN$.area1.project1;
  project1.Object1 = function(){...}
  var p = new project1.Object();  
In this example, 'project1' is a global variable shortcut to a GN which is set up and then used straight away. Note: the above example should not be used for projects which are likely to be used as dependencies themselves as it is polluting the global namespace. See next point.
Example - bad:
  var project1 = $GN$.area1.project1;
  function doSomethingLater() {
    ...
    var obj = new project1.Object1();
    ...
  }  
In this example, the 'doSomethingLater' function is not executed but references a global shortcut 'project1'. In an environment that has multiple dependencies from multiple authors, the 'project1' could be overwritten by another dependency at load time and is not guaranteed to be correct. Even if we don't use a shortcut, it is possible that $GN$.area1.project1 has been overwritten with a different version.

2) Using modules to build the GN and to reference other GN's

If your project is intended or likely to be used as a dependency (a library to some other project), then i) it should be constructed using a module (at load time) and ii) the GN's of its dependencies should be referenced inside this module. Shortcuts to such GN's should be created within the scope of the module at or near the top.
Rationale:
A dependency which has other dependencies, will reference them in its module that builds that dependency at load time thus ensuring that the dependency has kept a reference to the appropriate version .
Example:
$GN1$.project1 = function() {
  ...
  var project2 = $GN2$.area1.project2;
  ...
  ...
  ...
}();
In this example, project2 is a dependency of project1 and is referenced within the executed scope of the anonymous function. Assuming load order and packaging have been set up properly, this should ensure that project1 is using the appropriate version of project2.
Note: Use of this technique within the context of the load order may mitigate any clashes in the GN; but it still makes sense to keep GN's unique to clearly distinguish publishers.

3) GN naming scheme

Use double dollar sign to denote a GN. The string between the dollar signs should represent a domain name of the publisher with the dots replaced by underscores. eg $web17_com_au$ would represent the GN for the publisher who owns the web17.com.au domain name.
Rationale:
If there are a proliferation of publishers creating javascript projects and libraries that can be used by other publishers, the likelihood of the same GN being used by different publishers increases; a modified domain name will reduce this risk.
The dollar signs signify that the variable is a GN and is part of the packaging system rather than just another variable in a program.
It also encourages the use of shortcuts. The GN should not be interspersed throughout the code in the project. It should be referenced by a shortcut at the top of a module or project.

OTHER SUGGESTIONS

1) Starting a GN

To start a GN, do:
var $GN$ = $GN$ || {};
This statement should be outside of any function. You do not create FGN's. By definition, they have been created for you by the dependency.

2) Modules

A suggested module pattern:
$GN$.project1 = function() {
 var module={};
 var shortcut1 = $FGN1$.path.to.project.Object1;
 var shortcut2 = $FGN2$.path.to.project.Object2;
 ...

 // Private module members.
 var private_var1 = ...;

  ...

 // Public module members.

 module.Object1 = function(...) {...}
 module.function1 = function(...) {...}
 ...

 // Return 'module' to build $GN$.project1 namespace.
 return module;
}();

3) Extending modules - submodules

Suppose you have built the module in point 2). You can extend the module (maybe in a separate file) likeso:
$GN$.project1.area1 = function() {
 ...
}();
Notes:
  • this file would need to come after the first file defining $GN$.project1 at load time
  • A new module is being created at project1.area1 - ie a submodule; this module clearly cannot see any private data in the parent module.
  • care needs to be taken that area1 doesn't overwrite a namesake defined in the project1 module1.
$GN$.project1 = function() {
  ...
}
$GN$.project1.Object2 = function() {
  var shortcut1 = $FGN1$.project2;  // Not so good.
  ...
}
In this example, the project1 module is augmented with a new object not a submodule.An attempt is made to create a shortcut within Object2. Unfortunately, this object can be instantiated at a later time when $FGN1$.project2 might have been overwritten with a different version. Granted, this is probably not that likely.

4) Layout of project

The following might represent the file structure for a project:
/
/ext
/ext/projectA-0.2.js
/ext/projectB-1.1.2.js
/globals.js
...
/module1
/module1/module1.js
/module1/submodule1.js
/module1/submodule2.js
...
/module2/module2.js
...
  • The above layout uses a directory called 'ext' to store dependencies - ie other projects are used by this project.
  • The dependencies are packaged with their version numbers displayed so it obvious to the maintainer.
  • The globals.js file sets up the $GN$ for the project including the relevant part of the $GN# which this project will handle. Eg
    var $GN$ = $GN$ || {};
    $GN$.module1={};  // Initialize the module for this project.    
  • A large project might create several modules and submodules; it may make sense to provide a subdirectory for each module and its submodules.