Wednesday, August 25, 2010

Javascript prototypes, properties and lexical closures

Javascript and Sharing lexical state

Intro

In this article I want to look at ways to share lexically scoped variables (state) with public prototype functions and properties (as defined via defineProperty) and compare these to the more well-known pattern of using privileged functions.

Related articles

Encapsulation

Encapsulation gives you at least 2 things

  • by minimizing what you expose to the outside world, you make it easier to change the implementation of an object (or closure) without unexpected side-effects and ramifications
  • and the hiding of state and internals of an object to the outside allows/encourages you to define a clear interface, making it easier to reason about the behaviour of your object with others and by extension the overall project which is using your object.

You get encapsulation via "objects"...

Closures and objects

In the joy of javascript article I showed that you don't need the new keyword in javascript to do encapsulation. You can achieve it simply enough using closures. This is a technique that has been around probably since lexical scoping was introduced into languages like scheme and dialects of lisp (including the precursors to Common Lisp).

Closures can be thought of as objects:

  function makeFoo() {
    var me = {};
    var secret = 'a';
    me.method1 = function() {
      ... do something with 'secret' ...
    } 
    return me;
  }
  var m = makeFoo();
  m.method1();

This can be re-arranged to use javascript's new keyword to instantiate objects:

  function Foo() {
    var me = this;
    var secret = 'a';
    me.method1 = function() {
      ... do something with 'secret' ...
    } 
  }
  var m = new Foo();
  m.method1();
  • Foo behaves with lexical scope just like makeFoo, but by using the new keyword, we get an object that has access to javascript's prototype system.
  • To make use of the prototype system we need to set the prototype property of the Foo constructor.
  • Prototype function definitions use the this keyword to refer to the instantiated object that they are being called on

Panning prototypes

In an earlier article I panned the javascript prototype system because it encouraged exposing private state to elements outside of the object.

Some points to note from this:

  • prototype methods cannot access the lexical scope of an object
    • using the above example, Foo.prototype.* cannot access the local variables defined with the braces that define Foo itself (ie function Foo() {...})
  • this means that if you want to use prototype methods and manipulate lexical state you need to expose it via the this keyword
  • some people may be happy with this and simply expose private state where needed by adding it as a property to this eg this.privateVar
  • others might adopt a convention where private variables might be prefixed with '_'; so this._privateVar
  • personally, I think it is still a little too easy to use such state and I prefer the ability to encapsulate it inside a lexical closure

Is it possible to share lexical state of an object with a prototype?

Sharing lexical state with prototype functions and properties

In the example below, we share lexical state information (information that is encapsulated within a lexical closure) with an object's prototype or properties.

  • The cost of doing this however is that each prototype function or property is defined on a per-instance basis
  • I personally don't have trouble with this; but one reason people use prototypes is that they can define prototype functions once and share across all instances
    • as mentioned, the cost you pay for doing this is that in order for such functions to work with the state of the object, it has to be exposed via the this keyword eg this.privateVar

          var F,f,fa,makeF;

          F = function() {
              this.pubval = 'mess with me!';  // public value
              
              var secret1 = 'secret-1';  // privileged function based
              var secret3 = 'secret-3';  // defineProperty based
              this.secret1 = function() { return secret1; }
              
              Object.defineProperty(this,"secret3", {
                  get: function() { return secret3; },
              });
          }

          makeF = function(secrets) {
              F.prototype = {
                  secret2: function() { return secrets.secret2; },
              };
              Object.defineProperty(F.prototype,"secret4", {
                  get: function() { return secrets.secret4; },
              });
              return new F();
          }

          f = makeF({secret2:'secret-2',secret4:'secret-4',});
          fa = makeF({secret2:'secret-2a',secret4:'secret-4a',});
          f.secret1();
          f.secret2();
          fa.secret2();
          f.secret4;
          fa.secret4;
          f.secret3;

Breaking this down:

  • F is a constructor which we'll instantiate (new F())
  • makeF provides a closure into which we can pass secret state information; makeF is used to instantiate instances of F; we don't call new F() directly
  • secret1 is accessed using the standard privileged function approach; this.secret1 is a per-instance function
  • secret2 is accessed via a prototype function; note however that the prototype for any instance of F will be different and will have access to a different set of secrets (passed in to makeF); it is per-instance.
  • secret3 is similar to secret1 but uses relatively new construct called javascript properties; this allows us to access secret3 without having to explicitly call a function
    • also note that since we only define get for this property it is readonly; any attempt to write to it will cause an error
  • secret4 is a property of F.prototype
  • also note:
    • that we have to define our prototype for F within the lexical scope of makeF. We can't define it elsewhere and instantiate it within makeF. This could be a limiting factor for people who like to compose prototypes.
    • that we can readily define privileged property secret3 inside F's lexical enclosure; we can't do this with F.prototype.
      • update [27-Aug-2010]
        • Actually we can in some instances. Some javascript implementations support the __proto__ property; this property is applied to instances of a constructor and points to its prototype.
            var F = function() {
               var secret5 = 'secret-5';
               ...
               // Set F's prototype within lexical enclosure of F:
               this.__proto__ =  {
                   secret5: function() { return secret5; },
               };
               ...
            }
            var f = new F();
            f.secret5();
          
        • What's a little unusual in the above is that we seem to be implicitly setting F.prototype for a specific instance of F via __proto__; (the resulting prototype is still obviously on a per-instance basis as previously noted).
        • __proto__ is effectively giving us privileged prototypes
        • The above approach actually worked when I tested it on a recent version of Google Chrome and Firefox 3.6.
    • defineProperty is not supported in Firefox 3.x; it may be available in firefox 4.

Shared prototype functions and properties that are also privileged

Is it possible to have prototype functions or properties that have privileged access to lexical state information but which are shared across all instances of F?

I have not found a satisfactory work-around.

Part of the problem is that the very nature of lexical scope itself forces any privileged functions to be defined and created along with the state for a given instance of an object. If they are defined outside of the object's lexical state, they will not have access to it and so are not privileged

Flawed example

In this example

  • we sequence each instance of F with a unique id
  • we store all private state for all instances of F in an associative array secrets using the id as key
  • One obvious flaw in this approach is that secrets object will simply keep growing. If an instance of F is garbage collected, its state will remain in secrets.
  • We employ an anonymous closure [(function(){...}())] over secrets,F and F.prototype. We repeat this pattern as needed wherever we want to share secrets between these 3 elements.
  • I do not recommend the use of secrets here for storing state; I am just looking for work-arounds

          var F,f;
          
          (function(){
              var id=0;
              
              // This object will just keep growing as we
              // create more instances of F(!!)
              var secrets={};
              
              var seqf = function(){return ++id;}
              
              F = function() {
                  var id = seqf();
                  secrets[id] = {
                      secret1:'secret-1',
                      secret2:'secret-2',};
                  Object.defineProperty(this,'id',{
                      get: function() {return id;},
                  });
              }
              
              F.prototype = {
                  secret1: function() {
                      return secrets[this.id].secret1; },
              };
              
              Object.defineProperty(F.prototype,'secret2',{
                  get: function() {return secrets[this.id].secret2},
              });
              
          }());
          
          f = new F();
          f.secret1();
          f.secret2;

Non-lexical (non-privileged), "dynamic" sharing of secret information

One way to share secret information between an Object and a non-privileged function is to get the object to call the function and pass it some private state. An explicit parameter would need to be passed.

      var m = function(o,...) { ... do something with 'o' ... }
      var F = function() {
        ...
        var secrets = {...};
        this.f = function() {
          ...
          m(secrets);
          ...
        }           
      }         
      var f = new F();

In the above o is some sort of object that we pass to m for m to do its work. F defines secrets which is used by the privileged function when invoking m.

This actually defers the problem. We still need a privileged function this.f in order to have access to secrets which we then pass to m.

Javascript allows us to do this without having to specify o directly. We omit o:

      var m = function(...) { ... do something with 'this' ... }

Then we call m using javascript's call:

      m.call(secrets,...);

where secrets is some lexically bound object that we pass to m. The trouble with this is that there's no obvious indication that we have to call m. If we don't, then the default this will be used based on the object that contains the function. This may be acceptable, but remember that the thing I'm trying to investigate here is sharing privileged or hidden state defined within the lexical scope of an object not the object's public or privileged methods and fields.

Other languages:

  • ruby provides ways to define and rebind a method so that it has access to instance variables within an existing object instance; however ruby makes a precise distinction between instance variables and local variables which isn't so much the case with javascript
  • common lisp provides
    • dynamic scope (usually referred to now as 'special scope') which might be one way to get around the strictures of lexical scope and allow access to variables bound dynamically within a function but which may lead to subtle bugs and interactions - this is one of the reasons lexical scope is now the standard scoping mechanism
    • free variables; a technique I'm not familiar with but which (via lisp's macro system) might also be used to capture information

Wednesday, July 21, 2010

Using Rhino Javascript with Java

Javascript, Rhino and Java

This article looks at

  • installing rhino
  • using rhino to extend interfaces and abstract classes
  • part of the motivation for this article is that I want to show in future articles a way to build a repl on top of rhino's shell and use it to explore java technologies such as neo4j in an interactive and exploratory way
  • this article is just covering the basics

I recently got into rhino and I've found it to be not only a lot of fun but also a useful way to explore the java universe. This is a universe that is populated by such things as neo4j, apache software such as apache derby, lucene, and of course xml technologies such as existdb discussed recently here and webservices and semantic web technologies such as openrdf and so on.

There are many ways to explore this universe without having to write java and besides using rhino. So is there a reason to use rhino over jruby, or clojure or scala etc? I can't give an objective answer, but some points:

  • I'm attracted to javascript because it is a very simple language; a bit like a stripped down version of scheme without the fun stuff but with a far more familiar C-like syntax and, of course, closures.
  • rhino and some other javascript variants also have an interesting extension that makes xml a native datatype; this is not something I'll explore here but makes it an interesting fit for xml-centric java technologies; rhino has e4x built in and ready to go
  • javascript is quite possibly the most well-known language on the planet thanks to its close ties with the web and html
  • javascript can provide a scripting glue for coordinating java code where the latter performs the heavy lifting; this is a familiar paradigm that goes at least as far back as Unix itself with its C libraries and shell interpreters. Hence we can do things like run tests or engage in exploratory programming using rhino overlaid on a java application
    • of course, this last point applies to any language that runs on the jvm such as jruby etc
  • I am not sure if all other languages that run on the jvm can extend an abstract class; this is definitely something that rhino can do

Installing Rhino

  • I recommend installing rhino from the people who made it rather than the version that comes with more recent versions of Sun (Oracle's) java

    • one key difference is that the mozilla version of rhino allows you to extend abstract classes and implement multiple interfaces; the current version from sun allows you to implement only one interface
      • to quote this article:

        "Rhino's JavaAdapter has been removed. JavaAdapter is the feature by which a Java class can be extended by JavaScript and Java interfaces may be implemented by JavaScript. This feature also requires a class generation library. We have replaced Rhino's JavaAdapter with Sun's implementation of the JavaAdapter. In Sun implementation, only a single Java interface may be implemented by a JavaScript object."

  • Of course wanting to do more than implement a java interface is probably rare - to be honest I don't really know. But just for kicks I'm going to show an example of extending an abstract class in this article.

  • I've been using rhino1_7R2 which is the latest version
    • the download page is here
    • you just have to download the zip file and unpack it somewhere
    • once you've unpacked it, you can go in and
      • look at the javadocs
      • look at the source
    • but all you really need is one jar file, js.jar which should be in the root directory; that's all you need to get rhino going (besides a working implementation of java 6)
      • ignore the js-14.jar ; it lacks some features of js.jar (I can't remember what they are)

Setting up in bash (on linux)

  • Here are some bash shell script variables functions for running rhino
    • JAVAOPTS=-Djava.net.preferIPv4Stack=true
      • this is to force java to ipv4; for some reason on my system, it defaults to ipv6 regardless of settings I have made in /etc/java-6-sun; this has nothing to do with rhino and you can safely exclude it if you don't suffer this problem; or use it to pass other options
    • RHINO_CLASSPATH=/home/danb/javascript/rhino/rhino1_7R2
      • this is my path to where I unpacked the rhino zip file which you will need to change to your path for your setup
  • This bash function sets up a basic rhino shell
    js() {
         rlwrap java $JAVAOPTS -cp $RHINO_CLASSPATH/js.jar \
                  org.mozilla.javascript.tools.shell.Main $*
    }
    
  • I usually have the variables and functions like the above in a file that I source into my interactive shell:
    bash> . rhino-utils.sh
    
  • Java recommends you pass your classpaths via the -cp switch to java (rather than using environment CLASSPATH variable)
    • one gotcha is that when including jar files you must specify the name of the jar file, or include a wildcard asterisk; simply saying $RHINO_CLASSPATH by itself is not enough to pick up jar files (only .class files)
  • rlwrap is an optional extra I throw in to give the rhino shell readline capabilities similar to bash.
    • apparently rhino can be run with jline - but this is something I haven't tried yet

Test it

  • Create a file: test.js
    • in it type:
      print('hello world!');
      
  • Run it
    bash> js test.js
    hello world!
    
    • it may take a while for the script to run as java starts up; but you should see "hello world!"
  • You may be wondering where 'print' comes from; it is part of a bunch of rhino utilities that are available to you when running the rhino interpreter

Error messages

  • When you get errors running rhino shell, you may find the error messages unhelpful and distinctly lacking with regards line numbers
  • This can be remedied by adding a "-debug" switch to your invocation

Compiling

  • Rhino can be compiled into java bytecode
  • This will not yield the same type of performance as compiled java however so don't get too excited.
  • Add the following functions to your bash file and re-source it into your shell:
    compile() {
      java $JAVAOPTS -cp $RHINO_CLASSPATH/js.jar \
        org.mozilla.javascript.tools.jsc.Main $*
    }         
    run() {
      java $JAVAOPTS -cp .:$RHINO_CLASSPATH/js.jar $1
    }
    
  • Now compile the previous test.js file
    bash> compile test.js
    
    • this should produce a file called test.class
  • Now run it with
    bash> run test
    hello world!
    

Implementing an interface

  • Java interfaces are one of the most powerful features of the java language; most good java api's will make specific use of them; and rhino makes it extremely easy to implement them.
  • The classic example given is to implement a Runnable interface for a java thread
  • Rhino makes this as easy as instantiating the interface like an object and passing a javascript object literal with the required function definitions as members:
        var Runnable = java.lang.Runnable;
        var Thread = java.lang.Thread;
        var System = java.lang.System;
        var coffee = {run: function() {
                             Thread.sleep(2);
                             for(var i=0;i<10;i++) System.out.println("I like coffee!");} };
        var tea    = {run: function() {
                             for(var i=0;i<10;i++) System.out.println("I like tea!");} };
        var r1 = new Runnable(coffee);
        var r2 = new Runnable(tea);
        var t1 = new Thread(r1);
        var t2 = new Thread(r2);
        t1.start();
        t2.start();
    
    • In the above example we create 2 threads and start them (using the rhino shell)
    • Notes
      • I've delayed the first thread a little
      • we pass coffee and tea (javascript object literals) to Runnable, a java interface
      • we defined the java classes we want to use upfront; in this case: Runnable, Thread and System
        • to be more precise we could have said:
            var Runnable = Packages.java.lang.Runnable;
          
          • "Packages" is a global provided by rhino to access java packages
            • java.* packages (and possibly some others) are made available without having to use "Packages"
          • Be aware that there are other ways to import packages and specific classes into rhino's global namespace such as importPackage function and JavaImporter constructor
  • When we instantiate Runnable (in javascript) what's really happening is this:
     var r1 = new JavaAdaptor(Runnable,coffee);
    
    • rhino is just giving us a nice shorthand for making use of its JavaAdaptor
  • It's worth taking 5 minutes to read this article as it sets out a lot of the basics about how rhino javascript interoperates with its java host

Jetty and java servlets: Extending an abstract class

We are going to create a HelloWorld servlet in rhino.

I am not suggesting rhino should be used to create servlets. This is merely an example I worked on whilst playing with jetty; and information on extending abstract classes (a necessity for servlets, for instance) is rather sparse.

  • jetty is an interesting platform; some of my motivations for looking at it are:
    • support for continuations
    • ability to run multiple containers under one jvm
      • this isn't a special feature; however it is not trivial, for instance, to run several small independent sites on one rails app (a ruby process)
    • hot deployment class loading
    • an apparently good api for asynchronous web service calls
  • documentation for these things is not exactly lavish; these were just points of interest that I noticed and I may even be incorrect on their status;

  • I am going to use the latest jetty, version 7

    • it is part of the eclipse stable
    • the version I ended up downloading was jetty-hightide-7.1.4.v20100610.tar.gz
    • all the jar files you need are in the lib directory
  • First we build the servlet; create a HelloServlet.js file with one method:

        // HellowServlet.js file
    
      function doGet(httpServletRequest, httpServletResponse) {
          httpServletResponse.setContentType("text/plain");
          var out = httpServletResponse.getWriter();
          out.println("Hello World - we're written in javascript!");
          out.close();
      }
    
  • this method will override the doGet method provided by the abstract http servlet class.
  • Now create a second file, launch.js
        // launch.js file
    
      var Server = org.eclipse.jetty.server.Server;
      var SocketConnector = org.eclipse.jetty.server.bio.SocketConnector;
      var ServletHandler = org.eclipse.jetty.servlet.ServletHandler;
      var server,connector,handler;
    
      server = new Server();
      //server.addListener(":8070");  // Not in Jetty 7
    
      connector = new SocketConnector();
      connector.setPort(8080);
    
      //server.setConnectors(new Connector[] { connector });  // java version
      server.setConnectors([ connector ]);  // js version
    
      handler = new ServletHandler();
      server.setHandler(handler);
    
      handler.addServletWithMapping("HelloServlet","/");
    
      server.start();
      server.join();
    
  • notes

    • we're embedding jetty into our rhino app
    • there appear to be some changes between version 6 and 7 of jetty; this code is for version 7
    • I source the java objects I need from jetty into my rhino context at the beginning; these are Server, SocketConnector, ServletHandler.
    • note how the 'setConnectors' call uses a js array rather than a java array
    • this example is based on a minimal embedded example provided in jetty7's source;
      • you may need to download the source separate to the above tarball;
      • once unpacked the example is at:
        • jetty/example-jetty-embedded/src/main/java/org/eclipse/jetty/embedded
    • addServletWithMapping expects a class file; so we will compile our HelloServlet.js servlet above
  • the following is the bash script you could use to first compile and then run jetty using the 2 files above; you will need to modify JAVAOPTS, JETTY_CLASSPATH and RHINO_CLASSPATH:

      JAVAOPTS=-Djava.net.preferIPv4Stack=true
      JETTY_CLASSPATH=/home/danb/javascript/rhino/jetty/jetty/jetty-hightide-7.1.4.v20100610/lib/
      RHINO_CLASSPATH=/home/danb/javascript/rhino/rhino1_7R2
    
    jetty.compile() {
      java $JAVAOPTS -cp .:$JETTY_CLASSPATH/\*:$RHINO_CLASSPATH/js.jar \
        org.mozilla.javascript.tools.jsc.Main \
        -extends javax.servlet.http.HttpServlet \
        HelloServlet.js
    }
    
    jetty.launch() {
      java $JAVAOPTS -cp .:$JETTY_CLASSPATH/\*:$RHINO_CLASSPATH/js.jar \
        org.mozilla.javascript.tools.shell.Main -debug launch.js
    }   
    
  • notes

    • you may need to escape asterisks in your classpaths to prevent the shell from expanding them
    • your shell might not support dots in shell function names, so replace as required
  • compile the servlet

    bash> jetty.compile
    
    • this should produce 2 class files
  • then launch it
    bash> jetty.launch
    
  • head over http://localhost:8080 and you should see
    Hello World - we're written in javascript!
    

Concluding Remarks

Rhino is a lot of fun. If you want to play with some java tech but don't want the verbosity of java, rhino is a very flexible means to do this. Rhino imports the scheme-like simplicity of javascript straight into java along with a shell to help you explore it interactively. I hope to explore the shell and repl features for interacting with java in a later post.