kopia lustrzana https://github.com/c9/core
166 wiersze
6.8 KiB
Markdown
166 wiersze
6.8 KiB
Markdown
|
treehugger.js
|
||
|
=============
|
||
|
|
||
|
`treehugger.js` is a Javascript library for program processing. It
|
||
|
has generic means to represent and manipulate (analyze, transform)
|
||
|
[abstract syntax trees (ASTs)](http://en.wikipedia.org/wiki/Abstract_syntax_tree).
|
||
|
It consists of three parts:
|
||
|
|
||
|
* A generic ASTs representation format, inspired by [ATerms](http://www.meta-environment.org/Meta-Environment/ATerms))
|
||
|
that can be used to represent programs written in any language (Java, Ruby, Javascript)
|
||
|
`lib/treehugger/tree.js`
|
||
|
* A set of generic traversals to query, manipulate and annotate these ASTs,
|
||
|
inspired by [Stratego/XT](http://strategoxt.org)
|
||
|
`lib/treehugger/traverse.js`
|
||
|
* A set of analyses for specific languages. Currently:
|
||
|
- Javascript: a ([UglifyJS](https://github.com/mishoo/UglifyJS)-based) parser
|
||
|
and analyses reconstructing the type structure and first attempts at type-inference.
|
||
|
`lib/treehugger/js/*.js`
|
||
|
|
||
|
The project relies on [require.js](http://requirejs.org) for library loading.
|
||
|
|
||
|
AST Representation
|
||
|
------------------
|
||
|
|
||
|
ast.js uses a few simple data structures to represent ASTs and a textual representation
|
||
|
that makes it easy to debug and write these ASTs. The textual representation
|
||
|
is best introduced by example.
|
||
|
|
||
|
Consider a simple expresion language with expression as follows:
|
||
|
|
||
|
2 + 3 * 1
|
||
|
|
||
|
A parser could turn this into the following AST:
|
||
|
|
||
|
Add
|
||
|
/ \
|
||
|
Num Mul
|
||
|
| / \
|
||
|
2 Num Num
|
||
|
| |
|
||
|
3 1
|
||
|
|
||
|
Which is expressed using treehugger.js's textual representation as follows:
|
||
|
|
||
|
Add(Num("2"), Mul(Num("3"), Num("1")))
|
||
|
|
||
|
Using the `treehugger.js` API this AST can be contructed as follows:
|
||
|
|
||
|
var tree = require('treehugger/tree');
|
||
|
var node = tree.cons("Add", [tree.cons("Num", [tree.string("2")]),
|
||
|
tree.cons("Mul", [tree.cons("Num", [tree.string("3")]),
|
||
|
tree.cons("Num", [tree.string("1")])])]);
|
||
|
|
||
|
Or, more simply:
|
||
|
|
||
|
var node = tree.parse('Add(Num("2"), Mul(Num("3"), Num("1")))');
|
||
|
|
||
|
treehugger.js has three kinds of AST node types:
|
||
|
|
||
|
* Strings (e.g. `"2"`, `"myVariable"`), usually representing identifiers or other
|
||
|
textual values.
|
||
|
* Lists (e.g. `["a", "b", None()]`)
|
||
|
* Constructors (or simply: cons nodes) (e.g. `None()` or `Num("2")`), used to represent
|
||
|
language constructs in the language, such as operators, loop constructs, etc.
|
||
|
|
||
|
|
||
|
AST Analysis
|
||
|
------------
|
||
|
|
||
|
Treehugger.js is based on the concept of _generic traversals_. A generic traversal
|
||
|
traverses the tree in a particular order, for instance from top to bottom or from
|
||
|
the bottom up. At every node the traversal comes accross you can apply one or transformations.
|
||
|
Tranformations can either be AST _patterns_, _transformation functions_ or a combination of both.
|
||
|
|
||
|
The `treehugger/traverse.js` library adds a number of methods to AST nodes that make traversals simpler:
|
||
|
|
||
|
* `collectTopDown` (traverses the tree top to bottom until finding a match, collecting all matches and returning them as a list)
|
||
|
* `traverseTopDown` (traverses the tree top to bottom until finding a match)
|
||
|
* (more are coming)
|
||
|
|
||
|
A _transformation_ can be either:
|
||
|
|
||
|
* A textual AST pattern
|
||
|
* A textual AST pattern followed by a transformation function that is passed a binding object
|
||
|
* A transformation function
|
||
|
|
||
|
Each transform either _matches_ or _doesn't_ match. If a transformation function matches,
|
||
|
it returns a non-falsy value (usually a new AST node), if it doesn't, it returns `false`.
|
||
|
A simple example:
|
||
|
|
||
|
var node = tree.parse('Add(Num("2"), Mul(Num("3"), Num("1")))');
|
||
|
node.collectTopDown("Num(_)").debug();
|
||
|
|
||
|
This will traverse the AST and look for nodes that match the Num(_) _pattern_,
|
||
|
where `_` (a wildcard) can be anything. The `collectTopDown` traversal traverses
|
||
|
the AST from top to bottom, and on the way collects all matches and returns them
|
||
|
as an `ListNode`. The result of the `collectTopDown` call in this case will be:
|
||
|
|
||
|
[Num("2"), Num("3"), Num("1")]
|
||
|
|
||
|
The `.debug()` call prints the result to the Javascript console (`console.log`).
|
||
|
|
||
|
So, what if we want to only return the numbers, not the `Num(...)` constructors?
|
||
|
If we follow a textual pattern by a function, we can transform the result:
|
||
|
|
||
|
var node = tree.parse('Add(Num("2"), Mul(Num("3"), Num("1")))');
|
||
|
node.collectTopDown("Num(n)", function(b) { return b.n; }).debug();
|
||
|
|
||
|
Instead of using the placeholder `_`, we now used `n`. The function is passed a
|
||
|
_bindings_ object whose `n` property will contain the value of the placeholder.
|
||
|
So, the following will be printed to the Javascript console:
|
||
|
|
||
|
["2", "3", "1"]
|
||
|
|
||
|
If we want to match _either_ `Num` or `Mul` nodes we can add a pattern for that
|
||
|
to the `collectTopDown` call:
|
||
|
|
||
|
var node = tree.parse('Add(Num("2"), Mul(Num("3"), Num("1")))');
|
||
|
node.collectTopDown(
|
||
|
"Num(n)", function(b) {
|
||
|
return b.n;
|
||
|
},
|
||
|
"Mul(op1, op2)", function(b) {
|
||
|
return b.op1;
|
||
|
}
|
||
|
).debug();
|
||
|
|
||
|
This will print:
|
||
|
|
||
|
["2", Num("3")]
|
||
|
|
||
|
Why is that? The AST is traversed top to bottom by `collectTopDown`. On its way, it will
|
||
|
try to match every node first against the `Num(n)` pattern. If that succeeds,
|
||
|
the function after it in the argument list is executed, if the function returns
|
||
|
a value other than `false`, the traversal stops at that branch and the result
|
||
|
is added to the result list. If the `"Num(n)"` pattern did _not_ match, it is
|
||
|
matched against `"Mul(op1, op2)"`. Again, if this pattern matches, the function
|
||
|
is executed etc. etc. The `collectTopDown` traversal will not traverse down to the
|
||
|
`Num("3")` and `Num("1")` nodes, because the traversal stopped when matching the
|
||
|
`Mul(..., ...)` node -- if, instead the tranformation function would have returned
|
||
|
`false, the traversal would have proceeded down those nodes as well.
|
||
|
|
||
|
License (MIT)
|
||
|
-------------
|
||
|
|
||
|
Permission is hereby granted, free of charge, to any person obtaining a copy of this
|
||
|
software and associated documentation files (the "Software"), to deal in the Software
|
||
|
without restriction, including without limitation the rights to use, copy, modify,
|
||
|
merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
|
||
|
permit persons to whom the Software is furnished to do so, subject to the following
|
||
|
conditions:
|
||
|
|
||
|
The above copyright notice and this permission notice shall be included in all copies
|
||
|
or substantial portions of the Software.
|
||
|
|
||
|
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
|
||
|
INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
|
||
|
PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
|
||
|
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
||
|
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
||
|
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
||
|
|
||
|
Demo
|
||
|
----
|
||
|
|
||
|
You can see [treehugger.js in action in this simple demo](http://ajaxorg.github.com/treehugger/test.html).
|