This document outlines modifications to how the compiler processes CommonJS require calls.
These changes are desirable in order to allow Node.js modules to be either included or excluded from the compiler.
Currently, if a project uses a native node module like fs the source code must omit the require statement (relying on node --externs) in order for the code to compile. This is undesirable because it breaks the code for the compiler. It should be possible to keep the require statement and have the compiler ignore it. This can be achieved currently by aliasing the require function (e.g var externalRequire = require) and then using externalRequire for modules that shouldn't be compiled. This is a useful but hacky solution to the problem.
In addition, it is currently impossible for a node_module which supports Closure compiler to be built at the same time as the main code as the compiler cannot find the source files from the require statement alone (e.g require("q"))
Currently, the compiler processes CommonJS require(<modname>) statements by expanding it like a macro into:
goog.require(module$<modname>)
goog.provide(module$<modname>)
This works well when <modname> is a file path (like ../foo/bar) which is also provided to the compiler at compile time.
The problem with this approach is that it assumes that the required module code will be supplied to the compiler (e.g via --js). However, the module may not always be included if:
- it is a native module where providing the source is not possible or practical like
fsorcrypto. - it is a
node_modulewhich is not compatible with Closure compiler or should be excluded for some other reason.
In these cases, the desired behaviour is that the compiler ignores these require statements and does not expand them.
The proposed modifications would:
- Introduce a new annotation
@externalRequirewhich would make the compiler ignore (not expand) the annotatedrequirestatement. - Apply a lookup algorithm specific to Node.js to resolve
node_modulesfile paths.
The lookup algorithm would be applied only if the require statement cannot be resolved (e.g because it isn't a file path). A full description of the algorithm can be found at https://nodejs.org/api/modules.html#modules_loading_from_node_modules_folders . The algorithm would:
- check if it is a native node module like
fs. If it is, return that (this would be the extern file). Else, continue. - check the parent directory of the current script for a
node_modulesdirectory. - If found, it would look for a directory within it which matches the name of the module.
- If not found, it would go up a directory and look again for
node_modules. This process would be repeated until the root of the file system is reached, where it will terminate.
If a matching directory is found, it would:
- Look for a
package.jsonfile. - If found, it would look for the
mainkey and resolve the js file referenced. This would be the file used by Closure compiler for therequirestatement. - If not found (or
package.jsondoes not have amainkey), it would attempt to resolve the filesindex.jsandindex.nodein that directory.
This change was originally desired in order to compile Node.js code without modifications to the actual source code. Currently, node projects which want to be built using closure compiler have to be pre-processed by resolving require dependencies and concatenating the output to the js files. This is suboptimal as it makes compiler errors and warnings harder to identify and provides no way to exclude certain requires. Given there is already some CommonJS support in the compiler, it makes sense to fully support this use case.
For excluding modules, the decision to introduce a new annotation was made in order to clarify the intent of the code. The alias technique described earlier works but makes it hard to understand why this is being done without additional comments. The alias technique also unnecessarily contorts the code by introducing another variable simply for the compiler.
// closure compiler needs this in order to compile
var externalRequire = require;
var crypto = externalRequire("crypto");vs
/** @externalRequire */
var crypto = require("crypto");The annotation makes it clear that it is for an external tool rather than for the code being written.
For including modules, the algorithm employed is the same as the Node.js implementation. By using the same algorithm, it makes Closure compiler much more compatible with existing node code without modifications. For example, without this algorithm, developers would need to modify their code from require("q") to directly reference the source file require("../node_modules/q/index.js").
If the policy for including node modules is introduced, the current extern behaviour for node modules could be improved. Currently, the compiler incorrectly assumes global variables for node modules matches the name of the module. For example, you can call methods on an undeclared crypto variable that will resolve to the crypto node extern file if you add --extern crypto.js. However, it is entirely possible that the developer is already using the crypto variable for something else and is using the crypto node module under the variable nodeCrypto. The compiler doesn't handle this use case well currently. The proposed policy could fix this by correctly resolving var nodeCrypto = require("crypto"); to the module.exports of the extern file crypto.js, rather than trying to piggy back off the global variable name.
Our biweekly meeting was cancelled this week, so it will be a couple of weeks before we can discuss this more.