1 | const config = require('/path/to/file') |
The main object exported by the require
module is a function. When Node invokes that require()
function with a local file path as the function’s only argument, Node goes through the following sequence of steps:
Resolving: To find the absolute path of the file.
Loading: To determine the type of the file content.
Wrapping: To give the file its private scope. This is what makes both the
require
andmodule
objects local to every file we require.Evaluating: This is what the VM eventually does with the loaded code.
Caching: So that when we require this file again, we don’t go over all the steps another time.
Resolving a local path
1 | // in NODE REPL |
Every module object gets an id
property to identify it. This id
is usually the full path to the file, but in a REPL session it’s simply <repl>
.
Node modules have a one-to-one relation with files on the file-system. We require a module by loading the content of a file into memory.
However, since Node allows many ways to require a file(for example, with a relative path or a pre-configured path), before we can load the content of a file into the memory we need to find the absolute location of that file.
When we require a find-me
module, without specifying a path:
1 | require('find-me') |
Node will look for find-me.js
in all the paths specified by module.paths
in order.
1 | // in NODE REPL |
The paths list is basically a list of node_modules directories under every directory from the current directory to the root directory. It also includes a few legacy directories whose use is not recommended.
If Node can’t find find-me.js
in any of these paths, it will throw a ‘cannot find module error’.
Requiring a folder
Modules don’t have to be files. We can also create a find-me
folder under node_modules
and place an index.js
file in there. The same require('fine-me')
line will use that folder’s index.js
file.
An index.js
file will be used by default when we require a folder, but we can control what file name to start under the folder using the main
property in package.json
.
require.resolve
If you want to only resolve the module and not execute it, you can use the require.resolve
function. This behaves exactly the same as the main require
function, but does not execute the file.
This can be used, for example, to check whether an optional package is installed or not and only use it when it’s available.
exports, module.exports, and asynchronous loading of modules
In any module, exports is a special object.
exports
variable inside each module is just a reference to module.exports
which manages the exported properties.
When we reassign the exports
variable, that reference is lost and we would be introducing a new variable instead of changing the module.exports
object.
The module.exports
object in every module is what the require
function returns when we require that module.
Let’s talk about the loaded
attribute on every module. The module
module use the loaded
attribute to track which modules have been loaded(true value) and which modules are still being loaded(false value).
The exports
object becomes complete when Node finishes loading the module(and label it so). The whole process of requiring/loading a module is synchronous. That’s why we were able to see the modules fully loaded after one cycle of the event loop.
This also means that we cannot change the exports
object asynchronously.
JSON and C/C++ addons
We can natively require JSON files and C++ addon files with the require function. You don’t need to specify the extensions.
If a file extension was not specified, the first thing Node will try to resolve is a .js
, then .json
, and it will parse the .json
file if found as a JSON text file. After that Node will try to find a binary .node
file.
Requiring JSON file is useful, for example, everything you need to manage in that file is some static configuration values, or some values that you periodically read from an external source. For example, if we had the following config.json
file:
1 | { |
We can require it directly like this:
1 | const { host, port } = require('./config') |
All code you write in Node will be wrapped in functions
Node’s wrapping of modules is often misunderstood. To understand it, let me remind you about the exports/module.exports
relation.
We can use the exports
object to export properties, but we cannot replace the exports
object directly because it’s just a reference to module.exports
.
1 | exports.id = 52; // This is ok |
How exactly does this exports
object, which appears to be global for every module, get defined as a reference on the module
object?
In a browser, when we declare a variable in a script like this:
1 | var answer = 12; |
That answer
variable will be globally available in all scripts after the script that define it.
This is not the case in Node. When we define a variable in one module, the other module modules in the program will not have access to that variable. So how come variables in Node are magically scoped?
The answer is simple. Before compiling a module, Node wraps the module code in a function, which we can inspect using the wrapper
property of the module
module.
1 | ~ $ node |
Node does not execute any code you write in a file directly. It executes this wrapper function which will have your code in its body.
This is what keeps the top-level variables that are defined in any module scoped to that module.
This wrapper function has 5 arguments: exports
, require
, module
, __filename
, __dirname
. This is what makes them appear to look global when in fact they are specific to each module.
All of these arguments get their value when Node execute the wrapper function. exports
is defined as a reference to module.exports
prior to that. require
and module
are both specific to the function to be executed, and __dirname
/__filename
variables will contain the wrapped module’s directory path and absolute filename.
Since every module gets wrapped in a function, we can actually access that function’s arguments with the arguments
keyword:
1 | ~/learn-node $ echo "console.log(arguments)" > index.js |
What happens is roughly equivalent to:
1 | function (require, module, __filename, __dirname) { |
If we change the whole exports
object, it would no longer be a reference to module.exports
.
The require object
There is nothing special about require
. It’s an object that acts mainly as a function that takes a module name or path and return the module.exports
object. We can simply override the require
object with our own logic if we want to.
For exmaple, maybe for testing purposes, we want every require
call to be mocked by default and just return a fake object instead of the required module exports object.
1 | require = function () { |
After doing the above reassignment of require
, every require('something')
call in the script will just return the mocked object.
The require object also has properties of its own. We’ve seen the resolve
property, which is a function that performs only the resolving step of the require process. We’ve also seen require.extensions
above.
There is also require.main
which can be helpful to determine if the script is being required or run directly.
Say, for example, that we have this simple printFrame
function in print-in-frame.js
1 | // In print-in-frame.js |
We want to use this file in two ways:
- From the command line directly like this:
1 | node print-in-frame 8 hello |
- With
require
1 | const print = require('./print-in-frame.js') |
Those are two different usages. We need a way to determine if the file is being run as a stand-alone script or if it is being required by other scripts.
1 | const printFrame = (size, header) => { |
All module can be cached
Caching is important to understand.
Say that we have the following ascii-art.js
file that prints a cool looking header:
1 | ##### |
We want to display this header every time we require
the file, so when we require the file twice, we want the header to show twice:
1 | require('./ascii-arg'); // it will show the header |
The second require will not show the header because of modules’ caching. Node caches the first call and does not load the file on the second call.
We can see this cache by printing require.cache
after the first require. The cache registry is simply an object that has a property for every required module. Those properties values are the module
objects used for each module. We can simply delete a property from that require.cache
object to invalidate that cache. If we do that, Node will reload the module to recache it.