1
const config = require('/path/to/file')

The main object exported by the require module is a function. When Node invokes that require() function with a local file path as the function’s only argument, Node goes through the following sequence of steps:

  • Resolving: To find the absolute path of the file.

  • Loading: To determine the type of the file content.

  • Wrapping: To give the file its private scope. This is what makes both the require and module objects local to every file we require.

  • Evaluating: This is what the VM eventually does with the loaded code.

  • Caching: So that when we require this file again, we don’t go over all the steps another time.

Resolving a local path

1
2
3
4
5
6
7
8
9
10
11
// in NODE REPL
> module
Module: {
id: '<repl>',
exports: {},
parent: undefined,
fielname: null,
loaded: false,
children: [],
paths: [...]
}

Every module object gets an id property to identify it. This id is usually the full path to the file, but in a REPL session it’s simply <repl>.

Node modules have a one-to-one relation with files on the file-system. We require a module by loading the content of a file into memory.

However, since Node allows many ways to require a file(for example, with a relative path or a pre-configured path), before we can load the content of a file into the memory we need to find the absolute location of that file.

When we require a find-me module, without specifying a path:

1
require('find-me')

Node will look for find-me.js in all the paths specified by module.paths in order.

1
2
3
4
5
6
7
8
9
10
// in NODE REPL
> module.paths
[ '/Users/samer/learn-node/repl/node_modules',
'/Users/samer/learn-node/node_modules',
'/Users/samer/node_modules',
'/Users/node_modules',
'/node_modules',
'/Users/samer/.node_modules',
'/Users/samer/.node_libraries',
'/usr/local/Cellar/node/7.7.1/lib/node' ]

The paths list is basically a list of node_modules directories under every directory from the current directory to the root directory. It also includes a few legacy directories whose use is not recommended.

If Node can’t find find-me.js in any of these paths, it will throw a ‘cannot find module error’.

Requiring a folder

Modules don’t have to be files. We can also create a find-me folder under node_modules and place an index.js file in there. The same require('fine-me') line will use that folder’s index.js file.

An index.js file will be used by default when we require a folder, but we can control what file name to start under the folder using the main property in package.json.

require.resolve

If you want to only resolve the module and not execute it, you can use the require.resolve function. This behaves exactly the same as the main require function, but does not execute the file.

This can be used, for example, to check whether an optional package is installed or not and only use it when it’s available.

exports, module.exports, and asynchronous loading of modules

In any module, exports is a special object.

exports variable inside each module is just a reference to module.exports which manages the exported properties.

When we reassign the exports variable, that reference is lost and we would be introducing a new variable instead of changing the module.exports object.

The module.exports object in every module is what the require function returns when we require that module.

Let’s talk about the loaded attribute on every module. The module module use the loaded attribute to track which modules have been loaded(true value) and which modules are still being loaded(false value).

The exports object becomes complete when Node finishes loading the module(and label it so). The whole process of requiring/loading a module is synchronous. That’s why we were able to see the modules fully loaded after one cycle of the event loop.

This also means that we cannot change the exports object asynchronously.

JSON and C/C++ addons

We can natively require JSON files and C++ addon files with the require function. You don’t need to specify the extensions.

If a file extension was not specified, the first thing Node will try to resolve is a .js, then .json, and it will parse the .json file if found as a JSON text file. After that Node will try to find a binary .node file.

Requiring JSON file is useful, for example, everything you need to manage in that file is some static configuration values, or some values that you periodically read from an external source. For example, if we had the following config.json file:

1
2
3
4
{
"host": "localhost",
"port": 8080
}

We can require it directly like this:

1
const { host, port } = require('./config')

All code you write in Node will be wrapped in functions

Node’s wrapping of modules is often misunderstood. To understand it, let me remind you about the exports/module.exports relation.

We can use the exports object to export properties, but we cannot replace the exports object directly because it’s just a reference to module.exports.

1
2
3
exports.id = 52; // This is ok
exports = { id: 52 }; // This will not work
module.exports = { id: 52 }; // This is ok

How exactly does this exports object, which appears to be global for every module, get defined as a reference on the module object?

In a browser, when we declare a variable in a script like this:

1
var answer = 12;

That answer variable will be globally available in all scripts after the script that define it.

This is not the case in Node. When we define a variable in one module, the other module modules in the program will not have access to that variable. So how come variables in Node are magically scoped?

The answer is simple. Before compiling a module, Node wraps the module code in a function, which we can inspect using the wrapper property of the module module.

1
2
3
4
5
~ $ node
> require('module').wrapper
[ '(function (exports, require, module, __filename, __dirname) { ',
'\n});' ]
>

Node does not execute any code you write in a file directly. It executes this wrapper function which will have your code in its body.

This is what keeps the top-level variables that are defined in any module scoped to that module.

This wrapper function has 5 arguments: exports, require, module, __filename, __dirname. This is what makes them appear to look global when in fact they are specific to each module.

All of these arguments get their value when Node execute the wrapper function. exports is defined as a reference to module.exports prior to that. require and module are both specific to the function to be executed, and __dirname/__filename variables will contain the wrapped module’s directory path and absolute filename.

Since every module gets wrapped in a function, we can actually access that function’s arguments with the arguments keyword:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
~/learn-node $ echo "console.log(arguments)" > index.js
~/learn-node $ node index.js
{ '0': {},
'1':
{ [Function: require]
resolve: [Function: resolve],
main:
Module {
id: '.',
exports: {},
parent: null,
filename: '/Users/samer/index.js',
loaded: false,
children: [],
paths: [Object] },
extensions: { ... },
cache: { '/Users/samer/index.js': [Object] } },
'2':
Module {
id: '.',
exports: {},
parent: null,
filename: '/Users/samer/index.js',
loaded: false,
children: [],
paths: [ ... ] },
'3': '/Users/samer/index.js',
'4': '/Users/samer' }

What happens is roughly equivalent to:

1
2
3
4
5
function (require, module, __filename, __dirname) {
let exports = module.exports;
// Your Code...
return module.exports;
}

If we change the whole exports object, it would no longer be a reference to module.exports.

The require object

There is nothing special about require. It’s an object that acts mainly as a function that takes a module name or path and return the module.exports object. We can simply override the require object with our own logic if we want to.

For exmaple, maybe for testing purposes, we want every require call to be mocked by default and just return a fake object instead of the required module exports object.

1
2
3
require = function () {
return { mocked: true }
}

After doing the above reassignment of require, every require('something') call in the script will just return the mocked object.

The require object also has properties of its own. We’ve seen the resolve property, which is a function that performs only the resolving step of the require process. We’ve also seen require.extensions above.

There is also require.main which can be helpful to determine if the script is being required or run directly.

Say, for example, that we have this simple printFrame function in print-in-frame.js

1
2
3
4
5
6
7
// In print-in-frame.js

const printFrame = (size, header) => {
console.log('*'.repeat(size))
console.log(header)
console.log('*'.repeat(size))
}

We want to use this file in two ways:

  • From the command line directly like this:
1
node print-in-frame 8 hello
  • With require
1
2
const print = require('./print-in-frame.js')
print(5, 'hello')

Those are two different usages. We need a way to determine if the file is being run as a stand-alone script or if it is being required by other scripts.

1
2
3
4
5
6
7
8
9
10
11
12
const printFrame = (size, header) => {
console.log('*'.repeat(size))
console.log(header)
console.log('*'.repeat(size))
}

if (require.main === module) {
// the file is being executed directly
printFrame(process.argv[2], process.argv[3])
} else {
module.exports = printFrame
}

All module can be cached

Caching is important to understand.

Say that we have the following ascii-art.js file that prints a cool looking header:

1
2
3
4
5
#####
#
#####
#
#

We want to display this header every time we require the file, so when we require the file twice, we want the header to show twice:

1
2
require('./ascii-arg'); // it will show the header
require('./ascii-arg'); // it won't show the header

The second require will not show the header because of modules’ caching. Node caches the first call and does not load the file on the second call.

We can see this cache by printing require.cache after the first require. The cache registry is simply an object that has a property for every required module. Those properties values are the module objects used for each module. We can simply delete a property from that require.cache object to invalidate that cache. If we do that, Node will reload the module to recache it.