fs-walk.js/README.md

229 lines
5.1 KiB
Markdown
Raw Permalink Normal View History

2010-11-21 05:02:53 +00:00
node-walk
====
2011-02-04 07:35:56 +00:00
nodejs walk implementation.
This is somewhat of a port python's `os.walk`, but using Node.JS conventions.
* EventEmitter
* Asynchronous
* Chronological (optionally)
* Built-in flow-control
2011-05-03 03:11:03 +00:00
* includes Synchronous version (same API as Asynchronous)
As few file descriptors are opened at a time as possible.
This is particularly well suited for single hard disks which are not flash or solid state.
2010-11-21 05:02:53 +00:00
Installation
----
npm install walk
Usage
====
2011-05-03 03:11:03 +00:00
Both Asynchronous and Synchronous versions are provided.
2013-06-23 06:09:08 +00:00
```javascript
(function () {
"use strict";
var walk = require('walk')
, fs = require('fs')
, options
, walker
;
options = {
followLinks: false
// directories with these keys will be skipped
, filters: ["Temp", "_Temp"]
};
walker = walk.walk("/tmp", options);
// OR
// walker = walk.walkSync("/tmp", options);
walker.on("names", function (root, nodeNamesArray) {
nodeNamesArray.sort(function (a, b) {
if (a > b) return 1;
if (a < b) return -1;
return 0;
2011-02-04 07:14:19 +00:00
});
});
2013-06-23 06:09:08 +00:00
walker.on("directories", function (root, dirStatsArray, next) {
// dirStatsArray is an array of `stat` objects with the additional attributes
// * type
// * error
// * name
next();
});
2013-06-23 06:09:08 +00:00
walker.on("file", function (root, fileStats, next) {
fs.readFile(fileStats.name, function () {
// doStuff
next();
});
2013-06-23 06:09:08 +00:00
});
2013-06-23 06:09:08 +00:00
walker.on("errors", function (root, nodeStatsArray, next) {
next();
});
walker.on("end", function () {
console.log("all done");
});
}());
```
2013-06-23 06:09:08 +00:00
### Sync
Note: Due to changes in EventEmitter,
I don't think it's possible to create a truly synchronous walker,
but I believe it will still finish in a single event loop as-is
(due to changes in process.nextTick).
```javascript
2013-06-23 06:09:08 +00:00
(function () {
"use strict";
2013-06-23 06:09:08 +00:00
var walk = require('walk')
, fs = require('fs')
, options
, walker
;
options = {
listeners: {
2013-06-23 06:09:08 +00:00
names: function (root, nodeNamesArray) {
nodeNames.sort(function (a, b) {
if (a > b) return 1;
if (a < b) return -1;
return 0;
});
}
, directories: function (root, dirStatsArray, next) {
// dirStatsArray is an array of `stat` objects with the additional attributes
// * type
// * error
// * name
next();
2013-06-23 06:09:08 +00:00
}
, file: function (root, fileStats, next) {
fs.readFile(fileStats.name, function () {
// doStuff
next();
2013-06-23 06:09:08 +00:00
});
}
, errors: function (root, nodeStatsArray, next) {
next();
}
}
2013-06-23 06:09:08 +00:00
};
2013-06-23 06:09:08 +00:00
walker = walk.walkSync("/tmp", options);
2013-06-23 06:09:08 +00:00
console.log("all done");
}());
```
API
====
Emitted Values
2011-02-04 07:14:19 +00:00
* `on('XYZ', function(root, stats, next) {})`
* `root` - the containing the files to be inspected
* *stats[Array]* - a single `stats` object or an array with some added attributes
* type - 'file', 'directory', etc
* error
* name - the name of the file, dir, etc
* next - no more files will be read until this is called
Single Events - fired immediately
* `end` - No files, dirs, etc left to inspect
* `directoryError` - Error when `fstat` succeeded, but reading path failed (Probably due to permissions).
2011-02-04 07:14:19 +00:00
* `nodeError` - Error `fstat` did not succeeded.
* `node` - a `stats` object for a node of any type
* `file` - includes links when `followLinks` is `true`
2011-02-04 07:14:19 +00:00
* Note: This feature is broken in the current version, but works in the previous `walk-recursive` version
* `directory`
2011-02-04 07:14:19 +00:00
* `symbolicLink` - always empty when `followLinks` is `true`
* `blockDevice`
* `characterDevice`
* `FIFO`
* `socket`
Events with Array Arguments - fired after all files in the dir have been `stat`ed
2011-02-04 07:14:19 +00:00
* `names` - before any `stat` takes place. Useful for sorting and filtering.
* Note: the array is an array of `string`s, not `stat` objects
* Note: the `next` argument is a `noop`
* `errors` - errors encountered by `fs.stat` when reading ndes in a directory
* `nodes` - an array of `stats` of any type
* `files`
* `directories` - modification of this array - sorting, removing, etc - affects traversal
2011-02-04 07:14:19 +00:00
* `symbolicLinks`
* `blockDevices`
* `characterDevices`
* `FIFOs`
* `sockets`
2010-11-21 05:02:53 +00:00
2011-02-04 07:14:19 +00:00
**Warning** beware of infinite loops when `followLinks` is true (using `walk-recurse` varient).
Comparisons
2010-11-21 05:02:53 +00:00
====
Tested on my `/System` containing 59,490 (+ self) directories (and lots of files).
The size of the text output was 6mb.
2010-11-21 05:02:53 +00:00
`find`:
time bash -c "find /System -type d | wc"
59491 97935 6262916
2010-11-21 05:02:53 +00:00
real 2m27.114s
user 0m1.193s
sys 0m14.859s
2010-11-21 05:02:53 +00:00
`find.js`:
Note that `find.js` omits the start directory
time bash -c "node examples/find.js /System -type d | wc"
59490 97934 6262908
# Test 1
real 2m52.273s
user 0m20.374s
sys 0m27.800s
# Test 2
real 2m23.725s
user 0m18.019s
sys 0m23.202s
# Test 3
real 2m50.077s
user 0m17.661s
sys 0m24.008s
2011-05-03 03:11:03 +00:00
In conclusion node.js asynchronous walk is much slower than regular "find".
2012-05-08 15:57:33 +00:00
LICENSE
===
`node-walk` is available under the following licenses:
* MIT
* Apache 2
Copyright 2011 - Present AJ ONeal