190 lines
6.5 KiB
Markdown
190 lines
6.5 KiB
Markdown
|
### Qap
|
|||
|
|
|||
|
[![NPM VERSION](http://img.shields.io/npm/v/qap.svg?style=flat)](https://www.npmjs.org/package/qap)
|
|||
|
[![CODACY BADGE](https://img.shields.io/codacy/b18ed7d95b0a4707a0ff7b88b30d3def.svg?style=flat)](https://www.codacy.com/public/44gatti/qap)
|
|||
|
[![CODECLIMATE](http://img.shields.io/codeclimate/github/rootslab/qap.svg?style=flat)](https://codeclimate.com/github/rootslab/qap)
|
|||
|
[![CODECLIMATE-TEST-COVERAGE](https://img.shields.io/codeclimate/coverage/github/rootslab/qap.svg?style=flat)](https://codeclimate.com/github/rootslab/qap)
|
|||
|
[![LICENSE](http://img.shields.io/badge/license-MIT-blue.svg?style=flat)](https://github.com/rootslab/qap#mit-license)
|
|||
|
|
|||
|
[![TRAVIS CI BUILD](http://img.shields.io/travis/rootslab/qap.svg?style=flat)](http://travis-ci.org/rootslab/qap)
|
|||
|
[![BUILD STATUS](http://img.shields.io/david/rootslab/qap.svg?style=flat)](https://david-dm.org/rootslab/qap)
|
|||
|
[![DEVDEPENDENCY STATUS](http://img.shields.io/david/dev/rootslab/qap.svg?style=flat)](https://david-dm.org/rootslab/qap#info=devDependencies)
|
|||
|
[![NPM DOWNLOADS](http://img.shields.io/npm/dm/qap.svg?style=flat)](http://npm-stat.com/charts.html?package=qap)
|
|||
|
|
|||
|
[![NPM GRAPH1](https://nodei.co/npm-dl/qap.png)](https://nodei.co/npm/qap/)
|
|||
|
|
|||
|
[![NPM GRAPH2](https://nodei.co/npm/qap.png?downloads=true&downloadRank=true&stars=true)](https://nodei.co/npm/qap/)
|
|||
|
|
|||
|
[![status](https://sourcegraph.com/api/repos/github.com/rootslab/qap/.badges/status.png)](https://sourcegraph.com/github.com/rootslab/qap)
|
|||
|
[![views](https://sourcegraph.com/api/repos/github.com/rootslab/qap/.counters/views.png)](https://sourcegraph.com/github.com/rootslab/qap)
|
|||
|
[![views 24h](https://sourcegraph.com/api/repos/github.com/rootslab/qap/.counters/views-24h.png)](https://sourcegraph.com/github.com/rootslab/qap)
|
|||
|
|
|||
|
* __Qap__ is a quick parser for string or buffer patterns.
|
|||
|
* It is optimized for using with pattern strings <= 255 bytes.
|
|||
|
* Better results are achieved with long and sparse patterns.
|
|||
|
* It is an implementation of QuickSearch algorithm.
|
|||
|
|
|||
|
###Main features
|
|||
|
|
|||
|
> Given a m-length pattern and n-length data and σ-length alphabet ( σ = 256 ):
|
|||
|
|
|||
|
- simplification of the Boyer-Moore algorithm ( *see [Bop](https://github.com/rootslab/bop)* ).
|
|||
|
- uses only a bad-character shift table.
|
|||
|
- preprocessing phase in __O(m+σ)__ time and __O(σ)__ space complexity.
|
|||
|
- searching phase in __O(m*n)__ time complexity.
|
|||
|
- very fast in practice for short patterns and large alphabets.
|
|||
|
|
|||
|
> See __[Lecroq](http://www-igm.univ-mlv.fr/~lecroq/string/node19.html)__ for reference and also __[Bop](https://github.com/rootslab/bop)__, a Boyer-Moore parser.
|
|||
|
|
|||
|
###Install
|
|||
|
```bash
|
|||
|
$ npm install qap [-g]
|
|||
|
```
|
|||
|
|
|||
|
> __require__:
|
|||
|
|
|||
|
```javascript
|
|||
|
var Qap = require( 'qap' );
|
|||
|
```
|
|||
|
|
|||
|
###Run Tests
|
|||
|
|
|||
|
```javascript
|
|||
|
$cd qap/
|
|||
|
$npm test
|
|||
|
```
|
|||
|
|
|||
|
###Run Benchmarks
|
|||
|
|
|||
|
```bash
|
|||
|
$ cd qap/
|
|||
|
$ npm run-script bench
|
|||
|
```
|
|||
|
|
|||
|
###Constructor
|
|||
|
|
|||
|
> Create an instance with a Buffer or String pattern.
|
|||
|
|
|||
|
```javascript
|
|||
|
Qap( Buffer || String pattern )
|
|||
|
// or
|
|||
|
neq Qap( Buffer || String pattern )
|
|||
|
```
|
|||
|
|
|||
|
###Methods
|
|||
|
|
|||
|
> List all pattern occurrences into a String or Buffer data.
|
|||
|
> It returns a new array of indexes, or populates an array passed as the last argument to parse method.
|
|||
|
|
|||
|
```javascript
|
|||
|
// slower with String
|
|||
|
Qap#parse( String data [, Number startFromIndex [, Number limitResultsTo [, Array array ] ] ] ) : Array
|
|||
|
|
|||
|
// faster with Buffer
|
|||
|
Qap#parse( Buffer data [, Number startFromIndex [, Number limitResultsTo [, Array array ] ] ] ) : Array
|
|||
|
```
|
|||
|
|
|||
|
> Change the pattern :
|
|||
|
|
|||
|
```javascript
|
|||
|
Qap#set( Buffer || String pattern ) : Buffer
|
|||
|
```
|
|||
|
|
|||
|
###Usage Example
|
|||
|
|
|||
|
```javascript
|
|||
|
var log = console.log
|
|||
|
, assert = require( 'assert' )
|
|||
|
, Qap = require( 'qap' )
|
|||
|
, pattern = 'hellofolks\r\n\r\n'
|
|||
|
, text = 'hehe' + pattern +'loremipsumhellofolks\r\n' + pattern
|
|||
|
, bresult = null
|
|||
|
;
|
|||
|
|
|||
|
// create an instance and parse the pattern
|
|||
|
var qap = Qap( pattern )
|
|||
|
// parse data from beginning
|
|||
|
, results = qap.parse( text )
|
|||
|
;
|
|||
|
|
|||
|
// set a new Buffer pattern
|
|||
|
qap.set( new Buffer( pattern ) );
|
|||
|
|
|||
|
// parse data uffer instead of a String
|
|||
|
bresults = qap.parse( new Buffer( text ) );
|
|||
|
|
|||
|
// parser results ( starting indexes ) [ 4, 40 ]
|
|||
|
log( results, bresults );
|
|||
|
|
|||
|
// results are the same
|
|||
|
assert.deepEqual( results, bresults );
|
|||
|
|
|||
|
```
|
|||
|
|
|||
|
####Benchmark for a small pattern ( length <= 255 bytes )
|
|||
|
|
|||
|
> Parser uses a Buffer 256-bytes long to build the shifting table, then:
|
|||
|
|
|||
|
> - Pattern parsing / table creation space and time complexity is O(σ).
|
|||
|
> - Very low memory footprint.
|
|||
|
> - Ultra fast to preprocess pattern ( = table creation ).
|
|||
|
|
|||
|
```bash
|
|||
|
$ node bench/small-pattern-data-rate
|
|||
|
```
|
|||
|
|
|||
|
for default it:
|
|||
|
|
|||
|
> - uses a pattern string of 57 bytes/chars
|
|||
|
> - builds a data buffer of 700 MB in memory
|
|||
|
> - uses a redundancy/distance factor for pattern strings equal to 2. The bigger the value,
|
|||
|
the lesser are occurrences of pattern string into the text buffer.
|
|||
|
|
|||
|
**Custom Usage**:
|
|||
|
|
|||
|
```bash
|
|||
|
# with [testBufferSizeInMB] [distanceFactor] [aStringPattern]
|
|||
|
$ node bench/small-pattern-data-rate.js 700 4 "that'sallfolks"
|
|||
|
```
|
|||
|
|
|||
|
####Benchmark for a big pattern ( length > 255 bytes )
|
|||
|
|
|||
|
> Parser uses one Array to build the shifting table for a big pattern, then:
|
|||
|
|
|||
|
> - table has a size of 256 elements, every element is an integer value that
|
|||
|
> could be between 0 and the pattern length.
|
|||
|
> - Fast to preprocess pattern ( = table creation ).
|
|||
|
> - Low memory footprint
|
|||
|
|
|||
|
```bash
|
|||
|
$ node bench/big-pattern-data-rate
|
|||
|
```
|
|||
|
|
|||
|
> - it uses a pattern size of 20MB
|
|||
|
> - builds a data buffer of 300MB copying pattern 12 times
|
|||
|
|
|||
|
See __[bench](./bench)__ dir.
|
|||
|
|
|||
|
### MIT License
|
|||
|
|
|||
|
> Copyright (c) 2015 < Guglielmo Ferri : 44gatti@gmail.com >
|
|||
|
|
|||
|
> Permission is hereby granted, free of charge, to any person obtaining
|
|||
|
> a copy of this software and associated documentation files (the
|
|||
|
> 'Software'), to deal in the Software without restriction, including
|
|||
|
> without limitation the rights to use, copy, modify, merge, publish,
|
|||
|
> distribute, sublicense, and/or sell copies of the Software, and to
|
|||
|
> permit persons to whom the Software is furnished to do so, subject to
|
|||
|
> the following conditions:
|
|||
|
|
|||
|
> __The above copyright notice and this permission notice shall be
|
|||
|
> included in all copies or substantial portions of the Software.__
|
|||
|
|
|||
|
> THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
|
|||
|
> EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
|||
|
> MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
|||
|
> IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
|||
|
> CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
|||
|
> TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
|||
|
> SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|