| Path: | README |
| Last Update: | Mon Nov 06 03:24:20 +0100 2006 |
This is a Ruby interface to the CRM114 Controllable Regex Mutilator, an advanced and fast text classifier that uses sparse binary polynomial matching with a Bayesian Chain Rule evaluator and a hidden Markov model to categorize data with up to a 99.87% accuracy.
The Ruby wrapper grew out of this:
Requires the CRM114 binaries to be installed. Specifically, the ‘crm’ binary should be accessible in the current user‘s PATH environment variable.
The CRM114 library interface is very similar to that of the Classifier project.
Here follows a brief example:
require 'crm114' crm = Classifier::CRM114.new([:interesting, :boring]) crm.train! :interesting, 'Some data set with a decent signal to noise ratio.' crm.train! :boring, 'Pig latin, as in lorem ipsum dolor sit amet.' crm.classify 'Lorem ipsum' => [:boring, 0.99] crm.interesting? 'Lorem ipsum' => false crm.boring? 'Lorem ipsum' => true
Have a look at the included unit tests for more comprehensive examples.
Arto Bendiken (arto.bendiken@gmail.com) - bendiken.net
Released under the terms of the MIT license. See the accompanying LICENSE file for more information.