Matcher derives from Ruby Quiz #103, the DictionaryMatcher quiz.
Create a DictionaryMatcher with no words in it
# File lib/language/matcher.rb, line 25 def initialize @trie = {} @word_count = 0 end
Determines whether one of the words in the DictionaryMatcher is a substring
of string. Returns the index of the match if found,
nil if not found.
# File lib/language/matcher.rb, line 88 def =~ text internal_match(text){|md| return md.index} nil end
Add a word to the DictionaryMatcher
# File lib/language/matcher.rb, line 31 def add(word) @word_count += 1 container = @trie containers=[] i=0 word.each_byte do |b| container[b] = {} unless container.has_key? b container[:depth]=i containers << container container = container[b] i+=1 end containers << container container[0] = true # Mark end of word container[:depth]=i ff=compute_failure_function word ff.zip(containers).each do |pointto,container| container[:failure]=containers[pointto] if pointto end self end
Determine whether string was previously added to
the Trie.
# File lib/language/matcher.rb, line 75 def include?(word) container = @trie word.each_byte do |b| break unless container.has_key? b container = container[b] end container[0] end
# File lib/language/matcher.rb, line 20 def inspect to_s end
Determine whether one of the words in the DictionaryMatcher is a substring
of string. Returns a DictionaryMatcher::MatchData object if
found, nil if not found.
# File lib/language/matcher.rb, line 97 def match text internal_match(text){|md| return md} nil end
Scans string for all occurrances of strings in the
DictionaryMatcher. Overlapping matches are skipped (only the first one is
yielded), and when some strings in the DictionaryMatcher are substrings of
others, only the shortest match at a given position is found.
# File lib/language/matcher.rb, line 135 def scan(text, &block) matches=[] block= lambda{ |md| matches << md } unless block internal_match(text,&block) matches end