1 <?xml version=
"1.0" encoding=
"utf-8"?>
2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4 <html xmlns=
"http://www.w3.org/1999/xhtml" xml:
lang=
"en" lang=
"en">
6 <meta content=
"text/html; charset=utf-8" http-equiv=
"Content-Type" />
8 <title>Class: String
</title>
10 <link rel=
"stylesheet" href=
"./rdoc.css" type=
"text/css" media=
"screen" />
12 <script src=
"./js/jquery.js" type=
"text/javascript"
13 charset=
"utf-8"></script>
14 <script src=
"./js/thickbox-compressed.js" type=
"text/javascript"
15 charset=
"utf-8"></script>
16 <script src=
"./js/quicksearch.js" type=
"text/javascript"
17 charset=
"utf-8"></script>
18 <script src=
"./js/darkfish.js" type=
"text/javascript"
19 charset=
"utf-8"></script>
25 <div id=
"home-metadata">
26 <div id=
"home-section" class=
"section">
27 <h3 class=
"section-header">
28 <a href=
"./index.html">Home
</a>
29 <a href=
"./index.html#classes">Classes
</a>
30 <a href=
"./index.html#methods">Methods
</a>
35 <div id=
"file-metadata">
36 <div id=
"file-list-section" class=
"section">
37 <h3 class=
"section-header">In Files
</h3>
38 <div class=
"section-body">
41 <li><a href=
"./lib/porter2_rb.html?TB_iframe=true&height=550&width=785"
42 class=
"thickbox" title=
"lib/porter2.rb">lib/porter2.rb
</a></li>
51 <div id=
"class-metadata">
55 <div id=
"parent-class-section" class=
"section">
56 <h3 class=
"section-header">Parent
</h3>
58 <p class=
"link">Object
</p>
63 <!-- Namespace Contents -->
66 <!-- Method Quickref -->
68 <div id=
"method-list-section" class=
"section">
69 <h3 class=
"section-header">Methods
</h3>
70 <ul class=
"link-list">
72 <li><a href=
"#method-i-porter2_ends_with_short_syllable%3F">#porter2_ends_with_short_syllable?
</a></li>
74 <li><a href=
"#method-i-porter2_is_short_word%3F">#porter2_is_short_word?
</a></li>
76 <li><a href=
"#method-i-porter2_postprocess">#porter2_postprocess
</a></li>
78 <li><a href=
"#method-i-porter2_preprocess">#porter2_preprocess
</a></li>
80 <li><a href=
"#method-i-porter2_r1">#porter2_r1
</a></li>
82 <li><a href=
"#method-i-porter2_r2">#porter2_r2
</a></li>
84 <li><a href=
"#method-i-porter2_stem">#porter2_stem
</a></li>
86 <li><a href=
"#method-i-porter2_stem_verbose">#porter2_stem_verbose
</a></li>
88 <li><a href=
"#method-i-porter2_step0">#porter2_step0
</a></li>
90 <li><a href=
"#method-i-porter2_step1a">#porter2_step1a
</a></li>
92 <li><a href=
"#method-i-porter2_step1b">#porter2_step1b
</a></li>
94 <li><a href=
"#method-i-porter2_step1c">#porter2_step1c
</a></li>
96 <li><a href=
"#method-i-porter2_step2">#porter2_step2
</a></li>
98 <li><a href=
"#method-i-porter2_step3">#porter2_step3
</a></li>
100 <li><a href=
"#method-i-porter2_step4">#porter2_step4
</a></li>
102 <li><a href=
"#method-i-porter2_step5">#porter2_step5
</a></li>
104 <li><a href=
"#method-i-porter2_tidy">#porter2_tidy
</a></li>
106 <li><a href=
"#method-i-stem">#stem
</a></li>
112 <!-- Included Modules -->
116 <div id=
"project-metadata">
120 <div id=
"classindex-section" class=
"section project-section">
121 <h3 class=
"section-header">Class Index
122 <span class=
"search-toggle"><img src=
"./images/find.png"
123 height=
"16" width=
"16" alt=
"[+]"
124 title=
"show/hide quicksearch" /></span></h3>
125 <form action=
"#" method=
"get" accept-charset=
"utf-8" class=
"initially-hidden">
127 <legend>Quicksearch
</legend>
128 <input type=
"text" name=
"quicksearch" value=
""
129 class=
"quicksearch-field" />
133 <ul class=
"link-list">
135 <li><a href=
"./Porter2.html">Porter2
</a></li>
137 <li><a href=
"./String.html">String
</a></li>
139 <li><a href=
"./TestPorter2.html">TestPorter2
</a></li>
142 <div id=
"no-class-search-results" style=
"display: none;">No matching classes.
</div>
149 <div id=
"documentation">
150 <h1 class=
"class">String
</h1>
152 <div id=
"description">
153 <h2>The Porter
2 stemmer
</h2>
155 This is the Porter
2 stemming algorithm, as described at
<a
156 href=
"http://snowball.tartarus.org/algorithms/english/stemmer.html">snowball.tartarus.org/algorithms/english/stemmer.html
</a>
157 The original paper is:
160 Porter,
1980,
“An algorithm for suffix stripping
”,
161 <em>Program
</em>, Vol.
14, no.
3, pp
130-
137
164 Constants for the stemmer are in the
<a href=
"Porter2.html">Porter2
</a>
168 Procedures that implement the stemmer are added to the
<a
169 href=
"String.html">String
</a> class.
172 The stemmer algorithm is implemented in the
<a
173 href=
"String.html#method-i-porter2_stem">porter2_stem
</a> procedure.
175 <h2>Internationalisation
</h2>
177 There isn
’t much, as this is a stemmer that only works for English.
180 The
<tt>gb_english
</tt> flag to the various procedures allows the stemmer
181 to treat the British English
’-ise
’ the same as the American
182 English
’-ize
’.
184 <h2>Longest suffixes
</h2>
186 Several places in the algorithm require matching the longest suffix of a
187 word. The regexp engine in Ruby
1.9 seems to handle alterntives in regexps
188 by finding the alternative that matches at the first position in the
189 string. As we
’re only talking about suffixes, that first match is
190 also the longest suffix. If the regexp engine changes, this behaviour may
191 change and break the stemmer.
204 <div id=
"public-instance-method-details" class=
"method-section section">
205 <h3 class=
"section-header">Public Instance Methods
</h3>
208 <div id=
"porter-ends-with-short-syllable--method" class=
"method-detail ">
209 <a name=
"method-i-porter2_ends_with_short_syllable%3F"></a>
211 <div class=
"method-heading">
213 <span class=
"method-name">porter2_ends_with_short_syllable?
</span><span
214 class=
"method-args">()
</span>
215 <span class=
"method-click-advice">click to toggle source
</span>
219 <div class=
"method-description">
222 Returns true if the word ends with a short syllable
227 <div class=
"method-source-code"
228 id=
"porter-ends-with-short-syllable--source">
230 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
87</span>
231 87:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_ends_with_short_syllable?
</span>
232 88:
<span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::SHORT_SYLLABLE}$/
</span> <span class=
"ruby-operator">?
</span> <span class=
"ruby-keyword kw">true
</span> <span class=
"ruby-operator">:
</span> <span class=
"ruby-keyword kw">false
</span>
233 89:
<span class=
"ruby-keyword kw">end
</span></pre>
244 <div id=
"porter-is-short-word--method" class=
"method-detail ">
245 <a name=
"method-i-porter2_is_short_word%3F"></a>
247 <div class=
"method-heading">
249 <span class=
"method-name">porter2_is_short_word?
</span><span
250 class=
"method-args">()
</span>
251 <span class=
"method-click-advice">click to toggle source
</span>
255 <div class=
"method-description">
258 A word is short if it ends in a short syllable, and R1 is null
263 <div class=
"method-source-code"
264 id=
"porter-is-short-word--source">
266 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
93</span>
267 93:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_is_short_word?
</span>
268 94:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_ends_with_short_syllable?
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>.
<span class=
"ruby-identifier">empty?
</span>
269 95:
<span class=
"ruby-keyword kw">end
</span></pre>
280 <div id=
"porter-postprocess-method" class=
"method-detail ">
281 <a name=
"method-i-porter2_postprocess"></a>
283 <div class=
"method-heading">
285 <span class=
"method-name">porter2_postprocess
</span><span
286 class=
"method-args">()
</span>
287 <span class=
"method-click-advice">click to toggle source
</span>
291 <div class=
"method-description">
294 Turn all Y letters into y
299 <div class=
"method-source-code"
300 id=
"porter-postprocess-source">
302 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
289</span>
303 289:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_postprocess
</span>
304 290:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">gsub
</span>(
<span class=
"ruby-regexp re">/Y/
</span>,
<span class=
"ruby-value str">'y'
</span>)
305 291:
<span class=
"ruby-keyword kw">end
</span></pre>
316 <div id=
"porter-preprocess-method" class=
"method-detail ">
317 <a name=
"method-i-porter2_preprocess"></a>
319 <div class=
"method-heading">
321 <span class=
"method-name">porter2_preprocess
</span><span
322 class=
"method-args">()
</span>
323 <span class=
"method-click-advice">click to toggle source
</span>
327 <div class=
"method-description">
330 Preprocess the word. Remove any initial
’, if present. Then, set
331 initial y, or y after a vowel, to Y
334 (The comment to
‘establish the regions R1 and R2
’ in the
335 original description is an implementation optimisation that identifies
336 where the regions start. As no modifications are made to the word that
337 affect those positions, you may want to cache them now. This implementation
338 doesn
’t do that.)
343 <div class=
"method-source-code"
344 id=
"porter-preprocess-source">
346 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
53</span>
347 53:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_preprocess
</span>
348 54:
<span class=
"ruby-identifier">w
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">dup
</span>
350 56:
<span class=
"ruby-comment cmt"># remove any initial apostrophe
</span>
351 57:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/^'*(.)/
</span>,
<span class=
"ruby-value str">'\
1'
</span>)
353 59:
<span class=
"ruby-comment cmt"># set initial y, or y after a vowel, to Y
</span>
354 60:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/^y/
</span>,
<span class=
"ruby-value str">"Y
"</span>)
355 61:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-node">/(#{Porter2::V})y/
</span>,
<span class=
"ruby-value str">'\
1Y'
</span>)
357 63:
<span class=
"ruby-identifier">w
</span>
358 64:
<span class=
"ruby-keyword kw">end
</span></pre>
369 <div id=
"porter-r--method" class=
"method-detail ">
370 <a name=
"method-i-porter2_r1"></a>
372 <div class=
"method-heading">
374 <span class=
"method-name">porter2_r1
</span><span
375 class=
"method-args">()
</span>
376 <span class=
"method-click-advice">click to toggle source
</span>
380 <div class=
"method-description">
383 R1 is the portion of the word after the first non-vowel after the first
384 vowel (with words beginning
‘gener-
’,
‘commun-
’,
385 and
‘arsen-
’ treated as special cases
390 <div class=
"method-source-code"
391 id=
"porter-r--source">
393 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
69</span>
394 69:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_r1
</span>
395 70:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/^(gener|commun|arsen)(?
<r1
>.*)/
</span>
396 71:
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">last_match
</span>(
<span class=
"ruby-value">:r1
</span>)
397 72:
<span class=
"ruby-keyword kw">else
</span>
398 73:
<span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::V}#{Porter2::C}(?
<r1
>.*)$/
</span>
399 74:
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">last_match
</span>(
<span class=
"ruby-value">:r1
</span>)
<span class=
"ruby-operator">||
</span> <span class=
"ruby-value str">""</span>
400 75:
<span class=
"ruby-keyword kw">end
</span>
401 76:
<span class=
"ruby-keyword kw">end
</span></pre>
412 <div id=
"porter-r--method" class=
"method-detail ">
413 <a name=
"method-i-porter2_r2"></a>
415 <div class=
"method-heading">
417 <span class=
"method-name">porter2_r2
</span><span
418 class=
"method-args">()
</span>
419 <span class=
"method-click-advice">click to toggle source
</span>
423 <div class=
"method-description">
426 R2 is the portion of R1 (
<a
427 href=
"String.html#method-i-porter2_r1">porter2_r1
</a>) after the first
428 non-vowel after the first vowel
433 <div class=
"method-source-code"
434 id=
"porter-r--source">
436 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
80</span>
437 80:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_r2
</span>
438 81:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::V}#{Porter2::C}(?
<r2
>.*)$/
</span>
439 82:
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">last_match
</span>(
<span class=
"ruby-value">:r2
</span>)
<span class=
"ruby-operator">||
</span> <span class=
"ruby-value str">""</span>
440 83:
<span class=
"ruby-keyword kw">end
</span></pre>
451 <div id=
"porter-stem-method" class=
"method-detail ">
452 <a name=
"method-i-porter2_stem"></a>
454 <div class=
"method-heading">
456 <span class=
"method-name">porter2_stem
</span><span
457 class=
"method-args">(gb_english = false)
</span>
458 <span class=
"method-click-advice">click to toggle source
</span>
462 <div class=
"method-description">
465 Perform the stemming procedure. If
<tt>gb_english
</tt> is true, treat
466 ’-ise
’ and similar suffixes as
’-ize
’ in American
472 <div class=
"method-source-code"
473 id=
"porter-stem-source">
475 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
297</span>
476 297:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_stem
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
477 298:
<span class=
"ruby-identifier">preword
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_tidy
</span>
478 299:
<span class=
"ruby-keyword kw">return
</span> <span class=
"ruby-identifier">preword
</span> <span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">length
</span> <span class=
"ruby-operator"><=
</span> <span class=
"ruby-value">2</span>
480 301:
<span class=
"ruby-identifier">word
</span> =
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">porter2_preprocess
</span>
482 303:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">has_key?
</span> <span class=
"ruby-identifier">word
</span>
483 304:
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>[
<span class=
"ruby-identifier">word
</span>]
484 305:
<span class=
"ruby-keyword kw">else
</span>
485 306:
<span class=
"ruby-identifier">w1a
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_step0
</span>.
<span class=
"ruby-identifier">porter2_step1a
</span>
486 307:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_1A_SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">include?
</span> <span class=
"ruby-identifier">w1a
</span>
487 308:
<span class=
"ruby-identifier">w1a
</span>
488 309:
<span class=
"ruby-keyword kw">else
</span>
489 310:
<span class=
"ruby-identifier">w1a
</span>.
<span class=
"ruby-identifier">porter2_step1b
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step1c
</span>.
<span class=
"ruby-identifier">porter2_step2
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step3
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step4
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step5
</span>.
<span class=
"ruby-identifier">porter2_postprocess
</span>
490 311:
<span class=
"ruby-keyword kw">end
</span>
491 312:
<span class=
"ruby-keyword kw">end
</span>
492 313:
<span class=
"ruby-keyword kw">end
</span></pre>
498 <div class=
"aliases">
499 Also aliased as:
<a href=
"String.html#method-i-stem">stem
</a>
507 <div id=
"porter-stem-verbose-method" class=
"method-detail ">
508 <a name=
"method-i-porter2_stem_verbose"></a>
510 <div class=
"method-heading">
512 <span class=
"method-name">porter2_stem_verbose
</span><span
513 class=
"method-args">(gb_english = false)
</span>
514 <span class=
"method-click-advice">click to toggle source
</span>
518 <div class=
"method-description">
521 A verbose version of
<a
522 href=
"String.html#method-i-porter2_stem">porter2_stem
</a> that prints the
523 output of each stage to STDOUT
528 <div class=
"method-source-code"
529 id=
"porter-stem-verbose-source">
531 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
316</span>
532 316:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_stem_verbose
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
533 317:
<span class=
"ruby-identifier">preword
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_tidy
</span>
534 318:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Preword: #{preword}
"</span>
535 319:
<span class=
"ruby-keyword kw">return
</span> <span class=
"ruby-identifier">preword
</span> <span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">length
</span> <span class=
"ruby-operator"><=
</span> <span class=
"ruby-value">2</span>
537 321:
<span class=
"ruby-identifier">word
</span> =
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">porter2_preprocess
</span>
538 322:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Preprocessed: #{word}
"</span>
540 324:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">has_key?
</span> <span class=
"ruby-identifier">word
</span>
541 325:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Returning #{word} as special case #{Porter2::SPECIAL_CASES[word]}
"</span>
542 326:
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>[
<span class=
"ruby-identifier">word
</span>]
543 327:
<span class=
"ruby-keyword kw">else
</span>
544 328:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
545 329:
<span class=
"ruby-identifier">r2
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_r2
</span>
546 330:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"R1 = #{r1}, R2 = #{r2}
"</span>
548 332:
<span class=
"ruby-identifier">w0
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_step0
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
0: #{w0} (R1 = #{w0.porter2_r1}, R2 = #{w0.porter2_r2})
"</span>
549 333:
<span class=
"ruby-identifier">w1a
</span> =
<span class=
"ruby-identifier">w0
</span>.
<span class=
"ruby-identifier">porter2_step1a
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
1a: #{w1a} (R1 = #{w1a.porter2_r1}, R2 = #{w1a.porter2_r2})
"</span>
551 335:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_1A_SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">include?
</span> <span class=
"ruby-identifier">w1a
</span>
552 336:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Returning #{w1a} as
1a special case
"</span>
553 337:
<span class=
"ruby-identifier">w1a
</span>
554 338:
<span class=
"ruby-keyword kw">else
</span>
555 339:
<span class=
"ruby-identifier">w1b
</span> =
<span class=
"ruby-identifier">w1a
</span>.
<span class=
"ruby-identifier">porter2_step1b
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
1b: #{w1b} (R1 = #{w1b.porter2_r1}, R2 = #{w1b.porter2_r2})
"</span>
556 340:
<span class=
"ruby-identifier">w1c
</span> =
<span class=
"ruby-identifier">w1b
</span>.
<span class=
"ruby-identifier">porter2_step1c
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
1c: #{w1c} (R1 = #{w1c.porter2_r1}, R2 = #{w1c.porter2_r2})
"</span>
557 341:
<span class=
"ruby-identifier">w2
</span> =
<span class=
"ruby-identifier">w1c
</span>.
<span class=
"ruby-identifier">porter2_step2
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
2: #{w2} (R1 = #{w2.porter2_r1}, R2 = #{w2.porter2_r2})
"</span>
558 342:
<span class=
"ruby-identifier">w3
</span> =
<span class=
"ruby-identifier">w2
</span>.
<span class=
"ruby-identifier">porter2_step3
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
3: #{w3} (R1 = #{w3.porter2_r1}, R2 = #{w3.porter2_r2})
"</span>
559 343:
<span class=
"ruby-identifier">w4
</span> =
<span class=
"ruby-identifier">w3
</span>.
<span class=
"ruby-identifier">porter2_step4
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
4: #{w4} (R1 = #{w4.porter2_r1}, R2 = #{w4.porter2_r2})
"</span>
560 344:
<span class=
"ruby-identifier">w5
</span> =
<span class=
"ruby-identifier">w4
</span>.
<span class=
"ruby-identifier">porter2_step5
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
5: #{w5}
"</span>
561 345:
<span class=
"ruby-identifier">wpost
</span> =
<span class=
"ruby-identifier">w5
</span>.
<span class=
"ruby-identifier">porter2_postprocess
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After postprocess: #{wpost}
"</span>
562 346:
<span class=
"ruby-identifier">wpost
</span>
563 347:
<span class=
"ruby-keyword kw">end
</span>
564 348:
<span class=
"ruby-keyword kw">end
</span>
565 349:
<span class=
"ruby-keyword kw">end
</span></pre>
576 <div id=
"porter-step--method" class=
"method-detail ">
577 <a name=
"method-i-porter2_step0"></a>
579 <div class=
"method-heading">
581 <span class=
"method-name">porter2_step0
</span><span
582 class=
"method-args">()
</span>
583 <span class=
"method-click-advice">click to toggle source
</span>
587 <div class=
"method-description">
590 Search for the longest among the suffixes,
612 <div class=
"method-source-code"
613 id=
"porter-step--source">
615 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
103</span>
616 103:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step0
</span>
617 104:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub!
</span>(
<span class=
"ruby-regexp re">/(.)('s'|'s|')$/
</span>,
<span class=
"ruby-value str">'\
1'
</span>)
<span class=
"ruby-operator">||
</span> <span class=
"ruby-keyword kw">self
</span>
618 105:
<span class=
"ruby-keyword kw">end
</span></pre>
629 <div id=
"porter-step-a-method" class=
"method-detail ">
630 <a name=
"method-i-porter2_step1a"></a>
632 <div class=
"method-heading">
634 <span class=
"method-name">porter2_step1a
</span><span
635 class=
"method-args">()
</span>
636 <span class=
"method-click-advice">click to toggle source
</span>
640 <div class=
"method-description">
643 Search for the longest among the following suffixes, and perform the action
647 <tr><td valign=
"top">sses
</td><td><p>
651 <tr><td valign=
"top">ied, ies
</td><td><p>
652 replace by i if preceded by more than one letter, otherwise by ie
655 <tr><td valign=
"top">s
</td><td><p>
656 delete if the preceding word part contains a vowel not immediately before
660 <tr><td valign=
"top">us, ss
</td><td><p>
668 <div class=
"method-source-code"
669 id=
"porter-step-a-source">
671 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
113</span>
672 113:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step1a
</span>
673 114:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/sses$/
</span>
674 115:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/sses$/
</span>,
<span class=
"ruby-value str">'ss'
</span>)
675 116:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/..(ied|ies)$/
</span>
676 117:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(ied|ies)$/
</span>,
<span class=
"ruby-value str">'i'
</span>)
677 118:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(ied|ies)$/
</span>
678 119:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(ied|ies)$/
</span>,
<span class=
"ruby-value str">'ie'
</span>)
679 120:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(us|ss)$/
</span>
680 121:
<span class=
"ruby-keyword kw">self
</span>
681 122:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/s$/
</span>
682 123:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/(#{Porter2::V}.+)s$/
</span>
683 124:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/s$/
</span>,
<span class=
"ruby-value str">''
</span>)
684 125:
<span class=
"ruby-keyword kw">else
</span>
685 126:
<span class=
"ruby-keyword kw">self
</span>
686 127:
<span class=
"ruby-keyword kw">end
</span>
687 128:
<span class=
"ruby-keyword kw">else
</span>
688 129:
<span class=
"ruby-keyword kw">self
</span>
689 130:
<span class=
"ruby-keyword kw">end
</span>
690 131:
<span class=
"ruby-keyword kw">end
</span></pre>
701 <div id=
"porter-step-b-method" class=
"method-detail ">
702 <a name=
"method-i-porter2_step1b"></a>
704 <div class=
"method-heading">
706 <span class=
"method-name">porter2_step1b
</span><span
707 class=
"method-args">(gb_english = false)
</span>
708 <span class=
"method-click-advice">click to toggle source
</span>
712 <div class=
"method-description">
715 Search for the longest among the following suffixes, and perform the action
719 <tr><td valign=
"top">eed, eedly
</td><td><p>
720 replace by ee if the suffix is also in R1
723 <tr><td valign=
"top">ed, edly, ing, ingly
</td><td><p>
724 delete if the preceding word part contains a vowel and, after the
729 if the word ends at, bl or iz: add e, or
735 if the word ends with a double: remove the last letter, or
741 if the word is short: add e
748 (If gb_english is
<tt>true
</tt>, treat the
‘is
’ suffix as
749 ‘iz
’ above.)
754 <div class=
"method-source-code"
755 id=
"porter-step-b-source">
757 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
143</span>
758 143:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step1b
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
759 144:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(eed|eedly)$/
</span>
760 145:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(eed|eedly)$/
</span>
761 146:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(eed|eedly)$/
</span>,
<span class=
"ruby-value str">'ee'
</span>)
762 147:
<span class=
"ruby-keyword kw">else
</span>
763 148:
<span class=
"ruby-keyword kw">self
</span>
764 149:
<span class=
"ruby-keyword kw">end
</span>
765 150:
<span class=
"ruby-keyword kw">else
</span>
766 151:
<span class=
"ruby-identifier">w
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">dup
</span>
767 152:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::V}.*(ed|edly|ing|ingly)$/
</span>
768 153:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">sub!
</span>(
<span class=
"ruby-regexp re">/(ed|edly|ing|ingly)$/
</span>,
<span class=
"ruby-value str">''
</span>)
769 154:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(at|lb|iz)$/
</span>
770 155:
<span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">+=
</span> <span class=
"ruby-value str">'e'
</span>
771 156:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/is$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-identifier">gb_english
</span>
772 157:
<span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">+=
</span> <span class=
"ruby-value str">'e'
</span>
773 158:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::Double}$/
</span>
774 159:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">chop!
</span>
775 160:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">porter2_is_short_word?
</span>
776 161:
<span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">+=
</span> <span class=
"ruby-value str">'e'
</span>
777 162:
<span class=
"ruby-keyword kw">end
</span>
778 163:
<span class=
"ruby-keyword kw">end
</span>
779 164:
<span class=
"ruby-identifier">w
</span>
780 165:
<span class=
"ruby-keyword kw">end
</span>
781 166:
<span class=
"ruby-keyword kw">end
</span></pre>
792 <div id=
"porter-step-c-method" class=
"method-detail ">
793 <a name=
"method-i-porter2_step1c"></a>
795 <div class=
"method-heading">
797 <span class=
"method-name">porter2_step1c
</span><span
798 class=
"method-args">()
</span>
799 <span class=
"method-click-advice">click to toggle source
</span>
803 <div class=
"method-description">
806 Replace a suffix of y or Y by i if it is preceded by a non-vowel which is
807 not the first letter of the word.
812 <div class=
"method-source-code"
813 id=
"porter-step-c-source">
815 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
171</span>
816 171:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step1c
</span>
817 172:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/.+#{Porter2::C}(y|Y)$/
</span>
818 173:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(y|Y)$/
</span>,
<span class=
"ruby-value str">'i'
</span>)
819 174:
<span class=
"ruby-keyword kw">else
</span>
820 175:
<span class=
"ruby-keyword kw">self
</span>
821 176:
<span class=
"ruby-keyword kw">end
</span>
822 177:
<span class=
"ruby-keyword kw">end
</span></pre>
833 <div id=
"porter-step--method" class=
"method-detail ">
834 <a name=
"method-i-porter2_step2"></a>
836 <div class=
"method-heading">
838 <span class=
"method-name">porter2_step2
</span><span
839 class=
"method-args">(gb_english = false)
</span>
840 <span class=
"method-click-advice">click to toggle source
</span>
844 <div class=
"method-description">
847 Search for the longest among the suffixes listed in the keys of
848 Porter2::STEP_2_MAPS. If one is found and that suffix occurs in R1,
849 replace it with the value found in STEP_2_MAPS.
852 (Suffixes
‘ogi
’ and
‘li
’ are treated as special
853 cases in the procedure.)
856 (If gb_english is
<tt>true
</tt>, replace the
‘iser
’ and
857 ‘isation
’ suffixes with
‘ise
’, similarly to how
858 ‘izer
’ and
‘ization
’ are treated.)
863 <div class=
"method-source-code"
864 id=
"porter-step--source">
866 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
188</span>
867 188:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step2
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
868 189:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
869 190:
<span class=
"ruby-identifier">s2m
</span> =
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_2_MAPS
</span>.
<span class=
"ruby-identifier">dup
</span>
870 191:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">gb_english
</span>
871 192:
<span class=
"ruby-identifier">s2m
</span>[
<span class=
"ruby-value str">"iser
"</span>] =
<span class=
"ruby-value str">"ise
"</span>
872 193:
<span class=
"ruby-identifier">s2m
</span>[
<span class=
"ruby-value str">"isation
"</span>] =
<span class=
"ruby-value str">"ise
"</span>
873 194:
<span class=
"ruby-keyword kw">end
</span>
874 195:
<span class=
"ruby-identifier">step_2_re
</span> =
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">union
</span>(
<span class=
"ruby-identifier">s2m
</span>.
<span class=
"ruby-identifier">keys
</span>.
<span class=
"ruby-identifier">map
</span> {
<span class=
"ruby-operator">|
</span><span class=
"ruby-identifier">r
</span><span class=
"ruby-operator">|
</span> <span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">new
</span>(
<span class=
"ruby-identifier">r
</span> <span class=
"ruby-operator">+
</span> <span class=
"ruby-value str">"$
"</span>)})
875 196:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-identifier">step_2_re
</span>
876 197:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{$
&}$/
</span>
877 198:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-node">/#{$
&}$/
</span>,
<span class=
"ruby-identifier">s2m
</span>[
<span class=
"ruby-node">$
&</span>])
878 199:
<span class=
"ruby-keyword kw">else
</span>
879 200:
<span class=
"ruby-keyword kw">self
</span>
880 201:
<span class=
"ruby-keyword kw">end
</span>
881 202:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/li$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/(#{Porter2::Valid_LI})li$/
</span>
882 203:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/li$/
</span>,
<span class=
"ruby-value str">''
</span>)
883 204:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ogi$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/logi$/
</span>
884 205:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ogi$/
</span>,
<span class=
"ruby-value str">'og'
</span>)
885 206:
<span class=
"ruby-keyword kw">else
</span>
886 207:
<span class=
"ruby-keyword kw">self
</span>
887 208:
<span class=
"ruby-keyword kw">end
</span>
888 209:
<span class=
"ruby-keyword kw">end
</span></pre>
899 <div id=
"porter-step--method" class=
"method-detail ">
900 <a name=
"method-i-porter2_step3"></a>
902 <div class=
"method-heading">
904 <span class=
"method-name">porter2_step3
</span><span
905 class=
"method-args">(gb_english = false)
</span>
906 <span class=
"method-click-advice">click to toggle source
</span>
910 <div class=
"method-description">
913 Search for the longest among the suffixes listed in the keys of
914 Porter2::STEP_3_MAPS. If one is found and that suffix occurs in R1,
915 replace it with the value found in STEP_3_MAPS.
918 (Suffix
‘ative
’ is treated as a special case in the procedure.)
921 (If gb_english is
<tt>true
</tt>, replace the
‘alise
’ suffix
922 with
‘al
’, similarly to how
‘alize
’ is treated.)
927 <div class=
"method-source-code"
928 id=
"porter-step--source">
930 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
220</span>
931 220:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step3
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
932 221:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ative$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ative$/
</span>
933 222:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ative$/
</span>,
<span class=
"ruby-value str">''
</span>)
934 223:
<span class=
"ruby-keyword kw">else
</span>
935 224:
<span class=
"ruby-identifier">s3m
</span> =
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_3_MAPS
</span>.
<span class=
"ruby-identifier">dup
</span>
936 225:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">gb_english
</span>
937 226:
<span class=
"ruby-identifier">s3m
</span>[
<span class=
"ruby-value str">"alise
"</span>] =
<span class=
"ruby-value str">"al
"</span>
938 227:
<span class=
"ruby-keyword kw">end
</span>
939 228:
<span class=
"ruby-identifier">step_3_re
</span> =
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">union
</span>(
<span class=
"ruby-identifier">s3m
</span>.
<span class=
"ruby-identifier">keys
</span>.
<span class=
"ruby-identifier">map
</span> {
<span class=
"ruby-operator">|
</span><span class=
"ruby-identifier">r
</span><span class=
"ruby-operator">|
</span> <span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">new
</span>(
<span class=
"ruby-identifier">r
</span> <span class=
"ruby-operator">+
</span> <span class=
"ruby-value str">"$
"</span>)})
940 229:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
941 230:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-identifier">step_3_re
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{$
&}$/
</span>
942 231:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-node">/#{$
&}$/
</span>,
<span class=
"ruby-identifier">s3m
</span>[
<span class=
"ruby-node">$
&</span>])
943 232:
<span class=
"ruby-keyword kw">else
</span>
944 233:
<span class=
"ruby-keyword kw">self
</span>
945 234:
<span class=
"ruby-keyword kw">end
</span>
946 235:
<span class=
"ruby-keyword kw">end
</span>
947 236:
<span class=
"ruby-keyword kw">end
</span></pre>
958 <div id=
"porter-step--method" class=
"method-detail ">
959 <a name=
"method-i-porter2_step4"></a>
961 <div class=
"method-heading">
963 <span class=
"method-name">porter2_step4
</span><span
964 class=
"method-args">(gb_english = false)
</span>
965 <span class=
"method-click-advice">click to toggle source
</span>
969 <div class=
"method-description">
972 Search for the longest among the suffixes listed in the keys of
973 Porter2::STEP_4_MAPS. If one is found and that suffix occurs in R2,
974 replace it with the value found in STEP_4_MAPS.
977 (Suffix
‘ion
’ is treated as a special case in the procedure.)
980 (If gb_english is
<tt>true
</tt>, delete the
‘ise
’ suffix if
986 <div class=
"method-source-code"
987 id=
"porter-step--source">
989 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
246</span>
990 246:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step4
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
991 247:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ion$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(s|t)ion$/
</span>
992 248:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ion$/
</span>,
<span class=
"ruby-value str">''
</span>)
993 249:
<span class=
"ruby-keyword kw">else
</span>
994 250:
<span class=
"ruby-identifier">s4m
</span> =
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_4_MAPS
</span>.
<span class=
"ruby-identifier">dup
</span>
995 251:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">gb_english
</span>
996 252:
<span class=
"ruby-identifier">s4m
</span>[
<span class=
"ruby-value str">"ise
"</span>] =
<span class=
"ruby-value str">""</span>
997 253:
<span class=
"ruby-keyword kw">end
</span>
998 254:
<span class=
"ruby-identifier">step_4_re
</span> =
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">union
</span>(
<span class=
"ruby-identifier">s4m
</span>.
<span class=
"ruby-identifier">keys
</span>.
<span class=
"ruby-identifier">map
</span> {
<span class=
"ruby-operator">|
</span><span class=
"ruby-identifier">r
</span><span class=
"ruby-operator">|
</span> <span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">new
</span>(
<span class=
"ruby-identifier">r
</span> <span class=
"ruby-operator">+
</span> <span class=
"ruby-value str">"$
"</span>)})
999 255:
<span class=
"ruby-identifier">r2
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span>
1000 256:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-identifier">step_4_re
</span>
1001 257:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{$
&}/
</span>
1002 258:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-node">/#{$
&}$/
</span>,
<span class=
"ruby-identifier">s4m
</span>[
<span class=
"ruby-node">$
&</span>])
1003 259:
<span class=
"ruby-keyword kw">else
</span>
1004 260:
<span class=
"ruby-keyword kw">self
</span>
1005 261:
<span class=
"ruby-keyword kw">end
</span>
1006 262:
<span class=
"ruby-keyword kw">else
</span>
1007 263:
<span class=
"ruby-keyword kw">self
</span>
1008 264:
<span class=
"ruby-keyword kw">end
</span>
1009 265:
<span class=
"ruby-keyword kw">end
</span>
1010 266:
<span class=
"ruby-keyword kw">end
</span></pre>
1021 <div id=
"porter-step--method" class=
"method-detail ">
1022 <a name=
"method-i-porter2_step5"></a>
1024 <div class=
"method-heading">
1026 <span class=
"method-name">porter2_step5
</span><span
1027 class=
"method-args">()
</span>
1028 <span class=
"method-click-advice">click to toggle source
</span>
1032 <div class=
"method-description">
1035 Search for the the following suffixes, and, if found, perform the action
1039 <tr><td valign=
"top">e
</td><td><p>
1040 delete if in R2, or in R1 and not preceded by a short syllable
1043 <tr><td valign=
"top">l
</td><td><p>
1044 delete if in R2 and preceded by l
1051 <div class=
"method-source-code"
1052 id=
"porter-step--source">
1054 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
272</span>
1055 272:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step5
</span>
1056 273:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ll$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/l$/
</span>
1057 274:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ll$/
</span>,
<span class=
"ruby-value str">'l'
</span>)
1058 275:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span>
1059 276:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/e$/
</span>,
<span class=
"ruby-value str">''
</span>)
1060 277:
<span class=
"ruby-keyword kw">else
</span>
1061 278:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
1062 279:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">not
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::SHORT_SYLLABLE}e$/
</span>
1063 280:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/e$/
</span>,
<span class=
"ruby-value str">''
</span>)
1064 281:
<span class=
"ruby-keyword kw">else
</span>
1065 282:
<span class=
"ruby-keyword kw">self
</span>
1066 283:
<span class=
"ruby-keyword kw">end
</span>
1067 284:
<span class=
"ruby-keyword kw">end
</span>
1068 285:
<span class=
"ruby-keyword kw">end
</span></pre>
1079 <div id=
"porter-tidy-method" class=
"method-detail ">
1080 <a name=
"method-i-porter2_tidy"></a>
1082 <div class=
"method-heading">
1084 <span class=
"method-name">porter2_tidy
</span><span
1085 class=
"method-args">()
</span>
1086 <span class=
"method-click-advice">click to toggle source
</span>
1090 <div class=
"method-description">
1093 Tidy up the word before we get down to the algorithm
1098 <div class=
"method-source-code"
1099 id=
"porter-tidy-source">
1101 <span class=
"ruby-comment cmt"># File lib/porter2.rb, line
35</span>
1102 35:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_tidy
</span>
1103 36:
<span class=
"ruby-identifier">preword
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">to_s
</span>.
<span class=
"ruby-identifier">strip
</span>.
<span class=
"ruby-identifier">downcase
</span>
1105 38:
<span class=
"ruby-comment cmt"># map apostrophe-like characters to apostrophes
</span>
1106 39:
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/‘/
</span>,
<span class=
"ruby-value str">"'
"</span>)
1107 40:
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/’/
</span>,
<span class=
"ruby-value str">"'
"</span>)
1109 42:
<span class=
"ruby-identifier">preword
</span>
1110 43:
<span class=
"ruby-keyword kw">end
</span></pre>
1121 <div id=
"stem-method" class=
"method-detail method-alias">
1122 <a name=
"method-i-stem"></a>
1124 <div class=
"method-heading">
1126 <span class=
"method-name">stem
</span><span
1127 class=
"method-args">(gb_english = false)
</span>
1128 <span class=
"method-click-advice">click to toggle source
</span>
1132 <div class=
"method-description">
1143 <div class=
"aliases">
1144 Alias for:
<a href=
"String.html#method-i-porter2_stem">porter2_stem
</a>
1156 <div id=
"rdoc-debugging-section-dump" class=
"debugging-section">
1158 <p>Disabled; run with --debug to generate this.
</p>
1162 <div id=
"validator-badges">
1163 <p><small><a href=
"http://validator.w3.org/check/referer">[Validate]
</a></small></p>
1164 <p><small>Generated with the
<a href=
"http://deveiate.org/projects/Darkfish-Rdoc/">Darkfish
1165 Rdoc Generator
</a> 1.1.6</small>.
</p>