1 <?xml version=
"1.0" encoding=
"utf-8"?>
2 <!DOCTYPE html PUBLIC
"-//W3C//DTD XHTML 1.0 Strict//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4 <html xmlns=
"http://www.w3.org/1999/xhtml" xml:
lang=
"en" lang=
"en">
6 <meta content=
"text/html; charset=utf-8" http-equiv=
"Content-Type" />
8 <title>Class: String
</title>
10 <link rel=
"stylesheet" href=
"./rdoc.css" type=
"text/css" media=
"screen" />
12 <script src=
"./js/jquery.js" type=
"text/javascript"
13 charset=
"utf-8"></script>
14 <script src=
"./js/thickbox-compressed.js" type=
"text/javascript"
15 charset=
"utf-8"></script>
16 <script src=
"./js/quicksearch.js" type=
"text/javascript"
17 charset=
"utf-8"></script>
18 <script src=
"./js/darkfish.js" type=
"text/javascript"
19 charset=
"utf-8"></script>
25 <div id=
"home-metadata">
26 <div id=
"home-section" class=
"section">
27 <h3 class=
"section-header">
28 <a href=
"./index.html">Home
</a>
29 <a href=
"./index.html#classes">Classes
</a>
30 <a href=
"./index.html#methods">Methods
</a>
35 <div id=
"file-metadata">
36 <div id=
"file-list-section" class=
"section">
37 <h3 class=
"section-header">In Files
</h3>
38 <div class=
"section-body">
41 <li><a href=
"./lib/porter2_implementation_rb.html?TB_iframe=true&height=550&width=785"
42 class=
"thickbox" title=
"lib/porter2_implementation.rb">lib/porter2_implementation.rb
</a></li>
51 <div id=
"class-metadata">
55 <div id=
"parent-class-section" class=
"section">
56 <h3 class=
"section-header">Parent
</h3>
58 <p class=
"link">Object
</p>
63 <!-- Namespace Contents -->
66 <!-- Method Quickref -->
68 <div id=
"method-list-section" class=
"section">
69 <h3 class=
"section-header">Methods
</h3>
70 <ul class=
"link-list">
72 <li><a href=
"#method-i-porter2_ends_with_short_syllable%3F">#porter2_ends_with_short_syllable?
</a></li>
74 <li><a href=
"#method-i-porter2_is_short_word%3F">#porter2_is_short_word?
</a></li>
76 <li><a href=
"#method-i-porter2_postprocess">#porter2_postprocess
</a></li>
78 <li><a href=
"#method-i-porter2_preprocess">#porter2_preprocess
</a></li>
80 <li><a href=
"#method-i-porter2_r1">#porter2_r1
</a></li>
82 <li><a href=
"#method-i-porter2_r2">#porter2_r2
</a></li>
84 <li><a href=
"#method-i-porter2_stem">#porter2_stem
</a></li>
86 <li><a href=
"#method-i-porter2_stem_verbose">#porter2_stem_verbose
</a></li>
88 <li><a href=
"#method-i-porter2_step0">#porter2_step0
</a></li>
90 <li><a href=
"#method-i-porter2_step1a">#porter2_step1a
</a></li>
92 <li><a href=
"#method-i-porter2_step1b">#porter2_step1b
</a></li>
94 <li><a href=
"#method-i-porter2_step1c">#porter2_step1c
</a></li>
96 <li><a href=
"#method-i-porter2_step2">#porter2_step2
</a></li>
98 <li><a href=
"#method-i-porter2_step3">#porter2_step3
</a></li>
100 <li><a href=
"#method-i-porter2_step4">#porter2_step4
</a></li>
102 <li><a href=
"#method-i-porter2_step5">#porter2_step5
</a></li>
104 <li><a href=
"#method-i-porter2_tidy">#porter2_tidy
</a></li>
106 <li><a href=
"#method-i-stem">#stem
</a></li>
112 <!-- Included Modules -->
116 <div id=
"project-metadata">
119 <div id=
"fileindex-section" class=
"section project-section">
120 <h3 class=
"section-header">Files
</h3>
123 <li class=
"file"><a href=
"./Readme_rdoc.html">Readme.rdoc
</a></li>
129 <div id=
"classindex-section" class=
"section project-section">
130 <h3 class=
"section-header">Class Index
131 <span class=
"search-toggle"><img src=
"./images/find.png"
132 height=
"16" width=
"16" alt=
"[+]"
133 title=
"show/hide quicksearch" /></span></h3>
134 <form action=
"#" method=
"get" accept-charset=
"utf-8" class=
"initially-hidden">
136 <legend>Quicksearch
</legend>
137 <input type=
"text" name=
"quicksearch" value=
""
138 class=
"quicksearch-field" />
142 <ul class=
"link-list">
144 <li><a href=
"./Porter2.html">Porter2
</a></li>
146 <li><a href=
"./String.html">String
</a></li>
148 <li><a href=
"./TestPorter2.html">TestPorter2
</a></li>
151 <div id=
"no-class-search-results" style=
"display: none;">No matching classes.
</div>
158 <div id=
"documentation">
159 <h1 class=
"class">String
</h1>
161 <div id=
"description">
163 Implementation of the Porter
2 stemmer.
<a
164 href=
"String.html#method-i-porter2_stem">String#porter2_stem
</a> is the
165 main stemming procedure.
178 <div id=
"public-instance-method-details" class=
"method-section section">
179 <h3 class=
"section-header">Public Instance Methods
</h3>
182 <div id=
"porter-ends-with-short-syllable--method" class=
"method-detail ">
183 <a name=
"method-i-porter2_ends_with_short_syllable%3F"></a>
185 <div class=
"method-heading">
187 <span class=
"method-name">porter2_ends_with_short_syllable?
</span><span
188 class=
"method-args">()
</span>
189 <span class=
"method-click-advice">click to toggle source
</span>
193 <div class=
"method-description">
196 Returns true if the word ends with a short syllable
201 <div class=
"method-source-code"
202 id=
"porter-ends-with-short-syllable--source">
204 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
59</span>
205 59:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_ends_with_short_syllable?
</span>
206 60:
<span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::SHORT_SYLLABLE}$/
</span> <span class=
"ruby-operator">?
</span> <span class=
"ruby-keyword kw">true
</span> <span class=
"ruby-operator">:
</span> <span class=
"ruby-keyword kw">false
</span>
207 61:
<span class=
"ruby-keyword kw">end
</span></pre>
218 <div id=
"porter-is-short-word--method" class=
"method-detail ">
219 <a name=
"method-i-porter2_is_short_word%3F"></a>
221 <div class=
"method-heading">
223 <span class=
"method-name">porter2_is_short_word?
</span><span
224 class=
"method-args">()
</span>
225 <span class=
"method-click-advice">click to toggle source
</span>
229 <div class=
"method-description">
232 A word is short if it ends in a short syllable, and R1 is null
237 <div class=
"method-source-code"
238 id=
"porter-is-short-word--source">
240 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
65</span>
241 65:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_is_short_word?
</span>
242 66:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_ends_with_short_syllable?
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>.
<span class=
"ruby-identifier">empty?
</span>
243 67:
<span class=
"ruby-keyword kw">end
</span></pre>
254 <div id=
"porter-postprocess-method" class=
"method-detail ">
255 <a name=
"method-i-porter2_postprocess"></a>
257 <div class=
"method-heading">
259 <span class=
"method-name">porter2_postprocess
</span><span
260 class=
"method-args">()
</span>
261 <span class=
"method-click-advice">click to toggle source
</span>
265 <div class=
"method-description">
268 Turn all Y letters into y
273 <div class=
"method-source-code"
274 id=
"porter-postprocess-source">
276 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
261</span>
277 261:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_postprocess
</span>
278 262:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">gsub
</span>(
<span class=
"ruby-regexp re">/Y/
</span>,
<span class=
"ruby-value str">'y'
</span>)
279 263:
<span class=
"ruby-keyword kw">end
</span></pre>
290 <div id=
"porter-preprocess-method" class=
"method-detail ">
291 <a name=
"method-i-porter2_preprocess"></a>
293 <div class=
"method-heading">
295 <span class=
"method-name">porter2_preprocess
</span><span
296 class=
"method-args">()
</span>
297 <span class=
"method-click-advice">click to toggle source
</span>
301 <div class=
"method-description">
304 Preprocess the word. Remove any initial
’, if present. Then, set
305 initial y, or y after a vowel, to Y
308 (The comment to
‘establish the regions R1 and R2
’ in the
309 original description is an implementation optimisation that identifies
310 where the regions start. As no modifications are made to the word that
311 affect those positions, you may want to cache them now. This implementation
312 doesn
’t do that.)
317 <div class=
"method-source-code"
318 id=
"porter-preprocess-source">
320 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
25</span>
321 25:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_preprocess
</span>
322 26:
<span class=
"ruby-identifier">w
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">dup
</span>
324 28:
<span class=
"ruby-comment cmt"># remove any initial apostrophe
</span>
325 29:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/^'*(.)/
</span>,
<span class=
"ruby-value str">'\
1'
</span>)
327 31:
<span class=
"ruby-comment cmt"># set initial y, or y after a vowel, to Y
</span>
328 32:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/^y/
</span>,
<span class=
"ruby-value str">"Y
"</span>)
329 33:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-node">/(#{Porter2::V})y/
</span>,
<span class=
"ruby-value str">'\
1Y'
</span>)
331 35:
<span class=
"ruby-identifier">w
</span>
332 36:
<span class=
"ruby-keyword kw">end
</span></pre>
343 <div id=
"porter-r--method" class=
"method-detail ">
344 <a name=
"method-i-porter2_r1"></a>
346 <div class=
"method-heading">
348 <span class=
"method-name">porter2_r1
</span><span
349 class=
"method-args">()
</span>
350 <span class=
"method-click-advice">click to toggle source
</span>
354 <div class=
"method-description">
357 R1 is the portion of the word after the first non-vowel after the first
358 vowel (with words beginning
‘gener-
’,
‘commun-
’,
359 and
‘arsen-
’ treated as special cases
364 <div class=
"method-source-code"
365 id=
"porter-r--source">
367 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
41</span>
368 41:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_r1
</span>
369 42:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/^(gener|commun|arsen)(?
<r1
>.*)/
</span>
370 43:
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">last_match
</span>(
<span class=
"ruby-value">:r1
</span>)
371 44:
<span class=
"ruby-keyword kw">else
</span>
372 45:
<span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::V}#{Porter2::C}(?
<r1
>.*)$/
</span>
373 46:
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">last_match
</span>(
<span class=
"ruby-value">:r1
</span>)
<span class=
"ruby-operator">||
</span> <span class=
"ruby-value str">""</span>
374 47:
<span class=
"ruby-keyword kw">end
</span>
375 48:
<span class=
"ruby-keyword kw">end
</span></pre>
386 <div id=
"porter-r--method" class=
"method-detail ">
387 <a name=
"method-i-porter2_r2"></a>
389 <div class=
"method-heading">
391 <span class=
"method-name">porter2_r2
</span><span
392 class=
"method-args">()
</span>
393 <span class=
"method-click-advice">click to toggle source
</span>
397 <div class=
"method-description">
400 R2 is the portion of R1 (
<a
401 href=
"String.html#method-i-porter2_r1">porter2_r1
</a>) after the first
402 non-vowel after the first vowel
407 <div class=
"method-source-code"
408 id=
"porter-r--source">
410 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
52</span>
411 52:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_r2
</span>
412 53:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::V}#{Porter2::C}(?
<r2
>.*)$/
</span>
413 54:
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">last_match
</span>(
<span class=
"ruby-value">:r2
</span>)
<span class=
"ruby-operator">||
</span> <span class=
"ruby-value str">""</span>
414 55:
<span class=
"ruby-keyword kw">end
</span></pre>
425 <div id=
"porter-stem-method" class=
"method-detail ">
426 <a name=
"method-i-porter2_stem"></a>
428 <div class=
"method-heading">
430 <span class=
"method-name">porter2_stem
</span><span
431 class=
"method-args">(gb_english = false)
</span>
432 <span class=
"method-click-advice">click to toggle source
</span>
436 <div class=
"method-description">
439 Perform the stemming procedure. If
<tt>gb_english
</tt> is true, treat
440 ’-ise
’ and similar suffixes as
’-ize
’ in American
446 <div class=
"method-source-code"
447 id=
"porter-stem-source">
449 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
269</span>
450 269:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_stem
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
451 270:
<span class=
"ruby-identifier">preword
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_tidy
</span>
452 271:
<span class=
"ruby-keyword kw">return
</span> <span class=
"ruby-identifier">preword
</span> <span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">length
</span> <span class=
"ruby-operator"><=
</span> <span class=
"ruby-value">2</span>
454 273:
<span class=
"ruby-identifier">word
</span> =
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">porter2_preprocess
</span>
456 275:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">has_key?
</span> <span class=
"ruby-identifier">word
</span>
457 276:
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>[
<span class=
"ruby-identifier">word
</span>]
458 277:
<span class=
"ruby-keyword kw">else
</span>
459 278:
<span class=
"ruby-identifier">w1a
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_step0
</span>.
<span class=
"ruby-identifier">porter2_step1a
</span>
460 279:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_1A_SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">include?
</span> <span class=
"ruby-identifier">w1a
</span>
461 280:
<span class=
"ruby-identifier">w1a
</span>
462 281:
<span class=
"ruby-keyword kw">else
</span>
463 282:
<span class=
"ruby-identifier">w1a
</span>.
<span class=
"ruby-identifier">porter2_step1b
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step1c
</span>.
<span class=
"ruby-identifier">porter2_step2
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step3
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step4
</span>(
<span class=
"ruby-identifier">gb_english
</span>).
<span class=
"ruby-identifier">porter2_step5
</span>.
<span class=
"ruby-identifier">porter2_postprocess
</span>
464 283:
<span class=
"ruby-keyword kw">end
</span>
465 284:
<span class=
"ruby-keyword kw">end
</span>
466 285:
<span class=
"ruby-keyword kw">end
</span></pre>
472 <div class=
"aliases">
473 Also aliased as:
<a href=
"String.html#method-i-stem">stem
</a>
481 <div id=
"porter-stem-verbose-method" class=
"method-detail ">
482 <a name=
"method-i-porter2_stem_verbose"></a>
484 <div class=
"method-heading">
486 <span class=
"method-name">porter2_stem_verbose
</span><span
487 class=
"method-args">(gb_english = false)
</span>
488 <span class=
"method-click-advice">click to toggle source
</span>
492 <div class=
"method-description">
495 A verbose version of
<a
496 href=
"String.html#method-i-porter2_stem">porter2_stem
</a> that prints the
497 output of each stage to STDOUT
502 <div class=
"method-source-code"
503 id=
"porter-stem-verbose-source">
505 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
288</span>
506 288:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_stem_verbose
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
507 289:
<span class=
"ruby-identifier">preword
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_tidy
</span>
508 290:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Preword: #{preword}
"</span>
509 291:
<span class=
"ruby-keyword kw">return
</span> <span class=
"ruby-identifier">preword
</span> <span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">length
</span> <span class=
"ruby-operator"><=
</span> <span class=
"ruby-value">2</span>
511 293:
<span class=
"ruby-identifier">word
</span> =
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">porter2_preprocess
</span>
512 294:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Preprocessed: #{word}
"</span>
514 296:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">has_key?
</span> <span class=
"ruby-identifier">word
</span>
515 297:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Returning #{word} as special case #{Porter2::SPECIAL_CASES[word]}
"</span>
516 298:
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">SPECIAL_CASES
</span>[
<span class=
"ruby-identifier">word
</span>]
517 299:
<span class=
"ruby-keyword kw">else
</span>
518 300:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
519 301:
<span class=
"ruby-identifier">r2
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_r2
</span>
520 302:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"R1 = #{r1}, R2 = #{r2}
"</span>
522 304:
<span class=
"ruby-identifier">w0
</span> =
<span class=
"ruby-identifier">word
</span>.
<span class=
"ruby-identifier">porter2_step0
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
0: #{w0} (R1 = #{w0.porter2_r1}, R2 = #{w0.porter2_r2})
"</span>
523 305:
<span class=
"ruby-identifier">w1a
</span> =
<span class=
"ruby-identifier">w0
</span>.
<span class=
"ruby-identifier">porter2_step1a
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
1a: #{w1a} (R1 = #{w1a.porter2_r1}, R2 = #{w1a.porter2_r2})
"</span>
525 307:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_1A_SPECIAL_CASES
</span>.
<span class=
"ruby-identifier">include?
</span> <span class=
"ruby-identifier">w1a
</span>
526 308:
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"Returning #{w1a} as
1a special case
"</span>
527 309:
<span class=
"ruby-identifier">w1a
</span>
528 310:
<span class=
"ruby-keyword kw">else
</span>
529 311:
<span class=
"ruby-identifier">w1b
</span> =
<span class=
"ruby-identifier">w1a
</span>.
<span class=
"ruby-identifier">porter2_step1b
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
1b: #{w1b} (R1 = #{w1b.porter2_r1}, R2 = #{w1b.porter2_r2})
"</span>
530 312:
<span class=
"ruby-identifier">w1c
</span> =
<span class=
"ruby-identifier">w1b
</span>.
<span class=
"ruby-identifier">porter2_step1c
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
1c: #{w1c} (R1 = #{w1c.porter2_r1}, R2 = #{w1c.porter2_r2})
"</span>
531 313:
<span class=
"ruby-identifier">w2
</span> =
<span class=
"ruby-identifier">w1c
</span>.
<span class=
"ruby-identifier">porter2_step2
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
2: #{w2} (R1 = #{w2.porter2_r1}, R2 = #{w2.porter2_r2})
"</span>
532 314:
<span class=
"ruby-identifier">w3
</span> =
<span class=
"ruby-identifier">w2
</span>.
<span class=
"ruby-identifier">porter2_step3
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
3: #{w3} (R1 = #{w3.porter2_r1}, R2 = #{w3.porter2_r2})
"</span>
533 315:
<span class=
"ruby-identifier">w4
</span> =
<span class=
"ruby-identifier">w3
</span>.
<span class=
"ruby-identifier">porter2_step4
</span>(
<span class=
"ruby-identifier">gb_english
</span>) ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
4: #{w4} (R1 = #{w4.porter2_r1}, R2 = #{w4.porter2_r2})
"</span>
534 316:
<span class=
"ruby-identifier">w5
</span> =
<span class=
"ruby-identifier">w4
</span>.
<span class=
"ruby-identifier">porter2_step5
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After step
5: #{w5}
"</span>
535 317:
<span class=
"ruby-identifier">wpost
</span> =
<span class=
"ruby-identifier">w5
</span>.
<span class=
"ruby-identifier">porter2_postprocess
</span> ;
<span class=
"ruby-identifier">puts
</span> <span class=
"ruby-node">"After postprocess: #{wpost}
"</span>
536 318:
<span class=
"ruby-identifier">wpost
</span>
537 319:
<span class=
"ruby-keyword kw">end
</span>
538 320:
<span class=
"ruby-keyword kw">end
</span>
539 321:
<span class=
"ruby-keyword kw">end
</span></pre>
550 <div id=
"porter-step--method" class=
"method-detail ">
551 <a name=
"method-i-porter2_step0"></a>
553 <div class=
"method-heading">
555 <span class=
"method-name">porter2_step0
</span><span
556 class=
"method-args">()
</span>
557 <span class=
"method-click-advice">click to toggle source
</span>
561 <div class=
"method-description">
564 Search for the longest among the suffixes,
586 <div class=
"method-source-code"
587 id=
"porter-step--source">
589 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
75</span>
590 75:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step0
</span>
591 76:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub!
</span>(
<span class=
"ruby-regexp re">/(.)('s'|'s|')$/
</span>,
<span class=
"ruby-value str">'\
1'
</span>)
<span class=
"ruby-operator">||
</span> <span class=
"ruby-keyword kw">self
</span>
592 77:
<span class=
"ruby-keyword kw">end
</span></pre>
603 <div id=
"porter-step-a-method" class=
"method-detail ">
604 <a name=
"method-i-porter2_step1a"></a>
606 <div class=
"method-heading">
608 <span class=
"method-name">porter2_step1a
</span><span
609 class=
"method-args">()
</span>
610 <span class=
"method-click-advice">click to toggle source
</span>
614 <div class=
"method-description">
617 Search for the longest among the following suffixes, and perform the action
621 <tr><td valign=
"top">sses
</td><td><p>
625 <tr><td valign=
"top">ied, ies
</td><td><p>
626 replace by i if preceded by more than one letter, otherwise by ie
629 <tr><td valign=
"top">s
</td><td><p>
630 delete if the preceding word part contains a vowel not immediately before
634 <tr><td valign=
"top">us, ss
</td><td><p>
642 <div class=
"method-source-code"
643 id=
"porter-step-a-source">
645 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
85</span>
646 85:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step1a
</span>
647 86:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/sses$/
</span>
648 87:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/sses$/
</span>,
<span class=
"ruby-value str">'ss'
</span>)
649 88:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/..(ied|ies)$/
</span>
650 89:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(ied|ies)$/
</span>,
<span class=
"ruby-value str">'i'
</span>)
651 90:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(ied|ies)$/
</span>
652 91:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(ied|ies)$/
</span>,
<span class=
"ruby-value str">'ie'
</span>)
653 92:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(us|ss)$/
</span>
654 93:
<span class=
"ruby-keyword kw">self
</span>
655 94:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/s$/
</span>
656 95:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/(#{Porter2::V}.+)s$/
</span>
657 96:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/s$/
</span>,
<span class=
"ruby-value str">''
</span>)
658 97:
<span class=
"ruby-keyword kw">else
</span>
659 98:
<span class=
"ruby-keyword kw">self
</span>
660 99:
<span class=
"ruby-keyword kw">end
</span>
661 100:
<span class=
"ruby-keyword kw">else
</span>
662 101:
<span class=
"ruby-keyword kw">self
</span>
663 102:
<span class=
"ruby-keyword kw">end
</span>
664 103:
<span class=
"ruby-keyword kw">end
</span></pre>
675 <div id=
"porter-step-b-method" class=
"method-detail ">
676 <a name=
"method-i-porter2_step1b"></a>
678 <div class=
"method-heading">
680 <span class=
"method-name">porter2_step1b
</span><span
681 class=
"method-args">(gb_english = false)
</span>
682 <span class=
"method-click-advice">click to toggle source
</span>
686 <div class=
"method-description">
689 Search for the longest among the following suffixes, and perform the action
693 <tr><td valign=
"top">eed, eedly
</td><td><p>
694 replace by ee if the suffix is also in R1
697 <tr><td valign=
"top">ed, edly, ing, ingly
</td><td><p>
698 delete if the preceding word part contains a vowel and, after the
703 if the word ends at, bl or iz: add e, or
709 if the word ends with a double: remove the last letter, or
715 if the word is short: add e
722 (If gb_english is
<tt>true
</tt>, treat the
‘is
’ suffix as
723 ‘iz
’ above.)
728 <div class=
"method-source-code"
729 id=
"porter-step-b-source">
731 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
115</span>
732 115:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step1b
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
733 116:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(eed|eedly)$/
</span>
734 117:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(eed|eedly)$/
</span>
735 118:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(eed|eedly)$/
</span>,
<span class=
"ruby-value str">'ee'
</span>)
736 119:
<span class=
"ruby-keyword kw">else
</span>
737 120:
<span class=
"ruby-keyword kw">self
</span>
738 121:
<span class=
"ruby-keyword kw">end
</span>
739 122:
<span class=
"ruby-keyword kw">else
</span>
740 123:
<span class=
"ruby-identifier">w
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">dup
</span>
741 124:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::V}.*(ed|edly|ing|ingly)$/
</span>
742 125:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">sub!
</span>(
<span class=
"ruby-regexp re">/(ed|edly|ing|ingly)$/
</span>,
<span class=
"ruby-value str">''
</span>)
743 126:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(at|lb|iz)$/
</span>
744 127:
<span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">+=
</span> <span class=
"ruby-value str">'e'
</span>
745 128:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/is$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-identifier">gb_english
</span>
746 129:
<span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">+=
</span> <span class=
"ruby-value str">'e'
</span>
747 130:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::Double}$/
</span>
748 131:
<span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">chop!
</span>
749 132:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">w
</span>.
<span class=
"ruby-identifier">porter2_is_short_word?
</span>
750 133:
<span class=
"ruby-identifier">w
</span> <span class=
"ruby-operator">+=
</span> <span class=
"ruby-value str">'e'
</span>
751 134:
<span class=
"ruby-keyword kw">end
</span>
752 135:
<span class=
"ruby-keyword kw">end
</span>
753 136:
<span class=
"ruby-identifier">w
</span>
754 137:
<span class=
"ruby-keyword kw">end
</span>
755 138:
<span class=
"ruby-keyword kw">end
</span></pre>
766 <div id=
"porter-step-c-method" class=
"method-detail ">
767 <a name=
"method-i-porter2_step1c"></a>
769 <div class=
"method-heading">
771 <span class=
"method-name">porter2_step1c
</span><span
772 class=
"method-args">()
</span>
773 <span class=
"method-click-advice">click to toggle source
</span>
777 <div class=
"method-description">
780 Replace a suffix of y or Y by i if it is preceded by a non-vowel which is
781 not the first letter of the word.
786 <div class=
"method-source-code"
787 id=
"porter-step-c-source">
789 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
143</span>
790 143:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step1c
</span>
791 144:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/.+#{Porter2::C}(y|Y)$/
</span>
792 145:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/(y|Y)$/
</span>,
<span class=
"ruby-value str">'i'
</span>)
793 146:
<span class=
"ruby-keyword kw">else
</span>
794 147:
<span class=
"ruby-keyword kw">self
</span>
795 148:
<span class=
"ruby-keyword kw">end
</span>
796 149:
<span class=
"ruby-keyword kw">end
</span></pre>
807 <div id=
"porter-step--method" class=
"method-detail ">
808 <a name=
"method-i-porter2_step2"></a>
810 <div class=
"method-heading">
812 <span class=
"method-name">porter2_step2
</span><span
813 class=
"method-args">(gb_english = false)
</span>
814 <span class=
"method-click-advice">click to toggle source
</span>
818 <div class=
"method-description">
821 Search for the longest among the suffixes listed in the keys of
822 Porter2::STEP_2_MAPS. If one is found and that suffix occurs in R1,
823 replace it with the value found in STEP_2_MAPS.
826 (Suffixes
‘ogi
’ and
‘li
’ are treated as special
827 cases in the procedure.)
830 (If gb_english is
<tt>true
</tt>, replace the
‘iser
’ and
831 ‘isation
’ suffixes with
‘ise
’, similarly to how
832 ‘izer
’ and
‘ization
’ are treated.)
837 <div class=
"method-source-code"
838 id=
"porter-step--source">
840 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
160</span>
841 160:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step2
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
842 161:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
843 162:
<span class=
"ruby-identifier">s2m
</span> =
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_2_MAPS
</span>.
<span class=
"ruby-identifier">dup
</span>
844 163:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">gb_english
</span>
845 164:
<span class=
"ruby-identifier">s2m
</span>[
<span class=
"ruby-value str">"iser
"</span>] =
<span class=
"ruby-value str">"ise
"</span>
846 165:
<span class=
"ruby-identifier">s2m
</span>[
<span class=
"ruby-value str">"isation
"</span>] =
<span class=
"ruby-value str">"ise
"</span>
847 166:
<span class=
"ruby-keyword kw">end
</span>
848 167:
<span class=
"ruby-identifier">step_2_re
</span> =
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">union
</span>(
<span class=
"ruby-identifier">s2m
</span>.
<span class=
"ruby-identifier">keys
</span>.
<span class=
"ruby-identifier">map
</span> {
<span class=
"ruby-operator">|
</span><span class=
"ruby-identifier">r
</span><span class=
"ruby-operator">|
</span> <span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">new
</span>(
<span class=
"ruby-identifier">r
</span> <span class=
"ruby-operator">+
</span> <span class=
"ruby-value str">"$
"</span>)})
849 168:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-identifier">step_2_re
</span>
850 169:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{$
&}$/
</span>
851 170:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-node">/#{$
&}$/
</span>,
<span class=
"ruby-identifier">s2m
</span>[
<span class=
"ruby-node">$
&</span>])
852 171:
<span class=
"ruby-keyword kw">else
</span>
853 172:
<span class=
"ruby-keyword kw">self
</span>
854 173:
<span class=
"ruby-keyword kw">end
</span>
855 174:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/li$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/(#{Porter2::Valid_LI})li$/
</span>
856 175:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/li$/
</span>,
<span class=
"ruby-value str">''
</span>)
857 176:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ogi$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/logi$/
</span>
858 177:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ogi$/
</span>,
<span class=
"ruby-value str">'og'
</span>)
859 178:
<span class=
"ruby-keyword kw">else
</span>
860 179:
<span class=
"ruby-keyword kw">self
</span>
861 180:
<span class=
"ruby-keyword kw">end
</span>
862 181:
<span class=
"ruby-keyword kw">end
</span></pre>
873 <div id=
"porter-step--method" class=
"method-detail ">
874 <a name=
"method-i-porter2_step3"></a>
876 <div class=
"method-heading">
878 <span class=
"method-name">porter2_step3
</span><span
879 class=
"method-args">(gb_english = false)
</span>
880 <span class=
"method-click-advice">click to toggle source
</span>
884 <div class=
"method-description">
887 Search for the longest among the suffixes listed in the keys of
888 Porter2::STEP_3_MAPS. If one is found and that suffix occurs in R1,
889 replace it with the value found in STEP_3_MAPS.
892 (Suffix
‘ative
’ is treated as a special case in the procedure.)
895 (If gb_english is
<tt>true
</tt>, replace the
‘alise
’ suffix
896 with
‘al
’, similarly to how
‘alize
’ is treated.)
901 <div class=
"method-source-code"
902 id=
"porter-step--source">
904 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
192</span>
905 192:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step3
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
906 193:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ative$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ative$/
</span>
907 194:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ative$/
</span>,
<span class=
"ruby-value str">''
</span>)
908 195:
<span class=
"ruby-keyword kw">else
</span>
909 196:
<span class=
"ruby-identifier">s3m
</span> =
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_3_MAPS
</span>.
<span class=
"ruby-identifier">dup
</span>
910 197:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">gb_english
</span>
911 198:
<span class=
"ruby-identifier">s3m
</span>[
<span class=
"ruby-value str">"alise
"</span>] =
<span class=
"ruby-value str">"al
"</span>
912 199:
<span class=
"ruby-keyword kw">end
</span>
913 200:
<span class=
"ruby-identifier">step_3_re
</span> =
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">union
</span>(
<span class=
"ruby-identifier">s3m
</span>.
<span class=
"ruby-identifier">keys
</span>.
<span class=
"ruby-identifier">map
</span> {
<span class=
"ruby-operator">|
</span><span class=
"ruby-identifier">r
</span><span class=
"ruby-operator">|
</span> <span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">new
</span>(
<span class=
"ruby-identifier">r
</span> <span class=
"ruby-operator">+
</span> <span class=
"ruby-value str">"$
"</span>)})
914 201:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
915 202:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-identifier">step_3_re
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{$
&}$/
</span>
916 203:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-node">/#{$
&}$/
</span>,
<span class=
"ruby-identifier">s3m
</span>[
<span class=
"ruby-node">$
&</span>])
917 204:
<span class=
"ruby-keyword kw">else
</span>
918 205:
<span class=
"ruby-keyword kw">self
</span>
919 206:
<span class=
"ruby-keyword kw">end
</span>
920 207:
<span class=
"ruby-keyword kw">end
</span>
921 208:
<span class=
"ruby-keyword kw">end
</span></pre>
932 <div id=
"porter-step--method" class=
"method-detail ">
933 <a name=
"method-i-porter2_step4"></a>
935 <div class=
"method-heading">
937 <span class=
"method-name">porter2_step4
</span><span
938 class=
"method-args">(gb_english = false)
</span>
939 <span class=
"method-click-advice">click to toggle source
</span>
943 <div class=
"method-description">
946 Search for the longest among the suffixes listed in the keys of
947 Porter2::STEP_4_MAPS. If one is found and that suffix occurs in R2,
948 replace it with the value found in STEP_4_MAPS.
951 (Suffix
‘ion
’ is treated as a special case in the procedure.)
954 (If gb_english is
<tt>true
</tt>, delete the
‘ise
’ suffix if
960 <div class=
"method-source-code"
961 id=
"porter-step--source">
963 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
218</span>
964 218:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step4
</span>(
<span class=
"ruby-identifier">gb_english
</span> =
<span class=
"ruby-keyword kw">false
</span>)
965 219:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ion$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/(s|t)ion$/
</span>
966 220:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ion$/
</span>,
<span class=
"ruby-value str">''
</span>)
967 221:
<span class=
"ruby-keyword kw">else
</span>
968 222:
<span class=
"ruby-identifier">s4m
</span> =
<span class=
"ruby-constant">Porter2
</span><span class=
"ruby-operator">::
</span><span class=
"ruby-constant">STEP_4_MAPS
</span>.
<span class=
"ruby-identifier">dup
</span>
969 223:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">gb_english
</span>
970 224:
<span class=
"ruby-identifier">s4m
</span>[
<span class=
"ruby-value str">"ise
"</span>] =
<span class=
"ruby-value str">""</span>
971 225:
<span class=
"ruby-keyword kw">end
</span>
972 226:
<span class=
"ruby-identifier">step_4_re
</span> =
<span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">union
</span>(
<span class=
"ruby-identifier">s4m
</span>.
<span class=
"ruby-identifier">keys
</span>.
<span class=
"ruby-identifier">map
</span> {
<span class=
"ruby-operator">|
</span><span class=
"ruby-identifier">r
</span><span class=
"ruby-operator">|
</span> <span class=
"ruby-constant">Regexp
</span>.
<span class=
"ruby-identifier">new
</span>(
<span class=
"ruby-identifier">r
</span> <span class=
"ruby-operator">+
</span> <span class=
"ruby-value str">"$
"</span>)})
973 227:
<span class=
"ruby-identifier">r2
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span>
974 228:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-identifier">step_4_re
</span>
975 229:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-identifier">r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{$
&}/
</span>
976 230:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-node">/#{$
&}$/
</span>,
<span class=
"ruby-identifier">s4m
</span>[
<span class=
"ruby-node">$
&</span>])
977 231:
<span class=
"ruby-keyword kw">else
</span>
978 232:
<span class=
"ruby-keyword kw">self
</span>
979 233:
<span class=
"ruby-keyword kw">end
</span>
980 234:
<span class=
"ruby-keyword kw">else
</span>
981 235:
<span class=
"ruby-keyword kw">self
</span>
982 236:
<span class=
"ruby-keyword kw">end
</span>
983 237:
<span class=
"ruby-keyword kw">end
</span>
984 238:
<span class=
"ruby-keyword kw">end
</span></pre>
995 <div id=
"porter-step--method" class=
"method-detail ">
996 <a name=
"method-i-porter2_step5"></a>
998 <div class=
"method-heading">
1000 <span class=
"method-name">porter2_step5
</span><span
1001 class=
"method-args">()
</span>
1002 <span class=
"method-click-advice">click to toggle source
</span>
1006 <div class=
"method-description">
1009 Search for the the following suffixes, and, if found, perform the action
1013 <tr><td valign=
"top">e
</td><td><p>
1014 delete if in R2, or in R1 and not preceded by a short syllable
1017 <tr><td valign=
"top">l
</td><td><p>
1018 delete if in R2 and preceded by l
1025 <div class=
"method-source-code"
1026 id=
"porter-step--source">
1028 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
244</span>
1029 244:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_step5
</span>
1030 245:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/ll$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/l$/
</span>
1031 246:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/ll$/
</span>,
<span class=
"ruby-value str">'l'
</span>)
1032 247:
<span class=
"ruby-keyword kw">elsif
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r2
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span>
1033 248:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/e$/
</span>,
<span class=
"ruby-value str">''
</span>)
1034 249:
<span class=
"ruby-keyword kw">else
</span>
1035 250:
<span class=
"ruby-identifier">r1
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">porter2_r1
</span>
1036 251:
<span class=
"ruby-keyword kw">if
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-identifier">r1
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-regexp re">/e$/
</span> <span class=
"ruby-keyword kw">and
</span> <span class=
"ruby-keyword kw">not
</span> <span class=
"ruby-keyword kw">self
</span> <span class=
"ruby-operator">=~
</span> <span class=
"ruby-node">/#{Porter2::SHORT_SYLLABLE}e$/
</span>
1037 252:
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">sub
</span>(
<span class=
"ruby-regexp re">/e$/
</span>,
<span class=
"ruby-value str">''
</span>)
1038 253:
<span class=
"ruby-keyword kw">else
</span>
1039 254:
<span class=
"ruby-keyword kw">self
</span>
1040 255:
<span class=
"ruby-keyword kw">end
</span>
1041 256:
<span class=
"ruby-keyword kw">end
</span>
1042 257:
<span class=
"ruby-keyword kw">end
</span></pre>
1053 <div id=
"porter-tidy-method" class=
"method-detail ">
1054 <a name=
"method-i-porter2_tidy"></a>
1056 <div class=
"method-heading">
1058 <span class=
"method-name">porter2_tidy
</span><span
1059 class=
"method-args">()
</span>
1060 <span class=
"method-click-advice">click to toggle source
</span>
1064 <div class=
"method-description">
1067 Tidy up the word before we get down to the algorithm
1072 <div class=
"method-source-code"
1073 id=
"porter-tidy-source">
1075 <span class=
"ruby-comment cmt"># File lib/porter2_implementation.rb, line
7</span>
1076 7:
<span class=
"ruby-keyword kw">def
</span> <span class=
"ruby-identifier">porter2_tidy
</span>
1077 8:
<span class=
"ruby-identifier">preword
</span> =
<span class=
"ruby-keyword kw">self
</span>.
<span class=
"ruby-identifier">to_s
</span>.
<span class=
"ruby-identifier">strip
</span>.
<span class=
"ruby-identifier">downcase
</span>
1079 10:
<span class=
"ruby-comment cmt"># map apostrophe-like characters to apostrophes
</span>
1080 11:
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/‘/
</span>,
<span class=
"ruby-value str">"'
"</span>)
1081 12:
<span class=
"ruby-identifier">preword
</span>.
<span class=
"ruby-identifier">gsub!
</span>(
<span class=
"ruby-regexp re">/’/
</span>,
<span class=
"ruby-value str">"'
"</span>)
1083 14:
<span class=
"ruby-identifier">preword
</span>
1084 15:
<span class=
"ruby-keyword kw">end
</span></pre>
1095 <div id=
"stem-method" class=
"method-detail method-alias">
1096 <a name=
"method-i-stem"></a>
1098 <div class=
"method-heading">
1100 <span class=
"method-name">stem
</span><span
1101 class=
"method-args">(gb_english = false)
</span>
1102 <span class=
"method-click-advice">click to toggle source
</span>
1106 <div class=
"method-description">
1117 <div class=
"aliases">
1118 Alias for:
<a href=
"String.html#method-i-porter2_stem">porter2_stem
</a>
1130 <div id=
"rdoc-debugging-section-dump" class=
"debugging-section">
1132 <p>Disabled; run with --debug to generate this.
</p>
1136 <div id=
"validator-badges">
1137 <p><small><a href=
"http://validator.w3.org/check/referer">[Validate]
</a></small></p>
1138 <p><small>Generated with the
<a href=
"http://deveiate.org/projects/Darkfish-Rdoc/">Darkfish
1139 Rdoc Generator
</a> 1.1.6</small>.
</p>