Split constants into a module, procedures directly into String.
[porter2stemmer.git] / doc / String.html
1 <?xml version="1.0" encoding="utf-8"?>
2 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
3 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
4 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
5 <head>
6 <meta content="text/html; charset=utf-8" http-equiv="Content-Type" />
7
8 <title>Class: String</title>
9
10 <link rel="stylesheet" href="./rdoc.css" type="text/css" media="screen" />
11
12 <script src="./js/jquery.js" type="text/javascript"
13 charset="utf-8"></script>
14 <script src="./js/thickbox-compressed.js" type="text/javascript"
15 charset="utf-8"></script>
16 <script src="./js/quicksearch.js" type="text/javascript"
17 charset="utf-8"></script>
18 <script src="./js/darkfish.js" type="text/javascript"
19 charset="utf-8"></script>
20
21 </head>
22 <body class="class">
23
24 <div id="metadata">
25 <div id="home-metadata">
26 <div id="home-section" class="section">
27 <h3 class="section-header">
28 <a href="./index.html">Home</a>
29 <a href="./index.html#classes">Classes</a>
30 <a href="./index.html#methods">Methods</a>
31 </h3>
32 </div>
33 </div>
34
35 <div id="file-metadata">
36 <div id="file-list-section" class="section">
37 <h3 class="section-header">In Files</h3>
38 <div class="section-body">
39 <ul>
40
41 <li><a href="./lib/porter2_rb.html?TB_iframe=true&amp;height=550&amp;width=785"
42 class="thickbox" title="lib/porter2.rb">lib/porter2.rb</a></li>
43
44 </ul>
45 </div>
46 </div>
47
48
49 </div>
50
51 <div id="class-metadata">
52
53 <!-- Parent Class -->
54
55 <div id="parent-class-section" class="section">
56 <h3 class="section-header">Parent</h3>
57
58 <p class="link">Object</p>
59
60 </div>
61
62
63 <!-- Namespace Contents -->
64
65
66 <!-- Method Quickref -->
67
68 <div id="method-list-section" class="section">
69 <h3 class="section-header">Methods</h3>
70 <ul class="link-list">
71
72 <li><a href="#method-i-porter2_ends_with_short_syllable%3F">#porter2_ends_with_short_syllable?</a></li>
73
74 <li><a href="#method-i-porter2_is_short_word%3F">#porter2_is_short_word?</a></li>
75
76 <li><a href="#method-i-porter2_postprocess">#porter2_postprocess</a></li>
77
78 <li><a href="#method-i-porter2_preprocess">#porter2_preprocess</a></li>
79
80 <li><a href="#method-i-porter2_r1">#porter2_r1</a></li>
81
82 <li><a href="#method-i-porter2_r2">#porter2_r2</a></li>
83
84 <li><a href="#method-i-porter2_stem">#porter2_stem</a></li>
85
86 <li><a href="#method-i-porter2_stem_verbose">#porter2_stem_verbose</a></li>
87
88 <li><a href="#method-i-porter2_step0">#porter2_step0</a></li>
89
90 <li><a href="#method-i-porter2_step1a">#porter2_step1a</a></li>
91
92 <li><a href="#method-i-porter2_step1b">#porter2_step1b</a></li>
93
94 <li><a href="#method-i-porter2_step1c">#porter2_step1c</a></li>
95
96 <li><a href="#method-i-porter2_step2">#porter2_step2</a></li>
97
98 <li><a href="#method-i-porter2_step3">#porter2_step3</a></li>
99
100 <li><a href="#method-i-porter2_step4">#porter2_step4</a></li>
101
102 <li><a href="#method-i-porter2_step5">#porter2_step5</a></li>
103
104 <li><a href="#method-i-porter2_tidy">#porter2_tidy</a></li>
105
106 <li><a href="#method-i-stem">#stem</a></li>
107
108 </ul>
109 </div>
110
111
112 <!-- Included Modules -->
113
114 </div>
115
116 <div id="project-metadata">
117
118
119
120 <div id="classindex-section" class="section project-section">
121 <h3 class="section-header">Class Index
122 <span class="search-toggle"><img src="./images/find.png"
123 height="16" width="16" alt="[+]"
124 title="show/hide quicksearch" /></span></h3>
125 <form action="#" method="get" accept-charset="utf-8" class="initially-hidden">
126 <fieldset>
127 <legend>Quicksearch</legend>
128 <input type="text" name="quicksearch" value=""
129 class="quicksearch-field" />
130 </fieldset>
131 </form>
132
133 <ul class="link-list">
134
135 <li><a href="./Porter2.html">Porter2</a></li>
136
137 <li><a href="./String.html">String</a></li>
138
139 <li><a href="./TestPorter2.html">TestPorter2</a></li>
140
141 </ul>
142 <div id="no-class-search-results" style="display: none;">No matching classes.</div>
143 </div>
144
145
146 </div>
147 </div>
148
149 <div id="documentation">
150 <h1 class="class">String</h1>
151
152 <div id="description">
153 <h2>The Porter 2 stemmer</h2>
154 <p>
155 This is the Porter 2 stemming algorithm, as described at <a
156 href="http://snowball.tartarus.org/algorithms/english/stemmer.html">snowball.tartarus.org/algorithms/english/stemmer.html</a>
157 The original paper is:
158 </p>
159 <p>
160 Porter, 1980, &#8220;An algorithm for suffix stripping&#8221;,
161 <em>Program</em>, Vol. 14, no. 3, pp 130-137
162 </p>
163 <p>
164 Constants for the stemmer are in the <a href="Porter2.html">Porter2</a>
165 module.
166 </p>
167 <p>
168 Procedures that implement the stemmer are added to the <a
169 href="String.html">String</a> class.
170 </p>
171 <p>
172 The stemmer algorithm is implemented in the <a
173 href="String.html#method-i-porter2_stem">porter2_stem</a> procedure.
174 </p>
175 <h2>Internationalisation</h2>
176 <p>
177 There isn&#8217;t much, as this is a stemmer that only works for English.
178 </p>
179 <p>
180 The <tt>gb_english</tt> flag to the various procedures allows the stemmer
181 to treat the British English &#8217;-ise&#8217; the same as the American
182 English &#8217;-ize&#8217;.
183 </p>
184 <h2>Longest suffixes</h2>
185 <p>
186 Several places in the algorithm require matching the longest suffix of a
187 word. The regexp engine in Ruby 1.9 seems to handle alterntives in regexps
188 by finding the alternative that matches at the first position in the
189 string. As we&#8217;re only talking about suffixes, that first match is
190 also the longest suffix. If the regexp engine changes, this behaviour may
191 change and break the stemmer.
192 </p>
193
194 </div>
195
196 <!-- Constants -->
197
198
199 <!-- Attributes -->
200
201
202 <!-- Methods -->
203
204 <div id="public-instance-method-details" class="method-section section">
205 <h3 class="section-header">Public Instance Methods</h3>
206
207
208 <div id="porter-ends-with-short-syllable--method" class="method-detail ">
209 <a name="method-i-porter2_ends_with_short_syllable%3F"></a>
210
211 <div class="method-heading">
212
213 <span class="method-name">porter2_ends_with_short_syllable?</span><span
214 class="method-args">()</span>
215 <span class="method-click-advice">click to toggle source</span>
216
217 </div>
218
219 <div class="method-description">
220
221 <p>
222 Returns true if the word ends with a short syllable
223 </p>
224
225
226
227 <div class="method-source-code"
228 id="porter-ends-with-short-syllable--source">
229 <pre>
230 <span class="ruby-comment cmt"># File lib/porter2.rb, line 87</span>
231 87: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_ends_with_short_syllable?</span>
232 88: <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{Porter2::SHORT_SYLLABLE}$/</span> <span class="ruby-operator">?</span> <span class="ruby-keyword kw">true</span> <span class="ruby-operator">:</span> <span class="ruby-keyword kw">false</span>
233 89: <span class="ruby-keyword kw">end</span></pre>
234 </div>
235
236 </div>
237
238
239
240
241 </div>
242
243
244 <div id="porter-is-short-word--method" class="method-detail ">
245 <a name="method-i-porter2_is_short_word%3F"></a>
246
247 <div class="method-heading">
248
249 <span class="method-name">porter2_is_short_word?</span><span
250 class="method-args">()</span>
251 <span class="method-click-advice">click to toggle source</span>
252
253 </div>
254
255 <div class="method-description">
256
257 <p>
258 A word is short if it ends in a short syllable, and R1 is null
259 </p>
260
261
262
263 <div class="method-source-code"
264 id="porter-is-short-word--source">
265 <pre>
266 <span class="ruby-comment cmt"># File lib/porter2.rb, line 93</span>
267 93: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_is_short_word?</span>
268 94: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_ends_with_short_syllable?</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r1</span>.<span class="ruby-identifier">empty?</span>
269 95: <span class="ruby-keyword kw">end</span></pre>
270 </div>
271
272 </div>
273
274
275
276
277 </div>
278
279
280 <div id="porter-postprocess-method" class="method-detail ">
281 <a name="method-i-porter2_postprocess"></a>
282
283 <div class="method-heading">
284
285 <span class="method-name">porter2_postprocess</span><span
286 class="method-args">()</span>
287 <span class="method-click-advice">click to toggle source</span>
288
289 </div>
290
291 <div class="method-description">
292
293 <p>
294 Turn all Y letters into y
295 </p>
296
297
298
299 <div class="method-source-code"
300 id="porter-postprocess-source">
301 <pre>
302 <span class="ruby-comment cmt"># File lib/porter2.rb, line 289</span>
303 289: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_postprocess</span>
304 290: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">gsub</span>(<span class="ruby-regexp re">/Y/</span>, <span class="ruby-value str">'y'</span>)
305 291: <span class="ruby-keyword kw">end</span></pre>
306 </div>
307
308 </div>
309
310
311
312
313 </div>
314
315
316 <div id="porter-preprocess-method" class="method-detail ">
317 <a name="method-i-porter2_preprocess"></a>
318
319 <div class="method-heading">
320
321 <span class="method-name">porter2_preprocess</span><span
322 class="method-args">()</span>
323 <span class="method-click-advice">click to toggle source</span>
324
325 </div>
326
327 <div class="method-description">
328
329 <p>
330 Preprocess the word. Remove any initial &#8217;, if present. Then, set
331 initial y, or y after a vowel, to Y
332 </p>
333 <p>
334 (The comment to &#8216;establish the regions R1 and R2&#8217; in the
335 original description is an implementation optimisation that identifies
336 where the regions start. As no modifications are made to the word that
337 affect those positions, you may want to cache them now. This implementation
338 doesn&#8217;t do that.)
339 </p>
340
341
342
343 <div class="method-source-code"
344 id="porter-preprocess-source">
345 <pre>
346 <span class="ruby-comment cmt"># File lib/porter2.rb, line 53</span>
347 53: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_preprocess</span>
348 54: <span class="ruby-identifier">w</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">dup</span>
349 55:
350 56: <span class="ruby-comment cmt"># remove any initial apostrophe</span>
351 57: <span class="ruby-identifier">w</span>.<span class="ruby-identifier">gsub!</span>(<span class="ruby-regexp re">/^'*(.)/</span>, <span class="ruby-value str">'\1'</span>)
352 58:
353 59: <span class="ruby-comment cmt"># set initial y, or y after a vowel, to Y</span>
354 60: <span class="ruby-identifier">w</span>.<span class="ruby-identifier">gsub!</span>(<span class="ruby-regexp re">/^y/</span>, <span class="ruby-value str">&quot;Y&quot;</span>)
355 61: <span class="ruby-identifier">w</span>.<span class="ruby-identifier">gsub!</span>(<span class="ruby-node">/(#{Porter2::V})y/</span>, <span class="ruby-value str">'\1Y'</span>)
356 62:
357 63: <span class="ruby-identifier">w</span>
358 64: <span class="ruby-keyword kw">end</span></pre>
359 </div>
360
361 </div>
362
363
364
365
366 </div>
367
368
369 <div id="porter-r--method" class="method-detail ">
370 <a name="method-i-porter2_r1"></a>
371
372 <div class="method-heading">
373
374 <span class="method-name">porter2_r1</span><span
375 class="method-args">()</span>
376 <span class="method-click-advice">click to toggle source</span>
377
378 </div>
379
380 <div class="method-description">
381
382 <p>
383 R1 is the portion of the word after the first non-vowel after the first
384 vowel (with words beginning &#8216;gener-&#8217;, &#8216;commun-&#8217;,
385 and &#8216;arsen-&#8217; treated as special cases
386 </p>
387
388
389
390 <div class="method-source-code"
391 id="porter-r--source">
392 <pre>
393 <span class="ruby-comment cmt"># File lib/porter2.rb, line 69</span>
394 69: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_r1</span>
395 70: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/^(gener|commun|arsen)(?&lt;r1&gt;.*)/</span>
396 71: <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">last_match</span>(<span class="ruby-value">:r1</span>)
397 72: <span class="ruby-keyword kw">else</span>
398 73: <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{Porter2::V}#{Porter2::C}(?&lt;r1&gt;.*)$/</span>
399 74: <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">last_match</span>(<span class="ruby-value">:r1</span>) <span class="ruby-operator">||</span> <span class="ruby-value str">&quot;&quot;</span>
400 75: <span class="ruby-keyword kw">end</span>
401 76: <span class="ruby-keyword kw">end</span></pre>
402 </div>
403
404 </div>
405
406
407
408
409 </div>
410
411
412 <div id="porter-r--method" class="method-detail ">
413 <a name="method-i-porter2_r2"></a>
414
415 <div class="method-heading">
416
417 <span class="method-name">porter2_r2</span><span
418 class="method-args">()</span>
419 <span class="method-click-advice">click to toggle source</span>
420
421 </div>
422
423 <div class="method-description">
424
425 <p>
426 R2 is the portion of R1 (<a
427 href="String.html#method-i-porter2_r1">porter2_r1</a>) after the first
428 non-vowel after the first vowel
429 </p>
430
431
432
433 <div class="method-source-code"
434 id="porter-r--source">
435 <pre>
436 <span class="ruby-comment cmt"># File lib/porter2.rb, line 80</span>
437 80: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_r2</span>
438 81: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r1</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{Porter2::V}#{Porter2::C}(?&lt;r2&gt;.*)$/</span>
439 82: <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">last_match</span>(<span class="ruby-value">:r2</span>) <span class="ruby-operator">||</span> <span class="ruby-value str">&quot;&quot;</span>
440 83: <span class="ruby-keyword kw">end</span></pre>
441 </div>
442
443 </div>
444
445
446
447
448 </div>
449
450
451 <div id="porter-stem-method" class="method-detail ">
452 <a name="method-i-porter2_stem"></a>
453
454 <div class="method-heading">
455
456 <span class="method-name">porter2_stem</span><span
457 class="method-args">(gb_english = false)</span>
458 <span class="method-click-advice">click to toggle source</span>
459
460 </div>
461
462 <div class="method-description">
463
464 <p>
465 Perform the stemming procedure. If <tt>gb_english</tt> is true, treat
466 &#8217;-ise&#8217; and similar suffixes as &#8217;-ize&#8217; in American
467 English.
468 </p>
469
470
471
472 <div class="method-source-code"
473 id="porter-stem-source">
474 <pre>
475 <span class="ruby-comment cmt"># File lib/porter2.rb, line 297</span>
476 297: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_stem</span>(<span class="ruby-identifier">gb_english</span> = <span class="ruby-keyword kw">false</span>)
477 298: <span class="ruby-identifier">preword</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_tidy</span>
478 299: <span class="ruby-keyword kw">return</span> <span class="ruby-identifier">preword</span> <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">preword</span>.<span class="ruby-identifier">length</span> <span class="ruby-operator">&lt;=</span> <span class="ruby-value">2</span>
479 300:
480 301: <span class="ruby-identifier">word</span> = <span class="ruby-identifier">preword</span>.<span class="ruby-identifier">porter2_preprocess</span>
481 302:
482 303: <span class="ruby-keyword kw">if</span> <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">SPECIAL_CASES</span>.<span class="ruby-identifier">has_key?</span> <span class="ruby-identifier">word</span>
483 304: <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">SPECIAL_CASES</span>[<span class="ruby-identifier">word</span>]
484 305: <span class="ruby-keyword kw">else</span>
485 306: <span class="ruby-identifier">w1a</span> = <span class="ruby-identifier">word</span>.<span class="ruby-identifier">porter2_step0</span>.<span class="ruby-identifier">porter2_step1a</span>
486 307: <span class="ruby-keyword kw">if</span> <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">STEP_1A_SPECIAL_CASES</span>.<span class="ruby-identifier">include?</span> <span class="ruby-identifier">w1a</span>
487 308: <span class="ruby-identifier">w1a</span>
488 309: <span class="ruby-keyword kw">else</span>
489 310: <span class="ruby-identifier">w1a</span>.<span class="ruby-identifier">porter2_step1b</span>(<span class="ruby-identifier">gb_english</span>).<span class="ruby-identifier">porter2_step1c</span>.<span class="ruby-identifier">porter2_step2</span>(<span class="ruby-identifier">gb_english</span>).<span class="ruby-identifier">porter2_step3</span>(<span class="ruby-identifier">gb_english</span>).<span class="ruby-identifier">porter2_step4</span>(<span class="ruby-identifier">gb_english</span>).<span class="ruby-identifier">porter2_step5</span>.<span class="ruby-identifier">porter2_postprocess</span>
490 311: <span class="ruby-keyword kw">end</span>
491 312: <span class="ruby-keyword kw">end</span>
492 313: <span class="ruby-keyword kw">end</span></pre>
493 </div>
494
495 </div>
496
497
498 <div class="aliases">
499 Also aliased as: <a href="String.html#method-i-stem">stem</a>
500 </div>
501
502
503
504 </div>
505
506
507 <div id="porter-stem-verbose-method" class="method-detail ">
508 <a name="method-i-porter2_stem_verbose"></a>
509
510 <div class="method-heading">
511
512 <span class="method-name">porter2_stem_verbose</span><span
513 class="method-args">(gb_english = false)</span>
514 <span class="method-click-advice">click to toggle source</span>
515
516 </div>
517
518 <div class="method-description">
519
520 <p>
521 A verbose version of <a
522 href="String.html#method-i-porter2_stem">porter2_stem</a> that prints the
523 output of each stage to STDOUT
524 </p>
525
526
527
528 <div class="method-source-code"
529 id="porter-stem-verbose-source">
530 <pre>
531 <span class="ruby-comment cmt"># File lib/porter2.rb, line 316</span>
532 316: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_stem_verbose</span>(<span class="ruby-identifier">gb_english</span> = <span class="ruby-keyword kw">false</span>)
533 317: <span class="ruby-identifier">preword</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_tidy</span>
534 318: <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;Preword: #{preword}&quot;</span>
535 319: <span class="ruby-keyword kw">return</span> <span class="ruby-identifier">preword</span> <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">preword</span>.<span class="ruby-identifier">length</span> <span class="ruby-operator">&lt;=</span> <span class="ruby-value">2</span>
536 320:
537 321: <span class="ruby-identifier">word</span> = <span class="ruby-identifier">preword</span>.<span class="ruby-identifier">porter2_preprocess</span>
538 322: <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;Preprocessed: #{word}&quot;</span>
539 323:
540 324: <span class="ruby-keyword kw">if</span> <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">SPECIAL_CASES</span>.<span class="ruby-identifier">has_key?</span> <span class="ruby-identifier">word</span>
541 325: <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;Returning #{word} as special case #{Porter2::SPECIAL_CASES[word]}&quot;</span>
542 326: <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">SPECIAL_CASES</span>[<span class="ruby-identifier">word</span>]
543 327: <span class="ruby-keyword kw">else</span>
544 328: <span class="ruby-identifier">r1</span> = <span class="ruby-identifier">word</span>.<span class="ruby-identifier">porter2_r1</span>
545 329: <span class="ruby-identifier">r2</span> = <span class="ruby-identifier">word</span>.<span class="ruby-identifier">porter2_r2</span>
546 330: <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;R1 = #{r1}, R2 = #{r2}&quot;</span>
547 331:
548 332: <span class="ruby-identifier">w0</span> = <span class="ruby-identifier">word</span>.<span class="ruby-identifier">porter2_step0</span> ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 0: #{w0} (R1 = #{w0.porter2_r1}, R2 = #{w0.porter2_r2})&quot;</span>
549 333: <span class="ruby-identifier">w1a</span> = <span class="ruby-identifier">w0</span>.<span class="ruby-identifier">porter2_step1a</span> ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 1a: #{w1a} (R1 = #{w1a.porter2_r1}, R2 = #{w1a.porter2_r2})&quot;</span>
550 334:
551 335: <span class="ruby-keyword kw">if</span> <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">STEP_1A_SPECIAL_CASES</span>.<span class="ruby-identifier">include?</span> <span class="ruby-identifier">w1a</span>
552 336: <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;Returning #{w1a} as 1a special case&quot;</span>
553 337: <span class="ruby-identifier">w1a</span>
554 338: <span class="ruby-keyword kw">else</span>
555 339: <span class="ruby-identifier">w1b</span> = <span class="ruby-identifier">w1a</span>.<span class="ruby-identifier">porter2_step1b</span>(<span class="ruby-identifier">gb_english</span>) ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 1b: #{w1b} (R1 = #{w1b.porter2_r1}, R2 = #{w1b.porter2_r2})&quot;</span>
556 340: <span class="ruby-identifier">w1c</span> = <span class="ruby-identifier">w1b</span>.<span class="ruby-identifier">porter2_step1c</span> ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 1c: #{w1c} (R1 = #{w1c.porter2_r1}, R2 = #{w1c.porter2_r2})&quot;</span>
557 341: <span class="ruby-identifier">w2</span> = <span class="ruby-identifier">w1c</span>.<span class="ruby-identifier">porter2_step2</span>(<span class="ruby-identifier">gb_english</span>) ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 2: #{w2} (R1 = #{w2.porter2_r1}, R2 = #{w2.porter2_r2})&quot;</span>
558 342: <span class="ruby-identifier">w3</span> = <span class="ruby-identifier">w2</span>.<span class="ruby-identifier">porter2_step3</span>(<span class="ruby-identifier">gb_english</span>) ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 3: #{w3} (R1 = #{w3.porter2_r1}, R2 = #{w3.porter2_r2})&quot;</span>
559 343: <span class="ruby-identifier">w4</span> = <span class="ruby-identifier">w3</span>.<span class="ruby-identifier">porter2_step4</span>(<span class="ruby-identifier">gb_english</span>) ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 4: #{w4} (R1 = #{w4.porter2_r1}, R2 = #{w4.porter2_r2})&quot;</span>
560 344: <span class="ruby-identifier">w5</span> = <span class="ruby-identifier">w4</span>.<span class="ruby-identifier">porter2_step5</span> ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After step 5: #{w5}&quot;</span>
561 345: <span class="ruby-identifier">wpost</span> = <span class="ruby-identifier">w5</span>.<span class="ruby-identifier">porter2_postprocess</span> ; <span class="ruby-identifier">puts</span> <span class="ruby-node">&quot;After postprocess: #{wpost}&quot;</span>
562 346: <span class="ruby-identifier">wpost</span>
563 347: <span class="ruby-keyword kw">end</span>
564 348: <span class="ruby-keyword kw">end</span>
565 349: <span class="ruby-keyword kw">end</span></pre>
566 </div>
567
568 </div>
569
570
571
572
573 </div>
574
575
576 <div id="porter-step--method" class="method-detail ">
577 <a name="method-i-porter2_step0"></a>
578
579 <div class="method-heading">
580
581 <span class="method-name">porter2_step0</span><span
582 class="method-args">()</span>
583 <span class="method-click-advice">click to toggle source</span>
584
585 </div>
586
587 <div class="method-description">
588
589 <p>
590 Search for the longest among the suffixes,
591 </p>
592 <ul>
593 <li><p>
594 &#8216;
595 </p>
596 </li>
597 <li><p>
598 &#8217;s
599 </p>
600 </li>
601 <li><p>
602 &#8217;s&#8217;
603 </p>
604 </li>
605 </ul>
606 <p>
607 and remove if found.
608 </p>
609
610
611
612 <div class="method-source-code"
613 id="porter-step--source">
614 <pre>
615 <span class="ruby-comment cmt"># File lib/porter2.rb, line 103</span>
616 103: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step0</span>
617 104: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub!</span>(<span class="ruby-regexp re">/(.)('s'|'s|')$/</span>, <span class="ruby-value str">'\1'</span>) <span class="ruby-operator">||</span> <span class="ruby-keyword kw">self</span>
618 105: <span class="ruby-keyword kw">end</span></pre>
619 </div>
620
621 </div>
622
623
624
625
626 </div>
627
628
629 <div id="porter-step-a-method" class="method-detail ">
630 <a name="method-i-porter2_step1a"></a>
631
632 <div class="method-heading">
633
634 <span class="method-name">porter2_step1a</span><span
635 class="method-args">()</span>
636 <span class="method-click-advice">click to toggle source</span>
637
638 </div>
639
640 <div class="method-description">
641
642 <p>
643 Search for the longest among the following suffixes, and perform the action
644 indicated.
645 </p>
646 <table>
647 <tr><td valign="top">sses</td><td><p>
648 replace by ss
649 </p>
650 </td></tr>
651 <tr><td valign="top">ied, ies</td><td><p>
652 replace by i if preceded by more than one letter, otherwise by ie
653 </p>
654 </td></tr>
655 <tr><td valign="top">s</td><td><p>
656 delete if the preceding word part contains a vowel not immediately before
657 the s
658 </p>
659 </td></tr>
660 <tr><td valign="top">us, ss</td><td><p>
661 do nothing
662 </p>
663 </td></tr>
664 </table>
665
666
667
668 <div class="method-source-code"
669 id="porter-step-a-source">
670 <pre>
671 <span class="ruby-comment cmt"># File lib/porter2.rb, line 113</span>
672 113: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step1a</span>
673 114: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/sses$/</span>
674 115: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/sses$/</span>, <span class="ruby-value str">'ss'</span>)
675 116: <span class="ruby-keyword kw">elsif</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/..(ied|ies)$/</span>
676 117: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/(ied|ies)$/</span>, <span class="ruby-value str">'i'</span>)
677 118: <span class="ruby-keyword kw">elsif</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/(ied|ies)$/</span>
678 119: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/(ied|ies)$/</span>, <span class="ruby-value str">'ie'</span>)
679 120: <span class="ruby-keyword kw">elsif</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/(us|ss)$/</span>
680 121: <span class="ruby-keyword kw">self</span>
681 122: <span class="ruby-keyword kw">elsif</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/s$/</span>
682 123: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/(#{Porter2::V}.+)s$/</span>
683 124: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/s$/</span>, <span class="ruby-value str">''</span>)
684 125: <span class="ruby-keyword kw">else</span>
685 126: <span class="ruby-keyword kw">self</span>
686 127: <span class="ruby-keyword kw">end</span>
687 128: <span class="ruby-keyword kw">else</span>
688 129: <span class="ruby-keyword kw">self</span>
689 130: <span class="ruby-keyword kw">end</span>
690 131: <span class="ruby-keyword kw">end</span></pre>
691 </div>
692
693 </div>
694
695
696
697
698 </div>
699
700
701 <div id="porter-step-b-method" class="method-detail ">
702 <a name="method-i-porter2_step1b"></a>
703
704 <div class="method-heading">
705
706 <span class="method-name">porter2_step1b</span><span
707 class="method-args">(gb_english = false)</span>
708 <span class="method-click-advice">click to toggle source</span>
709
710 </div>
711
712 <div class="method-description">
713
714 <p>
715 Search for the longest among the following suffixes, and perform the action
716 indicated.
717 </p>
718 <table>
719 <tr><td valign="top">eed, eedly</td><td><p>
720 replace by ee if the suffix is also in R1
721 </p>
722 </td></tr>
723 <tr><td valign="top">ed, edly, ing, ingly</td><td><p>
724 delete if the preceding word part contains a vowel and, after the
725 deletion:
726 </p>
727 <ul>
728 <li><p>
729 if the word ends at, bl or iz: add e, or
730 </p>
731 </li>
732 </ul>
733 <ul>
734 <li><p>
735 if the word ends with a double: remove the last letter, or
736 </p>
737 </li>
738 </ul>
739 <ul>
740 <li><p>
741 if the word is short: add e
742 </p>
743 </li>
744 </ul>
745 </td></tr>
746 </table>
747 <p>
748 (If gb_english is <tt>true</tt>, treat the &#8216;is&#8217; suffix as
749 &#8216;iz&#8217; above.)
750 </p>
751
752
753
754 <div class="method-source-code"
755 id="porter-step-b-source">
756 <pre>
757 <span class="ruby-comment cmt"># File lib/porter2.rb, line 143</span>
758 143: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step1b</span>(<span class="ruby-identifier">gb_english</span> = <span class="ruby-keyword kw">false</span>)
759 144: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/(eed|eedly)$/</span>
760 145: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r1</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/(eed|eedly)$/</span>
761 146: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/(eed|eedly)$/</span>, <span class="ruby-value str">'ee'</span>)
762 147: <span class="ruby-keyword kw">else</span>
763 148: <span class="ruby-keyword kw">self</span>
764 149: <span class="ruby-keyword kw">end</span>
765 150: <span class="ruby-keyword kw">else</span>
766 151: <span class="ruby-identifier">w</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">dup</span>
767 152: <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">w</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{Porter2::V}.*(ed|edly|ing|ingly)$/</span>
768 153: <span class="ruby-identifier">w</span>.<span class="ruby-identifier">sub!</span>(<span class="ruby-regexp re">/(ed|edly|ing|ingly)$/</span>, <span class="ruby-value str">''</span>)
769 154: <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">w</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/(at|lb|iz)$/</span>
770 155: <span class="ruby-identifier">w</span> <span class="ruby-operator">+=</span> <span class="ruby-value str">'e'</span>
771 156: <span class="ruby-keyword kw">elsif</span> <span class="ruby-identifier">w</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/is$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-identifier">gb_english</span>
772 157: <span class="ruby-identifier">w</span> <span class="ruby-operator">+=</span> <span class="ruby-value str">'e'</span>
773 158: <span class="ruby-keyword kw">elsif</span> <span class="ruby-identifier">w</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{Porter2::Double}$/</span>
774 159: <span class="ruby-identifier">w</span>.<span class="ruby-identifier">chop!</span>
775 160: <span class="ruby-keyword kw">elsif</span> <span class="ruby-identifier">w</span>.<span class="ruby-identifier">porter2_is_short_word?</span>
776 161: <span class="ruby-identifier">w</span> <span class="ruby-operator">+=</span> <span class="ruby-value str">'e'</span>
777 162: <span class="ruby-keyword kw">end</span>
778 163: <span class="ruby-keyword kw">end</span>
779 164: <span class="ruby-identifier">w</span>
780 165: <span class="ruby-keyword kw">end</span>
781 166: <span class="ruby-keyword kw">end</span></pre>
782 </div>
783
784 </div>
785
786
787
788
789 </div>
790
791
792 <div id="porter-step-c-method" class="method-detail ">
793 <a name="method-i-porter2_step1c"></a>
794
795 <div class="method-heading">
796
797 <span class="method-name">porter2_step1c</span><span
798 class="method-args">()</span>
799 <span class="method-click-advice">click to toggle source</span>
800
801 </div>
802
803 <div class="method-description">
804
805 <p>
806 Replace a suffix of y or Y by i if it is preceded by a non-vowel which is
807 not the first letter of the word.
808 </p>
809
810
811
812 <div class="method-source-code"
813 id="porter-step-c-source">
814 <pre>
815 <span class="ruby-comment cmt"># File lib/porter2.rb, line 171</span>
816 171: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step1c</span>
817 172: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/.+#{Porter2::C}(y|Y)$/</span>
818 173: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/(y|Y)$/</span>, <span class="ruby-value str">'i'</span>)
819 174: <span class="ruby-keyword kw">else</span>
820 175: <span class="ruby-keyword kw">self</span>
821 176: <span class="ruby-keyword kw">end</span>
822 177: <span class="ruby-keyword kw">end</span></pre>
823 </div>
824
825 </div>
826
827
828
829
830 </div>
831
832
833 <div id="porter-step--method" class="method-detail ">
834 <a name="method-i-porter2_step2"></a>
835
836 <div class="method-heading">
837
838 <span class="method-name">porter2_step2</span><span
839 class="method-args">(gb_english = false)</span>
840 <span class="method-click-advice">click to toggle source</span>
841
842 </div>
843
844 <div class="method-description">
845
846 <p>
847 Search for the longest among the suffixes listed in the keys of
848 Porter2::STEP_2_MAPS. If one is found and that suffix occurs in R1,
849 replace it with the value found in STEP_2_MAPS.
850 </p>
851 <p>
852 (Suffixes &#8216;ogi&#8217; and &#8216;li&#8217; are treated as special
853 cases in the procedure.)
854 </p>
855 <p>
856 (If gb_english is <tt>true</tt>, replace the &#8216;iser&#8217; and
857 &#8216;isation&#8217; suffixes with &#8216;ise&#8217;, similarly to how
858 &#8216;izer&#8217; and &#8216;ization&#8217; are treated.)
859 </p>
860
861
862
863 <div class="method-source-code"
864 id="porter-step--source">
865 <pre>
866 <span class="ruby-comment cmt"># File lib/porter2.rb, line 188</span>
867 188: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step2</span>(<span class="ruby-identifier">gb_english</span> = <span class="ruby-keyword kw">false</span>)
868 189: <span class="ruby-identifier">r1</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r1</span>
869 190: <span class="ruby-identifier">s2m</span> = <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">STEP_2_MAPS</span>.<span class="ruby-identifier">dup</span>
870 191: <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">gb_english</span>
871 192: <span class="ruby-identifier">s2m</span>[<span class="ruby-value str">&quot;iser&quot;</span>] = <span class="ruby-value str">&quot;ise&quot;</span>
872 193: <span class="ruby-identifier">s2m</span>[<span class="ruby-value str">&quot;isation&quot;</span>] = <span class="ruby-value str">&quot;ise&quot;</span>
873 194: <span class="ruby-keyword kw">end</span>
874 195: <span class="ruby-identifier">step_2_re</span> = <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">union</span>(<span class="ruby-identifier">s2m</span>.<span class="ruby-identifier">keys</span>.<span class="ruby-identifier">map</span> {<span class="ruby-operator">|</span><span class="ruby-identifier">r</span><span class="ruby-operator">|</span> <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">new</span>(<span class="ruby-identifier">r</span> <span class="ruby-operator">+</span> <span class="ruby-value str">&quot;$&quot;</span>)})
875 196: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-identifier">step_2_re</span>
876 197: <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">r1</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{$&amp;}$/</span>
877 198: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-node">/#{$&amp;}$/</span>, <span class="ruby-identifier">s2m</span>[<span class="ruby-node">$&amp;</span>])
878 199: <span class="ruby-keyword kw">else</span>
879 200: <span class="ruby-keyword kw">self</span>
880 201: <span class="ruby-keyword kw">end</span>
881 202: <span class="ruby-keyword kw">elsif</span> <span class="ruby-identifier">r1</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/li$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/(#{Porter2::Valid_LI})li$/</span>
882 203: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/li$/</span>, <span class="ruby-value str">''</span>)
883 204: <span class="ruby-keyword kw">elsif</span> <span class="ruby-identifier">r1</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/ogi$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/logi$/</span>
884 205: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/ogi$/</span>, <span class="ruby-value str">'og'</span>)
885 206: <span class="ruby-keyword kw">else</span>
886 207: <span class="ruby-keyword kw">self</span>
887 208: <span class="ruby-keyword kw">end</span>
888 209: <span class="ruby-keyword kw">end</span></pre>
889 </div>
890
891 </div>
892
893
894
895
896 </div>
897
898
899 <div id="porter-step--method" class="method-detail ">
900 <a name="method-i-porter2_step3"></a>
901
902 <div class="method-heading">
903
904 <span class="method-name">porter2_step3</span><span
905 class="method-args">(gb_english = false)</span>
906 <span class="method-click-advice">click to toggle source</span>
907
908 </div>
909
910 <div class="method-description">
911
912 <p>
913 Search for the longest among the suffixes listed in the keys of
914 Porter2::STEP_3_MAPS. If one is found and that suffix occurs in R1,
915 replace it with the value found in STEP_3_MAPS.
916 </p>
917 <p>
918 (Suffix &#8216;ative&#8217; is treated as a special case in the procedure.)
919 </p>
920 <p>
921 (If gb_english is <tt>true</tt>, replace the &#8216;alise&#8217; suffix
922 with &#8216;al&#8217;, similarly to how &#8216;alize&#8217; is treated.)
923 </p>
924
925
926
927 <div class="method-source-code"
928 id="porter-step--source">
929 <pre>
930 <span class="ruby-comment cmt"># File lib/porter2.rb, line 220</span>
931 220: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step3</span>(<span class="ruby-identifier">gb_english</span> = <span class="ruby-keyword kw">false</span>)
932 221: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/ative$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r2</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/ative$/</span>
933 222: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/ative$/</span>, <span class="ruby-value str">''</span>)
934 223: <span class="ruby-keyword kw">else</span>
935 224: <span class="ruby-identifier">s3m</span> = <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">STEP_3_MAPS</span>.<span class="ruby-identifier">dup</span>
936 225: <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">gb_english</span>
937 226: <span class="ruby-identifier">s3m</span>[<span class="ruby-value str">&quot;alise&quot;</span>] = <span class="ruby-value str">&quot;al&quot;</span>
938 227: <span class="ruby-keyword kw">end</span>
939 228: <span class="ruby-identifier">step_3_re</span> = <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">union</span>(<span class="ruby-identifier">s3m</span>.<span class="ruby-identifier">keys</span>.<span class="ruby-identifier">map</span> {<span class="ruby-operator">|</span><span class="ruby-identifier">r</span><span class="ruby-operator">|</span> <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">new</span>(<span class="ruby-identifier">r</span> <span class="ruby-operator">+</span> <span class="ruby-value str">&quot;$&quot;</span>)})
940 229: <span class="ruby-identifier">r1</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r1</span>
941 230: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-identifier">step_3_re</span> <span class="ruby-keyword kw">and</span> <span class="ruby-identifier">r1</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{$&amp;}$/</span>
942 231: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-node">/#{$&amp;}$/</span>, <span class="ruby-identifier">s3m</span>[<span class="ruby-node">$&amp;</span>])
943 232: <span class="ruby-keyword kw">else</span>
944 233: <span class="ruby-keyword kw">self</span>
945 234: <span class="ruby-keyword kw">end</span>
946 235: <span class="ruby-keyword kw">end</span>
947 236: <span class="ruby-keyword kw">end</span></pre>
948 </div>
949
950 </div>
951
952
953
954
955 </div>
956
957
958 <div id="porter-step--method" class="method-detail ">
959 <a name="method-i-porter2_step4"></a>
960
961 <div class="method-heading">
962
963 <span class="method-name">porter2_step4</span><span
964 class="method-args">(gb_english = false)</span>
965 <span class="method-click-advice">click to toggle source</span>
966
967 </div>
968
969 <div class="method-description">
970
971 <p>
972 Search for the longest among the suffixes listed in the keys of
973 Porter2::STEP_4_MAPS. If one is found and that suffix occurs in R2,
974 replace it with the value found in STEP_4_MAPS.
975 </p>
976 <p>
977 (Suffix &#8216;ion&#8217; is treated as a special case in the procedure.)
978 </p>
979 <p>
980 (If gb_english is <tt>true</tt>, delete the &#8216;ise&#8217; suffix if
981 found.)
982 </p>
983
984
985
986 <div class="method-source-code"
987 id="porter-step--source">
988 <pre>
989 <span class="ruby-comment cmt"># File lib/porter2.rb, line 246</span>
990 246: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step4</span>(<span class="ruby-identifier">gb_english</span> = <span class="ruby-keyword kw">false</span>)
991 247: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r2</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/ion$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/(s|t)ion$/</span>
992 248: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/ion$/</span>, <span class="ruby-value str">''</span>)
993 249: <span class="ruby-keyword kw">else</span>
994 250: <span class="ruby-identifier">s4m</span> = <span class="ruby-constant">Porter2</span><span class="ruby-operator">::</span><span class="ruby-constant">STEP_4_MAPS</span>.<span class="ruby-identifier">dup</span>
995 251: <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">gb_english</span>
996 252: <span class="ruby-identifier">s4m</span>[<span class="ruby-value str">&quot;ise&quot;</span>] = <span class="ruby-value str">&quot;&quot;</span>
997 253: <span class="ruby-keyword kw">end</span>
998 254: <span class="ruby-identifier">step_4_re</span> = <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">union</span>(<span class="ruby-identifier">s4m</span>.<span class="ruby-identifier">keys</span>.<span class="ruby-identifier">map</span> {<span class="ruby-operator">|</span><span class="ruby-identifier">r</span><span class="ruby-operator">|</span> <span class="ruby-constant">Regexp</span>.<span class="ruby-identifier">new</span>(<span class="ruby-identifier">r</span> <span class="ruby-operator">+</span> <span class="ruby-value str">&quot;$&quot;</span>)})
999 255: <span class="ruby-identifier">r2</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r2</span>
1000 256: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-identifier">step_4_re</span>
1001 257: <span class="ruby-keyword kw">if</span> <span class="ruby-identifier">r2</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{$&amp;}/</span>
1002 258: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-node">/#{$&amp;}$/</span>, <span class="ruby-identifier">s4m</span>[<span class="ruby-node">$&amp;</span>])
1003 259: <span class="ruby-keyword kw">else</span>
1004 260: <span class="ruby-keyword kw">self</span>
1005 261: <span class="ruby-keyword kw">end</span>
1006 262: <span class="ruby-keyword kw">else</span>
1007 263: <span class="ruby-keyword kw">self</span>
1008 264: <span class="ruby-keyword kw">end</span>
1009 265: <span class="ruby-keyword kw">end</span>
1010 266: <span class="ruby-keyword kw">end</span></pre>
1011 </div>
1012
1013 </div>
1014
1015
1016
1017
1018 </div>
1019
1020
1021 <div id="porter-step--method" class="method-detail ">
1022 <a name="method-i-porter2_step5"></a>
1023
1024 <div class="method-heading">
1025
1026 <span class="method-name">porter2_step5</span><span
1027 class="method-args">()</span>
1028 <span class="method-click-advice">click to toggle source</span>
1029
1030 </div>
1031
1032 <div class="method-description">
1033
1034 <p>
1035 Search for the the following suffixes, and, if found, perform the action
1036 indicated.
1037 </p>
1038 <table>
1039 <tr><td valign="top">e</td><td><p>
1040 delete if in R2, or in R1 and not preceded by a short syllable
1041 </p>
1042 </td></tr>
1043 <tr><td valign="top">l</td><td><p>
1044 delete if in R2 and preceded by l
1045 </p>
1046 </td></tr>
1047 </table>
1048
1049
1050
1051 <div class="method-source-code"
1052 id="porter-step--source">
1053 <pre>
1054 <span class="ruby-comment cmt"># File lib/porter2.rb, line 272</span>
1055 272: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_step5</span>
1056 273: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/ll$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r2</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/l$/</span>
1057 274: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/ll$/</span>, <span class="ruby-value str">'l'</span>)
1058 275: <span class="ruby-keyword kw">elsif</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/e$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r2</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/e$/</span>
1059 276: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/e$/</span>, <span class="ruby-value str">''</span>)
1060 277: <span class="ruby-keyword kw">else</span>
1061 278: <span class="ruby-identifier">r1</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">porter2_r1</span>
1062 279: <span class="ruby-keyword kw">if</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/e$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-identifier">r1</span> <span class="ruby-operator">=~</span> <span class="ruby-regexp re">/e$/</span> <span class="ruby-keyword kw">and</span> <span class="ruby-keyword kw">not</span> <span class="ruby-keyword kw">self</span> <span class="ruby-operator">=~</span> <span class="ruby-node">/#{Porter2::SHORT_SYLLABLE}e$/</span>
1063 280: <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">sub</span>(<span class="ruby-regexp re">/e$/</span>, <span class="ruby-value str">''</span>)
1064 281: <span class="ruby-keyword kw">else</span>
1065 282: <span class="ruby-keyword kw">self</span>
1066 283: <span class="ruby-keyword kw">end</span>
1067 284: <span class="ruby-keyword kw">end</span>
1068 285: <span class="ruby-keyword kw">end</span></pre>
1069 </div>
1070
1071 </div>
1072
1073
1074
1075
1076 </div>
1077
1078
1079 <div id="porter-tidy-method" class="method-detail ">
1080 <a name="method-i-porter2_tidy"></a>
1081
1082 <div class="method-heading">
1083
1084 <span class="method-name">porter2_tidy</span><span
1085 class="method-args">()</span>
1086 <span class="method-click-advice">click to toggle source</span>
1087
1088 </div>
1089
1090 <div class="method-description">
1091
1092 <p>
1093 Tidy up the word before we get down to the algorithm
1094 </p>
1095
1096
1097
1098 <div class="method-source-code"
1099 id="porter-tidy-source">
1100 <pre>
1101 <span class="ruby-comment cmt"># File lib/porter2.rb, line 35</span>
1102 35: <span class="ruby-keyword kw">def</span> <span class="ruby-identifier">porter2_tidy</span>
1103 36: <span class="ruby-identifier">preword</span> = <span class="ruby-keyword kw">self</span>.<span class="ruby-identifier">to_s</span>.<span class="ruby-identifier">strip</span>.<span class="ruby-identifier">downcase</span>
1104 37:
1105 38: <span class="ruby-comment cmt"># map apostrophe-like characters to apostrophes</span>
1106 39: <span class="ruby-identifier">preword</span>.<span class="ruby-identifier">gsub!</span>(<span class="ruby-regexp re">/‘/</span>, <span class="ruby-value str">&quot;'&quot;</span>)
1107 40: <span class="ruby-identifier">preword</span>.<span class="ruby-identifier">gsub!</span>(<span class="ruby-regexp re">/’/</span>, <span class="ruby-value str">&quot;'&quot;</span>)
1108 41:
1109 42: <span class="ruby-identifier">preword</span>
1110 43: <span class="ruby-keyword kw">end</span></pre>
1111 </div>
1112
1113 </div>
1114
1115
1116
1117
1118 </div>
1119
1120
1121 <div id="stem-method" class="method-detail method-alias">
1122 <a name="method-i-stem"></a>
1123
1124 <div class="method-heading">
1125
1126 <span class="method-name">stem</span><span
1127 class="method-args">(gb_english = false)</span>
1128 <span class="method-click-advice">click to toggle source</span>
1129
1130 </div>
1131
1132 <div class="method-description">
1133
1134
1135
1136
1137
1138 </div>
1139
1140
1141
1142
1143 <div class="aliases">
1144 Alias for: <a href="String.html#method-i-porter2_stem">porter2_stem</a>
1145 </div>
1146
1147 </div>
1148
1149
1150 </div>
1151
1152
1153 </div>
1154
1155
1156 <div id="rdoc-debugging-section-dump" class="debugging-section">
1157
1158 <p>Disabled; run with --debug to generate this.</p>
1159
1160 </div>
1161
1162 <div id="validator-badges">
1163 <p><small><a href="http://validator.w3.org/check/referer">[Validate]</a></small></p>
1164 <p><small>Generated with the <a href="http://deveiate.org/projects/Darkfish-Rdoc/">Darkfish
1165 Rdoc Generator</a> 1.1.6</small>.</p>
1166 </div>
1167
1168 </body>
1169 </html>
1170