annotate src/org/nwoca/ssdt/tools/html2wiki/Html2Wiki.java @ 16:001e43423d5d

added replace transformer to remove garbage character (ESC, unicode 001B) from "screen prints", replaced with a space.
author ferrall@nwoca.org
date Tue, 08 Feb 2011 08:12:33 -0500
parents 494ca5643e1a
children a88e2f8fb117
rev   line source
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
1 package org.nwoca.ssdt.tools.html2wiki;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
2 /*
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
3 * Html2Wiki.java
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
4 *
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
5 * Created on May 9, 2006, 3:22 PM
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
6 *
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
7 */
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
8
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
9 import java.io.*;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
10 import java.util.Collection;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
11 import java.util.ArrayList;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
12 import java.util.List;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
13 import org.apache.commons.io.FileUtils;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
14 import java.util.regex.*;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
15
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
16 /**
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
17 * Converter to convert HTML documents into MediaWiki test.
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
18 *
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
19 * Heavily customized to handle HTML produced by DEC DOCUMENT
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
20 * SOFTARE doctype. Breaks file into Chapters in the manner done
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
21 * by Document. Needs modification to work with other HTML files.
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
22 *
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
23 * @author SMITH
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
24 */
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
25 public class Html2Wiki {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
26
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
27 private StringBuffer buffer;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
28 private Collection<Transformer> transformers;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
29 private boolean converted = false;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
30 private static String category;
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
31
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
32 /** Creates a new instance of Html2Wiki. */
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
33 public Html2Wiki(String html) {
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
34 buffer = new StringBuffer(html);
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
35 transformers = new ArrayList<Transformer>();
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
36 transformers.add(new DeleteTransformer("<html>|</html>|<body>|</body>"));
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
37 transformers.add(new DeleteTransformer("<!--.*-->(\\n|\\r)*",true));
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
38 transformers.add(new DeleteTransformer("<a .*?>|</a>"));
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
39 transformers.add(new DeleteTransformer("(?m)^\\*"));
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
40 transformers.add(new DeleteTransformer("(?m)<br>$"));
14
c8442e0eff84 Remove <caption> tags. Generlized {table} around {code} blocks.
smith@nwoca.org
parents: 13
diff changeset
41 transformers.add(new DeleteTransformer("<caption>.*</caption>")); // remove SDML captions (used for TOC)
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
42 transformers.add(new DeleteTransformer("<font .*?>|</font>"));
4
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
43 transformers.add(new CloseTagTransformer("<li>","(\n|\r)*(<li>|</ul>|</ol>|<ul>|<ol>)","</li>"));
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
44 transformers.add(new BadTableDataTransformer());
15
494ca5643e1a adjust h2 transformer for headings with arguments and added transformer for removing the empty contents table that is left as part of the conversion of the manuals.
ferrall@nwoca.org
parents: 14
diff changeset
45 transformers.add(new BadTableRowTransformer());
494ca5643e1a adjust h2 transformer for headings with arguments and added transformer for removing the empty contents table that is left as part of the conversion of the manuals.
ferrall@nwoca.org
parents: 14
diff changeset
46 transformers.add(new ReflowTransformer());
494ca5643e1a adjust h2 transformer for headings with arguments and added transformer for removing the empty contents table that is left as part of the conversion of the manuals.
ferrall@nwoca.org
parents: 14
diff changeset
47 transformers.add(new DeleteTransformer("<p>"));
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
48 transformers.add(new ReplaceTransformer("\\{","\\{")); // Escape braces
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
49 transformers.add(new ReplaceTransformer("\\}","\\}"));
7
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
50
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
51 transformers.add(new ReplaceTransformer("\\[","\\[")); // Escape brackets
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
52 transformers.add(new ReplaceTransformer("\\]","\\]"));
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
53 transformers.add(new PreTagTransformer()); // Unescape brackets inside <pre>
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
54 //
4
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
55 transformers.add(new ReplaceTransformer("<br>","\\\\"));
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
56
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
57 //replace table tag preserving border setting.
10
2fb5084b1564 [no commit message]
ferrall@nwoca.org
parents: 9
diff changeset
58 transformers.add(new TagTransformer("<table\\sborder=(\\d).*?>", true, "{table:border=", "|width=75%}"));
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
59
4
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
60 transformers.add(new ReplaceTransformer("<table.*?>|</table>","{table}"));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
61 transformers.add(new ReplaceTransformer("<tr>|</tr>","{tr}"));
5
d34f4d408ef9 [no commit message]
ferrall@nwoca.org
parents: 4
diff changeset
62 transformers.add(new ReplaceTransformer("<td.*?>|</td>","{td}"));
d34f4d408ef9 [no commit message]
ferrall@nwoca.org
parents: 4
diff changeset
63 transformers.add(new ReplaceTransformer("<th.*?>|</th>","{th}"));
4
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
64 transformers.add(new ReplaceTransformer("<ol.*?>|</ol>","{ol}"));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
65 transformers.add(new ReplaceTransformer("<ul.*?>|</ul>","{ul}"));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
66 transformers.add(new ReplaceTransformer("<li>","{li}"));
13
cf58f4b9902b clean up ending {li} on line by itself
smith@nwoca.org
parents: 12
diff changeset
67 transformers.add(new ReplaceTransformer("\\n\\s*</li>","{li}\n")); // remove leading space from </li>
cf58f4b9902b clean up ending {li} on line by itself
smith@nwoca.org
parents: 12
diff changeset
68 transformers.add(new ReplaceTransformer("</li>","{li}\n")); // Replace remaining </li>
16
001e43423d5d added replace transformer to remove garbage character (ESC, unicode 001B) from "screen prints", replaced with a space.
ferrall@nwoca.org
parents: 15
diff changeset
69 transformers.add(new ReplaceTransformer("\\u001B"," ")); // Replace ASCII ESC character
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
70
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
71 transformers.add(new ChapterTransformer(category));
4
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
72 transformers.add(new TagTransformer("<pre>(.*?)</pre>", true, "{code}","{code}"));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
73 transformers.add(new TagTransformer("<center>(.*?)</center>", true, "{center}","{center}"));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
74 transformers.add(new TagTransformer("<em>(.*?)</em>", "*","*"));
12
ferrall@nwoca.org
parents: 11
diff changeset
75 transformers.add(new TagTransformer("<strong>(.*?)</strong>", true, "*","*"));
9
ccb40d1cb213 [no commit message]
ferrall@nwoca.org
parents: 8
diff changeset
76 transformers.add(new TagTransformer("<u>(.*?)</u>" , "+","+"));
4
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
77 transformers.add(new TagTransformer("(?s)<kbd>(.*?)</kbd>", "{{", "}}"));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
78 transformers.add(new TagTransformer("<h1>(.*)</h1>", "h1. ", ""));
15
494ca5643e1a adjust h2 transformer for headings with arguments and added transformer for removing the empty contents table that is left as part of the conversion of the manuals.
ferrall@nwoca.org
parents: 14
diff changeset
79 transformers.add(new TagTransformer("<h2.*>(.*)</h2>", "h2. ", ""));
4
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
80 transformers.add(new TagTransformer("<h3>(accessing the program|sample run|sample screens?|sample reports?)</[h|H]3>","h3.", ""));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
81 transformers.add(new TagTransformer("<h3>(.*)</H3>", "h3. ", ""));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
82 transformers.add(new TagTransformer("<h3>(.*)</h3>", "h3. ", ""));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
83 transformers.add(new TagTransformer("<h4>(.*)</h4>", "h4. ", ""));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
84 transformers.add(new TagTransformer("<h5>(.*)</h5>", "h5. ", ""));
22ed6d93442c Start modifying transformers to Confluence wiki syntax
smith@nwoca.org
parents: 2
diff changeset
85 transformers.add(new TagTransformer("<h6>(.*)</h6>", "h6. ", ""));
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
86
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
87 //Replace Notes with Info tags.
10
2fb5084b1564 [no commit message]
ferrall@nwoca.org
parents: 9
diff changeset
88 transformers.add(new ReplaceTransformer("\\{center}\\n\\{table:border=\\d.*}\\n\\{tr\\}\\n\\s{2}\\{td\\}\\{center\\}\\*Note\\*\\{center\\}","{info}"));
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
89 transformers.add(new ReplaceTransformer("\\{td\\}\\n\\s{2}\\{tr\\}\\n\\{table\\}\\n\\{center\\}","{info}"));
5
d34f4d408ef9 [no commit message]
ferrall@nwoca.org
parents: 4
diff changeset
90
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
91 //Remove unnecessary table surrounding code blocks.
14
c8442e0eff84 Remove <caption> tags. Generlized {table} around {code} blocks.
smith@nwoca.org
parents: 13
diff changeset
92 transformers.add(new ReplaceTransformer("\\{table:.*\\}(\\n|\\s|\\{t.\\}|\\*\\S*\\*)*\\{code\\}","{code}"));
c8442e0eff84 Remove <caption> tags. Generlized {table} around {code} blocks.
smith@nwoca.org
parents: 13
diff changeset
93 transformers.add(new ReplaceTransformer("\\{code\\}(\\n|\\{t.\\}|\\s)*\\{table\\}","{code}"));
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
94
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
95 //Change borderStyle of code window for "screenshots" to none.
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
96 transformers.add(new TagTransformer("\\{code\\}([\\s\\n]*?_______________)", true, "{code:borderStyle=none}", ""));
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
97
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
98
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
99
7
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
100 transformers.add(new TagTransformer("<blockquote>(.*?)</blockquote>", true, "{quote}", "{quote}"));
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
101 transformers.add(new DeleteTransformer("(?s)<hr.*?>"));
8
e8ea26ab2cd7 [no commit message]
ferrall@nwoca.org
parents: 7
diff changeset
102 transformers.add(new ReflowTransformer("(\\{info\\})([^\\{]*)(\\{info\\})"));
12
ferrall@nwoca.org
parents: 11
diff changeset
103 transformers.add(new ReflowTransformer("(\\{note\\})([^\\{]*)(\\{note\\})"));
ferrall@nwoca.org
parents: 11
diff changeset
104 transformers.add(new ReflowTransformer("(\\{td\\})([^\\{]*)(\\{td\\})"));
ferrall@nwoca.org
parents: 11
diff changeset
105 transformers.add(new ReflowTransformer("(\\{li\\})([^\\{]*)(\\{li\\})"));
7
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
106 transformers.add(new TagTransformer("<sup>(.*?)</sup>", true, "^\\[","\\]^ "));
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
107 transformers.add(new ReplaceTransformer("&lt;","<"));
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
108 transformers.add(new ReplaceTransformer("&gt;",">"));
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
109 transformers.add(new ReplaceTransformer("&quot;","\""));
12
ferrall@nwoca.org
parents: 11
diff changeset
110 transformers.add(new ReplaceTransformer("&amp;","&"));
7
a634b4d554d4 Minor fixups &gt;, random smilies :), etc. Fixed blockquote. Handle escaping brackets outside pre tag.
smith@nwoca.org
parents: 6
diff changeset
111 transformers.add(new ReplaceTransformer(":\\)",": )")); // No smilies...
13
cf58f4b9902b clean up ending {li} on line by itself
smith@nwoca.org
parents: 12
diff changeset
112 transformers.add(new ReplaceTransformer("(\\w)(--)(\\w)"," -- ",2)); // avoid strikeout
15
494ca5643e1a adjust h2 transformer for headings with arguments and added transformer for removing the empty contents table that is left as part of the conversion of the manuals.
ferrall@nwoca.org
parents: 14
diff changeset
113 transformers.add(new ReplaceTransformer("\\{table(.*?)\\}\\n\\s{2}\\{tr\\}\\n\\s{4}\\{td\\}Contents\\{td\\}\\n\\s{2}\\{tr\\}\\n\\{table\\}","")); // remove "contents" table
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
114
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
115 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
116
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
117 /**
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
118 * @param args the command line arguments
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
119 */
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
120 public static void main(String[] args) throws IOException {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
121
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
122 if (args.length == 0) {
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
123 System.out.println("Usage:");
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
124 System.out.println(" Html2Wiki {inputDirectory} [Category]");
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
125 System.out.println(" default is current directory");
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
126 System.out.println(" Processes all *.html files. ");
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
127 System.out.println(" Each 'chapter' written to *.wiki");
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
128 return;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
129 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
130
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
131 File inputs = new File(args[0]);
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
132
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
133 if (args.length > 1) {
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
134 category = args[1];
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
135 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
136
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
137 File[] inputFiles = inputs.listFiles(new HtmlFileFilter());
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
138 for (int i = 0; i < inputFiles.length; i++) {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
139
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
140 process(inputFiles[i]);
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
141
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
142 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
143
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
144 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
145
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
146 protected static void process(File input) throws IOException {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
147
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
148 System.out.println(input.getAbsoluteFile());
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
149
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
150 Html2Wiki converter = new Html2Wiki(FileUtils.readFileToString(input, null));
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
151
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
152 WikiChapter[] chapters = converter.getWikiChapters();
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
153
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
154 System.out.format("Writing %d wiki files...\n", chapters.length);
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
155
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
156
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
157 for (int i = 0; i < chapters.length; i++) {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
158
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
159 FileUtils.writeStringToFile(new File(input.getParent(),
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
160 generateFilename(chapters[i].getChapterName()) + ".wiki"),
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
161 chapters[i].getContents().toString(),
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
162 null);
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
163
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
164 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
165
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
166 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
167
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
168 public static String generateFilename(String input) {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
169 return input.replaceAll("\\\\|/|:|\\(|\\)", "-").replace("<br>", "");
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
170
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
171 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
172
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
173 public String getWikiText() {
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
174 convert();
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
175 return buffer.toString();
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
176 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
177
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
178 public WikiChapter[] getWikiChapters() {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
179
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
180 convert();
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
181
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
182 List<WikiChapter> chapters = new ArrayList<WikiChapter>();
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
183
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
184 Pattern chapterPat = Pattern.compile("<chapter>");
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
185 Matcher begin = chapterPat.matcher(buffer);
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
186 Matcher end = chapterPat.matcher(buffer);
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
187
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
188 while (begin.find()) {
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
189
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
190
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
191 end.find(begin.end());
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
192
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
193 Pattern chapterNamePat = Pattern.compile("<chapter>(.*?)</chapter>");
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
194
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
195 Matcher chapterNameMatcher = chapterNamePat.matcher(buffer);
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
196
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
197 String chapterName = chapterNameMatcher.find(begin.start()) ? chapterNameMatcher.group(1) : null;
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
198
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
199 CharSequence contents = buffer.subSequence(chapterName == null ? begin.start() : chapterNameMatcher.end(), end.hitEnd() ? buffer.length() : end.start());
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
200
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
201 chapters.add(new WikiChapter(chapterName, contents));
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
202
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
203 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
204 return (WikiChapter[]) chapters.toArray(new WikiChapter[]{});
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
205 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
206
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
207 private void convert() {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
208
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
209 if (!converted) {
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
210 for (Transformer t : transformers) {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
211
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
212 System.out.println(".Applying: " + t);
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
213 t.apply(buffer);
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
214
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
215 }
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
216 }
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
217 converted = true;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
218 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
219
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
220 private static class HtmlFileFilter implements FileFilter {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
221
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
222 public boolean accept(File pathname) {
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
223 return pathname.getName().toLowerCase().matches("^.*\\.html$");
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
224 }
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
225 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
226
2
5da2e67620f9 Upgrade to Ivy configuration and begin clean up of tests. Added FreeBSD license.
smith@nwoca.org
parents: 0
diff changeset
227 protected static class WikiChapter {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
228
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
229 private String chapterName;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
230 private CharSequence contents;
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
231
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
232 public WikiChapter(String chapterName, CharSequence contents) {
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
233 this.chapterName = chapterName.replaceAll("\\\\|/|:|\\(|\\)", "-").replaceAll("\\s+", " ").replaceAll("&amp;", "and");
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
234
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
235 this.contents = contents;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
236 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
237
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
238 public String getChapterName() {
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
239 return chapterName;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
240 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
241
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
242 public CharSequence getContents() {
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
243 return contents;
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
244 }
6
99f293bd507f Add "reflow" transformer to reflow paragraphs, list items, etc.
smith@nwoca.org
parents: 5
diff changeset
245
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
246 public String toString() {
2
5da2e67620f9 Upgrade to Ivy configuration and begin clean up of tests. Added FreeBSD license.
smith@nwoca.org
parents: 0
diff changeset
247 return "Chapter: " + chapterName + " Content length: " + contents.length();
0
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
248 }
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
249 }
f8b1ea49d065 Initial version of crude HTML to WikiText converter. Customized for converting HTML files from DEC Document into Wiki markup.
smith@nwoca.org
parents:
diff changeset
250 }