At the Melbourne Cup lunch, Matthew Jensen and I were talking about the Java 4K competition, where you have to implement a game in Java where your resulting .jar file is 4KB or less. The competition harks back to the old 8bit demo scene where l33t – and mostly Eastern European – hackers created awesome graphics demos in pint-sized programs.

We joked about how far you’d get if you tried to create a Wiki in 4k.

This is a joke no more.

Wiki4k

screen1.png

Over the last couple of weekends I put together Wiki4k. It is easy to run, just download all 4069 bytes of wiki4k.jar (use Save As) and from the command line run java -jar wiki4k.jar 8080 and go to http://localhost:8080/ in your browser. Features:

  • A built-in HTTP 1.1 web server.
  • Wiki syntax including *bold*, [page link], headings, paragraphs, list and more (see below)
  • Edit and create wiki pages (full UTF-8 support).
  • Add comments to pages.
  • Pages and comments are persisted to disk.
  • A recent-changes page and a RSS feed.
  • A pluggable macro system.

Wiki Markup

Markup is transformed to html by stepping through the input string using one big regex and an even bigger switch statement. There are probably a variety of inputs that will cause it to barf, but I have a set of unit tests that shows that it works for all the simple cases.

  • Span styles of *bold* _italics_ -strikethrough- +underlined+ ^superscript^ ~subscript~ and {{code}}.
  • Span styles can nest, eg *this is all bold, some _italics_* (but not recursive nesting e.g. _ italics _inner italics_ italics_).
  • [Page Links] (can include spaces, case sensitive).
  • Escape meta-characters with slosh.
  • Html escaping (i.e. no html injection).
  • Blank lines start new paragraphs (as opposed to Confluence where any line is a line break).
  • bq. block quotes.
  • ---- horizontal rule.
  • h1. to h6. for headings.
  • * based nested lists. ** is a second level etc.
  • {macroname}...body...{macroname} Pluggable macros (see below).

Built-in web server

There is something kinda special about HTTP – it is amazing what you can do by just reading the request line from a Socket and bunging it through a regular expression. I wouldn’t claim that Wiki4k is a strictly HTTP 1.1 compliant (clockless) web server, but I’d say it is pretty close. Close enough to fool Firefox and Safari (didn’t try IE).

To reduce the code size, I just accept GET requests and I skip over any and all headers. Parsing POST was just a bit too onerous, so form submits that should normally POST just GET and receive a 307 redirect, which is not a bad hack to get rid of dangling form query parameters.

Everything else I don’t understand gets a 500 response.

Things that work:

  • GETs to page names (page names allow alphanumeric and space characters).
  • A GET to / loads the Home page.
  • GETs /recent-changes and /recent-changes.rss.
  • GETs to /pagename/edit and /pagename/save?page=newtext and /pagename?addcomment=commenttxt.

Things that don’t work so much:

  • The server is single threaded, it can only process one request at a time.
  • There are probably requests that might crash or DOS the server.
  • Early on I was honouring conditional GETs (ETag/If-None-Match) but for some reason FireFox would only send the If-None-Match on every second request (and I couldn’t work out what Safari was doing), so I ditched it and used the code-space for pluggable macros.

It is probably worth noting that Wiki4k doesn’t not output valid HTML. It sacrifices <html> and <head> and <body> at the alter of file-size reduction.

Creating pages

Each page has an Edit Page button that brings up the expected text area. [Page Links] can point to non-existant pages, in which case you get a chance to create that page with the Edit Page button.

Anyone can edit a page, and all edits are anonymous. You can’t delete pages, and saving a page might stomp someone else’s save (no conflict detection).

Comments

Anyone can add comments, they appear in a list at the bottom of the page. Hitting the Add Comment button uses some Javascript to un-hide the add-comment text area. How very Web 2.0!

Recent changes

Each page records the timestamp of its last edit, and this is used to determine a reverse-chronological list of pages. Comments don’t appear in the recent changes, just pages. This list can be retrieved as html or RSS.

There is no <link> for the RSS feed, that chewed up valuable space, and we don’t have a <head> section anyway.

Persistence

I used trusty old java.util.Properties to store everything. I needed a Map anyway, so why not one with load()/store() methods. The wiki state is stored in a file named wiki.properties in the current directory.

Pages are stored in the form pagename=markup and pagename.ts=timestamp. Comments are stored in pagename.comments=commentlist where commentlist is a list of all the comments separated by the ESC (0x1b character). (Why ESC and not something like NUL? You can’t make friends with NUL salad, that’s why.) I didn’t bother trying to escape the ESC character if it appears in a comment.

Macros

Pluggable macros in under 4K … give it up for Spuddy! Macro’s are mapped through the wiki.properties file with macro.macroname=classname.The wiki4j.jar doesn’t ship any macros itself, but there are some in macros.jar. Download macros.jar and this wiki.properties and run:

java -cp macros.jar:wiki4k.jar W 8080

This gives you access to the {{upper}} and {{code}} macros.

The trick here was to find an interface in the JDK that had a single method that takes a string and returns a string. IntelliJ’s structural search for the win, in this instance. It turned up java.net.FileNameMap. Macros need to implement this interface, the macro body is passed as the argument, and html is returned. Macro’s have access to wiki4k’s wiki-to-html escaper for sub rendering.

Keeping below 4K

There is a lot of bloat in a single .class file, so any serious 4K jar contains just one class. That is why I wanted to re-use a JDK interface for macros and not invent my own.

The next biggest source of bloat is the number of unique classes and methods you reference. I could probably have trimmed my use down a little, but doing so becomes a little wearisome.

You also get a lot of mileage out of an obfuscator. I used ProGuard which reduced my jar from 4338 to 4069 bytes.

Presumably writing a 4K game would be a little easier – I had to import Socker, ServerSocket, various streams, regex, etc, where as a Game could get away with little more than Frame and Graphics2D.

The source

In all it’s glory: W.java. You may also be interested in the macro code UpperMacro.java, CodeMacro.java and BaseMacro.java.

W.java

/* Code styles */
.code {
border-width: 1px;
border-style: dashed;
overflow: auto;
}
.code, .preformatted {
background-color: #fff;
}
.code pre, .preformatted pre { /* needs ‘pre’ to override TinyMCE style */
font-family:”Courier New”, Courier, monospace;
line-height: 1.3;
}
.code-keyword {
color: #000091;
background-color: inherit;
}
.code-object {
color: #910091;
background-color: inherit;
}
.code-quote {
color: #009100;
background-color: inherit;
}
.code-comment {
color: #808080;
background-color: inherit;
}
.code-xml .code-keyword {
color: inherit;
font-weight: bold;
}
.code-tag {
color: #000091;
background-color: inherit;
}

import java.io.DataInputStream;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.OutputStreamWriter;
import java.net.*;
import java.util.Properties;
import java.util.TreeMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class W {
static Properties PAGES = new Properties();
public static void main(String[] args) throws Exception {
Properties pages = PAGES;
Pattern requestProg = Pattern.compile(
"^((GET)) /" + // 1,2
                        //URL
                        "(?:" +
// pagename/suffix
                        "(?:" +
"(([a-zA-Z0-9+]|%20)*)" + // 3,4
                        "((/edit)" + // 5,6
                        "|(?addcomment=([^ ]*))" + // 7,8
                        "|(/save?page=([^ ]*))" + // 9,10
                        ")?" +
")" +
// recentchanges
                        "|(recent-changes(.rss)?)" + // 11,12
                        ")" +
// HTTP/1.1
                        " .*"
);
Pattern propertyNameProg = Pattern.compile("(.*).ts");
pages.setProperty("Home", "Welcome to wiki4k u221e!");
pages.setProperty("Home.ts", Long.toString(System.currentTimeMillis()));
String storename = "wiki.properties";
try {
FileInputStream sin = new FileInputStream(storename);
pages.load(sin);
sin.close();
} catch (Exception e) {
}
String homeurl = "http://localhost:" + args[0] + "/";

ServerSocket server = new ServerSocket((int) Long.parseLong(args[0]));
while (true) {
Socket s = server.accept();
DataInputStream in = new DataInputStream(s.getInputStream());
String requestline = null;
while (true) {
String line = in.readLine();
//                System.out.println(line);
                if (line == null || line.equals("")) {
break;
}
if (requestline == null) {
requestline = line;
}
}
// some tmp computations
            TreeMap<Long, String> recentChangesMap = new TreeMap<Long, String>();
for (Object o : pages.keySet()) {
String k = (String) o;
Matcher tsm = propertyNameProg.matcher("");
if (tsm.matches()) {
String g1 = tsm.group(1);
String value = pages.getProperty(k);
if (g1 != null) {
long ts = Long.parseLong(value);
recentChangesMap.put(new Long(-ts), g1);
}
}
}
OutputStreamWriter out = new OutputStreamWriter(s.getOutputStream(), "UTF-8");
Matcher m = requestProg.matcher(requestline);
out.write("HTTP/1.1 ");
if (m.matches()) {
if (m.group(11) != null) {
boolean rss = m.group(12) != null;
out.write("200 OKrnContent-Type: "); // TODO reuse with else below?
                    out.write(rss ? "application/rss+xml" : "text/html");
out.write("; charset=UTF-8rnrn");
if (rss) {
out.write("<?xml version="1.0"?>n<rss version="2.0"><channel><title>wiki4k</title><link>");
out.write(homeurl);
out.write("</link><description>wiki4k</description>");
}
if (!rss) {
out.write("<h1>Recent Changes</h1><ul>");
}
for (String spagename : recentChangesMap.values()) {
out.write(rss? "<item><link>" : "<li><a href='");
if (rss) {
out.write(homeurl);
}
out.write(URLEncoder.encode(spagename, "UTF-8"));
out.write(rss? "</link>" : "'>");
out.write(rss? "<title>" : "");
out.write(spagename);
out.write(rss? "</title></item>" : "</a>");
}
if (rss) {
out.write("</channel></rss>");
}
} else {
String upagename = m.group(3); // url page name
                    if (upagename.equals("")) {
upagename = "Home";
}
String spagename = URLDecoder.decode(upagename, "UTF-8"); // decoded pagename

String g8 = m.group(8);
String g9 = m.group(9);
if (g9 != null || g8 != null) {
if (g9 != null) {
// save page
                            String newtext = URLDecoder.decode(m.group(10), "UTF-8");
pages.setProperty(spagename, newtext);
pages.setProperty(spagename + ".ts", Long.toString(System.currentTimeMillis()));
} else {
// add comment
                            String newcomment = URLDecoder.decode(g8, "UTF-8");
String key = spagename + ".comments";
String comments = pages.getProperty(key);
if (comments == null) {
comments = newcomment;
} else {
comments = comments + "u001b" + newcomment;
}
pages.setProperty(key, comments);
}
out.write("307 l8rrnLocation: /");
out.write(upagename);
out.write("rnrn");
FileOutputStream sout = new FileOutputStream(storename);
pages.store(sout, spagename);
sout.close();
} else {
out.write("200 OKrnContent-Type: text/html; charset=UTF-8rnrn");
out.write("<h1>");
out.write(spagename);
out.write("</h1>");
String page = pages.getProperty(spagename);
if (m.group(6) != null) {
// page edit
                            if (page == null) {
page = "All about " + spagename;
}
out.write("<form action='save'><textarea name='page' rows='15' cols='80' accept-charset='UTF-8'>");
out.write(e(page));
out.write("</textarea><br><input type='submit' value='Save'>" +
" (<strong>*bold*</strong> <em>_italics_</em> [Page Link])");
} else {
// page view
                            if (page == null) {
page = "Unknown page" + spagename;
}
out.write("<p>");
w(out, page);
out.write("<hr> " +
"<a href='/recent-changes'>Recent Changes</a> " +
"(<a href='/recent-changes.rss'>rss</a>) " +
"<a href='/");
out.write(upagename);
out.write("/edit'>Edit Page</a><br>Comments:<ul>");
String comments = pages.getProperty(spagename + ".comments", "");
for (String cmt : comments.split("u001b")) {
if (cmt.length() > 0) {
out.write("<li>");
w(out, cmt);
}
}
out.write("</ul>"+
"<a href='#' onclick="getElementById('ac').style.display='block'">Add Comment</a>" +
"<form id='ac' action='' style='display: none'>" +
"<textarea name='addcomment' rows='15' cols='80' accept-charset='UTF-8'>" +
"</textarea><br><input type='submit' value='Save'>");
}
}
}
} else {
out.write("500 wtfrnrn");
}
out.close();
s.close();
}
}
/** wiki escape the page into the stream */
public static void w(OutputStreamWriter out, final String page) throws Exception {
Properties pages = PAGES;
Pattern allprog = Pattern.compile("" +
"([([a-zA-Z0-9 ]+)])" + // 1, 2=links
                "|((?m:^)([*]+) (.*)(?m:rn|r|n|$))" + // 3 lists 4=indents 5=body
                "|(([*_+^~-])((?!7).+?)7)" + // 6  7=style 8=stylebody
                "|((?:rn|r|n)((?:rn|r|n)*))" + // 9=newline, 10=morenrelines
                "|({{(.+?)}})" + // 11 code 12=body
                "|((?m:^)(?:(?:(h[1-6])|(bq)).) +(.*)(?m:rn|r|n|$))" + // 13  14=headings, 15=bq, 16=body
                "|((?m:^) *-{4,} *(?m:rn|r|n|$))" + // 17 --- hr
                "|({([a-zA-Z]+)})((?!18)(?s:.)+)18" + // {macro}...{macro} 18 macro 19 macroname 20 body
                "|()?(.)" // 21=escapenext 22 everything else
        );
Matcher m = allprog.matcher(page);
int offset = 0;
int listLevel = 0;
int endLength = page.length();
TOP:
while (offset < endLength) {
String tag = null;
boolean noEndTag = false;
String href = null;
String innerHtml = null;
String innerWiki = null;
String innerText = null;
int newListLevel = 0;
int group = matchgroup(m, offset, endLength);
switch (group) {
case 1: // links
                    String spagename = m.group(2);
tag = "a";
href = spagename;
innerHtml = spagename;
break;
case 3:
tag = "li";
newListLevel = m.group(4).length();
innerWiki = m.group(5);
break;
case 6: // *bold* etc styles
                    char c = m.group(7).charAt(0);
switch (c) {
case '*' : tag = "strong"; break;
case '_' : tag = "em"; break;
case '+' : tag = "u"; break;
case '^' : tag = "sup"; break;
case '~' : tag = "sub"; break;
case '-' : tag = "del"; break;
}
innerWiki = m.group(8);
break;
case 9:
innerHtml = m.group(10).equals("") ? " " : "<p>";
break;
case 11: // {{code}}
                    tag = "code";
innerText = m.group(12);
break;
case 13: // h1 etc
                    tag = m.group(14);
if (tag == null) {
tag = "blockquote";
}
innerWiki = m.group(16);
break;
case 17: // ----
                    tag = "hr";
noEndTag = true;
break;
case 18:
//comment in/out for macro support
                {
String macname = m.group(19);
try {
String classname = pages.getProperty("macro." + macname);
Class<?> clazz = W.class.getClassLoader().loadClass(classname);
FileNameMap macro = (FileNameMap) clazz.newInstance();
innerHtml = macro.getContentTypeFor(m.group(20));
//TODO what to do about if not found, print macronotfound?
                    } catch (Exception e) {
//                        e.printStackTrace();
                        tag = "code";
innerText = "MACROERROR:" + macname;
}
}
break;
case 21:
default:
innerText = m.group(22);
}
offset = m.end();
while (listLevel < newListLevel) {
out.write("<ul>");
listLevel++;
}
while (listLevel > newListLevel) {
out.write("</ul>");
listLevel--;
}
if (tag != null) {
out.write("<");
out.write(tag);
if (href != null) {
out.write(" href='");
out.write(URLEncoder.encode(href, "UTF-8"));
out.write("'");
}
out.write(">");
}
if (innerWiki != null) {
w(out, innerWiki);
}
if (innerHtml != null) {
out.write(innerHtml);
}
if (innerText != null) {
out.write(e(innerText));
}
if ((tag != null) && !noEndTag) {
out.write("</");
out.write(tag);
out.write(">");
}
}
while (listLevel > 0) {
out.write("</ul>");
listLevel--;
}
}
private static int matchgroup(Matcher m, int offset, int endLength) {
m.useTransparentBounds(true);
m.useAnchoringBounds(false);
m.region(offset, endLength);
if (!m.lookingAt()) {
return 0;
}
int l = m.groupCount();
for (int i = 1; i < l; i++) {
String g = m.group(i);
if (g != null) {
return i;
}
}
return 0;
}
/** html escape the input */
public static String e(String page) {
return page.replaceAll("&", "&amp;")
.replaceAll(">", "&gt;")
.replaceAll("<", "&lt;")
.replaceAll(""", "&quot;")
.replaceAll("'", "&apos;");
}
}

Wiki4k