Wiki Creole Grammar, Schema, and Transformations

Please find below an EBNF gram­mar, XML schema, and XSLT trans­for­ma­tions for Wiki Cre­ole, cur­rent­ly in ver­sion 1.0. Wiki Cre­ole is the first (and only) com­mu­ni­ty stan­dard for wiki markup.

The­se spec­i­fi­ca­tions were tak­en from the fol­low­ing two tech­ni­cal reports:

Wiki Cre­ole EBNF gram­mar

Wiki Cre­ole XML schema def­i­n­i­tion

Wiki Cre­ole to XHTML trans­for­ma­tion

XML to Wiki Cre­ole trans­for­ma­tion

It is pret­ty like­ly that despite our test suites, some bugs remain. So, if you find some, please let us know!

The best way to get in con­tact with us is to address Mar­t­in Jung­hans through the Source­Forge Wiki­Cre­ole project and cc: Dirk Riehle.

21 thoughts on “Wiki Creole Grammar, Schema, and Transformations

  1. marko

    The EBNF links are bro­ken: From the first one “wiki-creole/” must be removed, from the lat­ter the quo­ta­tion marks at the end.

    Reply
  2. Martien

    Gen­er­at­ing code using ANTL­R­Works 1.1.7 results in errors in rule text_bolditalcontent.

    I am a absolute novice on the sub­ject and appre­ci­ate your help.

    Reply
  3. Martin

    @Martien
    ANTL­R­Works does not always show the same behav­ior as ANTLR does. I couldn’t use ANTL­R­Works for devel­op­ment as the gram­mar size mag­ni­fied. Fur­ther you’ll prob­a­bly have to scale up the size of heap mem­o­ry for code gen­er­a­tion, e.g. use the JVM option –Xmx1024M.

    Reply
  4. V

    Im try­ing to imple­ment wiki­cre­ole using ocamly­acc or men­hir (an LR1 parser for OCaml). But Wikicreole’s EBNF is using some ANTLR speci­fic fea­tures (I think only this input.LA thing) that makes dif­fi­cult to adapt to anoth­er parser gen­er­a­tor … Does some­body have an idea about that?

    Reply
  5. Martin

    V, first you have to con­sid­er that ANTLR gen­er­ates LL(*) parsers, in con­trast to your LR(1).
    We had to use the explic­it input.LA state­ments since ANTLR choos­es a deriva­tion too ear­ly, i.e. even if oth­er deriva­tions are not wrong in respect to the cur­rent look-ahead. It does not increase the look-ahead as long as nec­es­sary to iden­ti­fy the only applic­a­ble deriva­tion. Thus using the wrong deriva­tion will result in an excep­tion, even though the gram­mar is cor­rect.

    The _grammar_ does not need the­se explic­it queries to the look-ahead. But the _ANTLR gram­mar specification_ seems to need it aux­il­iary.
    If I recall LR(1)-parsers cor­rect­ly, you can’t rely on the gram­mar spec­i­fi­ca­tion for your parser gen­er­a­tor only, because the Wiki­Cre­ole lan­guage spec­i­fi­ca­tion needs more look-ahead than 1.
    Nev­er­the­less, good luck for your work!

    Reply
  6. Greg

    Hi guys. I’m inter­est­ed in using this XML for­mat as my inter­me­di­ate (stored) stage. How­ev­er, it seems that I’ll have to write the code (Java) to trans­late from Cre­ole to ‘X-Creole’. I realise there’s an XSD, but would some of you have some exam­ple XML files for me to test with? Thanks!

    Reply
  7. Martin

    Hi Greg, you are right. We do rely on some code to trans­form a rep­re­sen­ta­tion of the wiki after pars­ing to the XML rep­re­sen­ta­tion. I’m not sure whether you can trans­form the parse tree (or AST) to XML by ANTLR’s domain speci­fic lan­guage. We imple­ment­ed a Java DFS parse tree walk­er that gen­er­ates the XML file simul­ta­ne­ous­ly. Regards, and I mail some files to you.

    Reply
  8. Dirk Riehle Post author

    Greg, Mar­t­in: I had been won­der­ing about this for a while—we knew that the tree walk­er is miss­ing from our pub­li­ca­tions, but then you can’t real­ly pub­lish some Java code.

    What you can do, of course, is cre­ate an open source project. So for that rea­son I cre­at­ed a Source­Forge project called wiki­cre­ole a while back. Is there any inter­est on writ­ing Tree Walk­er code for that project?

    Reply
  9. Jose Chillan

    Hi,
    I am very inter­est­ed in your work. I went to the source­forge project and I saw no file was com­mit­ted. Is there any­where I could get the
    Tree Walk­er for the ANTLR gram­mar that pro­duces the XML.
    Regards

    Reply
  10. Martin

    Hi Jose and every­one,
    The Source­Forge Project Wiki Cre­ole Parser is avail­able. It gen­er­ates scan­ner and parser from an ANTLR gram­mar file, pars­es a wiki page and trans­forms the parse tree to the XML inter­change for­mat.
    Please don’t hes­i­tate to con­tact me with any ques­tions or com­ments.

    Reply
  11. Varun

    Hel­lo all,

    I had one ques­tion, might sound triv­ial to you. I would like to know whats the dif­fer­ent between “with_extension” and “without_extension” in above files..?

    Sec­ond, Mar­t­in you had mailed (see above com­ments) Greg some XML doc­u­ments he asked for, can you mail me the same, please?

    Thanks!

    Reply
  12. Dirk Riehle Post author

    The dif­fer­ence between with_extension and without_extension is that the lat­ter is a gram­mar (and relat­ed files) for the plain orig­i­nal Wiki Cre­ole while with_extension is pre­pared by way of an addi­tion­al token to be extend­ed with new syn­tac­tic ele­ments. So if all you want is what Wiki Cre­ole 1.0 can give you, use without_extension, if you are think­ing about adding your own new syn­tac­tic ele­ments try using with_extension. The dif­fer­ences are min­i­mal, just the hook for the extend­ed syn­tax.

    As to the files, I’ll ask Mar­t­in, I think we might have to clean up this page; may­be you can already find what you are look­ing for at sf.net/projects/wikicreole?

    Reply
  13. Luke

    Hey peo­ple, great work..

    .. but:
    I’ve a prob­lem with ANTLR and this gram­mar. When I try to gen­er­ate the code via ANTLR 2.7.7 there’s always a non­de­ter­min­ism between a huge amount of rules..
    which looka­head (k) should i take to wipe the­se out, but keep the effi­cien­cy high?

    thanks!
    luke

    Reply
  14. Luke

    ok.. there’s n oth­er thing i want­ed to know..

    when i look past those warn­ings and try to com­pile the .java files in the direc­to­ry, i get 13 errors, every sin­gle one about:

    Can­not find sym­bol
    Sym­bol: input

    What do I have to do to avoid this one?
    Why doesn’t
    input.LA(1)==DASH
    or sth. work?

    thanks
    luke

    Reply
  15. Dave Pawson

    Pro­cess­ing the ‘with­out exten­sions’ .g file,
    using antl­r­works 1.2.3 I’m get­ting 5 errors and 180 warn­ings.

    that’s from the com­mand line.

    Has some­thing changed?

    TIA Dav­eP

    Reply
  16. Martin

    Hi Dave,
    we rec­om­mend not to use ANTL­R­works for gen­er­at­ing the scanner/parser. I do not remem­ber why, but using ANTLR from com­mand line is the prop­er way to go. May­be you will also have to increase the size of mem­o­ry for gen­er­a­tion.

    Cheers, Mar­t­in

    Reply
  17. tmont

    I’m hav­ing trou­ble com­pil­ing this with ANTLR 3.2 (from the com­mand line as rec­om­mend­ed with increased heap size).

    The repeat­ed error I’m see­ing (in addi­tion to about a hun­dred warn­ings) is “error(201): creole10.g:61:2: The fol­low­ing alter­na­tives can nev­er be matched: 2”.

    Is there a pre­vi­ous ver­son of ANTLR that will com­pile this with­out errors? Or am I bark­ing up a dead tree?

    Reply
    1. Dirk Riehle Post author

      Sor­ry, we are not active­ly using ANTLR any longer. We hope to have some­thing much bet­ter out for Wiki­Cre­ole though soon…

      Reply
  18. Pingback: 【整理】ANTLR应用案例 | 在路上

Leave a Reply