wiki2text is the first useful tool I've made in Nim. It extracts plain text from a MediaWiki dump, faster than bunzip2 can unzip it. This makes great input data for machine learning from text.
The available options for doing this before were an unwieldy Java toolkit or a Python 2 script that would take hours to days to run. This version takes mere minutes (especially thanks to def's help).
Feedback is welcomed, including about coding style.
Looking over the code, the only thing I can find that might be changed is some of your variable declarations. Both 'var' and 'let' have block forms that allow declaration of multiple variables.
var
a = 1
b = "hello"
let
c = a
d = b
On a side note, be aware that slicing strings and sequences copies the contents into a new object.
Hmm. What would be a more efficient way to do comparisons such as if text[pos .. pos+1] == "[[" that's still idiomatic?
Also, I suppose I have to change some -1 indices to ^1, don't I.
continuesWith() is significantly faster than slices (somewhere around 7 to 8 times for me)
Thanks for the tip ;-)