Hi, Mr Nim Newbie here,
when I run the following code fragment
import xmltree
var e = newElement("elem")
e.add newText("some text")
echo typeof e
echo typeof e.text
echo e
echo e.text
It crashes and I get the resulting output
XmlNode
string
<elem>some text</elem>
/Users/_/Projects/my-nim/src/x.nim(8) x
/Users/_/.choosenim/toolchains/nim-1.6.6/lib/pure/xmltree.nim(181) text
/Users/_/.choosenim/toolchains/nim-1.6.6/lib/system/assertions.nim(38) failedAssertImpl
/Users/j_/.choosenim/toolchains/nim-1.6.6/lib/system/assertions.nim(28) raiseAssert
/Users/_/.choosenim/toolchains/nim-1.6.6/lib/system/fatal.nim(53) sysFatal
Error: unhandled exception: /Users/_/.choosenim/toolchains/nim-1.6.6/lib/pure/xmltree.nim(181, 10) n.k in {xnText, xnComment, xnCData, xnEntity} [AssertionDefect]
Error: execution of an external program failed: /Users/_/Projects/my-nim/src/x
It looks like you can't directly access the text portion of an XmlNode type.
Have I misread the docs or is there something else going on?
Thanks
PS: macos arm64 FWIW
to access text content of a generic XmlNode you can use innerText.
The node you create is not of kind xnText and so it does not have a text attribute. It does have a child (it can have multiple children), which is of kind xnText. Compare your code with the following:
import xmltree
var e = newElement("elem")
e.add newText("some text")
echo typeof e
echo e.kind
echo e
echo e.innerText
e.add newText(" and other text")
echo e.innerText
echo ""
var f = newText("some text")
echo typeof f
echo f.kind
echo f.text
output:
XmlNode
xnElement
<elem>some text</elem>
some text
some text and other text
XmlNode
xnText
some text
I opted against innerText because it does indeed return the text of the node and all it's children:
import xmltree
var e = newElement("child1")
e.add newText(" jibber")
var f = newElement("child2")
f.add newText(" jabber")
var x = newXmlTree("root", [e, f])
x.add newText(" some text")
echo x
echo x.innerText
gives:
<root><child1> jibber</child1><child2> jabber</child2> some text</root>
jibber jabber some text
and I just want:
some text
So I posit that the text proc is broken, or at the very least, xmltree library is missing a nodeText proc that returns the text as string for that node only.
You can build your own proc that only returns text from children of kind xnText:
import xmltree
var e = newElement("child1")
e.add newText(" jibber")
var f = newElement("child2")
f.add newText(" jabber")
var x = newXmlTree("root", [e, f])
x.add newText(" some text")
proc directText(x: XmlNode): string =
for child in x:
if child.kind == xnText:
result &= child.text
echo x
echo x.innerText
echo x.directText
var g = newElement("child4")
g.add newText(" jobber")
x.add g
x.add newText(" other text")
echo ""
echo x
echo x.innerText
echo x.directText
Output:
<root><child1> jibber</child1><child2> jabber</child2> some text</root>
jibber jabber some text
some text
<root><child1> jibber</child1><child2> jabber</child2> some text<child4> jobber</child4> other text</root>
jibber jabber some text jobber other text
some text other text
I had considered the same work around. However the purpose of the `text` proc is stated as: "Gets the associated text with the node n"
TBH I would rather spend a week trying to figure out how to fix xmltree and submit a PR than code a work around to an API call that should just work as advertised. Just saying.
FYI
// nodejs
const {XMLParser, XMLValidator} = require('fast-xml-parser')
const parser = new XMLParser( {trimValues: false} )
const xml = `<root>some text<child1>jibber</child1><child2> jabber</child2> some more text</root>`
if (XMLValidator.validate(xml)) { // Proof you can have text in the root node
console.log(parser.parse(xml))
} else {
console.log('xml is invalid')
}
gives:
{
root: {
child1: 'jibber',
child2: ' jabber',
'#text': 'some text some more text'
}
}
No more buttered scones for me I'm off to play the xmltree
Thanks for your replies
Cheers
So I found the problem which was: the text* getter was incomplete.
All node kinds (other than xnElement) have an fText property. That's why there was as guard assertion in place which is what was causing the crash as I was trying to get the text of an xnElement.
xnElement stores it's text as one or more xnText nodes in the sequence that's used used to track it's other child nodes. This was never being accessed by text*. This looks like the original author knew this was a problem, guarded against it, but never came back to fix it. (Probs too busy inventing a new programming language)
This will be the guts of my PR:
# proc text*(n: XmlNode): lent string {.inline.} =
# assert n.k in {xnText, xnComment, xnCData, xnEntity}
# result = n.fText
proc text*(n: XmlNode): string {.inline.} =
case n.k
of xnText, xnVerbatimText, xnComment, xnCData, xnEntity:
result = n.fText
of xnElement:
result = ""
for n in n.s:
if n.k == xnText:
result.add(n.fText)
else:
discard
else:
discard
Which means this:
var e = newElement("child1")
e.add newText(" jibber")
var f = newElement("child2")
f.add newText(" jabber")
var x = newXmlTree("root", [e, f])
x.add newText(" some text")
echo x
echo x.text
results in this:
<root><child1> jibber</child1><child2> jabber</child2> some text</root>
some text
This looks like the original author knew this was a problem, guarded against it, but never came back to fix it.
Not quite, I realized a more complete solution goes into the realms of custom application logic.