nimforum mirror - How can I maintain the original HTML structure/DOM?

gradha (orginal) [2014-06-15T00:39:34+02:00] view original

In the following example:


import htmlparser, xmltree, strtabs, streams

const
  test = """
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<!--  This file is generated by Nimrod. -->
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<body>
<img src="test.txt">
</body></html>
"""

proc mangle() =
  var
    html = test.new_string_stream.parse_html
    DID_CHANGE: bool
  
  for img in html.find_all("img"):
    let src = img.attrs["src"]
    if not src.is_nil:
      img.attrs["src"] = "Something else"
      DID_CHANGE = true
  
  if DID_CHANGE:
    echo "Did change, output:", html

when isMainModule: mangle()

an input HTML string extracted from nimrod's own documentation generator is being modified to change URLs in img tags. The output is:


Did change, output:<document>

<!--   This file is generated by Nimrod.  -->
<html xml:lang="en" lang="en" xmlns="http://www.w3.org/1999/xhtml">
<body>
<img src="Something else" />
</body></html>
</document>

The original XML and doctype is removed and replaced with a document one. This breaks the original rendering. How can I preserve the source structure?

Araq (orginal) [2014-06-15T11:58:09+02:00] view original

use something like


const xmlBoilerplate = """
<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
"""

proc renderHtml(n: PXmlNode): string =
  result = xmlBoilerplate
  for child in n: result.add(child)

It's certainly a kludge, but then PXmlNode don't store Doctypes and parseHtml ignores them too... We need a doctype module in addition to what we have to deal with this properly.

Troyholly (orginal) [2015-01-03T07:09:05+01:00] view original

According to me while using the img src, you must give the complete path of the image where it is been located. Here at http://www.sitewired.com are some of the codes with structure which will be helpful.

siddydv28 (orginal) [2019-04-23T10:08:01+02:00] view original

Play text twist 2 free online download just now https://texttwist.online brilliant game.

amelia123 (orginal) [2019-04-25T09:31:04+02:00] view original

Actually, I have also faced these kinds of Printer errors like that must be recovered by them. To resolve all these issues https://errorcode0x.com/fix-epson-printer-error-code-0x97/ that will be helpful for them.

Mirror of forum.nim-lang.org

473 :: How can I maintain the original HTML structure/DOM?