nimforum mirror - goalkicker.com and httpclient

AIR (orginal) [2021-08-13T04:49:32+02:00] view original

Taking a break from working on nim-cocoa, I wanted to see how difficult it would be to pull down all of the free books from goalkicker.com

Got it working, just wondering if there was anything I could have done differently?

import httpClient, htmlParser, xmltree, strutils, os

let site = "https://books.goalkicker.com/"
var client = newHttpClient()
let src = client.getContent(site)
var fileName: string

createDir("books")

let html = src.parseHtml

for a in html.findAll("div"):
    if a.attr("class").startsWith("bookContainer"):
        for b in a.findAll("a"):
            let bookPage = client.getContent(site & b.attr("href"))
            let book = bookPage.parseHtml
            for c in book.findAll("div"):
                if c.attr("id") == "header":
                    fileName = c.innerText.replace(" book",".pdf")
                    if fileName.startsWith('.'): fileName = "DOT" & fileName
                if c.attr("id") == "footer":
                    var dlFile = c.innerText.split()[0]
                    var pdfFile = site & b.attr("href") & dlFile
                    echo "Downloading $#..." % [fileName]
                    client.downloadFile(pdfFile, "books/" & fileName)

AIR.

xigoi (orginal) [2021-08-13T07:55:13+02:00] view original

I recommend using nimquery instead of a bunch of nested ifs.

AIR (orginal) [2021-08-14T00:55:31+02:00] view original

Would you be able to show an example of how you would achieve the same result using nimquery?

xigoi (orginal) [2021-08-14T02:36:22+02:00] view original

Something like

for b in html.querySelectorAll("div[class^='bookContainer'] > a"):
    let bookPage = client.getContent(site & b.attr("href"))
    let book = bookPage.parseHtml
    let header = book.querySelector("div#header"):
    fileName = header.innerText.replace(" book",".pdf")
    if fileName.startsWith('.'): fileName = "DOT" & fileName
    let footer = book.querySelector("div#footer")
    var dlFile = footer.innerText.split()[0]
    var pdfFile = site & b.attr("href") & dlFile
    echo "Downloading $#..." % [fileName]
    client.downloadFile(pdfFile, "books/" & fileName)

AIR (orginal) [2021-08-14T06:47:08+02:00] view original

That's very cool, @xigoi!

Thanks for sharing, I didn't know about nimquery...

Mirror of forum.nim-lang.org

8321 :: goalkicker.com and httpclient