nimforum mirror - Show Nim: Curly, an efficient thread-ready parallel HTTP client

guzba (orginal) [2024-01-20T22:19:04+01:00] view original

It has been a while since I last posted some new Nim stuff so I thought this would be a good one to share. https://github.com/guzba/curly

Curly is a new HTTP client built on top of libcurl. What makes Curly interesting is that it enables running multiple HTTP requests in parallel while controlling how and when you want to block.

Some highlights are:

Automatic TCP connection re-use (a big performance benefit for HTTPS connections).

Uses HTTP/2 multiplexing when possible (multiple requests in-flight on one TCP connection).

Any number of threads can start any number of requests and choose their blocking / nonblocking behavior.

Getting started


import curly

let curl = newCurly() # Best to start with a single long-lived instance

A simple request


let response = curl.post("https://...", headers, body) # blocks until complete

Multiple requests in parallel


var batch: RequestBatch
batch.post("https://...", headers, body)
batch.get("https://...")

for (response, error) in curl.makeRequests(batch): # blocks until all are complete
  if error == "":
    echo response.code
  else:
    # Something prevented a response from being received, maybe a connection
    # interruption, DNS failure, timeout etc. Error here contains more info.
    echo error

A single non-blocking request


curl.startRequest("GET", "https://...") # doesn't block

# do whatever

Multiple non-blocking requests


var batch: RequestBatch
batch.get(url1)
batch.get(url2)
batch.get(url3)
batch.get(url4)

curl.startRequests(batch) # doesn't block

# do whatever

Handle responses to non-blocking requests


let (response, error) = curl.waitForResponse() # blocks until a request is complete
if error == "":
  echo response.code
else:
  echo error
# Or use `let answer = curl.pollForResponse()` and `if answer.isSome:`

By choosing what blocks and doesn't block, you can manage your program's control flow however makes sense for you.

My production use-case

My Mummy HTTP server mostly makes blocking requests to a handful of endpoints through one Curly instance that is used from many threads. This results in great connection re-use to keep latency as low as possible.

I do however have one specific HTTP API call I need to make a lot that does not need to block the Mummy request handler. For this, I created a second Curly instance just for these requests, and use startRequests instead. I then have a thread that blocks reading responses and handles any cleanup necessary.

Sequential vs parallel

I have an example you can run to see the time difference between sequential and parallel HTTP requests here.

Running requests in parallel is obviously going to be much faster than sequential.

Thanks for taking a look!

ajusa (orginal) [2024-01-21T19:15:06+01:00] view original

This is amazing work as always!

Not sure if you've documented this elsewhere, but what is the difference between curly and https://github.com/treeform/puppy? Is curly more intended for use on servers (running Linux), or does it work on Windows as well with the libcurl dll? Is it possible to include a similar batching API in Puppy as well?

guzba (orginal) [2024-01-21T19:36:50+01:00] view original

Puppy's origin story is "I want to make easy cross-platform HTTP requests without -d:ssl and without extra stuff on Windows". Not really server-focused but it certainly works there.

Curly's origin story is "my server needs a good way to do lots of low-latency HTTPS RPC calls". This was a harder requirement.

Imo these are pretty different origins which makes a difference at least to where they stand today. Also when I worked a bit on Puppy I knew a lot less about all of these ways of solving the HTTPS problem so there's that.

I think the APIs are moving closer together but there is one fundamental difference--Curly is a libcurl-centric thing which means it'll always need a DLL on Windows. Puppy uses OS APIs on Windows to avoid this.

These differences kind of don't matter but kind of do. I've not thought enough about it but a revisit of Puppy is due with the knowledge gained over the past couple years. There is no reason it shouldn't have similar batching support etc.

Long vague answer but yeah Puppy has its niche imo and can improve too.

fefic (orginal) [2024-02-01T13:13:16+01:00] view original

Thank you this is very good. How can I add easy_setopt options for example proxy support for requests?

guzba (orginal) [2024-02-02T00:18:34+01:00] view original

I think some additions to Curly will be needed for this. Probably not a big deal. Would you be willing to open an issue on the GitHub repo with some more info? Such as the easy_setopt options etc.

fefic (orginal) [2024-02-02T10:07:27+01:00] view original

Sorry, I don't have a github account. But I was able to add a proxy this way

curly.nim


      # Follow up to 10 redirects...
      discard easyHandle.easy_setopt(OPT_MAXREDIRS, 10)
      
      # New code for proxy (proxy from https://spys.en)
      let proxy = "socks5://98.181.137.83:4145" # pass proxy string from main program as batch.get argument?
      discard easyHandle.easy_setopt(OPT_PROXY, proxy.cstring)
    # many proxy failed without this
      discard easyHandle.easy_setopt(OPT_SSL_VERIFYHOST, 0)
      discard easyHandle.easy_setopt(OPT_SSL_VERIFYPEER, 0)

main program


import curly, std/times

let curl = newCurly()
var batch: RequestBatch

for i in 0 ..< 3:
  batch.get("https://api.ipify.org")

for (response, error) in curl.makeRequests(batch, timeout = 30):
  if error == "":
    echo response.code, ' ', response.url, ' ', response.body
  else:
    echo error

Something like this, when for each request you can set an independent proxy would be very useful for scraping.


for i in 0 ..< 3:
  let proxy = "..."
  batch.get("https://api.ipify.org", proxy=proxy)

VNC (orginal) [2024-11-04T00:11:20+01:00] view original

Nice library. I tested it yesterday and need to try out all the features.

Mirror of forum.nim-lang.org

10893 :: Show Nim: Curly, an efficient thread-ready parallel HTTP client