nimforum mirror - List comprehension

vega (orginal) [2016-03-21T10:09:46+01:00] view original

Hi!

There are some ideas about list comprehension improvemets. The first is about the type of the result. It's annoying to do the work that the compiler must do. So, I want to throw it away. Of cause, auto doesn't work and leads to standard error while trying to infer the type of the element of the sequence:


lctest.nim(4, 24) Error: A nested proc can have generic parameters only when it is used as an operand to another routine and the types of the generic paramers can be inferred from the expected signature.

And when I'm trying to find out the code, produced by the list comprehension macro from the future, I receive another error. Here it is: https://gist.github.com/vegansk/4ac21b238c4d4bf329da. Am I doing something wrong or is this a compiler bug?

The second idea is about replacing the sequence by the iterator. We can use sequtils.toSeq if we really need the sequence. And we can use some heavy computations in lc, they will be computed only if needed.

So, what do you think about it?

vega (orginal) [2016-03-22T10:13:33+01:00] view original

Figured out that the problem described in gist is a compiler issue, so here it is: https://github.com/nim-lang/Nim/issues/3991.

But the main question is how to get the type of the result for the list comprehension. We must enumerate all the parts of the lc which unfolds to cycles and determine their variable's types. And then we must inference the result type from theese. So, how can we get the type of the expression in macro? Is it possible?

Jehan (orginal) [2016-03-23T20:06:41+01:00] view original

I've brought this up before, but this is another reason why I'm not a huge fan of lc for list comprehensions and instead prefer to use a simple template in conjunction with iterators:

template enumerate*(s: untyped): untyped =
  block:
    iterator `~tmp`(): auto = s
    var result = newSeq[type(`~tmp`())]()
    for item in `~tmp`():
      add(result, item)
    result

(Though it is a bit awkward that you can't hide local variables from an untyped statement block and need to resort to identifier obfuscation.)

While this is more verbose than lc, I don't think the verbosity here is necessarily bad and it integrates well with the rest of the language. Importantly, type inference works.

proc main =
  const n = 20
  let triangles = enumerate do:
    for x in 1..n:
      for y in x..n:
        for z in y..n:
          if x*x + y*y == z*z:
            yield (a: x, b: y, c: z)
  
  let even10 = enumerate do:
    for x in 1..10:
      if x mod 2 == 0:
        yield x
  
  echo triangles
  echo even10

main()

Arrrrrrrrr (orginal) [2016-03-23T20:17:21+01:00] view original

Your enumerate could be included in the future module, simple and useful.

Jehan (orginal) [2016-03-24T00:00:38+01:00] view original

Well, one problem would be what to name it. It's fine for my personal stdlib, but it's a very general name to use for very specific functionality.

andrea (orginal) [2016-03-24T09:35:36+01:00] view original

What about loop? Seems not to be taken yet and would read nice

Arrrrrrrrr (orginal) [2016-03-24T11:06:25+01:00] view original

When i read loop i translate it to "while true". Enumerate (for me) is fine.

cblake (orginal) [2016-03-24T16:54:29+01:00] view original

I agree "enumerate" is generic (and used in various other ways by various other languages). Since the construct as-used is really kind of a double/triple -- (templateNameTBD, do:, yield) and also always results in a seq, perhaps "seqYielded" would be the best name? It's only one more character than "enumerate" and about as explicit/obvious as one can get, at least to me -- it directly alludes to both what is being constructed and how.

andrea (orginal) [2016-03-24T17:52:31+01:00] view original

I don't think that would read nice:

let triangles = seqYielded do:
  for x in 1..n:
    for y in x..n:
      for z in y..n:
        if x*x + y*y == z*z:
          yield (a: x, b: y, c: z)

I think in this context loop is less weird than in isolation:

let triangles = loop do:
  for x in 1..n:
    for y in x..n:
      for z in y..n:
        if x*x + y*y == z*z:
          yield (a: x, b: y, c: z)

but enumerate is also fine

cblake (orginal) [2016-03-24T20:27:56+01:00] view original

At its core, this thing is a seq constructor. That always requires a loop, while a loop even with a yield does not always relate to a seq (e.g., inside some actual iterator it is just dynamic values possibly handled one at a time by the caller). So, I think "loop" here is as just about as overly generic as "enumerate". We can agree to disagree - I am just backing up my recommendation. Another possibility might be "seqFrom" or "seqFromLoops" or something like that. I don't think brevity of the name is more important than clarity.

andrea (orginal) [2016-03-25T09:49:17+01:00] view original

So maybe collect? :-)

Arrrrrrrrr (orginal) [2016-03-25T10:44:23+01:00] view original

Collect is perfect for the task it performs.

cblake (orginal) [2016-03-25T11:48:08+01:00] view original

collect as a verb is fine - I don't care much about the verb. You could even go all the way to comprehend which might let people find it more easily with partial substring searches of comprehen* in theindex.html.

However, correctly indicating the output type in the name somehow just feels best to me. This is especially true here because the let/var type inference works well in Jehan's approach. So, there may be no other indication to the reader what type of collection is made. Maybe this approach could be extended to be able to do "table comprehensions" or "set comprehensions" or even singly- and doubly-linked list comprehensions from the stdlib's collections.lists. Then we would need like 5 names and seqCollect, tableCollect, etc. would all be a natural family. (In this light, lc/ListComprehension in future.nim is already poorly named - it should be seqComprehension since output is a Nim seq not a Haskell singly-linked list or Python array list.)

Alternately, one simpler name like collect could work, but maybe have users provide the output data structure as an argument to the template. Then maybe any collection type with an appropriate add proc would work with the construct automatically. Reusing/presizing the output might then also become a possibility.

Mirror of forum.nim-lang.org

2141 :: List comprehension