nimforum mirror - Having problems porting a Python script for computing two cofactors of a number

Yardanico (orginal) [2020-05-13T09:19:11+02:00] view original

Hello! So I decided to start trying to implement MTProto protocol (Telegram) in Nim, did the most basic things so I can get a reply from the server for the first time.

To continue I need to decompose a number pq into two prime cofactors p and q, such as p < q. So basically I need to find the largest prime cofactor of a number and then divide the number by that cofactor to get the second one. The Python version I found runs really fast in under 100ms:

from random import randint
from math import gcd


def brent(N):
    if N % 2 == 0:
        return 2
    # y, c, m = randint(1, N - 1), randint(1, N - 1), randint(1, N - 1)
    y = 398563423414958448
    c = 259203014236224986
    m = 359097073186387306
    
    g, r, q = 1, 1, 1
    while g == 1:
        x = y
        for i in range(r):
            y = ((y * y) % N + c) % N
        k = 0
        while k < r and g == 1:
            ys = y
            for i in range(min(m, r - k)):
                y = ((y * y) % N + c) % N
                q = q * (abs(x - y)) % N
            g = gcd(q, N)
            k = k + m
        r *= 2
    if g == N:
        while True:
            ys = ((ys * ys) % N + c) % N
            g = gcd(abs(x - ys), N)
            if g > 1:
                break
    
    return g

def factors(n):
  a = brent(n)
  b = n // a
  return (a, b) if a < b else (b, a)

print(factors(1243804461473944423))

I tried my hand at porting it to Nim but I guess I failed:

import random, math

randomize()

proc brent*(n: uint64): uint64 =
  if n mod 2 == 0:
    return 2
  
  var
    # y = rand(1'u64 ..< n - 1)
    y = 398563423414958448'u64
    c = 259203014236224986'u64
    m = 359097073186387306'u64
    g, r, q = 1'u64
    x, k, ys = 0'u64
  
  while g == 1:
    x = y
    for i in 0 ..< r:
      y = ((y * y) mod n + c) mod n
    k = 0'u64
    while k < r and g == 1:
      ys = y
      for i in 0 ..< min(m, r - k):
        y = ((y * y) mod n + c) mod n
        q = q * uint64(abs(int64(x - y))) mod n
      g = gcd(q, n)
      k = k + m
    r *= 2
  if g == n:
    while true:
      ys = ((ys * ys) mod n + c) mod n
      g = gcd(uint64(abs(int64(x - ys))), n)
      if g > 1:
        break
  
  result = g

proc factors(n: uint64): (uint64, uint64) =
  let a = brent(n)
  let b = n div a
  if a < b: (a, b) else: (b, a)

echo factors(1243804461473944423'u64)

As far as I can see one of the issues if that there's an overflow when two big uint64's are multiplied. So can someone recommend a good way to port this algorithm, or maybe I can use some other algorithm which won't use such big numbers for multiplication? (I also tried porting https://en.wikipedia.org/wiki/Pollard%27s_rho_algorithm and succeeded, but it's quite slow)

miran (orginal) [2020-05-13T13:02:12+02:00] view original

As far as I can see one of the issues if that there's an overflow when two big uint64's are multiplied.

Since after the multiplication you're taking a modulo, these examples of modular arithmetic might come useful.

mratsim (orginal) [2020-05-13T14:42:16+02:00] view original

From https://github.com/mratsim/number-theory/blob/12c54a7476327cf9ab00989c9964732e3eb68be7/src/modular_arithmetic.nim

Modular addition

proc addmod*[T: SomeInteger](a, b, m: T): T =
  ## Modular addition
  
  let a_m = if a < m: a
            else: a mod m
  if b == 0.T:
    return a_m
  let b_m = if b < m: b
            else: b mod m
  
  # We don't do a + b to avoid overflows
  # But we know that m at least is inferior to biggest T
  
  let b_from_m = m - b_m
  if a_m >= b_from_m:
    return a_m - b_from_m
  return m - b_from_m + a_m

Modular multiplication

proc mulmod*[T: SomeInteger](a, b, m: T): T =
  ## Modular multiplication
  
  var a_m = a mod m
  var b_m = b mod m
  if b_m > a_m:
    swap(a_m, b_m)
  while b_m > 0.T:
    if b_m.isOdd:
      result = addmod(result, a_m, m)
    #a_m = doublemod(a_m, m)
    a_m = (a_m shl 1) - (if a_m >= (m - a): m else: 0)
    b_m = b_m shr 1

Modular inversion (via GCD, so can return the GCD)

proc invmod*[T:SomeInteger](a, m: T): T =
  ## Modular multiplication inverse
  ## Input:
  ##   - 2 positive integers a and m
  ## Result:
  ##   - An integer z that solves `az ≡ 1 mod m`
  # Adapted from Knuth, The Art of Computer Programming, Vol2 p342
  # and Menezes, Handbook of Applied Cryptography (HAC), p610
  # to avoid requiring signed integers
  # http://cacr.uwaterloo.ca/hac/about/chap14.pdf
  
  # Starting from the extended GCD formula (Bezout identity),
  # `ax + by = gcd(x,y)`
  # with input x,y and outputs a, b, gcd
  # We assume a and m are coprimes, i.e. gcd is 1, otherwise no inverse
  # `ax + my = 1`
  # `ax + my ≡ 1 mod m`
  # `ax ≡ 1 mod m``
  # Meaning we can use the Extended Euclid Algorithm
  # `ax + by` with
  # a = a, x = result, b = m, y = 0
  
  var
    a = a
    x = 1.T
    b = m
    y = 0.T
    oddIter = true # instead of requiring signed int, we keep track of even/odd iterations which would be in negative
  
  while b != 0.T:
    let
      (q, r) = divmod(a,b)
      t = x + q * y
    x = y; y = t; a = b; b = r
    oddIter = not oddIter
  
  if a != 1.T:
    # a now holds the gcd(a, m) and should equal 1
    raise newException(ValueError, "No modular inverse exists")
  
  if oddIter:
    return x
  return m - x

The whole point of proof-of-work is to have non-trivial computation. Factorizing small number is very fast hence you need them at a decent bit-width but if you're really desperate, I have an arbitrary bigint factorization algorithm in BrainFuck here ;): https://github.com/numforge/laser/blob/d1e6ae61/examples/ex07_jit_brainfuck_src/factor.bf

pietroppeter (orginal) [2020-05-13T19:06:28+02:00] view original

Hi, out of curiosity I decided to do a naive "port" of the algorithm to use https://github.com/def-/nim-bigints

Apart from the fact that doing that I ran into an issue with bigints (https://github.com/def-/nim-bigints/issues/27) the result (https://play.nim-lang.org/#ix=2lU5) works and on my machine compiled with -d:danger it takes a bit more than 500ms.

It might also be interesting to use https://github.com/status-im/nim-stint instead of bigints.

mratsim (orginal) [2020-05-13T22:41:46+02:00] view original

You need to use in-place addition and multiplication instead of out-of-place otherwise you will bottleneck on memory allocation/deallocation during all those loops with nim-bigint.

If it fits in an uint64, stint is not needed, but otherwise as it's all stack allocated you won't suffer from allocation, the tradeoff is that you cannot have an arbitrary size.

pietroppeter (orginal) [2020-05-14T00:44:35+02:00] view original

Yes, there is definitely room for improvements. I just wanted to check if using bigints with minimal changes would be working. And I am actually surprised it does, since I was not able to find a gcd prox overloaded for BigInt (I was expecting to have to provide an implementation). Not sure what is being used there. Also, I I think in place addition and multiplication are not exported in bigints.

Regarding the usage of stint, I would guess the actual Telegram protocol might go beyond uint64 and might have a specific size that one can work with.

Mirror of forum.nim-lang.org

6327 :: Having problems porting a Python script for computing two cofactors of a number