I am asked to introduce to my classmates the the Williams' p+1 algorithm to factor integers and I don't think I get it right. As far as I understand, this algorithm takes an integer N that factors into primes p,q (N=pq) where p+1 is B-smooth. I understand why, starting from these premises, the algorithm works (I've written proof), but I don't get how to correctly implement and use it. I think it must be implemented as follows:
- I take a, randomly, in interval [1,N-1]
- I compute x=gcd(a,N). If x !=1, then I return x (I don't get why we don't check first if x is prime because we don't actually know if N really is equal to p*q and x could be composed, right?)
- Normally, x == 1, so I have to compute y = gcd(V_M-2,N) where V_0 = 2, V_1 = a, V_n= aV_(n-1) - V_(n-2) . I have found a way to compute V_n doing matrix power modulus N but I don't know which M should I use (I've copied the Pollard's one, but I don't know if that works and why).
- If y!=1 and y!=N, I return y (again, as with the x, I think we should check that y is prime, am I right?). Else, just try another random a and start again.
So, that is mainly my question about the implementation, overall concerning the building of M, which might be related to the fact of p+1 B-smoothness I guess.
Regarding the usage, I really don't get in which cases I should use this method and which B should I take. I am going to leave my code in Python3 here and a case example that really is turning me crazy to see if you can give me a hand.
import random
from math import floor, log, gcd
def is_prime(n): #funcion que determina si un numero es primo
for d in range(2,n):
if n%d == 0:
return False
return True
def primes_leq(B): #funcion para obtener los primos que son menor o igual que B
l=[]
for i in range(2,B+1):
if is_prime(i):
l.append(i)
return l
def matrix_square(A, mod):
return mat_mult(A,A,mod)
def mat_mult(A,B, mod):
if mod is not None:
return [[(A[0][0]*B[0][0] + A[0][1]*B[1][0])%mod, (A[0][0]*B[0][1] + A[0][1]*B[1][1])%mod],
[(A[1][0]*B[0][0] + A[1][1]*B[1][0])%mod, (A[1][0]*B[0][1] + A[1][1]*B[1][1])%mod]]
def matrix_pow(M, power, mod):
#Special definition for power=0:
if power <= 0:
return [[1,0],[0,1]]
powers = list(reversed([True if i=="1" else False for i in bin(power)[2:]])) #Order is 1,2,4,8,16,...
matrices = [None for _ in powers]
matrices[0] = M
for i in range(1,len(powers)):
matrices[i] = matrix_square(matrices[i-1], mod)
result = None
for matrix, power in zip(matrices, powers):
if power:
if result is None:
result = matrix
else:
result = mat_mult(result, matrix, mod)
return result
def williams(N, B):
flag = False
while not flag :
a = random.randint(1,N-1)
print("a : " + str(a))
x = gcd(a,N)
print("x : " + str(x))
if x != 1:
return x
else :
M = 1
A = [[0,1],[-1,a]]
for p in primes_leq(B):
M *= p **(floor(log(N,p)))
print("voy por aquí")
C = matrix_pow(A,M,N)
V = 2*C[0][0]+ a*C[0][1]
y = gcd(V-2,N)
print("y : " + str(y))
if y != 1 and y != N:
flag = True
return y
Trying to test my implementation, I've tried to follow some examples to check if my factorization is working alright. For example, I've looked at https://members.loria.fr/PZimmermann/records/Pplus1.html and I've tried williams(2**439-1,10**5) and I'm getting 104110607 but I understand I should be getting 122551752733003055543 (as in the webpage). As far as I understand, both are primes that factor N=2**439-1, but isn't this fact just contradicting the hypothesis of N being a product of two primes p*q?
Thanks for your help, it'll be appreciated
2439−1 has more than two prime factors. If you’re not getting the one you want, you should divide by the one that you got and keep going on the quotient.