Hyperelliptic curve point counting record: 254 bit Jacobian
June 2008
Thanks to Dan Bernstein and
Nikki Pitcher for their suggestions.
Curve data
Let p=2^127-1 and C the genus 2 curve over
GF(p) defined by
y^2 = 31375376347971734085670496609836615726 +
27953605038214221253645981475511570657 x +
62420003626849852428332554437765277161 x^2 +
88005954939527387244239849284058473679 x^3 +
155062477294917469622604436777931982527 x^4 +
x^5
The characteristic polynomial of its Frobenius endomorphism
is
Z(T) = T^4 - s1 T^3 + s2 T^2 - p s1 T + p^2
with
s1 = -15671660075779706640
and
s2 = 86154286096042006774781271889300357630
Thus the number of points on the Jacobian of C and on
its twist are
N = 28948022309329048858559141044255323173240361153110759925351350586742398190080
Nt =
28948022309329048853226351460088630752886375018059729181000254746129230922240
This curve has been more or less randomly chosen, according to the
following requisites:
- The Kummer surface associated to C is rational,
when using the Theta-coordinates of [Gaudry07].
The corresponding parameters are
a = 70749273537019715197487696660857318100
b = 13297111293698997518530805629493053456
c = 31962724788629373720348362255515893454
d = 39961846205204383608694313460795530917
This implies, in particular that the group order of the Jacobian
is divisible by 16. On the other hand, this allows fast
arithmetic (see also Bernstein's ECC
talk).
- The prime p=2^127-1 is chosen so that the finite
field arithmetic is facilitated. This was not used in this
record-computation, but would make sense for using the curve in
cryptography.
- The curve C was selected among half a dozen of
curves, so that the divisions by 2,3,5,7 algorithms were
facilitated. This bias should be removed for proper cryptographic
curve construction.
Previous records
- Previous record was held by Sutherland:
in 2007, he computed some 188-bit genus 2 Jacobian over a prime
field. Note that this is not exactly point counting,
since only a curve whose quadratic twist has a Jacobian with a
smooth order can be counted with his method.
- The previous record was held by Gaudry
and Schost: in 2004, they
computed genus 2 Jacobians of 164 bits, over a prime field with
p=5*10^24+8503491 ~ 2^82.
- Previously, the record was held by Gaudry
and Harley: in
2000, they computed genus 2 Jacobians of 126 bits, over a prime
field with p=10^19+51 ~ 2^63.
Algorithm used
The algorithm used is a collection of variants and generalizations of
Schoof's algorithm for genus 2. The characteristic polynomial is
computed modulo several primes or prime powers. This modular information is
collected using the Chinese Remainder Theorem. Once the missing part
is small enough, a generic «square-root» algorithm is used. Most of
these algorithms are based on the work on 2004 with several
improvements in almost every place.
Everything is implemented in C++, using the NTL and the GMP libraries.
Powers of 2
Elements of 2^k-torsion are computed using a novel algorithm, that
makes use of the Kummer surface parametrization. This simplifies a
bit the lifting. We no longer use a general purpose Gröbner basis
implementation.
Powers of 3
Elements of 3^k-torsion are computed using a novel algorithm. At
each step, the algebraic system that has to be inverted is solved
using deformation techniques and Newton lifting of power series.
Again, we no longer use a general purpose Gröbner basis
implementation.
Powers of 5, 7
Elements of 5^k- and 7^k-torsion are computed. At each step, the
algebraic system that has to be inverted is solved using resultant
computations.
Other primes
Computations modulo primes have been enhanced; now it proceeds as
follows. First the full, exact, l-torsion ideal is computed. The
"refining of resultant" step has been modified to save time and
space, allowing larger torsions to be computed. Thereafter, the
Frobenius equation is tested against the full ideal, without trying
to factor it. The theoretical complexity is better, and this is also
better in practice for current sizes.
Birthday paradox algorithm
For the record computation, we did more modular computation than
needed, so that the 2-dimensional version of the algorithm was not
used. The classical "kangaroo" algorithm was used.
Timings
All times are in seconds on a single CPU of an AMD Opteron 250 at 2.4
GHz. Memory usage peaked around 8GB (the maximun available on our
computers).
2^k-torsion
Torsion |
Degree |
Halving |
Schoof |
Get Z mod |
2^5 |
16 |
0.19 |
0.04 |
2^2 |
2^6 |
32 |
0.25 |
0.29 |
2^3 |
2^7 |
64 |
0.7 |
0.75 |
2^4 |
2^8 |
128 |
2.0 |
1.8 |
2^6 |
2^9 |
256 |
6.5 |
4.6 |
2^7 |
2^10 |
512 |
20 |
12 |
2^8 |
2^11 |
1024 |
70 |
30 |
2^9 |
2^12 |
2048 |
243 |
71 |
2^10 |
2^13 |
4096 |
903 |
168 |
2^11 |
2^14 |
8192 |
3554 |
425 |
2^12 |
2^15 |
16384 |
13329 |
1026 |
2^13 |
2^16 |
32768 |
58751 |
2753 |
2^14 |
2^17 |
65536 |
257425 |
9842 |
2^15 |
3^k-torsion
Torsion |
Degree |
Halving |
Schoof |
Get Z mod |
3^2 |
6 |
543 |
0.1 |
3^2 |
3^3 |
6 |
57 |
0.1 |
3^3 |
3^4 |
18 |
59 |
0.2 |
3^4 |
3^5 |
54 |
265 |
1 |
3^5 |
3^6 |
162 |
1330 |
4 |
3^6 |
3^7 |
486 |
4905 |
11 |
3^7 |
3^8 |
1458 |
22642 |
44 |
3^8 |
3^9 |
4374 |
134530 |
211 |
3^9 |
5^k-torsion
Torsion |
Degree |
Halving |
Schoof |
Get Z mod |
5^2 |
10 |
1639 |
0.1 |
5^2 |
5^3 |
50 |
12334 |
2 |
5^3 |
5^4 |
250 |
221740 |
17 |
5^4 |
7^k-torsion
Torsion |
Degree |
Halving |
Schoof |
Get Z mod |
7^2 |
56 |
95975 |
2.7 |
7^2 |
Other primes
ell |
resultants |
reduction of res |
Frobeniuses |
end of Schoof |
11 | 974 | 411 | 148 | 45 |
13 | 3173 | 1174 | 312 | 203 |
17 | 19374 | 3685 | 1183 | 927 |
19 | 39198 | 12443 | 3382 | 2245 |
23 | 150776 | 40729 | 14581 | 12725 |
29 | 612602 | 134993 | 31456 | 37285 |
31 | 874886 | 178431 | 38213 | 44493 |
Combining this modular information gives s1 exactly, and s2 was
recovered using kangaroos in about 2 hours.
It is to be noted that computing so much modular information is far
from optimal: putting more pressure on the square-root part would
save some time. These timing tables are also benches for better
tuning in next computations.
A conclusion of these experiments is that counting one curve over
GF(2^127-1) can be done in about one CPU-month.
Future computations
Secure Kummer surface of 254 bits
This new record computation is the beginning of a search for a
secure Kummer surface over GF(2^127-1), possibly having
small coefficients, so that the pseudo-group law gets as fast as
possible. We also require both Jacobians of C and of its
twist to have an order that is 16 times a prime.
Larger record
With our current software, and a computer with 32 GB of RAM, we
expect to be able to count points of a Jacobian with about 330 bits.
Thanks
- INRIA for invitation of É. Schost in Spring 2008
- LORIA for invitation of D. J. Bernstein in Fall 2007
- SHARCNET for providing computer resources
- NSERC and the Canada Research
Chair Program for funding.
... and a few logos: