First download gmp-5.0.1, uncompress it, and go in the gmp-5.0.1 directory. $ patch -p 0 -i /tmp/fft_patch.gmp-5.0.1 (assuming the patch is in /tmp) patching file mpn/generic/mul_fft.c patching file gmp-h.in patching file tfft.c $ ./configure; make $ gcc -I. tfft.c .libs/libgmp.a $ ./a.out 1000 1000 1500 # performs a product of 1000 limbs by 1000 limbs, # with a result mod B^r+1 with r>=1500 # and B=2^BITS_PER_MP_LIMB r=1536 # prints the actual value of r This patch adds the following interface to GMP (it does not change anything in the existing interface): A new structure "mp_fft_t" is defined: typedef struct { mp_size_t n; /* multiplication is performed mod 2^(n*GMP_NUMB_BITS)+1 */ unsigned long k; /* fft length is K = 2^k */ unsigned long l; /* allow to accumulate up to 2^l products */ unsigned long M; mp_size_t nprime; mp_ptr *d; /* array of K arrays of nprime+1 limbs */ } __mp_fft_struct; typedef __mp_fft_struct mp_fft_t[1]; Seven new functions are exported (their interface is preliminary and may change, please send any comments to zimmerma at loria dot fr): mp_size_t mpn_fft_init (mp_fft_t r, mp_size_t n0, unsigned long l) Initializes a FFT structure r, to perform multiplication modulo 2^(n*GMP_NUMB_BITS)+1, with n >= n0. Return the value of n. The 3rd argument indicates that up to 2^l products may be accumulated in the Fourier domain; in usual cases l = 0. void mpn_fft_transform (mp_fft_t r, mp_ptr ap, mp_size_t an) Puts in the FFT structure r the transform of {ap, an}. We should have an <= n, where n is the value returned when initializing r. void mpn_fft_mult (mp_fft_t r, mp_fft_t s) Puts in the FFT structure r the term to term product of r and s. void mpn_fft_add (mp_fft_t r, mp_fft_t s) Puts in the FFT structure r the term to term sum of r and s. void mpn_fft_sub (mp_fft_t r, mp_fft_t s) Puts in the FFT structure r the term to term difference of r and s. void mpn_fft_itransform (mp_ptr ap, mp_fft_t r) Puts in {ap, n+1} the inverse transform of the FFT structure r, where n is the value returned when initializing r. void mpn_fft_clear (mp_fft_t r) Frees the memory allocated by the FFT structure r. Examples: 1) a multiplication call mpn_mul (rp, ap, an, bp, bn) can be replaced by: mp_fft_t at, bt; mp_ptr tp; mp_size_t m; m = mpn_fft_init (at, an + bn, 0); mpn_fft_init (bt, an + bn, 0); mpn_fft_transform (at, ap, an); mpn_fft_transform (bt, bp, bn); mpn_fft_mult (at, bt); tp = malloc ((m + 1) * sizeof (mp_limb_t)); mpn_fft_itransform (tp, at); MPN_COPY (rp, tp, an + bn); mpn_fft_clear (at); mpn_fft_clear (bt); free (tp); 2) in some cases, we want to reuse several times a transformed representation. However, mpn_fft_mult() destroys its first argument. Here is a copy function, which assumes that both r and s were initialized with the same 2nd argument of mpn_fft_init: /* copy s into r, assumes that r was initialized with mpn_fft_init(r, n), with the same 'n' than for s */ void mpn_fft_copy (mp_fft_t r, mp_fft_t s) { MPN_COPY (r->d[0], s->d[0], s->K * (s->nprime + 1)); }