Skip to main content

Fast Fourier Transform (FFT)

Fast Fourier Transform (FFT)

This section describes how the Cooley-Tukey fast Fourier transform works. As we learned in the previous section, the key is to select evaluation points that yield an efficient FFT algorithm.

Specifically, say we have ωF\omega \in F such that ωn=1\omega^{n} = 1, and ωr1\omega^r \neq 1 for any 0<r<n0 < r < n.

Put another way, all the values 1,ω,ω2,ω3,,ωn11, \omega, \omega^2, \omega^3, \dots, \omega^{n-1} are distinct and ωn=1\omega^n = 1.

Put yet another way, the group generated by ω\omega inside F×F^\times (written ω\langle \omega \rangle) has size nn.

We call such an ω\omega a primitive nn-th root of unity.

Suppose we have an ω\omega which is a primitive 2k2^kth root of unity and let Ak={1,ω,,ω2k1}A_k = \{ 1, \omega, \dots, \omega^{2^k - 1} \}.

The FFT algorithm will let us compute interpAk\mathsf{interp}_{A_k} for this set.

Actually, it is easier to see how it will let us compute the evalAk\mathsf{eval}_{A_k} algorithm efficiently.

We will describe an algorithm FFT(k,ω,f)\mathsf{FFT}(k, \omega, f) that takes as input

  • kNk \in \N
  • ωF\omega \in F a primitive 2k2^kth root of unity
  • fF[x]<2kf \in F[x]_{< 2^k} in dense coefficients form (i.e., as a vector of coefficients of length nn).

and outputs the vector of evaluations

[f(1),f(ω),f(ω2),f(ω2k1)][f(1), f(\omega), f(\omega^2) \dots, f(\omega^{2^k - 1})]

and does it in time O(k2k)O(k 2^k) (which is to say, nlognn \log n if n=2kn = 2^k).

Notice that naively, computing each evaluation f(ωi)f(\omega^i) using the coefficients of ff would require time O(n)O(n), and so computing all nn of them would require time O(n2)O(n^2).

The algorithm FFT(k,ω,f)\mathsf{FFT}(k, \omega, f) can be defined recursively as follows.

If k=0k = 0, then ω\omega is a primitive 11st root of unity, and ff is a polynomial of degree 00. That means ω=1\omega = 1 and also ff is a constant cFc \in F. So, we can immediately output the array of evaluations [c]=[f(1)][c] = [f(1)].

If k>0k > 0, then we will split ff into two polynomials, recursively call FFT\mathsf{FFT} on them, and reconstruct the result from the recursive calls.

To that end, define f0f_0 to be the polynomial whose coefficients are all the even-index coefficients of ff and f1f_1 the polynomial whose coefficients are all the odd-index coefficients of ff. In terms of the array representation, this just means splitting out every other entry into two arrays. So that can be done in time O(n)O(n).

Write f=i<2kcixif = \sum_{i < 2^k} c_i x^i, so that f0=i<2k1c2ixif_0 = \sum_{i < 2^{k - 1}} c_{2i} x^i and f1=i<2k1c2i+1xif_1 = \sum_{i < 2^{k-1}} c_{2i + 1} x^i. Then

f(x)=i<2kcixi=i<2k1c2ix2i+i<2k1c2i+1x2i+1=i<2k1c2i(x2)i+i<2k1c2i+1x(x2)i=i<2k1c2i(x2)i+xi<2k1c2i+1(x2)i=f0(x2)+xf1(x2)\begin{aligned} f(x) &= \sum_{i < 2^k} c_i x^i \\ &= \sum_{i < 2^{k-1}} c_{2i} x^{2i} + \sum_{i < 2^{k-1}} c_{2i + 1} x^{2i + 1} \\ &= \sum_{i < 2^{k-1}} c_{2i} (x^2)^i+ \sum_{i < 2^{k-1}} c_{2i + 1} x \cdot (x^2)^i \\ &= \sum_{i < 2^{k-1}} c_{2i} (x^2)^i+ x \sum_{i < 2^{k-1}} c_{2i + 1} (x^2)^i \\ &= f_0(x^2) + x f_1(x^2) \end{aligned}

Now, notice that if ω\omega is a 2k2^kth root of unity, then ω2\omega^2 is a 2k12^{k - 1}th root of unity. Thus we can recurse with FFT(k1,ω2,f0)\mathsf{FFT}(k - 1, \omega^2, f_0) and similarly for f1f_1. Let

[e0,0,,e0,2k11]=FFT(k1,ω2,f0)[e1,0,,e1,2k11]=FFT(k1,ω2,f1)\begin{aligned} [e_{0, 0}, \dots, e_{0, 2^{k-1} - 1}] &= \mathsf{FFT}(k-1, \omega^2, f_0) \\ [e_{1, 0}, \dots, e_{1, 2^{k-1} - 1}] &= \mathsf{FFT}(k-1, \omega^2, f_1) \end{aligned}

By assumption ei,j=fi((ω2)j)e_{i, j} = f_i((\omega^2)^j). So, for any jj we have

f(ωj)=f0((ω2)j)+ωjf1((ω2)j)\begin{aligned} f(\omega^j) &= f_0((\omega^2)^j) + \omega^j f_1((\omega^2)^j) \end{aligned}

Now, since jj may be larger than 2k112^{k-1} - 1, we need to reduce it mod 2k12^{k-1}, relying on the fact that if τ\tau is an nnth root of unity then τj=τjmodn\tau^j = \tau^{j \mod n} since τn=1\tau^n = 1. Thus, (ω2)j=(ω2)jmod2k1(\omega^2)^j = (\omega^2)^{j \mod 2^{k-1}} and so we have

f(ωj)=f0((ω2)jmod2k1)+ωjf1((ω2)jmod2k1)=e0,jmod2k1+ωje1,jmod2k1\begin{aligned} f(\omega^j) &= f_0((\omega^2)^{j \mod 2^{k-1}} ) + \omega^j f_1((\omega^2)^{j \mod 2^{k-1}}) \\ &= e_{0, j \mod 2^{k-1}} + \omega^j e_{1, j \mod 2^{k-1}} \end{aligned}

We can compute the array W=[1,ω,,ω2k1]W = [ 1, \omega, \dots, \omega^{2^k - 1}] in time O(n)O(n) (since each entry is the previous entry times ω\omega). Then we can compute each entry of the output in O(1)O(1) as

f(ωj)=e0,jmod2k1+W[j]e1,jmod2k1\begin{aligned} f(\omega^j) &= e_{0, j \mod 2^{k-1}} + W[j] \cdot e_{1, j \mod 2^{k-1}} \end{aligned}

There are nn such entries, so this takes time O(n)O(n).

This concludes the recursive definition of the algorithm FFT(k,ω,f)\mathsf{FFT}(k, \omega, f).

Algorithm: computing evalAk\mathsf{eval}_{A_k}

  • Input f=[c0,,c2k1]\mathsf{Input~} f = [c_0, \ldots, c_{2^k - 1}] the coefficients of polynomial f(x)=i<2kcixif(x) = \sum_{i < 2^k} c_i x^i
  • Compute W[1,ω,ω2,...,ω2k1]\mathsf{Compute~} W \gets \left[1, \omega, \omega^2, ..., \omega^{2^k - 1}\right]
  • FFT(k,ω,f)[f(1),f(ω),f(ω2),f(ω2k1)]\mathsf{FFT}(k, \omega, f) \rightarrow \left[f(1), f(\omega), f(\omega^2) \dots, f(\omega^{2^k - 1})\right]
    • if k==0\mathtt{if~} k == 0
      • return f\mathtt{return~} f
    • else\mathtt{else}
      • Compute f0=[c0,c2,...,c2k2]\mathsf{Compute~} f_0 = [c_0, c_2, ..., c_{2^k - 2}] the even coefficients of f,f, corresponding to f0(x)=i<2k1c2ixif_0(x) = \sum_{i < 2^{k - 1}} c_{2i} x^i
      • Compute f1=[c1,c3,...,c2k1]\mathsf{Compute~} f_1 = [c_1, c_3, ..., c_{2^k - 1}] the odd coefficients of f,f, corresponding to f1(x)=i<2k1c2i+1xif_1(x) = \sum_{i < 2^{k - 1}} c_{2i + 1} x^i
      • e0FFT(k1,ω2,f0)e_0 \gets \mathsf{FFT}(k - 1, \omega^2, f_0)
      • e1FFT(k1,ω2,f1)e_1 \gets \mathsf{FFT}(k - 1, \omega^2, f_1)
      • for j[0,2k1]\mathtt{for~} j \in [0, 2^k - 1]
        • Fje0,jmod2k1+W[j]e1,jmod2k1F_j \gets e_{0, j \mod 2^{k - 1}} + W[j] \cdot e_{1, j \mod 2^{k - 1}}
      • return F\mathtt{return~} F

Now let's analyze the time complexity. Let T(n)T(n) be the complexity on an instance of size nn (that is, for n=2kn = 2^k).

Looking back at what we have done, we have done

  • O(n)O(n) for computing f0f_0 and f1f_1
  • two recursive calls, each of size n/2n / 2
  • O(n)O(n) for computing the powers of ω\omega
  • O(n)O(n) for combining the results of the recursive calls

In total, this is O(n)+2T(n/2)O(n) + 2 T(n / 2). Solving this recurrence yields T(n)=O(n)logn=O(nlogn)T(n) = O(n) \cdot log n = O(n \log n). Basically, there are logn\log n recursions before we hit the base case, and each step takes time O(n)O(n). \square

Now, in practice there are ways to describe this algorithm non-recursively that have better concrete performance, but that's out of scope for this document. Read the code if you are interested.

Using the FFT algorithm to compute interpAk\mathsf{interp}_{A_k}

So far we have a fast way to compute evalAk(f)\mathsf{eval}_{A_k}(f) all at once, where AkA_k is the set of powers of a 2k2^kth root of unity ω\omega. For convenience let n=2kn = 2^k.

Now we want to go the other way and compute a polynomial given an array of evaluations. Specifically, nn evaluations [f(x0),f(x1),...,f(xn1)]\left[f(x_0), f(x_1), ..., f(x_{n - 1})\right] uniquely define a degree n1n - 1 polynomial. This can be written as a system of nn equations

f(x0)=c0+c1x0++cn1x0n1f(x1)=c0+c1x1++cn1x1n1f(xn1)=c0+c1xn1++cn1xn1n1,\begin{aligned} f(x_0) &= c_0 + c_1x_0 + \ldots + c_{n - 1}x_0^{n - 1} \\ f(x_1) &= c_0 + c_1x_1 + \ldots + c_{n - 1}x_1^{n - 1} \\ \vdots\\ f(x_{n - 1}) &= c_0 + c_1x_{n - 1} + \ldots + c_{n - 1}x_{n - 1}^{n - 1}, \\ \end{aligned}

which can be rewritten as a matrix vector product.

[f(x0)f(x1)f(xn1)]=[1x0x0n11x1x1n11xn1xn1n1]×[c0c1cn1]\begin{bmatrix} f(x_0) \\ f(x_1) \\ \vdots \\ f(x_{n - 1}) \end{bmatrix} = \begin{bmatrix} 1 & x_0 & \cdots & x_0^{n - 1} \\ 1 & x_1 & \cdots & x_1^{n - 1} \\ \vdots & \vdots & \ddots & \vdots\\ 1 & x_{n - 1} & \cdots & x_{n - 1}^{n - 1} \\ \end{bmatrix} \times \begin{bmatrix} c_{0} \\ c_{1} \\ \vdots \\ c_{n - 1} \end{bmatrix}

This n×nn \times n matrix is a Vandermonde matrix and it just so happens that square Vandermonde matrices are invertible, iff the xix_i are unique. Since we purposely selected our xix_i to be the powers of ω\omega, a primitive nn-th root of unity, by definition xi=ωix_i = \omega^i are unique.

Therefore, to compute the polynomial given the corresponding array of evaluations (i.e. interpolation) we can solve for the polynomial's coefficients using the inverse of the matrix.

[c0c1cn1]=[111n11ωωn11ωn1ω(n1)(n1)]1×[f(1)f(ω)f(ωn1)]\begin{bmatrix} c_{0} \\ c_{1} \\ \vdots \\ c_{n - 1} \end{bmatrix} = \begin{bmatrix} 1 & 1 & \cdots & 1^{n - 1} \\ 1 & \omega & \cdots & \omega^{n - 1} \\ \vdots & \vdots & \ddots & \vdots\\ 1 & \omega^{n - 1} & \cdots & \omega^{(n - 1)(n - 1)} \\ \end{bmatrix}^{-1} \times \begin{bmatrix} f(1) \\ f(\omega) \\ \vdots \\ f(\omega^{n - 1}) \end{bmatrix}

All we need now is the inverse of this matrix, which is slightly complicated to compute. I'm going to skip it for now, but if you have the details please make a pull request.

Substituting in the inverse matrix we obtain the equation for interpolation.

[c0c1cn1]=1n[111n11ω1ω(n1)1ω(n1)ω(n1)(n1)]×[f(1)f(ω)f(ωn1)]\begin{bmatrix} c_{0} \\ c_{1} \\ \vdots \\ c_{n - 1} \end{bmatrix} = \frac{1}{n} \begin{bmatrix} 1 & 1 & \cdots & 1^{n - 1} \\ 1 & \omega^{-1} & \cdots & \omega^{-(n - 1)} \\ \vdots & \vdots & \ddots & \vdots\\ 1 & \omega^{-(n - 1)} & \cdots & \omega^{-(n - 1)(n - 1)} \\ \end{bmatrix} \times \begin{bmatrix} f(1) \\ f(\omega) \\ \vdots \\ f(\omega^{n - 1}) \end{bmatrix}

Observe that this equation is nearly identical to the original equation for evaluation, except with the following substitution.

ωi1nω1i\omega^i \Rightarrow \frac{1}{n}\omega^{-1i}

Consequently and perhaps surprisingly, we can reuse the FFT algorithm evalAk\mathsf{eval}_{A_k} in order to compute the inverse-- interpAk\mathsf{interp}_{A_k}.

So, suppose we have an array [a0,,an1][a_0, \dots, a_{n-1}] of field elements (which you can think of as a function AkFA_k \to F) and we want to compute the coefficients of a polynomial ff with f(ωi)=aif(\omega^i) = a_i.

To this end, define a polynomial gg by g=j<najxjg = \sum_{j < n} a_j x^j. That is, the polynomial whose coefficients are the evaluations in our array that we're hoping to interpolate.

Now, let [e0,,en1]=FFT(k,ω1,g)[e_0, \dots, e_{n-1}] = \mathsf{FFT}(k, \omega^{-1}, g).

That is, we're going to feed gg into the FFT algorithm defined above with ω1\omega^{-1} as the 2k2^kth root of unity. It is not hard to check that if ω\omega is an n-th root of unity, so is ω1\omega^{-1}. Remember: the resulting values are the evaluations of gg on the powers of ω1\omega^{-1}, so ei=g(ωi)=j<najωije_i = g(\omega^{-i}) = \sum_{j < n} a_j \omega^{-ij}.

Now, let h=i<neixih = \sum_{i < n} e_i x^i. That is, re-interpret the values eie_i returned by the FFT as the coefficients of a polynomial. I claim that hh is almost the polynomial we are looking for. Let's calculate what values hh takes on at the powers of ω\omega.

h(ωs)=i<neiωsi=i<nωsij<najωij=i<nj<najωsiij=j<naji<nωi(sj)\begin{aligned} h(\omega^s) &= \sum_{i < n } e_i \omega^{si} \\ &= \sum_{i < n} \omega^{si} \sum_{j < n} a_j \omega^{-ij} \\ &= \sum_{i < n} \sum_{j < n} a_j \omega^{si-ij} \\ &= \sum_{j < n} a_j \sum_{i < n} \omega^{i(s-j)} \\ \end{aligned}

Now, let's examine the quantity cj:=i<nωi(sj)c_j := \sum_{i < n} \omega^{i(s-j)}. We claim that if j=sj = s, then cj=nc_j = n, and if jsj \neq s, then cj=0c_j = 0. The first claim is clear since

cs=i<nωi(ss)=i<nω0=i<n1=nc_s = \sum_{i<n} \omega^{i(s-s)} = \sum_{i < n} \omega^0 = \sum_{i<n} 1 = n

For the second claim, we will prove that ωsjcj=cj\omega^{s-j} c_j = c_j. This implies that (1ωsj)cj=0(1 - \omega^{s-j}) c_j = 0. So either 1ωsj=01 - \omega^{s-j} = 0 or cj=0c_j = 0. The former cannot be the case since it implies ωsj=1\omega^{s-j} =1 which in turn implies s=js = j which is impossible since we are in the case jsj \neq s. Thus we have cj=0c_j = 0 as desired.

So let's show that cjc_j is invariant under multiplication by ωsj\omega^{s-j}. Basically, it will come down to the fact that ωn=ω0\omega^n = \omega^0.

ωsjcj=ωsji<nωi(sj)=i<nωi(sj)+(sj)=i<n(ωi+1)sj=(ω0+1)sj+(ω1+1)sj++(ω(n1)+1)sj=(ω1)sj+(ω2)sj++(ωn)sj=(ω1)sj+(ω2)sj++(ω0)sj=(ω0)sj+(ω1)sj++(ωn1)sj=i<n(ωi)sj=cj\begin{aligned} \omega^{s-j} c_j &= \omega^{s-j} \sum_{i<n} \omega^{i(s-j)} \\ &= \sum_{i < n} \omega^{i(s-j) + (s-j)} \\ &= \sum_{i < n} (\omega^{i+1})^{s-j} \\ &= (\omega^{0+1})^{s-j} + (\omega^{1+1})^{s-j} + \dots + (\omega^{(n-1)+1})^{s-j} \\ &= (\omega^{1})^{s-j} + (\omega^{2})^{s-j} + \dots + (\omega^{n})^{s-j} \\ &= (\omega^{1})^{s-j} + (\omega^{2})^{s-j} + \dots + (\omega^{0})^{s-j} \\ &= (\omega^{0})^{s-j} + (\omega^{1})^{s-j} + \dots + (\omega^{n-1})^{s-j} \\ &= \sum_{i < n} (\omega^i)^{s-j} \\ &= c_j \end{aligned}

So now we know that

h(ωs)=j<najcj=asn+jsaj0=asn\begin{aligned} h(\omega^s) &= \sum_{j < n} a_j c_j \\ &= a_s \cdot n + \sum_{j \neq s} a_j \cdot 0 \\ &= a_s \cdot n \end{aligned}

So if we define f=h/nf = h / n, then f(ωs)=asf(\omega^s) = a_s for every ss as desired. Thus we have our interpolation algorithm, sometimes called an inverse FFT or IFFT:

Algorithm: computing interpAk\mathsf{interp}_{A_k}

  1. Input: [a0,,an1][a_0, \dots, a_{n-1}] the points we want to interpolate and ω\omega a nnth root of unity.
  2. Interpret the input array as the coefficients of a polynomial g=i<naixng = \sum_{i < n} a_i x^n.
  3. Let [e0,,en]=FFT(k,ω1,g)[e_0, \dots, e_n] = \mathsf{FFT}(k, \omega^{-1}, g).
  4. Output the polynomial i<n(ei/n)xi\sum_{i < n}(e_i / n) x^i. I.e., in terms of the dense-coefficients form, output the vector [e0/n,,en1/n][e_0 / n, \dots, e_{n - 1}/n].

Note that this algorithm also takes time O(nlogn)O(n \log n)

Takeaways

  • Polynomials can be represented as a list of coefficients or a list of evaluations on a set AA

  • If the set AA is the set of powers of a root of unity, there are time O(nlogn)O(n \log n) algorithms for converting back and forth between those two representations

  • In evaluations form, polynomials can be added and multiplied in time O(n)O(n)

    • TODO: caveat about hitting degree

Exercises

  • Implement types DensePolynomial<F: FfftField> and Evaluations<F: FftField> that wrap a Vec<F> and implement the FFT algorithms described above for converting between them

  • Familiarize yourself with the types and functions provided by ark_poly