This section describes how the Cooley-Tukey fast Fourier transform works. As we
learned in the previous section, the key is to select evaluation points that
yield an efficient FFT algorithm.
Specifically, say we have ω∈F such that ωn=1, and
ωr=1 for any 0<r<n.
Put another way, all the values
1,ω,ω2,ω3,…,ωn−1 are distinct and
ωn=1.
Put yet another way, the group generated by ω inside F× (written
⟨ω⟩) has size n.
We call such an ω a primitive n-th root of unity.
Suppose we have an ω which is a primitive 2kth root of unity and let
Ak={1,ω,…,ω2k−1}.
The FFT algorithm will let us compute interpAk for this set.
Actually, it is easier to see how it will let us compute the
evalAk algorithm efficiently.
We will describe an algorithm FFT(k,ω,f) that takes as input
k∈N
ω∈F a primitive 2kth root of unity
f∈F[x]<2k in dense coefficients form (i.e., as a vector of
coefficients of length n).
and outputs the vector of evaluations
[f(1),f(ω),f(ω2)…,f(ω2k−1)]
and does it in time O(k2k) (which is to say, nlogn if n=2k).
Notice that naively, computing each evaluation f(ωi) using the
coefficients of f would require time O(n), and so computing all n of them
would require time O(n2).
The algorithm FFT(k,ω,f) can be defined recursively as
follows.
If k=0, then ω is a primitive 1st root of unity, and f is a
polynomial of degree 0. That means ω=1 and also f is a constant
c∈F. So, we can immediately output the array of evaluations
[c]=[f(1)].
If k>0, then we will split f into two polynomials, recursively call
FFT on them, and reconstruct the result from the recursive calls.
To that end, define f0 to be the polynomial whose coefficients are all the
even-index coefficients of f and f1 the polynomial whose coefficients are
all the odd-index coefficients of f. In terms of the array representation,
this just means splitting out every other entry into two arrays. So that can be
done in time O(n).
Write f=∑i<2kcixi, so that
f0=∑i<2k−1c2ixi and
f1=∑i<2k−1c2i+1xi. Then
By assumption ei,j=fi((ω2)j). So, for any j we have
f(ωj)=f0((ω2)j)+ωjf1((ω2)j)
Now, since j may be larger than 2k−1−1, we need to reduce it mod
2k−1, relying on the fact that if τ is an nth root of unity then
τj=τjmodn since τn=1. Thus,
(ω2)j=(ω2)jmod2k−1 and so we have
We can compute the array W=[1,ω,…,ω2k−1] in time
O(n) (since each entry is the previous entry times ω). Then we can
compute each entry of the output in O(1) as
f(ωj)=e0,jmod2k−1+W[j]⋅e1,jmod2k−1
There are n such entries, so this takes time O(n).
This concludes the recursive definition of the algorithm
FFT(k,ω,f).
Algorithm: computing evalAk
Inputf=[c0,…,c2k−1] the coefficients of
polynomial f(x)=∑i<2kcixi
ComputeW←[1,ω,ω2,...,ω2k−1]
FFT(k,ω,f)→[f(1),f(ω),f(ω2)…,f(ω2k−1)]
ifk==0
returnf
else
Computef0=[c0,c2,...,c2k−2] the even
coefficients of f, corresponding to
f0(x)=∑i<2k−1c2ixi
Computef1=[c1,c3,...,c2k−1] the odd
coefficients of f, corresponding to
f1(x)=∑i<2k−1c2i+1xi
e0←FFT(k−1,ω2,f0)
e1←FFT(k−1,ω2,f1)
forj∈[0,2k−1]
Fj←e0,jmod2k−1+W[j]⋅e1,jmod2k−1
returnF
Now let's analyze the time complexity. Let T(n) be the complexity on an
instance of size n (that is, for n=2k).
Looking back at what we have done, we have done
O(n) for computing f0 and f1
two recursive calls, each of size n/2
O(n) for computing the powers of ω
O(n) for combining the results of the recursive calls
In total, this is O(n)+2T(n/2). Solving this recurrence yields
T(n)=O(n)⋅logn=O(nlogn). Basically, there are logn
recursions before we hit the base case, and each step takes time O(n).
□
Now, in practice there are ways to describe this algorithm non-recursively that
have better concrete performance, but that's out of scope for this document.
Read the code if you are interested.
So far we have a fast way to compute evalAk(f) all at once, where
Ak is the set of powers of a 2kth root of unity ω. For convenience
let n=2k.
Now we want to go the other way and compute a polynomial given an array of
evaluations. Specifically, n evaluations
[f(x0),f(x1),...,f(xn−1)] uniquely define a degree
n−1 polynomial. This can be written as a system of n equations
This n×n matrix is a Vandermonde matrix and it just so happens that
square Vandermonde matrices are invertible, iff the xi are unique. Since we
purposely selected our xi to be the powers of ω, a primitive n-th
root of unity, by definition xi=ωi are unique.
Therefore, to compute the polynomial given the corresponding array of
evaluations (i.e. interpolation) we can solve for the polynomial's coefficients
using the inverse of the matrix.
All we need now is the inverse of this matrix, which is slightly complicated to
compute. I'm going to skip it for now, but if you have the details please make a
pull request.
Substituting in the inverse matrix we obtain the equation for interpolation.
Observe that this equation is nearly identical to the original equation for
evaluation, except with the following substitution.
ωi⇒n1ω−1i
Consequently and perhaps surprisingly, we can reuse the FFT algorithm
evalAk in order to compute the inverse-- interpAk.
So, suppose we have an array [a0,…,an−1] of field elements (which
you can think of as a function Ak→F) and we want to compute the
coefficients of a polynomial f with f(ωi)=ai.
To this end, define a polynomial g by g=∑j<najxj. That is, the
polynomial whose coefficients are the evaluations in our array that we're hoping
to interpolate.
Now, let [e0,…,en−1]=FFT(k,ω−1,g).
That is, we're going to feed g into the FFT algorithm defined above with
ω−1 as the 2kth root of unity. It is not hard to check that if
ω is an n-th root of unity, so is ω−1. Remember: the resulting
values are the evaluations of g on the powers of ω−1, so
ei=g(ω−i)=∑j<najω−ij.
Now, let h=∑i<neixi. That is, re-interpret the values ei
returned by the FFT as the coefficients of a polynomial. I claim that h is
almost the polynomial we are looking for. Let's calculate what values h takes
on at the powers of ω.
Now, let's examine the quantity cj:=∑i<nωi(s−j). We claim
that if j=s, then cj=n, and if j=s, then cj=0. The first
claim is clear since
cs=i<n∑ωi(s−s)=i<n∑ω0=i<n∑1=n
For the second claim, we will prove that ωs−jcj=cj. This implies
that (1−ωs−j)cj=0. So either 1−ωs−j=0 or
cj=0. The former cannot be the case since it implies ωs−j=1
which in turn implies s=j which is impossible since we are in the case
j=s. Thus we have cj=0 as desired.
So let's show that cj is invariant under multiplication by ωs−j.
Basically, it will come down to the fact that ωn=ω0.
Polynomials can be represented as a list of coefficients or a list of
evaluations on a set A
If the set A is the set of powers of a root of unity, there are time
O(nlogn) algorithms for converting back and forth between those two
representations
In evaluations form, polynomials can be added and multiplied in time O(n)
Implement types DensePolynomial<F: FfftField> and Evaluations<F: FftField>
that wrap a Vec<F> and implement the FFT algorithms described above for
converting between them
Familiarize yourself with the types and functions provided by ark_poly