Fast Fourier Transform

Fast Fourier Transform (FFT)

This section describes how the Cooley-Tukey fast Fourier transform works. As we learned in the previous section, the key is to select evaluation points that yield an efficient FFT algorithm.

Specifically, say we have $ω \in F$ such that $ω^{n} = 1$ , and $ω^{r} \neq = 1$ for any $0 < r < n$ .

Put another way, all the values $1, ω, ω^{2}, ω^{3}, \dots, ω^{n - 1}$ are distinct and $ω^{n} = 1$ .

Put yet another way, the group generated by $ω$ inside $F^{\times}$ (written $⟨ ω ⟩$ ) has size $n$ .

We call such an $ω$ a primitive $n$ -th root of unity.

Suppose we have an $ω$ which is a primitive $2^{k}$ th root of unity and let $A_{k} = {1, ω, \dots, ω^{2^{k} - 1}}$ .

The FFT algorithm will let us compute $interp_{A_{k}}$ for this set.

Actually, it is easier to see how it will let us compute the $eval_{A_{k}}$ algorithm efficiently.

We will describe an algorithm $FFT (k, ω, f)$ that takes as input

$k \in N$
$ω \in F$ a primitive $2^{k}$ th root of unity
$f \in F [x]_{< 2^{k}}$ in dense coefficients form (i.e., as a vector of coefficients of length $n$ ).

and outputs the vector of evaluations $[f (1), f (ω), f (ω^{2}) \dots, f (ω^{2^{k} - 1})]$ and does it in time $O (k 2^{k})$ (which is to say, $n lo g n$ if $n = 2^{k}$ ).

Notice that naively, computing each evaluation $f (ω^{i})$ using the coefficients of $f$ would require time $O (n)$ , and so computing all $n$ of them would require time $O (n^{2})$ .

The algorithm $FFT (k, ω, f)$ can be defined recursively as follows.

If $k = 0$ , then $ω$ is a primitive $1$ st root of unity, and $f$ is a polynomial of degree $0$ . That means $ω = 1$ and also $f$ is a constant $c \in F$ . So, we can immediately output the array of evaluations $[c] = [f (1)]$ .

If $k > 0$ , then we will split $f$ into two polynomials, recursively call $FFT$ on them, and reconstruct the result from the recursive calls.

To that end, define $f_{0}$ to be the polynomial whose coefficients are all the even-index coefficients of $f$ and $f_{1}$ the polynomial whose coefficients are all the odd-index coefficients of $f$ . In terms of the array representation, this just means splitting out every other entry into two arrays. So that can be done in time $O (n)$ .

Write $f = \sum_{i < 2^{k}} c_{i} x^{i}$ , so that $f_{0} = \sum_{i < 2^{k - 1}} c_{2 i} x^{i}$ and $f_{1} = \sum_{i < 2^{k - 1}} c_{2 i + 1} x^{i}$ . Then

$f (x) = i < 2^{k} \sum c_{i} x^{i} = i < 2^{k - 1} \sum c_{2 i} x^{2 i} + i < 2^{k - 1} \sum c_{2 i + 1} x^{2 i + 1} = i < 2^{k - 1} \sum c_{2 i} (x^{2})^{i} + i < 2^{k - 1} \sum c_{2 i + 1} x \cdot (x^{2})^{i} = i < 2^{k - 1} \sum c_{2 i} (x^{2})^{i} + x i < 2^{k - 1} \sum c_{2 i + 1} (x^{2})^{i} = f_{0} (x^{2}) + x f_{1} (x^{2})$

Now, notice that if $ω$ is a $2^{k}$ th root of unity, then $ω^{2}$ is a $2^{k - 1}$ th root of unity. Thus we can recurse with $FFT (k - 1, ω^{2}, f_{0})$ and similarly for $f_{1}$ . Let

$[e_{0, 0}, \dots, e_{0, 2^{k - 1} - 1}] [e_{1, 0}, \dots, e_{1, 2^{k - 1} - 1}] = FFT (k - 1, ω^{2}, f_{0}) = FFT (k - 1, ω^{2}, f_{1})$

By assumption $e_{i, j} = f_{i} ((ω^{2})^{j})$ . So, for any $j$ we have

$f (ω^{j}) = f_{0} ((ω^{2})^{j}) + ω^{j} f_{1} ((ω^{2})^{j})$

Now, since $j$ may be larger than $2^{k - 1} - 1$ , we need to reduce it mod $2^{k - 1}$ , relying on the fact that if $τ$ is an $n$ th root of unity then $τ^{j} = τ^{j mod n}$ since $τ^{n} = 1$ . Thus, $(ω^{2})^{j} = (ω^{2})^{j mod 2^{k - 1}}$ and so we have

$f (ω^{j}) = f_{0} ((ω^{2})^{j mod 2^{k - 1}}) + ω^{j} f_{1} ((ω^{2})^{j mod 2^{k - 1}}) = e_{0, j mod 2^{k - 1}} + ω^{j} e_{1, j mod 2^{k - 1}}$

We can compute the array $W = [1, ω, \dots, ω^{2^{k} - 1}]$ in time $O (n)$ (since each entry is the previous entry times $ω$ ). Then we can compute each entry of the output in $O (1)$ as

$f (ω^{j}) = e_{0, j mod 2^{k - 1}} + W [j] \cdot e_{1, j mod 2^{k - 1}}$

There are $n$ such entries, so this takes time $O (n)$ .

This concludes the recursive definition of the algorithm $FFT (k, ω, f)$ .

Algorithm: computing $eval_{A_{k}}$

$Input f = [c_{0}, \dots, c_{2^{k} - 1}]$ the coefficients of polynomial $f (x) = \sum_{i < 2^{k}} c_{i} x^{i}$

$Compute W \leftarrow [1, ω, ω^{2}, \dots, ω^{2^{k} - 1}]$

$FFT (k, ω, f) \to [f (1), f (ω), f (ω^{2}) \dots, f (ω^{2^{k} - 1})]$

$if k == 0$

$return f$

$else$

$Compute f_{0} = [c_{0}, c_{2}, \dots, c_{2^{k} - 2}]$ the even coefficients of $f,$ corresponding to $f_{0} (x) = \sum_{i < 2^{k - 1}} c_{2 i} x^{i}$

$Compute f_{1} = [c_{1}, c_{3}, \dots, c_{2^{k} - 1}]$ the odd coefficients of $f,$ corresponding to $f_{1} (x) = \sum_{i < 2^{k - 1}} c_{2 i + 1} x^{i}$

$e_{0} \leftarrow FFT (k - 1, ω^{2}, f_{0})$

$e_{1} \leftarrow FFT (k - 1, ω^{2}, f_{1})$

$for j \in [0, 2^{k} - 1]$

$F_{j} \leftarrow e_{0, j mod 2^{k - 1}} + W [j] \cdot e_{1, j mod 2^{k - 1}}$

$return F$

Now let’s analyze the time complexity. Let $T (n)$ be the complexity on an instance of size $n$ (that is, for $n = 2^{k}$ ).

Looking back at what we have done, we have done

$O (n)$ for computing $f_{0}$ and $f_{1}$
two recursive calls, each of size $n /2$
$O (n)$ for computing the powers of $ω$
$O (n)$ for combining the results of the recursive calls

In total, this is $O (n) + 2 T (n /2)$ . Solving this recurrence yields $T (n) = O (n) \cdot l o g n = O (n lo g n)$ . Basically, there are $lo g n$ recursions before we hit the base case, and each step takes time $O (n)$ . $□$

Now, in practice there are ways to describe this algorithm non-recursively that have better concrete performance, but that’s out of scope for this document. Read the code if you are interested.

Using the FFT algorithm to compute $interp_{A_{k}}$

So far we have a fast way to compute $eval_{A_{k}} (f)$ all at once, where $A_{k}$ is the set of powers of a $2^{k}$ th root of unity $ω$ . For convenience let $n = 2^{k}$ .

Now we want to go the other way and compute a polynomial given an array of evaluations. Specifically, $n$ evaluations $[f (x_{0}), f (x_{1}), \dots, f (x_{n - 1})]$ uniquely define a degree $n - 1$ polynomial. This can be written as a system of $n$ equations

$f (x_{0}) f (x_{1}) ⋮ f (x_{n - 1}) = c_{0} + c_{1} x_{0} + \dots + c_{n - 1} x_{0}^{n - 1} = c_{0} + c_{1} x_{1} + \dots + c_{n - 1} x_{1}^{n - 1} = c_{0} + c_{1} x_{n - 1} + \dots + c_{n - 1} x_{n - 1}^{n - 1},$ which can be rewritten as a matrix vector product. $f (x_{0}) f (x_{1}) ⋮ f (x_{n - 1}) = 11 ⋮ 1 x_{0} x_{1} ⋮ x_{n - 1} \dots \dots ⋱ \dots x_{0}^{n - 1} x_{1}^{n - 1} ⋮ x_{n - 1}^{n - 1} \times c_{0} c_{1} ⋮ c_{n - 1}$ This $n \times n$ matrix is a Vandermonde matrix and it just so happens that square Vandermonde matrices are invertible, iff the $x_{i}$ are unique. Since we purposely selected our $x_{i}$ to be the powers of $ω$ , a primitive $n$ -th root of unity, by definition $x_{i} = ω^{i}$ are unique.

Therefore, to compute the polynomial given the corresponding array of evaluations (i.e. interpolation) we can solve for the polynomial’s coefficients using the inverse of the matrix. $c_{0} c_{1} ⋮ c_{n - 1} = 11 ⋮ 1 1 ω ⋮ ω^{n - 1} \dots \dots ⋱ \dots 1^{n - 1} ω^{n - 1} ⋮ ω^{(n - 1) (n - 1)}^{- 1} \times f (1) f (ω) ⋮ f (ω^{n - 1})$ All we need now is the inverse of this matrix, which is slightly complicated to compute. I’m going to skip it for now, but if you have the details please make a pull request.

Substituting in the inverse matrix we obtain the equation for interpolation. $c_{0} c_{1} ⋮ c_{n - 1} = \frac{1}{n} 11 ⋮ 1 1 ω^{- 1} ⋮ ω^{- (n - 1)} \dots \dots ⋱ \dots 1^{n - 1} ω^{- (n - 1)} ⋮ ω^{- (n - 1) (n - 1)} \times f (1) f (ω) ⋮ f (ω^{n - 1})$ Observe that this equation is nearly identical to the original equation for evaluation, except with the following substitution. $ω^{i} \Rightarrow \frac{1}{n} ω^{- 1 i}$ Consequently and perhaps surprisingly, we can reuse the FFT algorithm $eval_{A_{k}}$ in order to compute the inverse– $interp_{A_{k}}$ .

So, suppose we have an array $[a_{0}, \dots, a_{n - 1}]$ of field elements (which you can think of as a function $A_{k} \to F$ ) and we want to compute the coefficients of a polynomial $f$ with $f (ω^{i}) = a_{i}$ .

To this end, define a polynomial $g$ by $g = \sum_{j < n} a_{j} x^{j}$ . That is, the polynomial whose coefficients are the evaluations in our array that we’re hoping to interpolate.

Now, let $[e_{0}, \dots, e_{n - 1}] = FFT (k, ω^{- 1}, g)$ .

That is, we’re going to feed $g$ into the FFT algorithm defined above with $ω^{- 1}$ as the $2^{k}$ th root of unity. It is not hard to check that if $ω$ is an n-th root of unity, so is $ω^{- 1}$ . Remember: the resulting values are the evaluations of $g$ on the powers of $ω^{- 1}$ , so $e_{i} = g (ω^{- i}) = \sum_{j < n} a_{j} ω^{- ij}$ .

Now, let $h = \sum_{i < n} e_{i} x^{i}$ . That is, re-interpret the values $e_{i}$ returned by the FFT as the coefficients of a polynomial. I claim that $h$ is almost the polynomial we are looking for. Let’s calculate what values $h$ takes on at the powers of $ω$ .

$h (ω^{s}) = i < n \sum e_{i} ω^{s i} = i < n \sum ω^{s i} j < n \sum a_{j} ω^{- ij} = i < n \sum j < n \sum a_{j} ω^{s i - ij} = j < n \sum a_{j} i < n \sum ω^{i (s - j)}$

Now, let’s examine the quantity $c_{j} := \sum_{i < n} ω^{i (s - j)}$ . We claim that if $j = s$ , then $c_{j} = n$ , and if $j \neq = s$ , then $c_{j} = 0$ . The first claim is clear since

$c_{s} = i < n \sum ω^{i (s - s)} = i < n \sum ω^{0} = i < n \sum 1 = n$

For the second claim, we will prove that $ω^{s - j} c_{j} = c_{j}$ . This implies that $(1 - ω^{s - j}) c_{j} = 0$ . So either $1 - ω^{s - j} = 0$ or $c_{j} = 0$ . The former cannot be the case since it implies $ω^{s - j} = 1$ which in turn implies $s = j$ which is impossible since we are in the case $j \neq = s$ . Thus we have $c_{j} = 0$ as desired.

So let’s show that $c_{j}$ is invariant under multiplication by $ω^{s - j}$ . Basically, it will come down to the fact that $ω^{n} = ω^{0}$ .

$ω^{s - j} c_{j} = ω^{s - j} i < n \sum ω^{i (s - j)} = i < n \sum ω^{i (s - j) + (s - j)} = i < n \sum (ω^{i + 1})^{s - j} = (ω^{0 + 1})^{s - j} + (ω^{1 + 1})^{s - j} + \dots + (ω^{(n - 1) + 1})^{s - j} = (ω^{1})^{s - j} + (ω^{2})^{s - j} + \dots + (ω^{n})^{s - j} = (ω^{1})^{s - j} + (ω^{2})^{s - j} + \dots + (ω^{0})^{s - j} = (ω^{0})^{s - j} + (ω^{1})^{s - j} + \dots + (ω^{n - 1})^{s - j} = i < n \sum (ω^{i})^{s - j} = c_{j}$

So now we know that

$h (ω^{s}) = j < n \sum a_{j} c_{j} = a_{s} \cdot n + j \neq = s \sum a_{j} \cdot 0 = a_{s} \cdot n$

So if we define $f = h / n$ , then $f (ω^{s}) = a_{s}$ for every $s$ as desired. Thus we have our interpolation algorithm, sometimes called an inverse FFT or IFFT:

Algorithm: computing $interp_{A_{k}}$

Input: $[a_{0}, \dots, a_{n - 1}]$ the points we want to interpolate and $ω$ a $n$ th root of unity.

Interpret the input array as the coefficients of a polynomial $g = \sum_{i < n} a_{i} x^{n}$ .

Let $[e_{0}, \dots, e_{n}] = FFT (k, ω^{- 1}, g)$ .

Output the polynomial $\sum_{i < n} (e_{i} / n) x^{i}$ . I.e., in terms of the dense-coefficients form, output the vector $[e_{0} / n, \dots, e_{n - 1} / n]$ .

Note that this algorithm also takes time $O (n lo g n)$

Takeaways

Polynomials can be represented as a list of coefficients or a list of evaluations on a set $A$
If the set $A$ is the set of powers of a root of unity, there are time $O (n lo g n)$ algorithms for converting back and forth between those two representations
In evaluations form, polynomials can be added and multiplied in time $O (n)$
- TODO: caveat about hitting degree

Exercises

Implement types DensePolynomial<F: FfftField> and Evaluations<F: FftField> that wrap a Vec<F> and implement the FFT algorithms described above for converting between them
Familiarize yourself with the types and functions provided by ark_poly

Mina book