Groups are an algebraic structure which
are used in most widely deployed signature schemes
are used in most widely deployed zk-SNARKs
generalize the notion of a collection of symmetries of an object
generalize certain aspects of numbers
I will not touch on the last two too much.
First let’s do a quick definition and then some examples.
Definition. A group is a set with a binary operation , an identity for that operation, and an inverse map such that
is associative: for all .
is an identity for . I.e., for all .
is an inverse map for with identity . I.e., for all .
So basically, an invertible binary operation. Definition in hand, we can see some examples:
The integers with , as identity, and negation as inverse.
For any natural number , the integers mod . This can be thought of as the set of numbers with the operation being followed by taking remainder mod . It can also be thought of as the group of rotations of an -gon.
For any set , the set of invertible functions (a.k.a permutations) with function composition as the operation, the identity function as the identity, and function inverse as the inverse operation.
The set of translations of a plane with composition of translations (since the composition of translations is a translation). The identity is translating by and inverse is reversing the translation. This is equivalent to the group with coordinate-wise addition as the operation and coordinate-wise negation as the inverse.
The set of rotations of a sphere with composition as the operation, doing nothing as the identity, and performing the rotation in reverse as the inverse map.
For any field , the set , with field multiplication as the operation, as the identity, and as the inverse map. This is called the group of units of .
Sometimes, instead of we simply write when is clear from context.
An abelian group is a group in which is commutative, meaning . Typically, in that case, we write the group operation as instead and we write the identity as .
Some examples are
the integers with addition
the integers mod with addition
the group of units of a field
the real numbers with addition
the rational numbers with addition
the real numbers with multiplication
the rational numbers with multiplication
vectors of integers of some fixed dimension with coordinate-wise addition
The set of polynomials over a ring , with addition
Abelian groups are equivalently described as what are called -modules. A -module is like a vector space, but with the integers instead of a field.
Basically a -module (or equivalently an Abelian group) is a structure where you can add elements and where it makes sense to multiply a group element by an integer.
If is an abelian group, we can define this “multiplication by an integer” as follows. If , then for , we can define by
and if , then define . Or equivalently.,
This is the general sense of what is called scalar-multiplication or sometimes exponentiation in cryptography.
A cyclic group is a special kind of abelian group. It is an abelian group generated by a single element . That is, a cyclic group (generated by ) is one in which for every we have for some .
Many cryptographic protocols are defined generically with respect to any abelian group that has a “hard discrete log”.
be a cyclic group
a generator of
a probability distribution on
a set of algorithms of type with runtime and memory usage bounds. In other words, a set of tuples where is a program of type , is a bound on the time that program can be run, and is a bound on the amount of memory that program can use.
In practice you fix this to be something like, the set of all computations that can be run for less than 1 trillion $.
a probability, usually taken to be something close to 0 like
Then we can define the -computational discrete-log assumption which says:
For any , if you sample h from G according to , then the probability that is smaller than , assuming successfully runs within the given resource bounds.
Sometimes this is called the computational Diffie–Helman assumption. Basically what it’s saying is that for a group element sampled randomly, you can’t practically compute how many times you have to add to itself to get .
Another really important assumption is the no-relation assumption (TODO: name).
Basically what this says is that randomly selected group elements have no efficiently computable linear relations between them. Formally, letting be a group and a probability distribution on , and a set of programs (with resource bounds) that take in a list of group elements as inputs and outputs a list of integers of the same length.
Then the -no-relation assumption says for all , for any , if you sample according to , letting , it is not the case that
except with probability (assuming program runs in time with memory ).
Now, what are some concrete groups that we can safely make the no-relation or computational discrete log hardness assumptions about?
Well, the most efficiently implementable groups that people have come up with – and that we believe satisfy the above assumptions for being the class of realistic computations and being something like – are elliptic curves over finite fields.
Giving a complete definition of what an elliptic curve is requires a lot of math, and is not very useful from the point of view of cryptography. So we will give a definition that is not complete, but more useful.
An elliptic curve over a field is a set of the form
for some , plus an additional point which is called the point at infinity. Basically it’s a set of pairs satisfying some equation of that form. The data of the elliptic curve itself is just the field together with the constants and .
What is special about elliptic curves is that there is a way of giving them a group structure with simple to compute operations that we believe satisfy the hardness assumptions described above.
Group negation is defined by
so we just negate the coordinate.
The identity for the group is , the point at infinity. For that point we may also therefore write it as .
Group addition is more complicated to define, so we will not, but here is what’s worth knowing about how to compute
There are three cases
but . In this case it turns out and so the two points are inverse, and we return
In cases 1 and 2, the algorithm to compute the result just performs a simple sequence of some field operations (multiplications, additions, divisions, subtractions) with the input values. In other words, there is a simple arithmetic formula for computing the result.
Given a curve point we know and thus is one of the two square roots of . If is a given square root, the other square root is since . So, if we know , then we almost know the whole curve point: we just need a single bit to tell us which of the two possible values (i.e., the two square roots of ) is the y coordinate.
Here is how this is commonly done over prime order fields , assuming . Remember that we represent elements of as integers in the range . In this representation, for a field element , is the integer (or if ). Thus, if is odd, then is even (since is odd and an odd minus an odd is even). Similarly, if is even, then is odd (since an odd minus an even is odd).
So, for any , unless is 0, and have opposite parities. Parity can easily be computed: it’s just the least significant bit. Thus, we have an easy way of encoding a curve point . Send
in its entirety
The least significant bit of
Given this, we can reconstruct as follows.
Compute any square root of
If the least significant bit of is equal to , then , otherwise,
In the case of Mina, our field elements require 255 bits to store. Thus, we can encode a curve point in bits, or 32 bytes. At the moment this optimized encoding is not implemented for messages on the network. It would be a welcome improvement.
The above form of representing curve points as pairs – which is called affine form – is sometimes sub-optimal from an efficiency perspective. There are several other forms in which curve points can be represented, which are called projective forms.
The simple projective form represents a curve as above as the set
If you think about it, this is saying that is a point on the original curve in affine form. In other words, in projective form we let the first two coordinates get scaled by some arbitrary scaling factor , but we keep track of it as the third coordinate.
To be clear, this means curve points have many different representations. If is a curve point in projective coordinates, and is any element of , then is another representation of the same curve point.
This means curve points require more space to store, but it makes the group operation much more efficient to compute, as we can avoid having to do any field divisions, which are expensive.
There is another form, which is also sometimes called a projective form, which is known as the jacobian form. There, a curve would be represented as
so the triple corresponds to the affine point . These are the fastest for computing group addition on a normal computer.
Use affine coordinates when the cost of division doesn’t matter and saving space is important. Specific contexts to keep in mind:
- Working with elliptic curve points inside a SNARK circuit
Use Jacobian coordinates for computations on normal computers, or other circumstances where space usage is not that costly and division is expensive.
Check out this website for the formulas for implementing the group operation.
- When cryptographers talk about “groups”, they usually mean a “computational group”, which is a group equipped with efficient algorithms for computing the group operation and inversion. This is different because a group in the mathematical sense need not have its operations be computable at all.
Implement a type
JacobianPoint<F:Field>and functions for computing the group operation
Familiarize yourself with the types and traits in
fn decompress<F: SquareRootField>(c: (F, bool)) -> (F, F)