Skip to main content

Groups

Groups are an algebraic structure which

  • are used in most widely deployed signature schemes

  • are used in most widely deployed zk-SNARKs

  • generalize the notion of a collection of symmetries of an object

  • generalize certain aspects of numbers

I will not touch on the last two too much.

First let's do a quick definition and then some examples.

Definition. A group is a set GG with a binary operation  ⁣:G×GG* \colon G \times G \to G, an identity eGe \in G for that operation, and an inverse map ()1 ⁣:GG(\quad)^{-1} \colon G \to G such that

  • * is associative: x(yz)=(xy)zx*(y*z) = (x*y)*z for all x,y,zx, y,z.

  • ee is an identity for *. I.e., ex=xe=xe * x = x * e = x for all xx.

  • ()1(\quad)^{-1} is an inverse map for * with identity ee. I.e., xx1=x1x=ex * x^{-1} = x^{-1} * x = e for all xx.

So basically, an invertible binary operation. Definition in hand, we can see some examples:

  • The integers Z\mathbb{Z} with ++, 00 as identity, and negation as inverse.

  • For any natural number nn, the integers mod nn. This can be thought of as the set of numbers {0,,n1}\{0, \dots, n - 1\} with the operation being ++ followed by taking remainder mod nn. It can also be thought of as the group of rotations of an nn-gon.

  • For any set XX, the set of invertible functions XXX \to X (a.k.a permutations) with function composition as the operation, the identity function as the identity, and function inverse as the inverse operation.

  • The set of translations of a plane R×R\mathbb{R} \times \mathbb{R} with composition of translations (since the composition of translations is a translation). The identity is translating by (0,0)(0, 0) and inverse is reversing the translation. This is equivalent to the group R×R\mathbb{R} \times \mathbb{R} with coordinate-wise addition as the operation and coordinate-wise negation as the inverse.

  • The set of rotations of a sphere with composition as the operation, doing nothing as the identity, and performing the rotation in reverse as the inverse map.

  • For any field FF, the set F×:=F{0}F^{\times} := F \setminus \{0\}, with field multiplication as the operation, 11 as the identity, and x1/xx \mapsto 1 / x as the inverse map. This is called the group of units of FF.

Sometimes, instead of xyx * y we simply write xyxy when * is clear from context.

Abelian groups

An abelian group is a group in which * is commutative, meaning xy=yxx * y = y * x. Typically, in that case, we write the group operation as ++ instead and we write the identity as 00.

Some examples are

  • the integers with addition

  • the integers mod nn with addition

  • the group of units of a field FF

  • the real numbers with addition

  • the rational numbers with addition

  • the real numbers with multiplication

  • the rational numbers with multiplication

  • vectors of integers of some fixed dimension with coordinate-wise addition

  • The set of polynomials over a ring RR, R[x]R[x] with addition

Abelian groups are equivalently described as what are called Z\Z-modules. A Z\Z-module is like a vector space, but with the integers instead of a field.

Basically a Z\mathbb{Z}-module (or equivalently an Abelian group) is a structure where you can add elements and where it makes sense to multiply a group element by an integer.

If GG is an abelian group, we can define this "multiplication by an integer" as follows. If n0n \geq 0, then for gGg \in G, we can define ngn \cdot g by

ng:=g++gnn \cdot g := \underbrace{g + \dots + g}_{n}

and if n<0n < 0, then define ng:=n(g)n \cdot g := |n| \cdot (-g). Or equivalently.,

ng=(g)++(g)n=(g++gn)n \cdot g = \underbrace{(-g) + \dots + (-g)}_{|n|} = -\left( \underbrace{g + \dots + g}_{|n|} \right)

This is the general sense of what is called scalar-multiplication or sometimes exponentiation in cryptography.

Cyclic groups

A cyclic group GG is a special kind of abelian group. It is an abelian group generated by a single element gGg \in G. That is, a cyclic group GG (generated by gGg \in G) is one in which for every hGh \in G we have h=ngh = n g for some nZn \in \mathbb{Z}.

Groups in cryptography

Many cryptographic protocols are defined generically with respect to any abelian group that has a "hard discrete log".

Let

  • GG be a cyclic group

  • gg a generator of GG

  • μ\mu a probability distribution on GG

  • A\mathcal{A} a set of algorithms of type GZG \to \mathbb{Z} with runtime and memory usage bounds. In other words, a set of tuples (P,t,m)(P, t, m) where PP is a program of type GZG \to \Z, tt is a bound on the time that program can be run, and mm is a bound on the amount of memory that program can use.

    In practice you fix this to be something like, the set of all computations that can be run for less than 1 trillion $.

  • ε[0,1]\varepsilon \in [0, 1] a probability, usually taken to be something close to 0 like 1/21281 / 2^{128}

Then we can define the (G,g,μ,A,ε)(G, g, \mu, \mathcal{A}, \varepsilon)-computational discrete-log assumption which says:

For any (P,t,m)A(P, t, m) \in \mathcal{A}, if you sample h from G according to μ\mu, then the probability that P(h)h=gP(h) \cdot h = g is smaller than ε\varepsilon, assuming PP successfully runs within the given resource bounds.

Sometimes this is called the computational Diffie--Helman assumption. Basically what it's saying is that for a group element hh sampled randomly, you can't practically compute how many times you have to add gg to itself to get hh.

Another really important assumption is the no-relation assumption (TODO: name).

Basically what this says is that randomly selected group elements have no efficiently computable linear relations between them. Formally, letting GG be a group and μ\mu a probability distribution on GG, and A\mathcal{A} a set of programs (with resource bounds) that take in a list of group elements as inputs and outputs a list of integers of the same length.

Then the (G,μ,A,N,ε)(G, \mu, \mathcal{A}, N,\varepsilon)-no-relation assumption says for all (P,t,m)A(P,t,m) \in \mathcal{A}, for any nNn \leq N, if you sample g1,,gng_1, \dots, g_n according to μ\mu, letting [e1,,en]=P([g1,,gn])[e_1, \dots, e_n] = P([g_1, \dots, g_n]), it is not the case that

e1g1++engn=0e_1 \cdot g_1 + \dots + e_n \cdot g_n = 0

except with probability ε\varepsilon (assuming program PP runs in time tt with memory mm).

Generic group model

Elliptic curves

Now, what are some concrete groups that we can safely make the no-relation or computational discrete log hardness assumptions about?

Well, the most efficiently implementable groups that people have come up with -- and that we believe satisfy the above assumptions for A\mathcal{A} being the class of realistic computations and ε\varepsilon being something like 1/21281/2^{128} -- are elliptic curves over finite fields.

Giving a complete definition of what an elliptic curve is requires a lot of math, and is not very useful from the point of view of cryptography. So we will give a definition that is not complete, but more useful.

An elliptic curve EE over a field FF is a set of the form

{(x,y)F×Fy2=x3+ax+b}\{ (x, y) \in F \times F \mid y^2 = x^3 + ax + b \}

for some a,bFa, b \in F, plus an additional point OO which is called the point at infinity. Basically it's a set of pairs satisfying some equation of that form. The data of the elliptic curve itself is just the field FF together with the constants aa and bb.

What is special about elliptic curves is that there is a way of giving them a group structure with simple to compute operations that we believe satisfy the hardness assumptions described above.

Group negation () ⁣:EE(-) \colon E \to E is defined by

(x0,y0)=(x0,y0)-(x_0, y_0) = (x_0, - y_0)

so we just negate the yy coordinate.

The identity for the group is OO, the point at infinity. For that point we may also therefore write it as 00.

Group addition is more complicated to define, so we will not, but here is what's worth knowing about how to compute (x0,y0)+(x1,y1)(x_0, y_0) + (x_1, y_1)

  • There are three cases

    1. x0x1x_0 \neq x_1

    2. x0=x1x_0 = x_1 and y0=y1y_0 = y_1

    3. x0=x1x_0 = x_1 but y0y1y_0 \neq y_1. In this case it turns out y0=y1y_0 = -y_1 and so the two points are inverse, and we return OO

    In cases 1 and 2, the algorithm to compute the result just performs a simple sequence of some field operations (multiplications, additions, divisions, subtractions) with the input values. In other words, there is a simple arithmetic formula for computing the result.

Elliptic curves in Rust

Elliptic curves in Sage

Serializing curve points

Given a curve point (x,y)(x,y) we know y2=x3+ax+by^2 = x^3 + ax + b and thus yy is one of the two square roots of x3+ax+bx^3 + ax + b. If yy is a given square root, the other square root is y-y since y2=(y)2y^2 = (-y)^2. So, if we know xx, then we almost know the whole curve point: we just need a single bit to tell us which of the two possible values (i.e., the two square roots of x3+ax+bx^3+ax+b) is the y coordinate.

Here is how this is commonly done over prime order fields Fp\mathbb{F}_p, assuming p2p \neq 2. Remember that we represent elements of Fp\mathbb{F}_p as integers in the range [0,p1][0, p-1]. In this representation, for a field element yy, y-y is the integer pyp - y (or 00 if y=0y = 0). Thus, if yy is odd, then y-y is even (since pp is odd and an odd minus an odd is even). Similarly, if yy is even, then y- y is odd (since an odd minus an even is odd).

So, for any yy, unless yy is 0, yy and y-y have opposite parities. Parity can easily be computed: it's just the least significant bit. Thus, we have an easy way of encoding a curve point (x,y)(x, y). Send

  • xx in its entirety

  • The least significant bit b0b_0 of yy

Given this, we can reconstruct yy as follows.

  1. Compute any square root YY of x3+ax+bx^3 + ax + b

  2. If the least significant bit of YY is equal to b0b_0, then y=Yy = Y, otherwise, y=Yy = -Y

In the case of Mina, our field elements require 255 bits to store. Thus, we can encode a curve point in 255+1=256255 + 1 = 256 bits, or 32 bytes. At the moment this optimized encoding is not implemented for messages on the network. It would be a welcome improvement.

Projective form / projective coordinates

The above form of representing curve points as pairs -- which is called affine form -- is sometimes sub-optimal from an efficiency perspective. There are several other forms in which curve points can be represented, which are called projective forms.

The simple projective form represents a curve EE as above as the set

{(X,Y,Z)F3(Y/Z)2=(X/Z)3+a(X/Z)+b}\{ (X, Y, Z) \in F^3 \mid (Y/Z)^2 = (X/Z)^3 + a(X/Z) + b \}

If you think about it, this is saying that (X/Z,Y/Z)(X/Z, Y/Z) is a point on the original curve in affine form. In other words, in projective form we let the first two coordinates get scaled by some arbitrary scaling factor ZZ, but we keep track of it as the third coordinate.

To be clear, this means curve points have many different representations. If (x,y,z)(x, y, z) is a curve point in projective coordinates, and ss is any element of FF, then (sx,sy,sz)(sx,sy,sz) is another representation of the same curve point.

This means curve points require more space to store, but it makes the group operation much more efficient to compute, as we can avoid having to do any field divisions, which are expensive.

Jacobian form / Jacobian coordinates

There is another form, which is also sometimes called a projective form, which is known as the jacobian form. There, a curve EE would be represented as

{(X,Y,Z)F3(X/Z2,Y/Z3)E}\{ (X, Y, Z) \in F^3 \mid (X / Z^2, Y / Z^3) \in E \}

so the triple (X,Y,Z)(X, Y, Z) corresponds to the affine point (X/Z2,Y/Z3)(X/Z^2, Y/Z^3). These are the fastest for computing group addition on a normal computer.

Take aways

  • Use affine coordinates when the cost of division doesn't matter and saving space is important. Specific contexts to keep in mind:

    • Working with elliptic curve points inside a SNARK circuit
  • Use Jacobian coordinates for computations on normal computers, or other circumstances where space usage is not that costly and division is expensive.

  • Check out this website for the formulas for implementing the group operation.

More things to know

  • When cryptographers talk about "groups", they usually mean a "computational group", which is a group equipped with efficient algorithms for computing the group operation and inversion. This is different because a group in the mathematical sense need not have its operations be computable at all.

Exercises

  • Implement a type JacobianPoint<F:Field> and functions for computing the group operation

  • Familiarize yourself with the types and traits in ark_ec. Specifically,

    • todo
  • Implement fn decompress<F: SquareRootField>(c: (F, bool)) -> (F, F)