## Congruence Group Γ(2)

Modular Group Γ is the one that generated the hyperbolic tessellations seen previously. It can be generated are:

Or in matrix form: [0,1;-1,0], [1,1;1,-]

The Congruence Group Γ(2) can be generated by these matrices:

[1,2;0,1], [1,0;-2,1]

As before the orange region is the fundamental domain. It and all of the tiles are 4 sided and touch the rim at 4 points. Here is another representation of the fundamental domain from Wolfram Alpha[1]:

The 4 corners of this fundamental domain are -1, 0, 1, and ∞. The red letters indicate the effect of the initial tiling of 1 letter words (e.g. T, T¯¹, S, T¯¹). These areas correspond to the yellow quadrilaterals in the tiling above.

[1] Weisstein, Eric W. “Modular Group Lambda.” From *MathWorld*–A Wolfram Web Resource. http://mathworld.wolfram.com/ModularGroupLambda.html

## Ford Circles and Farey Graphs

Here is an image of a tessellation or tiling of the upper half-plane. Under some group of symmetries, an initial triangle (for instance the one in orange) covers the plane without overlaps. (The initial triangle is not ideal because only one of its angles is 0). The group of symmetries used are those of the modular group. See my previous post where I go into this in greater detail.

Note that each triangle has a vertex on the rim of the circle. The rim of the circle is the real projective number line.

The orange initial triangle, aka the fundamental domain, is touching the real number line at the point of infinity. The green triangle right below the orange triangle touches the real number line at the point 0. The green triangle = S(orange triangle), where S -> -1/z. Every other vertex touching the rim of the disc can be obtained by applying some combination of S and T. (T->z+1).

If we place a circle of radius = 1 touching *i* and 0, and then apply S and T repeatedly we get the Ford circles. The set of points where the circles touch the real number line is the set of rational numbers, Q. This is the same set of points where the vertexes of the tessellated initial triangles touch the rim.

“The Farey graph is the PSL(2,Z) images of the imaginary axis, and so contains all of Q* as its points at infinity. It is not hard to see that p/q and r/s are connected by an edge iff ps-qr = +/-1, so it is in some sense a geometric recording of the matrix group.” [1]

Here the images of the imaginary axis are highlighted and the Farey graph is evident:

[1] http://mduchin.math.tufts.edu/notes/hyp-groups-course.pdf

[2] “Indra’s Pearls”, pages 210-213.

[3] http://homepages.warwick.ac.uk/~masbb/HypGeomandCntdFractions-2.pdf

## Tessellation of the Hyperbolic Plane on the Riemann Sphere

I’m interested in learning more about the modular group and this article looking into hyperbolic tessellations represents my initial efforts to collect my thoughts on the topic.

We are familiar with hyperbolic tessellations from the artwork of M.C. Escher and from numerous renderings of the Poincare Disk (which was really invented by Beltrami, not Poincaré):

The disk model is not the only model of hyperbolic space. There is also the Klein model, the hyperboloid model, and the upper half plane model, and probably some others. Let’s consider the upper half plane model. One way to look at the Poincaré half plane model is as a view on that part of complex plane where the imaginary part is positive. So we’re only considering half of the complex plane, the upper half. Here is what a tiling of the upper half plane looks like:

###### By Fropuff (from en wikipedia) [GFDL (http://www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0/)], via Wikimedia Commons

To quote wikipedia: “The projective linear group PGL(2,**C**) acts on the Riemann sphere by the Möbius transformations. The subgroup that maps the upper half-plane, **H**, onto itself is PSL(2,**R**)”. So Möbius transformations with real coefficients preserve the upper half-plane. That means that any such transformation will keep all points that are already on the upper half-plane on the upper half plane. (And incidentally, the real numbers are mapped to themselves).

If we look at the upper half plane model this way (as the upper half of the complex plane), then when we render it on the Riemann Sphere, it looks like this:

In this depiction, the tiled portion is the upper half plane. The red hemisphere is not the upper half plane, it is the lower half plane. The green axis points to (∞, ∞i), the blue axis intercepts the sphere at (-1,0i) and the red axis intercepts the sphere at (0,-i).

The tessellation depicted above is of the modular group, PSL(2,Z), where Z indicates the set of integers. So in other words the modular group Γ is those mobius transformations with integer coefficients. So it is a subset of PSL(2,R).

“The modular group can be shown to be generated by the two transformations

so that every element in the modular group can be represented (in a non-unique way) by the composition of powers of *S* and *T*. Geometrically, *S* represents inversion in the unit circle followed by reflection with respect to the imaginary axis, while *T* represents a unit translation to the right.”

So we can use those 2 transformations and apply them over and over. What region should we start with as our initial tile (or to use the technical terminology: what * fundamental domain* should we choose?). A common choice is depicted in the grey region in the upper half plane figure above. It is that region where |z| is > 1 and |Re(z)| < 1/2. The orange triangle in the spherical tiling above is the very same region.

You can view my Three.js version of this tessellation here where you can spin and zoom it as you please. Here is an alternative coloring.

Note that we are drawing this diagram using WebGL shaders. Pixel shaders (aka fragment shaders) are called once for each pixel. In our case a pixel maps to a complex coordinate. So our code is called repeatedly and handed a different complex coordinate on each call. The pixel shader has to decide what color to use for this given pixel/complex coordinate. So it has to do the Möbius transforms in reverse. (I discuss this issue in greater detail here). This shows how the tesselation works [3]:

Thus, in code, the logic is:

if |z| < 1 then apply S^{-1}else if Re(z) < -.5 apply T, else T^{-1}.

(remember we’re doing everything backwards because we’re working with pixel shaders).

By a suitable placement of the camera you can get a view that looks just like the Poincaré disk:

This provides an intuition for believing that there is a simple mapping from the upper half plane model to the disk model, and indeed there is, it is called the Cayley transform.

In case it wasn’t clear, the rim of the upper half plane is the real line, or the projective real line (the real line plus a point at infinity) to be precise. A cool thing about the modular group is that each point where a tessellation intersects the real line is a rational number. That makes sense because any combinations of powers of S and T can only generate a rational number.

In other words, the points where the tessellation’s of the fundamental domain hit the boundary are those points s ∈ R ∪ {∞} that are fixed by a parabolic element of Γ. The points are precisely Q ∪ {∞}, where Q stands for the set of rational numbers.

The parabolic element is the transform T above. You can tell it is parabolic because the Trace of the 2×2 matrix corresponding to the Möbius transform for T is 2.

So what does this have to do with hyperbolic tesselations? We’ve explained how a special set of Möbius transformations, those that generate the modular group, can tile the upper half plane. The upper half plane is a model of hyperbolic geometry. This was the famous insight that came to Poincaré in a flash:

*“At that moment which I put my foot on the step the idea came to me, without anything in my former thoughts seeming to have paved the way for it, that the transformations I had used to define the Fuscian functions were identical with those of non-Euclidean geometry”* L’Invention Mathématique, Henri Poincaré

Or as wikipedia says: “Möbius transformations are also isometries of the hyperbolic plane“. This seems a strange statement. Möbius transformations are conformal on the complex plane: they preserve angles, but the don’t preserve lengths. Isometry means ‘length preserving’. This is possible because in hyperbolic space, in marked contrast to Euclidean space, equal angles guarantee that corresponding side lengths are equal in the hyperbolic metric.[1] A hyperbolic triangle’s size is determined by its angles.

As a final note, since we’re working with the Riemann sphere we can do all sorts of fun mappings from 360 video footage, as I detailed elsewhere. So piping video through the tessellation gives you:

Video 5 from Robert Woodley on Vimeo.

.

—————-

[1] http://www.mathematica-journal.com/issue/v9i3/contents/ModularGroup/ModularGroup.pdf

[2] https://math.dartmouth.edu/~m125x15/quat-book-chap29-072315.pdf

[3] http://www.math.uconn.edu/~kconrad/blurbs/grouptheory/SL(2,Z).pdf

## Crib Sheet on Mathematical Groups

**general linear group**of degree

*n*is the set of

*n*×

*n*invertible matrices, together with the operation of ordinary matrix multiplication. This forms a group, because the product of two invertible matrices is again invertible, and the inverse of an invertible matrix is invertible. The group is so named because the columns of an invertible matrix are linearly independent, hence the vectors/points they define are in general linear position, and matrices in the general linear group take points in general linear position to points in general linear position.

**R**(the set of real numbers) is the group of

*n*×

*n*invertible matrices of real numbers, and is denoted by GL

_{n}(

**R**) or GL(

*n*,

**R**).

**special linear group**, written SL(

*n*,

*F*) or SL

_{n}(

*F*), is the subgroup of GL(

*n*,

*F*) consisting of matrices with a determinant of 1, with the group operations of ordinary matrix multiplication and matrix inversion. Matrices of this type form a group since the determinant of the product of two matrices is the product of the determinants of each matrix. The special linear group SL(

*n*,

**R**) can be characterized as the group of

*volume and orientation preserving*linear transformations of

**R**

^{n}; this corresponds to the interpretation of the determinant as measuring change in volume and orientation.

**SL(2,R)**or

**SL**is the group of 2 × 2 real matrices with determinant one. SL(2,

_{2}(R)**R**) acts on the complex upper half-plane by fractional linear transformations

**SL(2,R)**on C, which takes R U ∞ to itself and stabilizes the upper and lower half-planes.

**Möbius group**, which is the projective linear group PGL(2,

**C**).

**PGL(2,C)**consists of 2×2 matrices with non-zero determinant: Möbius transformations are defined on the extended complex plane,

**C***. Stereographic projection identifies

**C***with a sphere, which is then called the Riemann sphere; alternatively,

**C***can be thought of as the complex projective line

**CP1**. The Möbius transformations are exactly the bijective conformal maps from the Riemann sphere to itself, i.e., the automorphisms of the Riemann sphere as a complex manifold; alternatively, they are the automorphisms of

**CP1**as an algebraic variety. Therefore, the set of all Möbius transformations forms a group under composition. This group is called the Möbius group, and is sometimes denoted,

**Aut(C*)**.

**Kleinian Group**– is a discrete subgroup of PSL(2,

**C**). Or is it PGL(2,C)?????? Discreteness implies points in

*B*

^{3}have finite stabilizers, and discrete orbits under the group

*G*. But the orbit

*Gp*of a point

*p*will typically accumulate on the boundary of the closed ball

**B3**.

*T*be a periodictessellation of hyperbolic 3-space. The group of symmetries of the tessellation is a Kleinian group

**Fuchsian Group**– Any Fuchsian group (a discrete subgroup of SL(2,

**R**)) is a Kleinian group, and conversely any Kleinian group preserving the real line (in its action on the Riemann sphere) is a Fuchsian group. More generally, any Kleinian group preserving a circle or straight line in the Riemann sphere is conjugate to a Fuchsian group. “Indra’s Pearls” says that any group which can be conjugated to a group meeting the above requirements is also a Fuschian group, which seems reasonable. In other words, the traces of Fuscian matrices must be real; traces don’t change under conjugation.

**quasi-Fuchsian group**is a Kleinian group whose limit set is contained in an invariant Jordan curve. In other words the limit set divides the ordinary set into an inside and an outside. The limit set can be crinkly. The special case when the Jordan curve is a circle or line is called a Fuchsian group.

**modular group**is the projective special linear group

*PSL*(2,

**Z**) of 2 x 2 matrices with integer coefficients and unit determinant. It acts on the upper half of the complex plane because Z is real. Some authors

*define*the modular group to be PSL(2,

**Z**), and still others define the modular group to be the larger group SL(2,

**Z**).

**H**(the upper half plane).

*S*and

*T*. Geometrically,

*S*represents inversion in the unit circle followed by reflection with respect to the imaginary axis, while

*T*represents a unit translation to the right.

Γ(*N*) is a normal subgroup of the modular group Γ. The group Γ(*N*) is given as the set of all modular transformations

for which *a* ≡ *d* ≡ ±1 (mod *N*) and *b* ≡ *c* ≡ 0 (mod *N*).

The principal congruence subgroup of level 2, Γ(2), is also called the **modular group Λ**. Since PSL(2, **Z**/2**Z**) is isomorphic to *S*_{3}, Λ is a subgroup of index 6. The group Λ consists of all modular transformations for which *a* and *d* are odd and *b* and *c* are even.

The subgroups , sometimes called the *Hecke congruence subgroup* of level n, is defined as the preimage by of the group of upper triangular matrices. That is,

- .

The subgroups are the preimage of the subgroup of unipotent matrices:

The *theta subgroup* is the congruence subgroup of defined as the preimage of the cyclic group of order two generated by in . It is of index 3 and is explicitly described by:^{[2]}

- .

It is also called the Modular Group Lambda. It is the set of transforms where a and d are odd and b and c are even. Same as above.

**SU(2)**is the group of unit quaternions. They can be used to define regular polyhedra. A M ̈obius transformation is an isometry of the Riemann sphere (for the chordal metric) if, and only if, it is represented by a 2×2 matrix that is a member of

**SU(2)**. This chordal metric is nothing more than the Euclidean metric between the two points that z1 and z2 are mapped to on the unit sphere by stereographic projection.

## Möbius Transformations on Spherical Photos and Videos

Spherical Video presents some interesting challenges, for instance, how do you zoom? Or how do you rotate on any arbitrary axis? As the mathematician Henry Segerman pointed out in a post for EleVR, you can achieve both of the above using Möbius transformations. The transformations are conformal (they preserve angles), and they map circles to circles (considering a line to be a circle of infinite radius). Professor Segerman’s work is the inspiration for the project I am describing here.

I wrote an implementation using Möbius transformations to manipulate spherical images in WebGL/Three.js. The implementation is here. Click on the ‘?’ for help or watch this video.

So what is going on here?

### The mathematical procedure:

Spherical cameras such as the Ricoh Theta save images in an equirectangular format. This is then wrapped around a sphere to give the spherical effect. (With three.js you create a material whose texture is the saved image from the Theta and then create a mesh using this material and a sphere geometry.) The X and Y of the equirectangular format map to the longitude and latitude of the sphere.

The Riemann sphere is a representation of the complex plane as a sphere using reverse stereographic projection. Möbius transformations are transformations of the complex plane. Our three.js sphere is a normal sphere in 3-d cartesian space (R3); each point on its surface has an (x,y,z) coordinate where x, y and z are real numbers. We can convert our x,y,z point to a point in the complex plane if we take our sphere to be a Riemann sphere. Or in jargon we can say equivalently: P¹(C) is diffeomorphic to the sphere S². The formulas are here (and elsewhere).

So for instance the south pole is (0,0,-1) in cartesian/R3 space and (0,0i) in complex space. The north pole is (0,0,1) in cartesian space/R3 and (∞, ∞i) in complex space. (The Reimann sphere is really the complex plane plus one point which is the point at the north pole, but I digress).

### The technical procedure:

To implement our transformations, we are going to move pixels around on the texture. We can only do this in the shader since doing it in javascript would be way to slow. Specifically we are going to do it in the fragment shader. Our vertex shader will be short and boring, doing only what is necessary to build the sphere’s vertices. Nothing special there. The fragment shader is where the work is done.

The fragment shader is called when the graphics layer needs to know what color to put at a given UV coordinate. We will pluck that pixel from our transformed texture. The steps to execute on each call to the fragment shader are:

1 – Get the UV coordinates. ‘UV’ is industry nomenclature for the X and Y coordinates of the texture that the graphics layer is trying to draw.

2 – We know that the rectangular texture must map to a sphere. The top of the rectangle maps to the north pole, the bottom to the south pole. So we can calculate the X, Y, and Z coordinate of the sphere corresponding to the UV coordinate. That is, we know which point of the sphere we are drawing.

3 – This sphere is also a Riemann sphere as mentioned above, so we calculate the point (a + bi) on the complex plane corresponding to the cartesian point (X,Y,Z) on the sphere, using the formulas mentioned above.

4 – We now know our location on the complex plane and we apply as many transformations as we’d like. When we are done we have our new complex point (c + di).

5 – We reverse the above steps and calculate the cartesian X,Y,Z corresponding to (c + di). Then we calculate the UV corresponding to this X,Y,Z and we use the color at that pixel, UV.

The only Möbius transformations I’ve implemented for now are those to do rotation and zoom. By default the fixed points are antipodal though you can explicitly set the fixed points using the Epsilon 1 and Epsilon 2 buttons. The video linked to above gives a good overview of what all the different buttons do.

This is a rotation around 2 non-antipodal fixed points:

In addition to Möbius transforms to do rotation and zoom, there are some other complex transformation options.

You may have noticed that we start with the point on the complex plane (a + bi) for which we need a pixel. We then transform (a + bi) into (c + di) to get the pixel we will draw. Thus the transformation is really a reverse transformation.

Infinities of volleyball players on North Avenue Beach (Works at 60FPS on video!):

This article by Henry Segerman gives a comprehensive overview of the math behind these transformations and includes more examples of other transformations you can do.

## Thought Bubbles

A window is a two-dimensional hole in a two-dimensional plane that allows you to see into a three-dimensional world. So what if we could make three-dimensional holes?

The original idea came from a three.js demo by altered qualia where he was demo’ing fresnel shaders:

I didn’t care much about fresnel shaders, but was intrigued by the bubbles. I realized that rendering the outer sphere was not needed to render the bubbles:

These bubbles are three-dimensional windows that can be taken anywhere. Spherical holograms.

To render the above image, you need 2 cameras, each one looking at a separate texture. Now look at this next image. If we flood Michigan avenue we need 2 cameras as well, one for the surrounding sphere, and one to capture the reflection as well.

We can view the scene of a flooded Michigan Avenue at home, via one of our holo-bubbles from above. The camera count is now 3, and you have to deal with the fact that while you want them all to turn in sync, some will need to be more wider angle than others.

We can make the inner camera have such a wide angle that it captures the zenith and the nadir, and the cursor both zooms and moves in space. It is a very sensitive. (Try it!).

This final image is one of the strangest I’ve made to date. Only one camera, with the sphere in the middle seemingly refracting the surroundings but even though the image is in the sphere is reversed it tracks the outer image as they rotate. How is this possible?

Most of the above images can and should be clicked to see the rendering in WebGL.

## Complex Surfaces

I wrote a separate version of formula toy that handles complex functions. So, you can type in a function like: f=sqrt(g). Both f and g are expected to be complex functions:

g = u +iv

and

f = w +ix

where u is the real component of g, v is the imaginary component of g, etc.

To draw this surface we need 4 axes, so formula toy uses a color gradient for the 4th axis. We name these as follows:

– fR – the real component of f.

– fI – the imaginary component of f.

– gR – the real component of g.

– gI – the imaginary component of g.

And you can choose which complex axis maps to which cartesian axis.

This flexible way of doing the mappings allows for simple multi-valued functions/Riemann surfaces:

Click on the images to open the surface in formula toy. More details here.

## Torus Knots

Formula Toy is a simple and free WebGL app I wrote that allows you to enter in 3-d formulas and see the resulting surface. Sort of like Desmos, but for 3D.

When I first wrote it you could express your formulas in 3 different coordinate systems: cartesian, spherical (polar), and cylindrical. I recently add toroidal which is not very useful except for drawing toruses:

radius=.5+sin(phi)/(15)*3

AND I added parametric surfaces. If you choose this option, the system is expecting formulas for X, Y, and Z. You are given U and V. For instance a helix could be expressed as:

u=u*(2*pi); v=v*(6*pi); x=u*cos(v); y=u*sin(v); z=v;

Parametric equations that are functions of a single variable (t) instead of 2 variables (u,v), don’t display because they have zero thickness. For instance the parametric formula for a trefoil knot is a function of 1 variable. To make this visible in formula toy, you need a tube geometry so that the knot doesn’t have zero thickness/volume. Three.js offers a tube geometry but you can also achieve the same effect using the parametric geometry, but with a little extra math. We can start with a torus, and make that our ‘tube’ that will be bent into a trefoil. Here is the parametric formula for a torus:

u=u*(2*pi); v=v*(2*pi); x=(4+cos(v))*cos(u); y=(4+cos(v))*sin(u); z=sin(v);

So to plot a trefoil:

// Setup u=u*(2*pi); v=v*(2*pi); phi=u;

// Parametric formula for a trefoil pp=2; // A trefoil is a (2,3) torus ring. qq=3; rr=cos(qq*phi)+2; // xx,yy,zz are the formula for the trefoil, a function of one variable (phi) xx=rr*cos(pp*phi); yy=rr*sin(pp*phi); zz=-3*sin(qq*phi);

// Modified Torus formula to bend a torus into a trefoil shape: x=(4*xx+cos(v)*xx/rr); y=(4*yy+cos(v)*yy/rr); z=sin(v)+zz/rr;

A trefoil is just on example of a torus knot. It is a (2,3) torus knot when means it winds 3 times around a circle in the interior of the torus, and 2 times around the torus’ axis of rotational symmetry. There is a whole family of torus knots (*p,q*). *p* and *q* correspond to **pp** and **qq** in the parametric formula above.

For instance, here is a (*3,7*) knot:

Below is another example which shows a (*5,6*) torus knot winding around a torus. This was not done in Formula Toy since it draws 2 surfaces (the knot and the torus), and Formula Toy only draws one surface. (but it was all done with three.js).

## Rotations, Transformations – Geometries and Meshes

I was driving myself batty trying to get all my rotations and transformations to behave correctly in three.js. So in these kind of cases one must always pare down to essentials. As follows:

Lets create a simple mesh and place it on the scene:

// Example 1 var geo = new THREE.BoxGeometry(5,5,20,32); _mesh = new THREE.Mesh(geo, new THREE.MeshNormalMaterial());

Then in the render loop, we rotate it around the Z axis (the blue axis):

_mesh.rotation.z = -_tick * Math.PI/256;

The result:

This just a screen grab, an animated gif. Hence the jerk when the animation restarts.

Now what happens if we rotate the mesh in the initial setup:

// Example 2 var geo = new THREE.BoxGeometry(5,5,20,32); _mesh = new THREE.Mesh(geo, new THREE.MeshNormalMaterial()); _mesh.rotateY(Math.PI/2);

Code in the render loop is the same as before:

_mesh.rotation.z = -_tick * Math.PI/256;

Now the block rotates along the X (red) axis, even though we told it to rotate it along the Z axis.

If we specify rotation order in the initial setup, then it will rotate around the Z axis:

// Example 3 _mesh.rotation.order = 'ZXY';

The result:

Now lets try some translation. Initial setup:

// Example 4 var geo = new THREE.BoxGeometry(5,5,20,32); _mesh = new THREE.Mesh(geo, new THREE.MeshNormalMaterial()); _mesh.rotation.order = 'ZXY'; _mesh.rotateY(Math.PI/2); _mesh.position.set(0,-6,0); _scene.add( _mesh);

Render loop is the same, rotate around Z axis. The result is that the translation is applied and the object rotates around the its new local axis which was also shifted downwards:

Three.js also supports the axis/angle method of specifying rotations:

// Example 5 var geo = new THREE.BoxGeometry(5,5,20,32); _mesh = new THREE.Mesh(geo, new THREE.MeshNormalMaterial()); // these have no effect because in render() we will directly modify the internal rotation matrix //_mesh.rotateY(Math.PI/4); //_mesh.rotation.order = 'ZXY';

Render loop:

var axis = new THREE.Vector3( 0, 0, 1 ); var angle = _tick * Math.PI / 256; // matrix is a THREE.Matrix4() _matrix.makeRotationAxis( axis.normalize(), angle ); _mesh.rotation.setFromRotationMatrix( _matrix );

The result is identical to what we achieved above in Example 1 with mesh.rotation. But often it is easier to conceptualize an axis of rotation rather than a succession of Euler angles.

We can mimic the Example 3 result using makeRotationY and applying it to the __geometry__.

// Example 6 var geo = new THREE.BoxGeometry(5,5,20,32); geo.applyMatrix( new THREE.Matrix4().makeRotationY( Math.PI/2 ) ); _mesh = new THREE.Mesh(geo, new THREE.MeshNormalMaterial()); _scene.add( _mesh);

Render loop is same as Example 5, with the desired result.

Now why did that work whereas mesh.rotationY is ignored? Because we rotated the geometry; the vertices were changed. This will be clearer if we translate the geometry:

// Example 7 var geo = new THREE.BoxGeometry(5,5,20,32); geo.applyMatrix( new THREE.Matrix4().makeRotationY( Math.PI/2 ) ); geo.applyMatrix( new THREE.Matrix4().makeTranslation(0,-6,0) ); _mesh = new THREE.Mesh(geo, new THREE.MeshNormalMaterial()); _scene.add( _mesh); // put sphere at mesh origin var sphere = new THREE.Mesh( new THREE.SphereGeometry(1,20,20), new THREE.MeshNormalMaterial()); sphere.position.set(0, 0, 0,); // we could put the sphere anywhere // and the box would rotate around it

Render loop code is the same as the previous 2 examples:

var axis = new THREE.Vector3( 0, 0, 1 ); var angle = -_tick * Math.PI / 256; // matrix is a THREE.Matrix4() _matrix.makeRotationAxis( axis.normalize(), angle ); _mesh.rotation.setFromRotationMatrix( _matrix );

The result:

The sphere represents the center of our geometry, its origin. The vertices have been rotated and translated from the local origin, where the sphere is.

If we’re rotating geometries, it is helpful to create a set of axes to show the local rotation axes of the geometry.

// Example 8: // put axes at mesh origin with mesh rotation // drawAxes is my routine which is based on THREE.AxisUtils(). See animation link below. _scene.add(drawAxes(10, _mesh.position, _mesh.rotation));

Since we’re just rotating the mesh around the Z (blue) axis, we can use this more compact syntax in the render loop:

_mesh.rotation.z = -_tick * Math.PI/128;

The result:

So to review: we wanted a geometry to rotate around a point that was external to the geometry. We did that by transforming the vertices of the geometry using applyMatrix.

There is another way to accomplish the same result: attach the box mesh to a parent mesh. Set the child’s position (that is to say, the box’s position) to be relative to the parent. And then place the parent where one wants, and rotate the parent if appropriate:

// Example 9: var geo = new THREE.BoxGeometry(5,5,20,32); _mesh = new THREE.Mesh(geo, new THREE.MeshNormalMaterial()); _mesh.rotation.y = -Math.PI/2; _mesh.position.set(0,-6,0); // _mesh will be the child _sphere = new THREE.Mesh( // _sphere will be the parent new THREE.SphereGeometry(1,20,20), new THREE.MeshNormalMaterial()); _sphere.position.set(0,-12,0); _sphere.rotation.x -= Math.PI/8; _sphere.add(_mesh); // add child to parent _scene.add(_sphere); // put axes at parent origin with parent rotation _scene.add(drawAxes(10, _sphere.position, _sphere.rotation));

And in the render loop, rotate the _sphere, not the _mesh.

When is this method (building a parent child relationship) preferable to changing the geometry? I couldn’t see much difference until I had to work with physi.js to apply physics to some of my meshes. Physi.js only works with meshes that have certain simple geometries (sphere, cone, box, etc). It will however work with compound meshes and thus with more complicated geometries, but in this case a parent/child relationship between the meshes is required.

Full three.js animation is here.

Using r69 of three.js.

## Drawing Pentatope Cross-Sections in three.js

A triangle is the simplest regular figure in 2 dimensions. Its 3 dimensional analogue is a tetrahedron. Its 4 dimensional analogue is the pentatope. Another term for these 3 geometries is: simplex. Many articles online explain these further.

I wanted to view what it would look like if a pentatope passed through our 3-dimensional space. There are many examples of what the projection of a pentatope onto 3-d space would look like (ie its shadow), but I didn’t want to do that because while projections can be pretty, they seem less intuitive than cross-sections.

To summarize, the plan was: take 3-dimensional cross-sections of the pentatope (a 4 dimensional shape). The actual drawing will be done using the 3-dimensional drawing toolkit called three.js.

Our equation for the 3-dimensional cross-section is:

ax+by+cz+K = w

This is a linear equation; we’re not dealing with curved cross-sections for now, though that would be fun. a,b, and c are constants. x,y,z, and w are the 4 axes. K is a constant that we change to get different cross-sections.

But why do we need 4 dimensions to describe a 3-dimensional space? Because we are locating that 3-d space in 4 dimensions. Just as the equation for a plane in 3-d space requires 3 dimensions (e.g. x+y = z describes a plane in 3-d space).

What about our equation for pentatope? Well we don’t have an equation, but we have a collection of points. Our pentatope has 5 corners. Lines connect each point to every other point, resulting in 10 lines. These 10 lines in turn form 10 faces (2-d planes). These 10 faces in turn from 5 3-d tetrahedrons.

Note that:

– The intersection of a 4-d line and a 3-d space is a point.

– The intersection of a 4-d plane and a 3-d space is a line.

– The intersection of a 4-d volume and a 3-d space is a plane.

Our cross-section is going to look like a wire-frame, in other words a collection of lines. So we will capture the intersection of the pentatope’s faces with our cross-sectional space.

Our pseudo-code would look like this:

foreach (face in faces) { // 10 of these // Each face is a triangle. Lets name the vertices p0, p1, p2. These points have 4-d coords. // point1 and 2 will be the points we use to draw a line in 3-d space using three.js. // they have 3-d coords of course. point1 = calculateIntersectionOfLineWithSpace(p0,p1); point2 = calculateIntersectionOfLineWithSpace(p0,p2); drawLineUsingThreeJS(point1, point2); }

So the interesting bit is in ‘calculateIntersectionOfLineWithSpace()’:

We must use parametric equations to work with lines in dimensions higher than 2. The parametric form for a line defined by the points p0 and p1 is:

p = p0 + t*v (1) where v = p1 - p0.

We need to calculate what the value of t is at the intersection with our cross-sectional space. (p0 and p1 are the vertices of one of the faces of our pentatope.) So we need to solve for t.

t = (Pt - P0)/V

Pt is the point of intersection of the line with the cross-sectional space. Given our formula above for our 3-d cross-sectional space, the equation for t is:

t = (W0-aX0-bY0-cZ0-K)/(aXv+bYv+cZv-Wv) Where P0 = (X0, Y0, Z0, W0) and V is (Xv, Yv, Zv, Wv).

Plugging t into equation (1) above we calculate the point of intersection of our 4-d line with our 3-d space. We return (x,y,z) from ‘calculateIntersectionOfLineWithSpace()’ above, discarding w.

And we simply change K to see different cross-sections. Here is the animated result:

http://rwoodley.org/MyContent/WIP/30-Simplex/index.html

A still:

## Minecraft Menger Sponge – STEAM project.

What has zero volume and an infinite surface area? A Menger Sponge of course. If your child is a Minecraft fan like my 7 year old, then this simple fractal can be used to make a good afternoon STEAM project that teaches math and programming concepts. (Note you should already know how to modify Minecraft code. That would take more than afternoon to get a grip on. And of course you should know Java).

First we built a Level 1 Menger cube with snap cubes. Then we came up with the X,Y, and Z coordinates for each of the 20 blocks that make up the Level 1 cube. This was a major goal of our project: increased fluency with 3-d coordinates.

Then it was not a difficult leap to see how our Java code used these 20 coordinates to place blocks.

public static void drawblock(World world, int x, int y, int z, int startx, int starty, int startz) { int metadata = world.getBlockMetadata(x, y, z); Block block = Block.getBlockById(35); boolean res = world.setBlock(startx + x, starty + y, startz + z+5, block, metadata, 3); System.out.println("Placing block at " + x + "," + y + ", res = " + res); } public boolean onItemUse(ItemStack par1ItemStack, EntityPlayer par2EntityPlayer, World world, int startx, int starty, int startz, int par7, float par8, float par9, float par10) { Menger1Item.drawLevel1Cube(world, startx, starty, startz); return true; } public static void drawLevel1Cube(World world, int startx, int starty, int startz) { drawblock(world, 0, 0, 0, startx, starty, startz); drawblock(world, 1, 0, 0, startx, starty, startz); drawblock(world, 2, 0, 0, startx, starty, startz); drawblock(world, 0, 1, 0, startx, starty, startz); drawblock(world, 2, 1, 0, startx, starty, startz); drawblock(world, 0, 2, 0, startx, starty, startz); drawblock(world, 1, 2, 0, startx, starty, startz); drawblock(world, 2, 2, 0, startx, starty, startz); drawblock(world, 0, 0, 1, startx, starty, startz); drawblock(world, 0, 2, 1, startx, starty, startz); drawblock(world, 2, 0, 1, startx, starty, startz); drawblock(world, 2, 2, 1, startx, starty, startz); drawblock(world, 0, 0, 2, startx, starty, startz); drawblock(world, 1, 0, 2, startx, starty, startz); drawblock(world, 2, 0, 2, startx, starty, startz); drawblock(world, 0, 1, 2, startx, starty, startz); drawblock(world, 2, 1, 2, startx, starty, startz); drawblock(world, 0, 2, 2, startx, starty, startz); drawblock(world, 1, 2, 2, startx, starty, startz); drawblock(world, 2, 2, 2, startx, starty, startz); }

Here is the Level 1 Cube in Minecraft:

At this point we wanted to make higher level cubes. A simple recursive algorithm was called for. This was too much for my son, not surprisingly. It is excerpted below if you want to do something similar. The main concept I tried to convey was that each level was a new power of 20.

Level 1 = 20^1 = 20. Length of side: 3 blocks.

Level 2 = 20^2 = 400. Length of side: 9 blocks.

Level 3 = 20^3 = 8000. Length of side: 27 blocks.

Level 4 = 20^4 = 160,000. Length of side: 81 blocks.

Level 5 = 20^5 = 3,200,000. Length of side 243 blocks.

At this point you could point out how the volume is shrinking with each level. E.g: 400/9^3 < 3200000/243^3. Over time the volume will approach zero! Ah, the mysteries of fractals and limits.
Back to the concrete: We started with a level 2 cube. Exploring this next level up was important to understand recursion. My son kindly illuminated the structure with torches as night fell:

We skipped right to Level 4 which was beautiful and impressive. When night fell in our Minecraft world, the local wild-life (mobs) colonized our structure and served to give a nice sense of scale and depth.

Take a tour of this level 4 Menger Cube:

Level 4 Menger Cube from Robert Woodley on Vimeo.

Could Minecraft handle a Level 5 cube? That was the question on our minds. 3,200,000 blocks. It took about 5 minutes and the laptop labored mightily. But it worked! There were nice glitch effects as Minecraft struggled to build out the structure. Also the top was truncated as Minecraft prevented us from going into outer space.

This incredible fractal structure is best understood through video:

Level 5 Menger Cube in Minecraft from Robert Woodley on Vimeo.

Complete Java class to build a level 5 Menger Cube:

package com.example.examplemod; import net.minecraft.block.Block; import net.minecraft.entity.player.EntityPlayer; import net.minecraft.item.Item; import net.minecraft.creativetab.CreativeTabs; import net.minecraft.item.ItemStack; import net.minecraft.world.World; public class Menger1Item extends Item { public Menger1Item() { setMaxStackSize(64); setCreativeTab(CreativeTabs.tabMisc); setUnlocalizedName("MengaLevel1Cube"); } public static void drawblock(World world, int x, int y, int z, int startx, int starty, int startz) { int metadata = world.getBlockMetadata(x, y, z); Block block = Block.getBlockById(35); boolean res = world.setBlock(startx + x, starty + y, startz + z+5, block, metadata, 3); System.out.println("Placing block at " + x + "," + y + ", res = " + res); } public boolean onItemUse(ItemStack par1ItemStack, EntityPlayer par2EntityPlayer, World world, int startx, int starty, int startz, int par7, float par8, float par9, float par10) { Menger1Item.drawLevel1Cube(world, startx, starty, startz); return true; } public static void drawLevel1Cube(World world, int startx, int starty, int startz) { int[][] i5coords = returnCoords(81); for (int i5 = 0; i5 < 20; i5++) { int[][] i4coords = returnCoords(27); for (int i4 = 0; i4 < 20; i4++) { int[][] i3coords = returnCoords(9); for (int i3 = 0; i3 < 20; i3++) { int[][] i2coords = returnCoords(3); for (int i2 = 0; i2 < 20; i2++) { int[][] coords = returnCoords(1); for (int i1 = 0; i1 < 20; i1++) { drawblock(world, i5coords[i5][0] + i4coords[i4][0] + i3coords[i3][0] + i2coords[i2][0] + coords[i1][0], i5coords[i5][1] + i4coords[i4][1] + i3coords[i3][1] + i2coords[i2][1] + coords[i1][1], i5coords[i5][2] + i4coords[i4][2] + i3coords[i3][2] + i2coords[i2][2] + coords[i1][2], startx, starty, startz); } } } } } } public static int[][] returnCoords(int level) { int[][] coords = { {0, 0, 0}, {1, 0, 0}, {2, 0, 0}, {0, 1, 0}, {2, 1, 0}, {0, 2, 0}, {1, 2, 0}, {2, 2, 0}, {0, 0, 1}, {0, 2, 1}, {2, 0, 1}, {2, 2, 1}, {0, 0, 2}, {1, 0, 2}, {2, 0, 2}, {0, 1, 2}, {2, 1, 2}, {0, 2, 2}, {1, 2, 2}, {2, 2, 2}, }; for (int i = 0; i < 20; i++) { for (int j = 0; j < 3; j++) coords[i][j] *= level; } return coords; } }

## CVision: Computer Vision Workbench

CVision: A computer vision WorkBench and Utilities written in WinForms that wraps OpenCV. I wrote it for my purposes, there is plenty to improve: https://github.com/rwoodley/CVision.

Features:

- Has all OpenCV color options and color maps.
- Histogram Equalization
- 4 Blurs
- Many Morph Modes:
- ERODE, DILATE, OPEN, CLOSE, GRADIENT, TOPHAT, BLACKHAT

- You can specify morph type (RECT, CROSS, ELLIPSE) as well as kernel size.

- Other transformations: NOT, SOBELX SOBELY, LAPLACIAN, SCHARRX1, SCHARRY1, CANNY, THRESHOLD
- Ability to group operations into ‘recipes’.
- Boolean operations.
- Intelligent contours and rotation.
- Ability to process in batch mode across a directory full of images.
- All intermediate transformations are saved.

## The Visual Representation of High Dimension Spaces

Our brains struggle visualizing spaces with more than 3 dimensions. This is a problem we tried to address in our FaceCloud and FaceField projects. We present here a possible solution using fractional dimensions to represent higher dimensions. The examples here are all based on the Eigenfaces face recognition algorithm where we were dealing with high dimension PCA spaces. But the visualization methodology is not specific to PCA.

The model we used for the Face Field project is a 60-dimensional ‘face space’. That is, we work with 60 Eigenvectors (or Eigenfaces) for all of our operations, be it classic face recognition, the anti-face calculation, or synthetic face generation.

We’ve repeatedly tried to create visualizations of this 60-dimensional space. Working with images makes this easier because one can see immediately what many of the Eigenvectors encode. Working with faces is even better because we’re so tuned to reading faces.

For instance the first Eigenface codes for both lighting and gender.

In this picture we see the ‘mean face’ or ‘average face’ in the center, and the results of shifting the first eigenface’s value (that is, its eigenvalue) from -1 to 1. Note that the face on the left is: female, white face on dark background. The face on the right is: male, dark face on light background.

The coordinates for the mean face are {0,0,0,0,0,…..,0} – zeroes for all 60 eigen values.

The coordinates for the face on the left are {-1,0,0,0,0,…..,0} – minus one for the first eigen value, 0 for the remaining 59.

The coordinates for the face on the right are {1,0,0,0,0,…..,0} – one for the first eigen value, 0 for the remaining 59.

—-

Lets take this a step further and look at the first 3 Eigenfaces. All of the faces below have Eigenvalues of 0 for all Eigenfaces beyond the 3rd one. The values for the first 3 are displayed on the diagram. We have just discussed the first Eigenface in our preceding paragraph. It is represented here by the axis that goes from the upper-left to the lower-right. The second Eigenface seems to encode for side lighting. The third Eigenface encodes for top vs bottom lighting which also incidentally encodes for hair.

But wait, you say, where is the face corresponding to these eigenvalues: {1,1,1}. or {1,1,0} or {-1,0,1}? These are indeed not displayed. One could imagine a cube with the X, Y and Z axes being the first 3 eigen vectors and these other missing faces being the vertices and edges of the cube. So go imagine that because I haven’t got a graphic handy that displays that. Even if I did, what would we have accomplished? Simply displaying 3 dimensions of data in 3 dimensions. The topic of this article concerns High Dimensional Spaces.

We can take the ‘Face Sextant’ above a step further and use Adelheid Mers’ fractal 3-line matrix to browse a 6 dimensional face space:

I only typed in the coordinates for a few faces. Hopefully you can easily divine what the other values would be. Once again we are missing the edges and vertices of the fractal cube (i.e. the face at {1,1,1,1,1,1} is not on this diagram).

So this process could be repeated over and over-again, adding 3 new dimensions to our visualization on each iteration. It is a loss-less method of representing higher dimensions using smaller and smaller fractal spaces.

In terms of actually making a usable visualization, the faces would get smaller of course so we would need a pan and zoom capability. Here is a visualization using this approach:

Face Cloud from Robert Woodley on Vimeo.

It displays faces with non-zero eigenvalue for the first 12 eigenfaces. In other words it is a visualization of 12 dimensional space. Note also it is using a tetrahedral approach as opposed to a cubic approach. That is, the coordinates for the first 4 faces to emanate from the Mean Face are: {1,1,1}, {-1,-1,1}, {1,1,-1}, {-1,-1,-1}. This reduces the number of faces, thereby reducing the visual clutter and giving the graphics card a chance to keep up with the calculations.

That was a video. The original was written in javascript using three.js and can be run here using chrome on a computer with a good graphics card.

—–

Separately, many of you have seen the Synthetic Face machine. It allows you to tweak all 60 eigen values to get any face you want from the face space, in real-time:

So it is not a way to visualize high dimension spaces, but it is useful for browsing those spaces and pulling faces at random from them. Indeed this was how we generated the faces on display at the “Enter The Matrix” show at the Chicago Cultural Center.

## Face Field Update

10 Synthetic Faces are now on display at the Chicago Cultural Center through August, as part of Adelheid Mers’ “Enter the Matrix” exhibit. They look great in large format and are quite powerful. Some photos below.

These are faces pulled at random from the 60-dimensional PCA space that we have been working with for some time now. You can create your own Synthetic Face at http://facefield.org/SynthFace.aspx. (Click on the face that appears for options.)

## Formula Toy

Back at the age of 17, thanks to the liberal access policies of Indiana University’s Wrubel Computing Center, I was able to write short little computer programs (on cards!) that would graph 3 dimensional surfaces using a pen plotter. The little nerd that was me was thrilled at the results, and my bedroom wall was plastered with graphs of various mathematical functions of my creation. And incidentally it was an extremely helpful way to get a visual understanding of mathematical geometries, including alternative coordinates systems – cartesian, spherical, and cylindrical.

In these days, there is no shortage of packages that will draw 3-D mathematical surfaces. However, nothing that I found was totally simple. All required a learning curve, or a download or a plugin. I wanted to build something whereby you would simply type in a formula, hit enter, and see the surface. Hence was born Formula Toy. It uses the amazing three.js library which is a wrapper around WebGL. The dependence on WebGL means that it won’t run on every single browser because WebGL is still an emerging standard. However it should work on most MACs and on most Window’s desktops, at least if you use a modern browser like Chrome.

You can either go directly to http://formulatoy.net or you can look at some examples and just click on ones there that intrigue you which will pop you straight into formula toy. There is some help text here.

## The Hairy Blob, 800 Ping Pong Balls, and a Mindstorms RoboCam

#### The Hairy Blob:

At the Hairy Blob exhibition at the Hyde Park Art Center last spring, visitors were invited to draw an image of time on a ping pong ball and toss it into a net that was suspended from the ceiling.

#### 800 Ping Pong Balls:

What to do with over 800 ping pong balls?

How to document 800 three dimensional objects in less than 5 years?

#### Mindstorms RoboCam:

Our ping pong cam was an NXT Mindstorms robot (which rotated the balls) driven by a laptop that was simultaneously taking pictures. Controlling the robot from an external device was suprisingly difficult. NHK.MindSqualls did the job, but just.

Ping Pong Robo Cam and Laptop setup:

We did the scanning over Thanksgiving weekend at the Roger Brown house in New Buffalo, MI.

#### The installation:

Intalled at the Mers Micro Museum, a Raspberry Pi drives the display. Some javascript randomly selects from the 800, and then starts a few of them spinning. First it shows a random batch of the day time balls and then a random batch of the night time balls. And so on, indefinitely.

You can also spin the balls online.

## Anti-Face Model Specification and Calculation Details

**Update 10/2013**: We have implemented this in a free iPhone app which is available here:

https://itunes.apple.com/us/app/anti-face/id690376775

Our site FaceField.org (and now the iPhone app) uses the EigenFaces methodology as implemented in Open CV to calculate a special kind of face that we have labelled an ‘Anti Face’.

Model specification:

– Over 1000 faces were used.

– The faces were all facing the camera straight-on. We used a specially designed Haar classifier to ensure that we excluded faces looking to the side.

– The faces that we fed into the PCA calculation are slightly larger than what the Haar classifier detected so that we didn’t cut off the chins.

– The faces were all subject to histogram equalization.

– The faces were all sized to 200×200 pixels.

– We took the first 60 eigenvectors.

The Antiface calculation is simple: we do a subspace projection of the uploaded face into the 60-dimensional EigenFace space. It is interesting to view this ‘reconstructed face’ as we call it. In a normal face recognition calculation you would then compute the nearest face to this reconstructed face. The anti-face is simply the reconstructed face except that every weight is multiplied by -1.

In other words:

Let Ω ∋ R⁶⁰ be the subspace projection of the uploaded face.

Ω=(w₁,w₂,..w₆₀) where wᵢ= uᵢᵀ(Γ-Ψ) for i=1…60.

u ∋ R⁶⁰ is the Eigenface. Γ is the uploaded face image and Ψ is the mean face.

Then the antiface, Ω’ is (-1*w₁,-1*w₂,..-1*w₆₀).

So prominent features in the reconstructed faces (with high weighting, meaning far from the mean) would be equally prominent, though opposite, in the anti-face.

Why 60 dimensions? Originally we tried higher numbers like 200 because at that point the reconstructed face looks indistinguishable from the original face. However the anti-face looks very little like a face and is highly distorted and muddied. It seems that it such high-dimension space not everything looks like a face, however in a lower dimension space like 60 you’re likely to get a face no matter where you land.

Two Face/Anti-Face pairs as examples:

## Visualizing Factors and Prime Factorization.

**Update 8/2014:** The calculator discussed below is also available in the Chrome store here (for free of course).

I’ve previously discussed Brent Yorgey’s factor diagrams. As the father of a 6 year old, I’ve found they are a great way to introduce the concepts of primes and factorization.

Since then, I dabbled with the javascript animations by Sean Seefried to create 2 related products:

1. a calculator, and

2. a factorization game.

**Factor Diagram Calculator**

The calculator does multiplication and division and allows the young ‘uns to explore the diagrams. I also recently added the ability to do exponentiation after watching Mike Lawler’s video on powers of 3 and Sierpinski’s triangle.

Multiplication:

Division:

Calculator is here.

**Factorization Game
**Kids learn by playing, that is well known. So how to make a game out of all of this? I scripted up something simple whereby you’d be presented with a large number and have to factor it while the clock is ticking. Do this a few times, get a score. Then compare with friends collect badges, etc. That last bit (prizes, badges) is not written and is a whole separate app of course.

Game is here.

So as it now stands it is simple, a germ of an idea really. Any thoughts on how to improve the learning experience?

## Factor Dominoes

Many have discovered Brent Yorgey’s very cool factorization diagrams. They seem like a great way to teach multiplication and factorization to children.

We also were excited by the domino game suggested by Malke Rosenfeld on her blog and decided to try to take her ideas and see if we could create viable game that also would be a fun way to think about factors and primes. To this end we brought together a math geek, a visual artist and a 6 year old ninja and started playing. Through trial and error we arrived at a set of rules.

### Simple version of the rules:

Start by printing out a deck of cards. (We have attached a pdf to this post with numbers up to 24 that can be printed on card stock and cut up.)

Each player gets 6 cards. One card is turned face up in the middle. The first player tries to match the turned up card, following the match rules below. After that, players can choose to match the card at either end of the growing chain. If you can’t match then you draw another card. First person to run out of cards wins.

The whole game comes down to the matching rules:

– The number 1 matches anything

– The number 2 matches any even number

– Primes match primes (or as you can explain to a child: circles match circles)

– Other numbers have a major and minor group. For instance 9 has a major group of 3 and a minor group of 3. 15 has a major group of 5 and a minor group of 3. 18 has a major group of 3 and a minor group of 6.

You can match based on major or minor group. If the card you want to play has the same major group or same minor group as the card to be matched, then you can play it. So 10 can match 15 (major group of 5 matches), and 9 can match 15 (minor group of 3 matches). A prime number can match either the minor or the major group, thus 5 can match 10 (major group), but 3 can match 15 (minor group).

### Beyond the simple rules:

We started to think about a rule set that might work if you had numbers higher than 24.

Types of number visualizations:

Primes are represented as circles.

Here are the types of Minor Groups:

Simple Minor groups are low primes:

1, 2, 3, 5, 7

Compounded minor groups:

4 (2×2), 6 (3×2)

Doubly compounded minor groups:

8 (2x2x2), 9 (3x3x3), 16 (4x4x4), 18 (2x3x3)

Triple compounded minor groups:

24 (2x2x2x3), etc.

Major groups

A major group is a combination of N copies of one of these minor or compounded groups.

So here are the generalized matching rules:

– 1 matches anything

– 2 matches any even number

– Minor group types can be matched with each other within types, but not across.

– Major groups can be matched if they have an equal number of elements. For that, minor group types do not have to match.

### Attachment:

Here is the pdf you can print out on card stock to make your own set of cards:

## Constrain Face Detection for Better Face Recognition

Getting good results with PCA-based facial recognition algorithms depends on correcting for differences in lighting and alignment between the faces. Widely used techniques for correcting for lighting include using histogram equalization or discarding the first eigen vector. Techniques for correcting for alignment differences often involve locating facial features such as the eyes and then rotating the face.

I found these techniques problematic. The corrections for lighting can indeed reduce the impact of overall illumination effects but don’t work for side-lighting scenarios. Face alignment methods are complicated and error prone. In the case of eye alignment one would presumably use additional haar cascades to locate the right eye and the left eye which in turn are error-prone and repeat many of correction problems we have with faces.

It seems that face recognition problems stem from an overly permissive face detection algorithm. The haar cascade that comes with Open CV (called ‘frontalface_alt2’) is indeed very good at detection faces, including rotated faces. It seems it was trained with a sample set that included rotated faces and faces in all different illumination conditions.

Thus constraining face detection so that it only detects horizontal well lit faces would make face recognition much more accurate. That was my hypothesis at any rate and I decided to give it a try.

To this end I used the Open CV tools to build a training set that included only horizontal, full frontal and well lit faces. In the negative sample set I included faces rotated along the Z axis. A better negative set would have included faces rotated along the X and Y axes as well, but I didn’t do that. (By Z axis I mean an axis running vertically through the face).

The resultant cascade is available on https://github.com/rwoodley/iOS-projects/blob/master/FaceDetectionAlgoCompare/OpenCVTest/haarcascade_constrainedFrontalFace.xml. There is a sample iOS application there as well that allows you to compare this cascade with frontalface_alt2 as well as one other cascade.

I was quite pleased with the results. The detection only works on well lit, well aligned faces. This puts the onus on the user to submit a good face. I realize this scenario won’t work with everyone’s system requirements, but in my case it was quite useful.

See the sample output below from my test iPhone application. I aim the camera at a photo of Joe Biden. Above his face are 3 smaller faces in grey-scale. These represent the faces detected by an LBP cascade, the Open CV alt2 cascade and my constrained cascade respectively.

Now see the sample below where Joe Biden’s photo is tilted. The first 2 cascades still find his face, but the 3rd one (the constrained one) does not.