Jul 18

Euclidean space geometry: scalar product, norm and distance

Euclidean space geometry: scalar product, norm and distance

Learning this material has spillover effects for Stats because everything in this section has analogs for means, variances and covariances.

Scalar product

Definition 1. The scalar product of two vectors x,y\in R^n is defined by x\cdot y=\sum_{i=1}^nx_iy_i. The motivation has been provided earlier.

Remark. If matrix notation is of essence and x,y are written as column vectors, we have x\cdot y=x^Ty. The first notation is better when we want to emphasize symmetry x\cdot y=y\cdot x.

Linearity. The scalar product is linear in the first argument when the second argument is fixed: for any vectors x,y,z and numbers a,b one has

(1) (ax+by)\cdot z=a(x\cdot z)+b(y\cdot z).

Proof. (ax+by)\cdot z=\sum_{i=1}^n(ax_i+by_i)z_i=\sum_{i=1}^n(ax_iz_i+by_iz_i)

=a\sum_{i=1}^nx_iz_i+b\sum_{i=1}^ny_iz_i=ax\cdot z+by\cdot z.

Special cases. 1) Homogeneity: by setting b=0 we get (ax)\cdot z=a(x\cdot  z). 2) Additivity: by setting a=b=1 we get (x+y)\cdot z=x\cdot z+y\cdot  z.

Exercise 1. Formulate and prove the corresponding properties of the scalar product with respect to the second argument.

Definition 2. The vectors x,y are called orthogonal if x\cdot y=0.

Exercise 2. 1) The zero vector is orthogonal to any other vector. 2) If x,y are orthogonal, then any vectors proportional to them are also orthogonal. 3) The unit vectors in R^n are defined by e_i=(0,...,1,...,0) (the unit is in the ith place, all other components are zeros), i=1,...,n. Check that they are pairwise orthogonal.


Exercise 3. On the plane find the distance between a point x and the origin.

Figure 1. Pythagoras theorem

Figure 1. Pythagoras theorem

Once I introduce the notation on a graph (Figure 1), everybody easily finds the distance to be \text{dist}(0,x)=\sqrt{x_1^2+x_2^2} using the Pythagoras theorem. Equally easily, almost everybody fails to connect this simple fact with the ensuing generalizations.

Definition 3. The norm in R^n is defined by \left\Vert x\right\Vert=\sqrt{\sum_{i=1}^nx_i^2}. It is interpreted as the distance from point x to the origin and also the length of the vector x.

Exercise 4. 1) Can the norm be negative? We know that, in general, there are two square roots of a positive number: one is positive and the other is negative. The positive one is called an arithmetic square root. Here we are using the arithmetic square root.

2) Using the norm can you define the distance between points x,y\in R^n?

3) The relationship between the norm and scalar product:

(2) \left\Vert x\right\Vert =\sqrt{x\cdot x}.

True or wrong?

4) Later on we'll prove that \Vert x+y\Vert\leq\Vert x\Vert+\Vert{ y}\Vert . Explain why this is called a triangle inequality. For this, you need to recall the parallelogram rule.

5) How much is \left\Vert 0\right\Vert ? If \left\Vert x\right\Vert =0, what can you say about x?

Norm of a linear combination. For any vectors x,y and numbers a,b one has

(3) \left\Vert ax+by\right\Vert^2=a^2\left\Vert x\right\Vert^2+2ab(x\cdot y)+b^2\left\Vert y\right\Vert^2.

Proof. From (2) we have

\left\Vert ax+by\right\Vert^2=\left(ax+by\right)\cdot\left(ax+by\right)     (using linearity in the first argument)

=ax\cdot\left(ax+by\right)+by\cdot\left(ax+by\right)         (using linearity in the second argument)

=a^2x\cdot x+abx\cdot y+bay\cdot x+b^2y\cdot y (applying symmetry of the scalar product and (2))

=a^2\left\Vert x\right\Vert^2+2ab(x\cdot y)+b^2\left\Vert y\right\Vert^2.

Pythagoras theorem. If x,y are orthogonal, then \left\Vert x+y\right\Vert^2=\left\Vert x\right\Vert^2+\left\Vert y\right\Vert^2.

This is immediate from (3).

Norm homogeneity. Review the definition of the absolute value and the equation |a|=\sqrt{a^2}. The norm is homogeneous of degree 1:

\left\Vert ax\right\Vert=\sqrt{(ax)\cdot (ax)}=\sqrt{{a^2x\cdot x}}=|a|\left\Vert x\right\Vert.

Nov 16

Statistical measures and their geometric roots

Variance, covariancestandard deviation and correlation: their definitions and properties are deeply rooted in the Euclidean geometry.

Here is the why: analogy with Euclidean geometry

euclidEuclid axiomatically described the space we live in. What we have known about the geometry of this space since the ancient times has never failed us. Therefore, statistical definitions based on the Euclidean geometry are sure to work.

   1. Analogy between scalar product and covariance

Geometry. See Table 2 here for operations with vectors. The scalar product of two vectors X=(X_1,...,X_n),\ Y=(Y_1,...,Y_n) is defined by

(X,Y)=\sum X_iY_i.

Statistical analog: Covariance of two random variables is defined by


Both the scalar product and covariance are linear in one argument when the other argument is fixed.

   2. Analogy between orthogonality and uncorrelatedness

Geometry. Two vectors X,Y are called orthogonal (or perpendicular) if

(1) (X,Y)=\sum X_iY_i=0.

Exercise. How do you draw on the plane the vectors X=(1,0),\ Y=(0,1)? Check that they are orthogonal.

Statistical analog: Two random variables are called uncorrelated if Cov(X,Y)=0.

   3. Measuring lengths


Figure 1. Length of a vector

Geometry: the length of a vector X=(X_1,...,X_n) is \sqrt{\sum X_i^2}, see Figure 1.

Statistical analog: the standard deviation of a random variable X is


This explains the square root in the definition of the standard deviation.

   4. Cauchy-Schwarz inequality

Geometry|(X,Y)|\le\sqrt{\sum X_i^2}\sqrt{\sum Y_i^2}.

Statistical analog|Cov(X,Y)|\le\sigma(X)\sigma(Y). See the proof here. The proof of its geometric counterpart is similar.

   5. Triangle inequality


Figure 2. Triangle inequality

Geometry\sqrt{\sum (X_i+Y_i)^2}\le\sqrt{\sum X_i^2}+\sqrt{\sum X_i^2}, see Figure 2 where the length of X+Y does not exceed the sum of lengths of X and Y.

Statistical analog: using the Cauchy-Schwarz inequality we have





   4. The Pythagorean theorem

Geometry: In a right triangle, the squared hypotenuse is equal to the sum of the squares of the two legs. The illustration is similar to Figure 2, except that the angle between X and Y should be right.

Proof. Taking two orthogonal vectors X,Y as legs, we have

Squared hypotenuse = \sum(X_i+Y_i)^2

(squaring out and using orthogonality (1))

=\sum X_i^2+2\sum X_iY_i+\sum Y_i^2=\sum X_i^2+\sum Y_i^2 = Sum of squared legs

Statistical analog: If two random variables are uncorrelated, then variance of their sum is a sum of variances Var(X+Y)=Var(X)+Var(Y).

   5. The most important analogy: measuring angles

Geometry: the cosine of the angle between two vectors X,Y is defined by

Cosine between X,Y = \frac{\sum X_iY_i}{\sqrt{\sum X_i^2\sum Y_i^2}}.

Statistical analog: the correlation coefficient between two random variables is defined by


This intuitively explains why the correlation coefficient takes values between -1 and +1.

Remark. My colleague Alisher Aldashev noticed that the correlation coefficient is the cosine of the angle between the deviations X-EX and Y-EY and not between X,Y themselves.