Understanding Matrices | Part 1: Matrix-Vector Multiplication

are basic objects in numerous fields of contemporary pc science and arithmetic, together with however not restricted to linear Algebra, machine studying, and pc graphics.

Within the present collection of 4 tales, I’ll current a manner of decoding algebraic matrices in order that the bodily that means of assorted Matrix evaluation formulation will change into clearer. For instance, the components for multiplying 2 matrices:

[begin{equation}
c_{i,j} = sum_{k=1}^{p} a_{i,k}*b_{k,j}
end{equation}]

or the components for inverting a series of matrices:

[begin{equation}
(ABC)^{-1} = C^{-1}B^{-1}A^{-1}
end{equation}]

In all probability for many of us, once we had been studying matrix-related definitions and formulation for the primary time, questions like the next ones arose:

what does a matrix truly signify,
what’s the bodily that means of multiplying a matrix by a vector,
why multiplication of two matrices is carried out by such a non-standard components,
why for multiplication the variety of columns of the primary matrix should be equal to the variety of rows of the second,
what’s the that means of transposing a matrix,
why for sure forms of matrices, inversion equals to transposition,
… and so forth.

On this collection, I plan to current a method of answering many of the listed questions. So let’s dive in!

However earlier than beginning, listed here are a few notation guidelines that I take advantage of all through this collection:

Matrices are denoted by uppercase (like A, B), whereas vectors and scalars are denoted by lowercase (like x, y or m, n),
a_i,j – The worth of i-th row and j-th column of matrix ‘A‘,
x_i – the i-th worth of vector ‘x‘.

Multiplication of a matrix by a vector

Let’s put apart for now the best operations on matrices, that are addition and subtraction. The subsequent easiest manipulation might be the multiplication of a matrix by a vector:

[begin{equation}
y = Ax
end{equation}]

We all know that the results of such an operation is one other vector ‘y‘, which has a size equal to the variety of rows of ‘A‘, whereas the size of ‘x‘ must be equal to the variety of columns of ‘A‘.

Let’s think about “n*n” sq. matrices for now (these with equal numbers of rows and columns). We are going to observe the conduct of rectangular matrices a bit later.

The components for calculating y_i is:

[begin{equation}
y_i = sum_{j=1}^{n} a_{i,j}*x_j
end{equation}]

… which, if written within the expanded manner, is:

[begin{equation}
begin{cases}
y_1 = a_{1,1}x_1 + a_{1,2}x_2 + dots + a_{1,n}x_n
y_2 = a_{2,1}x_1 + a_{2,2}x_2 + dots + a_{2,n}x_n
;;;;; vdots
y_n = a_{n,1}x_1 + a_{n,2}x_2 + dots + a_{n,n}x_n
end{cases}
end{equation}]

Such expanded notation clearly reveals that each cell a_i,j is current within the system of equations solely as soon as. Extra exactly, a_i,j is current because the issue of x_j, and participates solely within the sum of y_i. This leads us to the next interpretation:

Within the product of a matrix by a vector “y = Ax”, a sure cell a_i,j describes how a lot the output worth y_i is affected by the enter worth x_j.

Having that mentioned, we are able to draw the matrix geometrically, within the following manner:

Geometrical interpretation of a 3×3 matrix “A” (written on the left). The suitable stack (purple gadgets) corresponds to the inputs of the matrix, that are the values of vector ‘x’. The left stack (inexperienced gadgets) corresponds to the outputs of the matrix, that are the values of vector ‘y’. Each arrow beginning at ‘x_j‘ and ending at ‘y_i‘ corresponds to a sure cell “a_i,j“.

And as we’re going to interpret matrix ‘A‘ as influences of values x_j on values y_i, it’s affordable to connect values of ‘x‘ to the proper stack, which is able to end in values of ‘y‘ being current on the left stack.

*Putting values of an enter vector “x = (x₁, x₂, x₃)” on the proper stack clearly reveals how the values of the output vector “y = (y₁, y₂, y₃)” are obtained on the left stack.*

I want to name this interpretation of matrices as “X-way interpretation”, as the location of offered arrows seems to be just like the English letter “X”. And for a sure matrix ‘A‘, I want to name such a drawing as “X-diagram” of ‘A‘.

Such interpretation clearly reveals that the enter vector ‘x‘ goes by means of some sort of transformation, from proper to left, and turns into vector ‘y‘. That is the explanation why in Linear Algebra, matrices are additionally known as “transformation matrices”.

If taking a look at any ok-th merchandise of the left stack, we are able to see how all of the values of ‘x‘ are being accrued in the direction of it, whereas being multiplied by coefficients a_ok,j (that are the ok-th row of the matrix).

The buildup of all enter values (x₁, x₂, x₃) in the direction of the output worth ‘y₂‘ is highlighted with crimson arrows. The enter values are multiplied by coefficients (9, 4, 6) respectively, that are the 2nd row of matrix ‘A’.

On the identical time, if taking a look at any ok-th merchandise of the proper stack, we are able to see how the worth x_ok is being distributed over all values of ‘y’, whereas being multiplied by coefficients a_i,ok (which at the moment are the ok-th column of the matrix).

The distribution of the enter worth ‘x₃‘ in the direction of all output values (y₁, y₂, y₃) is highlighted with crimson arrows. The enter worth ‘x₃‘ is being multiplied by coefficients (7, 6, 8) respectively, which at the moment are the third column of matrix ‘A’.

This already provides us one other perception, that when decoding a matrix within the X-way, the left stack might be related to rows of the matrix, whereas the proper stack might be related to its columns.

Certainly, if we’re concerned about finding some worth a_i,j, taking a look at its X-diagram just isn’t as handy as trying on the matrix in its unusual manner – as an oblong desk of numbers. However, as we are going to see later and within the subsequent tales of this collection, X-way interpretation explicitly presents the that means of assorted algebraic operations over matrices.

Rectangular matrices

Multiplication of the shape “y = Ax” is allowed provided that the size of vector ‘x‘ is the same as the variety of columns of matrix ‘A‘. On the identical time, the end result vector ‘y‘ may have a size equal to the variety of rows of ‘A‘. So, if ‘A‘ is an oblong matrix, vector ‘x‘ will change its size whereas passing by means of its transformation. We will observe it by taking a look at X-way interpretation:

X-way interpretation of a 3*4 matrix ‘A’. We see that its left stack has a top of three (depend of rows of ‘A’), whereas its proper stack has a top of 4 (depend of columns of ‘A’). There are 3*4=12 arrows general, every equivalent to a single cell a_i,j.

Now it’s clear why we are able to multiply ‘A‘ solely on such a vector ‘x‘, the size of which is the same as the variety of columns of ‘A‘: as a result of in any other case the vector ‘x‘ will merely not match on the proper facet of the X-diagram.

Equally, it’s clear why the size of the end result vector “y = Ax” is the same as the variety of rows of ‘A‘.

Viewing rectangular matrices within the X-way strokes, we now have beforehand made an perception, which is that gadgets of the left stack of the X-diagram correspond to rows of the illustrated matrix, whereas gadgets of its proper stack correspond to columns.

Observing a number of particular matrices in X-way interpretation

Let’s see how X-way interpretation will assist us to know the conduct of sure particular matrices:

Scale / diagonal matrix

A scale matrix is such a sq. matrix that has all cells of its primary diagonal equal to some worth ‘s‘, whereas having all different cells equal to 0. Multiplying a vector “x” by such a matrix ends in each worth of “x” being multiplied by ‘s‘:

[begin{equation*}
begin{pmatrix}
y_1 y_2 vdots y_{n-1} y_n
end{pmatrix}
=
begin{bmatrix}
s & 0 & dots & 0 & 0
0 & s & dots & 0 & 0
& & vdots
0 & 0 & dots & s & 0
0 & 0 & dots & 0 & s
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 vdots x_{n-1} x_n
end{pmatrix}
=
begin{pmatrix}
s x_1 s x_2 vdots s x_{n-1} s x_n
end{pmatrix}
end{equation*}]

The X-way interpretation of a scale matrix reveals its bodily that means. As the one non-zero cells listed here are those on the diagonal – a_i,i, the X-diagram may have arrows solely between corresponding pairs of enter and output values, that are x_i and y_i.

X-diagram of a scale matrix. Each output worth ‘y_i‘ is affected solely by the enter worth ‘x_i‘, which is why all of the arrows within the diagram are strictly horizontal. Scale matrix multiples values of enter vector “x” by ‘s’, which is why coefficients close to all arrows are equal to ‘s’.

A particular case of a scale matrix is the diagonal matrix (additionally known as an “id matrix”), typically denoted with the letters “E” or “I” (we are going to use “E” within the present writing). It’s a scale matrix with the parameter “s=1″.

[begin{equation*}
begin{pmatrix}
y_1 y_2 vdots y_{n-1} y_n
end{pmatrix}
=
begin{bmatrix}
1 & 0 & dots & 0 & 0
0 & 1 & dots & 0 & 0
& & vdots
0 & 0 & dots & 1 & 0
0 & 0 & dots & 0 & 1
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 vdots x_{n-1} x_n
end{pmatrix}
=
begin{pmatrix}
x_1 x_2 vdots x_{n-1} x_n
end{pmatrix}
end{equation*}]

*Identification matrix “E” is a scale matrix with the worth “s=1” on the principle diagonal.*

We see that doing the multiplication “y = Ex” will simply go away the vector ‘x‘ unchanged, as each worth x_i is simply multiplied by 1.

90° rotation matrix

A matrix, which rotates a given level (x₁, x₂) across the zero-point (0,0) by 90 levels counter-clockwise, has a easy type:

[begin{equation*}
begin{pmatrix}
y_1 y_2
end{pmatrix}
=
begin{bmatrix}
0 & -1
1 & phantom{-}0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2
end{pmatrix}
=
begin{pmatrix}
-x_2 phantom{-}x_1
end{pmatrix}
end{equation*}]

*Counter-clockwise rotation on a airplane. We see that if the unique (crimson) level has coordinates (x1, x2), then the rotated (blue) level’s coordinates are (y1, y2) = (-x2, x1).*

X-way interpretation of the 90° rotation matrix reveals that conduct:

*The X-way interpretation of 90° rotation matrix reveals the trade between x₁ and x₂ coordinates on the airplane.*

Change matrix

An trade matrix ‘J‘ is such a matrix that has 1s on its anti-diagonal, and has 0s in any respect different cells. Multiplying it by a vector ‘x‘ ends in reversing the order of values of ‘x‘:

[begin{equation*}
begin{pmatrix}
y_1 y_2 vdots y_{n-1} y_n
end{pmatrix}
=
begin{bmatrix}
0 & 0 & dots & 0 & 1
0 & 0 & dots & 1 & 0
& & vdots
0 & 1 & dots & 0 & 0
1 & 0 & dots & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 vdots x_{n-1} x_n
end{pmatrix}
=
begin{pmatrix}
x_n x_{n-1} vdots x_2 x_1
end{pmatrix}
end{equation*}]

This reality is explicitly proven within the X-way interpretation of the trade matrix ‘J‘:

X-way interpretation of ‘J’ reveals that the i-th from the highest worth of enter vector “x” goes solely to the i-th from the underside worth of output vector “y”. The coefficients of these arrows are all the time 1. That’s why vector “y” turns into the reverse of sequence “x”.

The 1s reside solely on the anti-diagonal right here, which signifies that output worth y₁ is affected solely by enter worth x_n, then y₂ is affected solely by x_n-1, and so forth, having y_n affected solely by x₁. That is seen on the X-diagram of the trade matrix ‘J‘.

Shift matrix

A shift matrix is such a matrix that has 1s on some diagonal, parallel to the principle diagonal, and has 0s in any respect remaining cells:

[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
=
begin{bmatrix}
0 & 1 & 0 & 0 & 0
0 & 0 & 1 & 0 & 0
0 & 0 & 0 & 1 & 0
0 & 0 & 0 & 0 & 1
0 & 0 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_2 x_3 x_4 x_5 0
end{pmatrix}
end{equation*}]

Multiplying such a matrix by a vector “x” ends in the identical vector however all values shifted by ‘ok‘ positions up or down. ‘ok‘ is the same as the space between the diagonal with 1s and the principle diagonal. Within the offered instance, we now have “ok=1″ (diagonal with 1s is just one place above the principle diagonal). If the diagonal with 1s is within the upper-right triangle, as it’s within the offered instance, then the shift of values of “x” is carried out upwards. In any other case, the shift of values is carried out downwards.

Shift matrix can be illustrated explicitly within the X-way:

The X-diagram of a shift matrix reveals that each enter worth “x_i” is simply being transferred to the output worth “y_i-k“, the place ‘ok’ is the space between the diagonal with 1s and the principle diagonal. This ends in the values of the enter vector “x” being shifted by ‘ok’ positions. Right here we now have “ok=1”.

Permutation matrix

A permutation matrix is a matrix composed of 0s and 1s, which rearranges all values of the enter vector “x” in a sure manner. The impression is that when multiplied by such a matrix, the values of “x” are being permuted.

To realize that, the n*n-sized permutation matrix ‘P‘ will need to have ‘n‘ 1s, whereas all different cells should be 0. Additionally, no two 1s should seem in the identical row or the identical column. An instance of a permutation matrix is:

[begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4 y_5
end{pmatrix}
=
begin{bmatrix}
0 & 0 & 0 & 1 & 0
1 & 0 & 0 & 0 & 0
0 & 0 & 0 & 0 & 1
0 & 0 & 1 & 0 & 0
0 & 1 & 0 & 0 & 0
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4 x_5
end{pmatrix}
=
begin{pmatrix}
x_4 x_1 x_5 x_3 x_2
end{pmatrix}
end{equation*}]

If drawing the X-diagram of the talked about permutation matrix ‘P‘, we are going to see the reason of such conduct:

*X-diagram of the permutation matrix offered above.*

The constraint that no two 1s should seem in the identical column signifies that just one arrow ought to depart from any merchandise of the proper stack. The constraint that no two 1s should seem on the identical row signifies that just one arrow should arrive at each merchandise of the left stack. Lastly, the constraint that every one the non-zero cells of a permutation matrix should be 1 signifies that a sure enter worth x_j, whereas arriving at output worth y_i, won’t be multiplied by any coefficient. All this ends in the values of vector “x” being rearranged in a sure method.

Triangular matrix

A triangular matrix is a matrix that has 0s in any respect cells both under or above its primary diagonal. Let’s observe upper-triangular matrices (the place 0s are under the principle diagonal), because the lower-triangular ones have related properties.

[
begin{equation*}
begin{pmatrix}
y_1 y_2 y_3 y_4
end{pmatrix}
=
begin{bmatrix}
a_{1,1} & a_{1,2} & a_{1,3} & a_{1,4}
0 & a_{2,2} & a_{2,3} & a_{2,4}
0 & 0 & a_{3,3} & a_{3,4}
0 & 0 & 0 & a_{4,4}
end{bmatrix}
*
begin{pmatrix}
x_1 x_2 x_3 x_4
end{pmatrix}
=
begin{pmatrix}
begin{aligned}
a_{1,1}x_1 + a_{1,2}x_2 + a_{1,3}x_3 + a_{1,4}x_4
a_{2,2}x_2 + a_{2,3}x_3 + a_{2,4}x_4
a_{3,3}x_3 + a_{3,4}x_4
a_{4,4}x_4
end{aligned}
end{pmatrix}
end{equation*}
]

Such an expanded notation illustrates that any output worth y_i is affected solely by enter values with higher or equal indexes, that are x_i, x_i+1, x_i+2, …, x_N. If drawing the X-diagram of the talked about upper-triangular matrix, that reality turns into apparent:

Within the X-diagram of an upper-triangular matrix, all arrows are both horizontal or directed upwards, which illustrates the truth that any output worth ‘y_i‘ is affected solely by enter values with the identical or higher index – ‘x_i‘, ‘x_i+1‘, ‘x_i+2‘, …, ‘x_N‘.

Conclusion

Within the first story of the collection, which is dedicated to the interpretation of algebraic matrices, we checked out how matrices might be offered geometrically, and known as it “X-way interpretation”. Such interpretation explicitly highlights numerous properties of matrix-vector multiplication, in addition to the conduct of matrices of a number of particular sorts.

Within the subsequent story of this collection, we are going to discover an interpretation of the multiplication of two matrices by working on their X-diagrams, so keep tuned for the second arrival!

My gratitude to:
– Roza Galstyan, for cautious overview of the draft,
– Asya Papyan, for the exact design of all of the used illustrations ( https://www.behance.net/asyapapyan ).

For those who loved studying this story, be happy to attach with me on LinkedIn, the place, amongst different issues, I may even submit updates ( https://www.linkedin.com/in/tigran-hayrapetyan-cs/ ).

All used photographs, except in any other case famous, are designed by request of the creator.

Source link

Understanding Matrices | Part 1: Matrix-Vector Multiplication

Is an Online Master’s Degree in AI a Good Idea?

I Built a C++ Backend So My GPU Would Stop Eating Air

I Spent May Evaluating Different Engines for OCR

Why AI Is NOT Stealing Your Job

What AI Agents Should Never Do on Their Own

Exploring Income Patterns with Python Pandas, Matplotlib, and Seaborn

I Took 200 Photos With the Motorola Razr Ultra and Here’s What I Learned

Is an Online Master’s Degree in AI a Good Idea?

How courts are coping with a flood of AI-generated lawsuits

Foregen aims to reverse circumcision with bio-engineered tissue

Featured Picks

Meta’s ‘Free Expression’ Push Results In Far Fewer Content Takedowns

Anthropic rolls out a new extension to MCP to let users interact with apps directly inside the Claude chatbot, with support for Asana, Figma, Slack, and others (Robert Hart/The Verge)

New AI video model generates video in real-time

Understanding Matrices | Part 1: Matrix-Vector Multiplication

Multiplication of a matrix by a vector

Rectangular matrices

Observing a number of particular matrices in X-way interpretation

Scale / diagonal matrix

90° rotation matrix

Change matrix

Shift matrix

Permutation matrix

Triangular matrix

Conclusion

Related Posts