Solving Linear Systems

Section 2.1 Solving Linear Systems

Subsection Elementary Operations

Our first discussion of linear algebra will cover the ideas of efficiently solving a system of linear equations and matrix operations.

A system of \(m\) linear equations in \(n\) variables can be written:

\begin{equation*} \begin{array}{rcrcrcrcr} a_{11} x_1 \amp +\amp a_{12} x_2 \amp +\amp ... \amp +\amp a_{1n}x_n \amp =\amp b_1 \\ a_{21} x_1 \amp +\amp a_{22} x_2 \amp +\amp ... \amp +\amp a_{2n}x_n \amp =\amp b_2 \\ \vdots \amp \amp \vdots \amp \amp \amp \amp \vdots \amp \amp \\ a_{m1} x_1 \amp +\amp a_{m2} x_2 \amp +\amp ... \amp +\amp a_{mn}x_n \amp =\amp b_m \end{array} \end{equation*}

The term \(a_{ij}\) is the coefficient of the \(j\)-th variable (denoted \(x_j\)) in the \(i\)-th equation. In these notes, we will only consider real values for the coefficients of our linear systems, i.e. \(a_{ij} \in \mathbb{R}\text{.}\) A solution is a choice of variable values that satisfiesall equations in the system. A solution is not a particular variable value but must include a choice for all variables in the system. The solution set for a system of equations is the set of all possible solutions. We will have many ways to describe solutions to a system this semester but they all specify the values of \(x_1\text{,}\) \(x_2\text{,}\) ..., and \(x_n\text{,}\) typically as an ordered \(n\)-tuple (\(x_1\text{,}\) \(x_2\text{,}\) …, \(x_n\)).

Activity 2.1.

Is \((1,2,3)\) a solution to the following system?

\begin{equation*} \begin{array}{rcrcrcrcr} 1 x_1 \amp +\amp 2 x_2 \amp +\amp 3 x_3 \amp =\amp 14 \\ % -1 3 2 x_1 \amp -\amp 3 x_2 \amp +\amp 2 x_3 \amp =\amp 0 \\ x_1 \amp \amp \amp +\amp 7 x_3 \amp =\amp 0 \end{array} \end{equation*}

The previous problem shows how easy it is to check if a set of variable values is a solution. However, finding a solution or the set of all solutions is harder but very important to many problems. Generally speaking, the process of finding the solution set for a system of equations is to trade the system of equations you have for an equivalent system (a system with the same solution set).

Activity 2.2.

For each pair of equations given, state whether \(E_1\) is equivalent to \(E_2\text{.}\)

(a)

\(E_1: x^2-1=0\) and \(E_2: x-1=0\)

(b)

\(E_1: x^2-2x+1=0\) and \(E_2: x-1=0\)

(c)

\(E_1: e^x=1\) and \(E_2: x^3+x^2+x=0\)

Hopefully it will be easier to explicitly write the solution set of the new equivalent system. An elementary operation on a system of equations is an operation of the form:

multiplying an equation by a non-zero scalar
switching two equations
adding a multiple of one equation to another equation

Activity 2.3.

For this question, we will consider the following system of linear equations:

\begin{align*} a_1 x_1+a_2x_2+a_3x_3\amp =a_4\\ b_1 x_1+b_2x_2+b_3x_3\amp =b_4 \end{align*}

(a)

Multiply the second equation in our system by negative three and state the new system of equations.

(b)

Write a few sentences about why the new system of equations given in the previous part is equivalent to the original system.

(c)

Write a few sentences about why switching the order in which equations are presented in a system does not change the set of solutions.

(d)

Write out the equation obtained by multiplying the second equation in the original system by a non-zero scalar (which we will call \(k\)) and adding to the first equation.

(e)

Replace the second equation in the original system with your answer to the previous part, which we will call System 2. Prove that System 2 is equivalent to the original system. In other words, you need to show that \((c_1,c_2,c_3)\) is a solution of the equations \(S_1\text{:}\)

\begin{align*} a_1 x_1+a_2x_2+a_3x_3\amp =a_4\\ b_1 x_1+b_2x_2+b_3x_3\amp =b_4 \end{align*}

if and only if \((c_1,c_2,c_3)\) is a solution to System 2.

Exercise 2.1.

Solve the following systems just using elementary operations. Remember to show your work.

(a)

\begin{align*} 2y+z\amp =4\\ x-3y+2z\amp =5\\ 2x+y \amp =-2 \end{align*}

(b)

\begin{align*} 3x-2y-z\amp =0\\ 2x+y+z\amp =10\\ x+4y+3z\amp =20 \end{align*}

(c)

\begin{align*} 3x-2y-z\amp =0\\ 2x+y+z\amp =10\\ x+4y+3z\amp=10 \end{align*}

A system of equations is consistent if there exists at least one solution to the system. In other words, a consistent system of equations has a nonempty solution set. A system that is not consistent is said to be inconsistent.

In Exercise 2.1, note that you didn’t change anything but the coefficients in the system of equations as you traded one system for another. Some of the coefficients probably became zero, but you didn’t really eliminate any variables or consider a totally different problem. We will use matrices to efficiently store, and manipulate the coefficients in a system of linear equations, since they are all that matter for now. Matrices will have many uses in this and other courses, and we will use capital letters like \(A\) and \(B\) to denote matrices. Matrices will be rectangular arrays with the same number of entries in each row and the same number of entries in each column. The size of a matrix is given (in order) as the number of rows by the number of columns, so a \(3\) by \(2\) matrix has \(3\) rows and \(2\) columns.

In order to specify what entry we are referring to in a matrix, we need an ordered pair of indices telling us the number of the row and number of the column to look in respectively. For instance, if

\begin{equation*} B=\begin{bmatrix} 1\amp 5\amp 0\\\heartsuit\amp \bigstar \amp \blacklozenge \\ £\amp \circledR\amp \maltese \end{bmatrix}\text{,} \end{equation*}

then the \((3,2)\) entry of \(B\) is in the third row and 2nd column. You could also write this as \(B_{3,2}= \circledR\text{.}\) The \(i\)-th row of a matrix \(A\) will be denoted \(row_i(A)\) and the \(j\)-th column will be denoted \(column_j(A)\text{.}\)

In order to distinguish vectors (as being more than just \(n\) by \(1\) matrices), we will use the arrow notation and lower case symbols like \(\vec{u}\) and \(\vec{v}\) to denote vectors. Unless otherwise stated, we will use column vectors. For instance, if \(\vec{v} =\begin{bmatrix}v_1\\ v_2\\\vdots\\ v_m\end{bmatrix}\text{,}\) then the second component of \(\vec{v}\) is the scalar \(v_2\text{.}\) The size of a vector in \(\mathbb{R}^n\) is the number of components the vector has. In Chapter 2, we will deal with a much more general notion of vectors that will not have components like vectors in \(\mathbb{R}^n\text{.}\)

The coefficient matrix of a linear system of \(m\) equations in \(n\) variables is a \(m\) by \(n\) matrix whose \((i,j)\) entry is the coefficient of the \(j\)-th variable, \(x_j\text{,}\) in the \(i\)-th equation of the system. The augmented matrix of a linear system of \(m\) equations in \(n\) variables is a \(m\) by \((n+1)\) matrix whose first \(n\) columns are the coefficient matrix of the system and the last column is the constant terms from the right side of each equation.

The system

\begin{equation*} \begin{array}{rcrcrcrcr} a_{11} x_1 \amp +\amp a_{12} x_2 \amp +\amp ... \amp +\amp a_{1n}x_n \amp =\amp b_1 \\ % -1 3 a_{21} x_1 \amp +\amp a_{22} x_2 \amp +\amp ... \amp +\amp a_{2n}x_n \amp =\amp b_2 \\ \vdots \amp \amp \vdots \amp \amp \amp \amp \vdots \amp \amp \vdots \\ a_{m1} x_1 \amp +\amp a_{m2} x_2 \amp +\amp ... \amp +\amp a_{mn}x_n \amp =\amp b_m \end{array} \end{equation*}

has a coefficient matrix

\begin{equation*} A=\begin{bmatrix} a_{11} \amp a_{12} \amp ... \amp a_{1n} \\ a_{21}\amp a_{22}\amp ... \amp a_{2n} \\ \vdots \amp \vdots \amp \amp \vdots \\ a_{m1}\amp a_{m2} \amp ... \amp a_{mn} \end{bmatrix} \end{equation*}

and an augmented matrix of

\begin{equation*} [A|b]=\begin{bmatrix} a_{11} \amp a_{12} \amp ... \amp a_{1n} \amp |\amp b_1\\ a_{21}\amp a_{22}\amp ... \amp a_{2n} \amp |\amp b_2\\ \vdots \amp \vdots \amp \amp \vdots \amp |\amp \vdots \\ a_{m1}\amp a_{m2} \amp ... \amp a_{mn} \amp |\amp b_m \end{bmatrix} \end{equation*}

For some properties of the system of equations, we need only look at the coefficient matrix but others will need the augmented matrix. It is important to know the difference and be able to state which corresponding matrix you are using in your work.

Activity 2.4.

Consider the various systems that have appeared earlier in this section when answering the following questions. That means each question will have several answers. Clearly indicate which earlier system each belongs to.

(a)

What is the coefficient matrix for the previous systems?

(b)

What is the augmented matrix for the previous systems?

The elementary operations on equations outlined above will correspond to elementary row operations on matrices as well. Specifically, an elementary row operation on a matrix is an operation of the form:

multiplying a row by a non-zero scalar
switching two rows
adding a multiple of one row to another row

We now have operations to trade our system of equations for an equivalent system, but we have not stated a way to make sure that the solution set will be easy to explicitly state from our new equivalent system. The following matrix forms will be useful for determining solution sets and various other properties of the corresponding system of equations.

Definition 2.2.

A rectangular matrix is in row echelon form if it has the following three properties:

All nonzero rows are above any rows of all zeros.
Each leading entry (being the first non-zero entry) of a row is in a column to the right of the leading entry of the row above it.
All entries in a column below a leading entry are zeros.

If a matrix in row echelon form satisfies the following additional properties, then we say the matrix is in reduced row echelon form:

The leading entry in each nonzero row is 1.
Each leading 1 is the only nonzero entry in its column.

The leading entry in a nonzero row of the row echelon form is called a pivot. The column in which a pivot occurs is called a pivot column and the corresponding variable is a basic variable or pivot variable. A variable corresponding to a column in which the coefficient matrix does not have a pivot are called free variables. While the echelon form is needed to find where pivots will occur, we will sometimes refer to pivot positions of a matrix even when the matrix is not in echelon form.

Theorem 2.3.

The reduced row echelon form of a rectangular matrix is unique.

It is important to note that the row echelon form of a matrix is not unique.

Activity 2.5.

Give an example of a matrix \(M\) that has the following properties. If such a matrix cannot exist, explain why.

(a)

\(M\) satisfies the first two properties of row echelon form but does not satisfy the third.

(b)

\(M\) satisfies the first and third properties of row echelon form but does not satisfy the second.

(c)

\(M\) satisfies the second and third properties of row echelon form but does not satisfy the first.

(d)

\(M\) satisfies the three properties of row echelon form but does not satisfy the first property of reduced row echelon form.

(e)

\(M\) satisfies the properties of row echelon form and the first property of reduced row echelon form but does not satisfy the second property of reduced row echelon form.

Exercise 2.4.

List out all possible row echelon forms of 3 by 4 matrices using the symbols \(\blacksquare\) for a pivot, \(*\) for a non-pivot entry (possibly \(0\)), and \(0\) (when an entry must be \(0\)). For each of these, list out which variables are pivot variables and which are free variables.

Hint.

There are 15 possible.

Exercise 2.5.

List out all possible reduced row echelon forms of 3 by 4 matrices using the symbols \(\blacksquare\) for a pivot, \(*\) for a non-pivot entry (possibly \(0\)), and \(0\) (when an entry must be \(0\)). What value must the \(\blacksquare\) entries be? For each of these, list out which variables are pivot variables and which are free variables.

Exercise 2.6.

Solve each of the following systems by converting to an augmented matrix and using elementary row operations to reduce the augmented matrix to reduced row echelon form. With each reduced row echelon form, put a box around all pivot entries. Use the system of equations corresponding to the reduced row echelon form to write out the solution set for each system.

(a)

\begin{align*} 3x_1-2x_2\amp =6\\ 2x_1-2x_2\amp =-2\\ -x_1+x_2\amp =1 \end{align*}

(b)

\begin{align*} 3x_1-2x_2\amp =6\\ 2x_1-2x_2\amp=6\\ -x_1+x_2\amp =1 \end{align*}

(c)

\begin{align*} 4x-y+3z\amp =5\\ 3x-y+2z\amp =7 \end{align*}

(d)

\begin{align*} 7x-11y-2z\amp =3\\ 8x-2y+3z\amp =1 \end{align*}

(e)

\begin{align*} 3r-5s+t\amp =2\\ -6r+10s-2t\amp =3 \end{align*}

Question 2.6.

Once you have the augmented matrix for a system of linear equations in reduced row-echelon form, how do you use it to determine the solution set for the system? Write a step-by-step procedure that is general enough to be used on any system of linear equations. Be aware of any implicit assumptions you’re making (and try to avoid them).

Two of the most important questions we will consider this semester are:

Is the system consistent?
If a solution exists, is the solution unique?

Question 2.7.

Look back at your results so far and try to figure out what properties of the system (or corresponding matrices) will help us answer question 1 and which properties of the system will help us answer question 2. Write a conjecture about each question.

In class, we came up with statements of the following two theorems:

Theorem 2.7. Consistency Theorem.

A system of equations is consistent if and only if the row echelon form of its augmented matrix has no pivot entries in the rightmost column. Equivalently, a system of equations is inconsistent if and only if the row echelon form of its augmented matrix has a pivot entry in the rightmost column.

Theorem 2.8. Uniqueness Theorem.

A system of \(m\) equations with \(n\) variables has a unique solution if and only if its augmented matrix has \(n\) pivot entries and no pivot entry in the rightmost column.

For the moment, proving these theorems is beyond our proof-writing skills. We may return to proving these theorems at a later stage, however.

Activity 2.8.

Using the statement of the Consistency Theorem and Uniqueness Theorem, treat each of your answers to Exercise 2.4 as an augmented matrix of a linear system of equations and state:

whether each corresponding system of equations will be consistent, inconsistent, or you can’t tell.
whether each corresponding system of equations will have a unique solution, multiple solutions, no solutions, or you can’t tell.

Activity 2.9.

Using the statement of the Consistency Theorem and Uniqueness Theorem, treat each of your answers to Exercise 2.4 as a coefficient matrix of a linear system of equations and state:

whether each corresponding system of equations will be consistent, inconsistent, or you can’t tell.
whether each corresponding system of equations will have a unique solution, multiple solutions, no solutions, or you can’t tell.

Hint.

You will probably need to restate the theorems or think about how coefficient matrices are different to augmented matrices!

Geometric Interpretation of a Solution Set.

Recall from earlier, that the solution set of a linear equation in two variables was a line in \(\mathbb{R}^2\) (the plane) and that the solution set of a system of two equations in two variables was possibly a point, a line, or empty. Similarly, the solution set for a linear equation in three variables will be a plane in 3-space (\(\mathbb{R}^3\)).

Activity 2.10.

List out all the possible ways two planes can intersect in a three dimensional space.
List out all the possible ways three planes can intersect in a three dimensional space.
List out all the possible ways four planes can intersect in a three dimensional space.
List out all the possible ways five planes can intersect in a three dimensional space.

We don’t usually draw what a solution set of a linear equation in four variables looks like because drawing in four dimensions is difficult. The graph os a single linear equation in four variables would be called a hyperplane in \(4\)-space. Although we don’t draw \(m\) hyperplanes in \(n\)-space, the intersections of hyperplanes will work very similarly to the pictures we can draw in 3-space (also known as \(\mathbb{R}^3\)).

Question 2.11.

Why does the graph of a linear equation in \(n\) variables need to be a flat \(n-1\) dimensional hyperplane?

We can use the open source computer algebra system SageMath to plot things, and we can even do it right here in the course notes. Click the button to plot a plane below.

Plotting the equations, \(3x-2y-z=0\text{,}\) \(2x+y+z=10\text{,}\) and \(x+4y+3z=20\) in red, yellow, and green respectively gives:

Question 2.12.

Does your answer to Task 2.1.b make sense with this plot? Explain.

Exercise 2.9.

For each of the systems in Exercise 2.6, use SageMath to draw a plot of each of the equations in the system and write a sentence for each system about why the plot and your answer to Exercise 2.6 make sense.

Hint.

You can edit the code block above and click the button again, and it will update the graph.

If you remember parametric equations of lines and planes in space from multivariable calculus, then we will return to those ideas soon

Prev Top Next