
Thursday, November 3 

Inner product review
If x=(x_{1},x_{2},...,x_{n})
y=(y_{1},y_{2},...,y_{n}) are vectors in
R^{n}, then
<x,y>=SUM_{j=1}^{n}x_{j}y_{j}.
There are some very useful fundamental properties of inner product.
Vectors x and y in R^{n} are orthogonal or perpendicular if
Significant example
I wanted to diagonalize
the matrix A given by
( 0 1 1) (1 1 2) ( 1 2 1)I will need (if possible, but it will be possible!) to find an invertible matrix C and a diagonal matrix D so that C^{1}AC=D. The diagonal entries in D will be the eigenvalues of A, in order, and the columns of C will be corresponding eigenvectors of A. the characteristic polynomial of C is
( 1 1) det(1 1 2) ( 1 2 1)which I computed somehow and it was what I thought it would be. Maple has the following command:
>charpoly(A,x); 3 2 x  2 x  5 x + 6
I found one
eigenvector, and students, as their QotD, found the others.
If =1, take (2,1,1)
If =2, take (1,1,1)
If =3, take (0,1,1).
I asked students to look at these eigenvectors. They did, and we
didn't seem to learn much.
The C matrix is
( 2 1 0) (1 1 1) ( 1 1 1)Now look at C^{t}:
( 2 1 1) (1 1 1) ( 0 1 1)We computed the matrix product C^{t}C. The result was
(6 0 0) (0 3 0) (0 0 2)The 0's were caused by the orthogonality of the different vectors in the basis of eigenvectors. If the result had diagonal entries 1, 1, and 1, then the transpose would be the inverse. But I can make this happen by choosing slightly different eigenvectors: multiplying them by a scalar ("normalizing" them) to have length 1. So choose C to be, instead,
( 2/sqrt(6) 1/sqrt(3) 0 ) (1/sqrt(6) 1/sqrt(3) 1/sqrt(2)) (1/sqrt(6) 1/sqrt(3) 1/sqrt(2))and then the transpose of C will be C^{1}. This really is remarkable.
The major result
We can always diagonalize any symmetric matrix by using a matrix of
normalized eigenvalues. The eigenvalues are orthogonal, and the
resulting matrix used for changing bases has the wonderful property
that its inverse is its transpose. A matrix C such that
C^{1}=C^{t} is called orthogonal.
Here's another version:
Every symmetric matrix has a basis of orthogonal eigenvectors, which, when normalized, form an orthogonal matrix. The matrix of eigenvectors and its {inversetranspose} diagonalize the symmetric matrix. 
This result is not easy to prove, and is usually the triumph (?) of our Math 250 course.
Why is an orthonormal basis (length of each basis element 1 and each perpendicular to the other) nice to compute with? Well, look:
( 3 7  1 ) ~ ( 1 7/3  1/3 ) ~ ( 1 0  5/29 ) (2 5  0 ) ( 0 29/3  2/3 ) ( 0 1  2/29 )Everytime I try arithmetic in public ... well, this really isn't too bad.
( 3 7  0 ) ~ ( 1 7/3  0 ) ~ ( 1 0  7/29 ) (2 5  1 ) ( 0 29/3  1 ) ( 0 1  3/29 )Everytime I try arithmetic in public ... well, this really isn't too bad.
( 3 7  8 ) ~ ( 1 7/3  8/3 ) ~ ( 1 0  30/29 ) (2 5  10 ) ( 0 29/3  46/3 ) ( 0 1  46/29 )Everytime I try arithmetic in public ... well, this really isn't too bad.
Thinking about what we did
Maybe we should think about what we did. Certainly a number of
questions arise. I can think of two questions, immediately.
And now some more
Well, I could think a bit about what's going on. We are really
studying a linear system. Here it is:
3x_{1}+7x_{2}=y_{1}
2x_{1}+5x_{2}=y_{2}
Maybe look at the idea:
 y_{1}> SIMPLE LIN >x_{1} y_{2}> EAR SYSTEM >x_{2} The system of linear equations reflects some sort of "device", if you wish. The inputs, the pair of y's, push some information out to the x's. Everything is linear, a very simple model which is hardly ever fulfilled in real life (if you believe in Hooke's Law, pull a rubber band for twenty feet!). But if we multiply (in this model) by, say, 365, then the x's will get multiplied by 365. If, also, we have two different pairs of y's and add them, and compare the pairs of x's, the result should be the sum of the outputs, the x's, from the two (pairs of) inputs.
Well the x's which correspond to the input of
(1) are the pair (5/29) (0) (2/29)and the x's which correspond to the input of
(0) are the pair (7/29) (1) ( 3/29)Now in the third example, we want the x's which correspond to
( 8) = 8(1)+10(0) (10) (0) (1)but if you believe in what I wrote above, this must be
8(5/29)+10(7/29)=({4070}/29)=(30/29) (2/29) ( 3/29) ({16+30}/29) ( 46/29)Hey, hey: this is the answer we got. It certainly should be.
The inverse of a matrix
Of course this should all be systematized. Well, the answer can
be recognized as a matrix product:
( 5/29 7/29 ) ( 8 ) ( 2/29 3/29 ) ( 10 )and the 2by2 matrix which appears there is called A inverse, usually written A^{1}. If you multiply A by this matrix, the result is the matrix
(1 0) (0 1)which is called the 2by2 identity matrix.
Definition Suppse A is an nbyn matrix. Then the inverse of A, usually written A^{1}, is an nbyn matrix whose product with A is the nbyn identity matrix, I_{n} (a square matrix with diagonal entries equal to 1 and offdiagonal entries equal to 0).
Now if
A=( 3 4 ) and, say, C= ( 3 2 ) you can check that AC=(1 0)=I_{2} ( 5 6 ) ( 5/2 3/2) (0 1)so that C is A^{1}. Solving the linear system AX=B where
X=(x) and B=(u) (y) (v)can be done by multiplying AX=B by C on the left, and we get
An algorithm for finding inverses
If A is a square matrix then augment A by I_{n}, an identity
matrix of the same size: (AI_{n}). Use row reduction to get
(I_{n}C). Then C will be A^{1}. If row reduction is
not successful (the I_{n} doesn't appear), A does not have an
inverse (this can happen: rank(A) could be less than n).
How much work is computing A^{1} this way? Here I used the word "work" in a rather elementary sense. I tried to convince people that a really rough bound on the number of arithmetic operations (add, multiply, divide, etc.) to find A^{1} using this method is, maybe, 6n^{3}.
Where did this come from? Each time we clear a column on the right to make another column of the identity matrix, we might need to divide or multiply a whole row by a number and add or subtract it from another row. This is 2n amount of work (it is actually 6n if you count a bit more carefully). But we have n columns. So there should be 6n^{3} work. In fact, if you are careful and clever, the coefficient can be reduced fairly easily (but not a lot). More significant in the "real world" is the exponent, which really pushes the computational growth. If you are very clever, the exponent can be reduced, but not by a lot. I also mentioned that of course the storage for this computation is about 2n^{2} (the size of the augmented matrix). All of this actually isn't too bad. Finding A^{1} this way is a problem which can be computed in polynomial time. Such problems are actually supposed to be rather nice computationally.
Vocabulary
A matrix which has no inverse is called singular. A matrix
which has an inverse is called invertible or
nonsingular or regular.
How can it fail
The algorithm can fail of there is a column which has only 0's where
there should be something nonzero. This occurs if the rank of the
matrix is not n.
AX=B, where A is an nbyn matrix  

considerations  
Full rank; regular; nonsingular; invertible  Singular  
equation  AX=0 has only the trivial solution.  AX=0 has infinitely many solutions in
addition to the trivial solution. 
equations  AX=Y has a unique
solution for all Y. If you know A^{1}, then X=A^{1}Y  There are Y's for which AX=Y has no
solution; for all other Y's, AX=Y has infinitely many solutions 
There are really two problems here.
(3 1 3) (2 0 1) (2 2 2)and that Y=
( 5) (2) ( 3).Find X solving AX=Y.

Thursday, October 13 

(BLOCK OF 1's & 0's WITH JJJJJJ UU UU NN N K K  Linear stuff ) (THE 1's MOVING DOWN AND JJ UU UU N N N KK  from original) (TO THE RIGHT.  JJJJ UUUUUU N NN K K  right sides ) () ( MAYBE HERE SEVERAL  More such ) ( ROWS OF 0'S  linear stuff )Please observe that the lower righthand corner now plays the part of the compatibility conditions which must be satisfied. All of those linear "fragments" must be equal to 0 if the original system has solutions. Now if these are 0, we can "read off" solutions in much the same manner as the example. The block labeled JUNK in fact tells us with its width how many free parameters there are in the solutions. Notice that the JUNK block could have size 0 (for example, consider the silly system x_{1}=567,672, already in RREF!) and in that case the system would have only one solution.
Problem
Are the vectors (4,3,2) and (3,2,3) and (4,4,3) and (5,2,1) in
R^{3} linearly independent? Now I wrote the vector
equation:
A(4,3,2)+B(3,2,3)+C(4,4,3)+D(5,2,1)=0 (this is (0,0,0) for this
instantiation of "linear independence") which gives me the system:
4A+3B4C+5D=0 (from the first components of the vectors)
3A+2B+4C+2D=0 (from the second components of the vectors)
2A+3B+3C+1D=0 (from the third components of the vectors)
and therefore would need to row reduce
( 4 3 4 5 ) ( 3 2 4 2 ) ( 2 3 3 1 )I started to do this, but then ... thought a bit. My goal was to get the RREF of this matrix, and use that to argue about whether the original system had solutions other than the trivial solution.
What can these RREF's look like? Let me write all of the RREF's possible whose first column (as here) has some entry which is not zero.
Those RREF's which have three initial 1's: ( 1 0 0 * ) ( 1 0 * 0 ) ( 1 0 * * ) ( 1 * 0 0 ) ( 0 1 0 0 ) ( 0 1 0 * ) ( 0 1 * 0 ) ( 0 1 * * ) ( 0 0 1 0 ) ( 0 0 1 0 ) ( 0 0 1 * ) ( 0 0 0 1 ) ( 0 0 0 0 ) ( 0 0 0 1 ) ( 0 0 0 1 ) Those RREF's which have two initial 1's: ( 1 * 0 * ) ( 1 * * 0 ) ( 0 1 0 * ) ( 0 1 * 0 ) ( 0 0 1 0 ) ( 0 0 1 * ) ( 0 0 0 1 ) ( 0 0 1 * ) ( 0 0 0 1 ) ( 0 0 0 1 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) Those RREF's which have one initial 1: ( 1 * * * ) ( 0 1 * * ) ( 0 0 1 * ) ( 0 0 0 1 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 ) Finally, the only RREF with no initial 1: ( 0 0 0 0 ) ( 0 0 0 0 ) ( 0 0 0 0 )In all of these 3 by 4 matrices, the entry * stands for something which could be any number, zero or nonzero. I hope I have them all. The point of this silly exercise was to base my conclusions upon some structural element of the RREF rather than to do particular arithmetic with the matrix. I am interesting in whether the homogeneous system represented by these matrices has nontrivial solutions.
Let's look at the first answer:
( 1 0 0 * ) ( 0 1 0 * ) ( 0 0 1 * )Here the system is A+*D=0 and B+*D=0 and C+*D=0. There is no constraint on what D must be. Therefore there are actually nontrivial solutions. So any RREF with any *'s in it represents a homogeneous system with nontrivial solutions.
( 0 0 1 0 ) ( 0 0 0 1 ) ( 0 0 0 0 )Now the system is C=0 and D=0. The variables A and B are assigned columns which are all 0's, and therefore they have nothing to do with the system. Their values are unconstrained. Thus there are again solutions to the original homogeneous system.
Look at all of the RREF's displayed. They have *'s or columns which are all 0's or, sometimes, both. In all cases, the homogeneous system has nontrivial solutions. The maximum number of initial 1's is 3, but there are 4 columns, and that inequality (4>3) produces either *'s or columns of 0's. This holds in general.
A homogeneous system with more variables than equations always has an infinite number of nontrivial solutions. 
Many problems in engineering and science turn out to be successfully modeled by systems of linear equations. My examples so far have mostly been"inspired" by the things I believe you will see in studying ODE's and PDE's, and have also been influenced by some of the primary objects in numerical analysis. The textbook has a wonderful diagram, a sort of flow chart, for solving linear systems on p. 366. I think our discussions in class have been sufficient for you to verify the diagram, and the diagram contains just about everything you needs to know about the theory behind solution of systems of linear equations.
My class metaphor here was the reduced row echelon monster, whose name is 'Tilda. 'Tilda roams the linear algebra woods, and eats matrices, and excretes them in RREF. 'Tilda is a large and peaceful lizard, with a much simpler internal construction than, say, the derivative monster. The internal organs of 'Tilda only include add, subtract, multiply, and divide, for use in the row operations. The derivative monster is much more illtempered, because it must deal with many different digestive processes.
The textbook has a wonderful diagram, a sort of flow chart, for solving linear systems on p. 366. I think our discussions in class have been sufficient for you to verify the diagram, and the diagram contains just about everything you need to know about the theory behind solution of systems of linear equations. The only remaining definition you need is that of the rank of a matrix. The rank is the number of nonzero rows in the RREF of the matrix. The 3by4 RREF's displayed above are shown with rank=3, then rank=2, then rank=1, and finally, rank=0. Here is my version of the diagram in HTML:
For m linear equations in n unknowns AX=B Two cases: B=0 and B not 0. Let rank(A)=r. AX=0    \ / v        \ / \ / v v Unique sol'n: X=0 Infinite number of sol'ns. rank(A)=n A rank(A)<n, nr arbitrary parameters in the sol'n BThe red letters refer to examples which I will give to illustrate each outcome. Also (vocabulary!), consistent means the system has solutions, and inconsistent means there are none. Here is a tabular listing of the alternatives, if you find this more palatable. The B=0 case, the homogeneous system, always has the trivial solution (used for, say, deciding linear independence). So the B=0 case is always consistent. Two alternatives can occur:
AX=B, B not 0    \ / v        \ / \ / v v Consistent Inconsistent rank(A)=rank(AB) rank(A)<rank(AB) E    \ / v        \ / \ / v v Unique solution Infinite number of sol'ns rank(A)=n C rank(A)<n nr arbitrary parameters in the sol'n D
AX=0: m equations in n unknowns; B=0  

I  II 
Unique sol'n: X=0 rank(A)=n A 
Infinite number of solutions. rank(A)<n, nr arbitrary parameters in the solutions B 
When B is not zero then we've got:
AX=0: m equations in n unknowns; B not 0  

Consistent rank(A)=rank(AB)  Inconsistent rank(A)<rank(AB)  
III  IV  V 
Unique solution rank(A)=n C 
Infinite number of solutions
rank(A)<nr arbitrary parameters in the solutions D 
No solutions E 
By the way, I do not recommend that you memorize this information. No one I know has done this, not even the most compulsive. But everyone I know who uses linear algebra has this installed in their brains. As I mentioned in class, I thought that a nice homework assignment would be for students to find examples of each of these (in fact, there have already been examples of each of these in the lectures!). The problem with any examples done "by hand" is that they may not reflect reality. To me, reality might begin with 20 equations in 30 unknowns, or maybe 2,000 equations ....
The examples follow, and are (I hope!) simple. They are different from the examples I gave in class, since I certainly left my brain home before class. Remember:m=number of equations; n=# of variables; r=rank(A), where A is the coefficient matrix:
( 2 3)~( 1 3/2)~( 1 0 ) (5 7) ( 0 29/2) ( 0 1 )I think you could already have seen that the rows of this 2by2 matrix were linearly independent just by looking at it (don't try "just looking at" a random 500by300 matrix!), so r=2. There is a unique solution, x=0 and y=0, the trivial solution.
Here's another example of this case:
2x+3y=0
5x+7y=0
4x+5y=0
m=2; n=2; r=2. r=2 since r is at least 2 using the previous row
reduction, and r can be at most 2 since the number of variables is 2.
There is a
unique solution, x=0 and y=0, the trivial solution.
( 2 3  1)~( 1 3/2  1/2)~( 1 0  1/2  (3/2)·(9/29)) (5 7  2) ( 0 29/2  9/2) ( 0 1  9/29 )There's exactly one solution which row reduction has produced: x=1/2(3/2)·(9/29)) and y=9/29.
( 2 3  1) ( 1 3/2  1/2) ( 1 0  1/2  (3/2)·(9/29)) (5 7  2)~( 0 29/2  9/2)~( 0 1  9/29 ) ( 4 5  3) ( 0 1  1 ) ( 0 0  1(9/29) )I am lazy and I know that 1(9/29) is not 0, so the row reduction showed that rank(A)=2<rank(AB)=3: case V, with no solutions.
HOMEWORK
Please read section 8.3 and hand in these problems on Monday: 8.3: 9, 13, 16, 19.

Thursday, October 5 

I began by writing the following on the side board.
A linear combination of vectors is a sum of scalar multiples of the
vectors.
A collection of vectors is spanning if every vector can be written as the linear combination of vectors in the collection. A collection of vectors is linearly independent if, whenever a linear combination of the vectors is the zero vector, then every scalar coefficient of that linear combination must be zero. 
The language and ideas of linear algebra are used everywhere in applied science and engineering. Basic calculus deals fairly nicely with functions defined by formulas involving standard functions. I asked how we could understand more realistic data points. I will simplify, because I am lazy. I will assume that we measure some quantity at one unit intervals. Maybe we get the following:
Tabular presentation of the data  Graphical
presentation of the data  


We could look at the tabular data and try to understand it, or we could plot the data because the human brain has lots of processing ability for visual information. But a bunch of dots is not good enough  we want to connect the dots. O.k., in practice this means that we would like to fit our data to some mathematical model. For example, we could try (not here!) the best fitting straight line or exponential or ... lots of things. But we could try to do something simpler. We could try to understand this data as just points on a piecewise linear graph. So I will interpolate the data with line segments, and even "complete" the graph by pushing down the ends to 0. The result is something like what's drawn to the right. I will call the function whose graph is drawn F(x). 
Meet the tent centered at the integer j
First,
here is complete information about the function T_{j}(x) which
you are supposed to get from the graph of T_{j}(x): (j
should be an integer)
This is a peculiar function. It is 0 for x<j1 and for x>j+1. It has height 1 at j, and interpolates linearly through the points (j1,0), (j,1), and (j+1,0). I don't much want an explicit formula for T_{j}(x): we could clearly get one, although it would be a piecewise thing. Actually, T_{j}(x) could be written nicely in Laplace transform language!
We can write F(x) in terms of these T_{j}(x)'s. Moving (as we
were accustomed in Laplace transforms!) from left to right, we first
"need" T_{1}(x). In fact, consider 5T_{1}(x) and
compare it to F(x). I claim that these functions are exactly
the same for x<=1. Well, they are both 0 for x<=0. Both of these
functions linearly interpolate between the points (0,0) and (1,5), so
in the interval from 0 to 1 the graphs must be the same (two points
still do determine a unique line!). Now consider
5T_{1}(x)+6T_{1}(x) and F(x) compared for x's less
than 2. Since T_{2}(x) is 0 for x<1, there is no interference in
the interval (infinity,1]. But between 1 and 2, both of the "pieces"
T_{1}(x) and T_{2}(x) are nonzero. But the sum of
5T_{1}(x)+6T_{1}(x) and F(x) match up at (1,5) and
(2,6) because we chose the coefficients so that they would. And both
of the "tents" are degree 1 polynomials so that their sum is also, and
certainly the graph of a degree 1 polynomial is a straight line, so
(again: lines determined by two points!) the sum
5T_{1}(x)+6T_{1}(x) and F(x) totally agree in the
interval from 1 to 2. Etc. What do I mean? Well, I mean that
F(x)= 5T_{1}(x)+6T_{1}(x)+2T_{1}(x)+4T_{1}(x)+
2T_{1}(x)+4T_{1}(x).
These functions agree for all x.
Linear combinations of the tents span these piecewise linear
functions
If we piecewise linearly interpolate data given at
the integers, then the resulting function can be written as a linear
combination of the T_{j}'s. Such linear combinations can be
useful in many ways (for example, the definite integral of F(x) is the
sum of constants multiplied by the integrals of the T_{j}(x),
each of which has total area equal to 1!). The T_{j}(x)'s are
enough to span all of these piecewise linear functions.
But maybe we don't need all of the T_{j}'s. What if someone
came up to you and said, "Hey, you don't need T_{33}(x) because:"
T_{33}(x)=53T_{12}(x)4T_{5}(x)+9T_{14}(x)
Is this possibly correct? If it were correct, then the function
T_{33}(x) would be redundant (extra, superfluous) in our
descriptions of the piecewise linear interpolations, and we wouldn't
need it in our linear combinations. But if
T_{33}(x)=53T_{12}(x)4T_{5}(x)+9T_{14}(x)
were correct, it should be true for all x's. This means we can
pick any x we like to evaluate the functions, and the resulting
equation of numbers should be true. Hey: let's try x=33. This is not
an especially inspired choice, but it does make T_{33}'s value
equal to 1, and the value of the other "tents" in the equation equal
to 0. The equation then becomes 1=0 which is currently
false.
Therefore we can't throw out
T_{33}(x). In fact, we need every T_{j}(x) (for each
integer j) to be able to write piecewise linear interpolations.
We have no extra T_{j}(x)'s: they are all
needed.
Let me rephrase stuff using some linear algebra language. Our "vectors" will be piecewise linear interpolations of data given at integer points, like F(x). If we consider the family of "vectors" given by the T_{j}(x)'s, for each integer j, then:
I emphasized that one reason I wanted to consider this example first is because we use linear algebra ideas constantly, and presenting them in a more routine setting may discourage noticing this. My second major example does, however, present things in a more routine setting, at least at first.
My "vectors" will be all polynomials of degree less than or equal to 2. So one example is 5x^{2}6x+(1/2). Another example is (Pi)xx^{2}+0x+223.67, etc. What can we say about this stuff?
I claim that every polynomial can be written as a sum of 1 and x and
x^{2} and the Clark polynomial, which was something like
C(x)=3x^{2}9x+2. It was generous of Mr. Clark to volunteer his polynomial. I
verified that, say, the polynomial 17x^{2}+44x98 could indeed
be written as a sum of 1 and x and x^{2} and the (fabulous)
Clark polynomial,
C(x)=3x^{2}9x+2 (well, it was something like
this!). Thus I need to find numbers filling the empty spaces in the
equation below, and the numbers should make the equation correct.
17x^{2}+44x98=[ ]1+[ ]x+[ ]x^{2}+[ ]C(x)
Indeed, through great computational difficulties I wrote
2+44x98=1021+62x+11x^{2}+2C(x)
(I think this is correct, but again I am relying upon carbonbased
computation, not siliconbased computation!)
We discussed this and concluded that the span of 1 and x and
x^{2} and C(x) is all the polynomials of degree less
than or equal to 2. But, really, do we need all of these? That is, do
we need the four numbers (count them!) we wrote in the equation above,
or can we have fewer? Well, 1 and x and x^{2} and C(x)
are not linearly independent. They are, in fact, linearly
dependent. There is a linear combination of these four which is
zero. Look:
1C(x)+(3)x^{2}+(9)x+(2)1=0.
so that
C(x)=3x^{2}+9x+21
and we can "plug" this into the equation
17x^{2}+44x98=1021+62x+11x^{2}+2C(x)
to get
17x^{2}+44x98=1021+62x+11x^{2}+2(3x^{2}+9x+21)
But a linear combination of linear combinations is also a linear combination (they concatenate well). So {1,x,x^{2},C(x)} certainly span the quadratic polynomials, but so does {1,x,x^{2}}. We also notice that {1,x,C(x)} spans everything. But are there any more redundancies? Does {1,C(x)} span all quadratic polynomials? Mr. Mostiero suggested that I try to write 1+x^{2} as (scalar)1+(another scalar)C(x). In order to get the x^{2} term, I need "another scalar" to be not zero. But then the sum (scalar)1+(another scalar)C(x) must have a term involving x, and therefore cannot be equal to 1+x^{2}.
We could sort of diagram what's going on. There seem to be various levels. The top level, where the collections of "vectors" are largest, are spanning sets. Certainly any set bigger than a spanning set is also a spanning set! Now shrink the set to try eliminate redundancies, extra "vectors" that you really don't need for descriptions. There may be various ways to do this shrinking. Clearly (?) you can continue shrinking until you eliminate linear dependencies, and the result will be a linearly independent set. Once the set is linearly independent, then even smaller sets will be linearly independent. But if you continue shrinking, you may lose the spanning property. You may not be able to describe everything in terms of linear combinations of the designated "vectors".
Spanning (with redundancy) {1,x,x^{2},C(x)} SPANNING! / \ SPANNING! / \ SPANNING! Spanning, no redundancy {1,x,x^{2}} {1,x,C(x)} SPANNING! LINEARLY INDEPENDENT! (linearly independent)  LINEARLY INDEPENDENT!  LINEARLY INDEPENDENT!  LINEARLY INDEPENDENT! Not spanning (not big {1,C(x)} LINEARLY INDEPENDENT! enough)There's sort of one level where spanning and linearly independent overlap. That's the collections of "vectors" where everything can be described by linear combinations and there are no redundancies.
Some weird polynomials to change our point of view
Look at these polynomials of degree 2:
P(x)=(x1)(x2)
Q(x)=(x1)(x3)
R(x)=(x2)(x3)
Why these polynomials? Who would care
about such silly polynomials?
Are these polynomials linearly independent?
This was the QotD. I remarked that I was asking students to
show that if
A P(x)+B Q(x)+C R(x)=0
for all x, then the students would need to deduce that A=0 and B=0 and
C=0.
1A+1B+1C=0 3A4B5C=0 2A+3B+6C=0and then I row reduced (by hand!) the coefficient matrix:
( 1 1 1 ) ( 1 1 1 ) ( 1 0 1 ) ( 1 0 0 ) (3 4 5 )~( 0 1 2 )~( 0 1 2 )~( 0 1 0 ) ( 2 3 6 ) ( 0 1 4 ) ( 0 0 2 ) ( 0 0 1 )This shows that the original system was row equivalent to the system A=0 and B=0 and C=0 (remember that "row equivalent"<==>"same solution set"), therefore there are no solutions to the equation A P(x)+B Q(x)+C R(x)=0 except the trivial solution. And therefore P(x) and Q(x) and R(x) are linearly independent: none of them are "redundant".
But can I describe all of the deg<=2 polys this way?
I assert that every polynomial of degree less than or equal to
2 can be described as a linear combination of P(x) and Q(x) and
R(x). How would I verify this claim? Please note that
I am more interested in the logic than the computational details
here!
I should be able to write x^{2} as sum of P(x) and Q(x) and
R(x). This means I should be able to solve the equation
A P(x)+B Q(x)+C R(x)=x^{2}.
Just as above, this leads to an augmented matrix which looks like:
( 1 1 1  1 ) ( 1 0 0  FRED ) (3 4 5  0 )~~~( 0 1 0  MARISSA) ( 2 3 6  0 ) ( 0 0 1  IRVING )I know this is true, since I already did the row operations above. Right now I am not totally interested in the values of FRED and MARISSA and IRVING but I know that the row operations just involve adding and multiplying and interchanging, so there must be such numbers. And therefore there are numbers which satisfy the equation:
( 1 1 1  0 ) (3 4 5  1 ) ( 2 3 6  0 )and just as before there will be solutions, and so x is in the span of P(x) and Q(x) and R(x). And so is 1. Since the linear combinations of x^{2} and x and 1 are all of the polynomials of degree 2 or less, and each of x^{2} and x and 1 is a linear combination of P(x) and Q(x) and R(x), I know that the span of P(x) and Q(x) and R(x) is all polynomials of degree 2 or less. So each of P(x) and Q(x) and R(x) is needed and there are "enough" of them. Notice that all I needed to do was verify that the RREF of the matrix above was in the 1 0 0 etc. form. Then everything automatically followed!

So P(x) and Q(x) and R(x) are a basis of the polynomials of degree less than or equal to 2.
Why would we look at P(x) and Q(x) and
R(x)?
Suppose again I have data points, let's say (1,13) and (2,18) and
(3,9). I could do linear interpolation as we did above. If I want to be
a bit more sophisticated, and get maybe something smoother, I could
try to get a polynomial Ax^{2}+Bx+C which fits these data
points. Here is the polynomial:
[9/2]P(x)+[18/2]Q(x)+[13/2]R(x). How was this remarkable
computation done? Well, I know certain function values
The function  Its values when x=1 and x=2 and x=3  

P(x)  0  0  2 
Q(x)  0  1  0 
R(x)  2  0  0 
so when I "plug in" x=1 and x=2 and x=3 in the linear combination A P(x)+B Q(x)+C R(x) and I want to get 13 and 18 and 9, respectively, the structure of the table makes it very easy to find A and B and C. If we want to interpolate quadratically, I would get a function defined by a simple formula using this basis. In fact, these functions are very useful in quadratic interpolation, and in the use of splines, a valuable technique for numerical approximation of solutions of ordinary and partial differential equations.
HOMEWORK
An exam is coming. Please look at the review
material. I will be in Hill 525 at 4 PM on Sunday.

Monday, October 3 

3x_{1}+2x_{2}+x_{3}x_{4}=A
4x_{1}1x_{2}+5x_{3}+x_{4}=B
2x_{1}+5x_{2}3x_{3}3x_{4}=C
These questions will be important:
(3 2 1 1  A) (4 1 5 1  B) (2 5 3 3  C)The vertical bar is used to distinguish between the two sides of the equations. It is useful to change the collection of linear equations so that the system becomes easier to understand. These changes should all be reversible, so that these equivalent systems will have the same set of solutions.
3 by 4 2 by 7 (1 0 0 0) (1 0 0 0 0 4 4) (0 0 1 5) (0 0 0 0 1 2 3) (0 0 0 0)And here is essentially a complete list of all possible 3 by 3 RREF matrices:
(1 0 0) (1 0 *) (1 * *) (1 * 0) (0 1 0) (0 1 *) (0 0 1) (0 0 0) (0 1 0) (0 1 *) (0 0 0) (0 0 1) (0 0 1) (0 0 0) (0 0 0) (0 0 0) (0 0 1) (0 0 0) (0 0 0) (0 0 0) (0 0 0) (0 0 0) (0 0 0) (0 0 0)The entries * may be any number, 0 or nonzero. There aren't very many and they are easy to understand. Notice that each of these is essentially different from the others: they represent coefficient matrices with very different solution sets.
My example
I then tried to use row operations on the augmented matrix of my
system of linear equations so that the coefficient matrix was in
RREF. I did this by hand. What follows is taken from a Maple
session (I did remove my errors!). Maple does have a
rref command, but it also allows row operations one at a
time. I will comment in this manner
about each command.
>with(linalg): This loads the linear algebra package. >M:=matrix(3,5,[3,2,1,1,A,4,1,5,1,B,2,5,3,3,C]); [3 2 1 1 A] [ ] M := [4 1 5 1 B] [ ] [2 5 3 3 C] This command just creates a matrix of specified size with the listed entries. >mulrow(M,1,1/3); [1 2/3 1/3 1/3 A/3] [ ] [4 1 5 1 B ] [ ] [2 5 3 3 C ] Multiply row 1 by a third. Creates an initial 1. >addrow(%,1,2,4); [1 2/3 1/3 1/3 A/3 ] [ ] [ 4 A ] [0 11/3 11/3 7/3   + B] [ 3 ] [ ] [2 5 3 3 C ] Add 4 times row 1 to row 2. This "pivots" and makes the (2,1) entry of the coefficient matrix 0. >addrow(%,1,3,2); [1 2/3 1/3 1/3 A/3 ] [ ] [ 4 A ] [0 11/3 11/3 7/3   + B] [ 3 ] [ ] [ 2 A ] [0 11/3 11/3 7/3   + C] [ 3 ] Add 2 times row 1 to row 3. So the (3,1) entry becomes 0. >mulrow(%,2,3/11); [1 2/3 1/3 1/3 A/3 ] [ ] [ 7 4 A 3 B] [0 1 1    ] [ 11 11 11 ] [ ] [ 2 A ] [0 11/3 11/3 7/3   + C] 3 Makes another leading 1. >addrow(%,2,1,2/3); [ A 2 B] [1 0 1 1/11  + ] [ 11 11 ] [ ] [ 7 4 A 3 B ] [0 1 1     ] [ 11 11 11 ] [ ] [ 2 A ] [0 11/3 11/3 7/3   + C ] [ 3 ] Makes the (1,2) entry equal to 0. >addrow(%,2,3,11/3); [ A 2 B ] [1 0 1 1/11  +  ] [ 11 11 ] [ ] [ 7 4 A 3 B ] [0 1 1     ] [ 11 11 11 ] [ ] [0 0 0 0 2 A + B + C] And now the (3,2) entry is 0. The coefficient matrix is now in RREF.Well, this went a heck of a lot better than when I did it in class. I will try, as I said, to avoid doing very much row reduction in class. I am totally inept at it.
Now back to the questions:
What if I know 2A+B+C=0? It turns out that there are solutions: the system is consistent. Let me check this claim by choosing some values of A and B and C which will make 2A+B+C=0 true. How about A=4 and B=3 and C=5? Then 2A+B+C=2(4)+3+5=0 (I hope). The RREF system then becomes (inserting these values of A and B and C [I did this in my head so there may be ... errors]):
[ 10 ] [1 0 1 1/11  ] [ 11 ] [ ] [ 7 7 ] [0 1 1    ] [ 11 11 ] [ ] [0 0 0 0 0 ]Then the first equation (unabbreviated) is x_{1}+x_{3}+(1/[11])x_{4}=([10]/[11]) so that
Be sure to look carefully at the signs, to check on what I've written. The equations have been written this way so that you can see that x_{3} and x_{4} are free. That is, I can give any values for these variables. Then the other variables (x_{1} and x_{2}) will have their values specified by what is already given. So: we select A and B and C satisfying the compatibility condition. Then there always will be a twoparameter family of solutions to the original system of linear equations. Notice that we get solutions exactly when the compatibility condition is satisfied: there are solutions if and only if (as math folks might say) the compatibility condition is correct.
The logic here is actually "easy". Since all the computational steps we performed are reversible, I know that the assertions I just made are correct. What is more wonderful is that the general situation will always be much like this.
What RREF does in general
Take your original augmented matrix, and put
the coefficient matrix into RREF. Then you get something like
(BLOCK OF 1's & 0's WITH JJJJJJ UU UU NN N K K  Linear stuff ) (THE 1's MOVING DOWN AND JJ UU UU N N N KK  from original) (TO THE RIGHT.  JJJJ UUUUUU N NN K K  right sides ) () ( MAYBE HERE SEVERAL  More such ) ( ROWS OF 0'S  linear stuff )Please observe that the lower righthand corner now plays the part of the compatibility conditions which must be satisfied. All of those linear "fragments" must be equal to 0 if the original system has solutions. Now if these are 0, we can "read off" solutions in much the same manner as the example. The block labeled JUNK in fact tells us with its width how many free parameters there are in the solutions. Notice that the JUNK block could have size 0 (for example, consider the silly system x_{1}=567,672, already in RREF!) and in that case the system would have only one solution.
I then started to examine other systems of linear equations. I had several purposes, and one was to introduce some additional linear algebra vocabulary.
Linear combination
Is (1,2,3,4) a linear combination of (1,1,1,0) and (1,0,1,1) and (1,0,0,1)?
Suppose v is a vector and w_{1},
w_{2}, w_{3}, ..., w_{k}
are other vectors. Then v is a linear combination of
w_{1}, w_{2}, ...,w_{k} if
there are scalars c_{1}, c_{2}, ..., c_{k} so
that v=SUM_{j=1}^{k}c_{j}w_{j}
Here v corresponds to (1,2,3,4), and k=3 (there are three
w_{l};s). Are there scalars (here, just real numbers) so that
(1,2,3,4)=c_{1}(1,1,1,0)+c_{2}(1,0,1,1)+c_{3}(1,0,0,1)?
Please recognize that this is the same as asking for solutions of
1c_{1}+1c_{2}+1c_{3}=1 1c_{1}+0c_{2}+0c_{3}=2 1c_{}+1c_{2}+0c_{3}=3 0c_{1}+1c_{2}+1c_{3}=4This isn't what I did in class, but then I did something easier in class. I will change the coefficient matrix to RREF form.
A B C ( 1 1 1  1 ) ( 1 0 0  2 ) ( 1 0 0  2 ) ( 1 0 0  2 ) ( 1 0 0  2 ) ~ ( 1 1 1  1 ) ~ ( 0 1 1 1 ) ~ ( 0 1 1 1 ) (1 1 0  3 ) (1 1 0  3 ) ( 0 1 0  5 ) ( 0 0 1  6 ) ( 0 1 1  4 ) ( 0 1 1  4 ) ( 0 1 1  4 ) ( 0 0 0  5 )
The tilde, ~, is frequently used to indicate that
systems of equations are equivalent.
A The first row operation I used switched the first and second rows.
B I used the first row to zero out (?) the other entries in the first column. Then
C I used multiples of the second row to zero out the other entries
in the second column.
I decided to stop here even though I didn't complete the row
reduction. Look at the last "equation", or rather, let's look at the
equation represented by the last row. It is
0c_{1}+0c_{2}+0c_{3}=5
This system of linear equations has no solution. The system is
inconsistent. The compatibility condition is not satisfied.
Therefore the answer to the original question:
Is (1,2,3,4) a linear combination of (1,1,1,0) and (1,0,1,1) and (1,0,0,1)?
is No.
Another linear combination
Can t^{2} be written as a linear combination of t(t+1) and
t(t1) and (t1)(t+1)?
Here there are again three "vectors", w_{1} and
w_{2} and w_{3}, but the vectors are
polynomials. Are there numbers c_{1} and c_{2} and
c_{3} so that
t^{2}=c_{1}t(t+1)+c_{2}t(t1)+c_{3}(t1)(t+1)?
I can tranlate this into a system of linear of equations:
The t^{2} coefficients: 1 = 1c_{1}+1c_{2}+1c_{3} The t^{1} coefficients: 0 = 1c_{1}1c_{2}+0c_{3} The t^{0} coefficients: 0 = 0c_{1}+0c_{2}1c_{3}I could do row reduction, but the last equation tells me that c_{3}=0, and then the second equation tells me that c_{1}=c_{2} and the first equation says their sum is 1, so that c_{1}=1/2 and c_{2}=1/2 finish a solution to this system of equations.
Therefore the answer to the original question:
Can t^{2} be written as a linear combination of t(t+1) and
t(t1) and (t1)(t+1)?
is Yes.
Linear independence
Are t(t+1) and t(t1) and (t1)(t+1) linearly independent?
Suppose w_{1}, w_{2},
w_{3}, ..., w_{k} are vectors. Then
w_{1}, w_{2}, ...,w_{k}
are linearly independent if whenever
the linear combination v=SUM_{j=1}^{k}c_{j}w_{j}=0,
then all of the scalars c_{j} MUST
be 0. In other words, the vector equation v=SUM_{j=1}^{k}c_{j}w_{j}=0
only has the trivial solution.
This may be the most important definition in linear algebra. It contains an "If ... then ..." statement which should be understood. In geometric terms, you could think that linearly independent vectors point off into different directions. Maybe this makes you happy, but my basic "geometric" intution has definitely deserted me above 47 dimensions!
So I need to consider the equation
c_{1}t(t+1)+c_{2}t(t1)+c_{3}(t1)(t+1)+0
and then try to see what this tells me about the c_{j}'s. The most
important ingredient here is LOGIC. First let me convert this
polynomial equation into a system of linear equations.
The t^{2} coefficients: 1 = 1c_{1}+1c_{2}+1c_{3} The t^{1} coefficients: 0 = 1c_{1}1c_{2}+0c_{3} The t^{0} coefficients: 0 = 0c_{1}+0c_{2}1c_{3}The third equation tells me that c_{3} must be 0. The second equation tells me that c_{2}=c_{1}, and then the first equation says that 2c_{1}=0 so c_{1} must be 0 also. And so c_{2} must be 0. Wow.
Or we could go to the RREF. Here is what Maple says (this is a homogeneous system so I only need the coefficient matrix):
>with(linalg): > A:=matrix(3,3,[1,1,1,1,1,0,0,0,1]); [1 1 1] [ ] A := [1 1 0] [ ] [0 0 1] > rref(A); [1 0 0] [ ] [0 1 0] [ ] [0 0 1]Therefore all of the c_{j}'s must be zero.
Therefore the answer to the original question:
Are t(t+1) and t(t1) and (t1)(t+1) linearly independent?
is Yes.
Just one more ...
Are the functions cos(t), sin(t), and cos(t78) linearly independent?
So I'm asking if there are scalars so that
c_{1}cos(t)+c_{2}sin(t)+c_{3}cos(t78)=0, and the
implication of the question is that this should be true for all t.
There were many complaints. Engineering
students were complaining about the absurdity of 78. There are
many sillier numbers. Also there was little understanding of the
question. Huh ... well, an inspiration arose ... cos(t78): we can
regress to a more primitive form of life, and recall a trig formula
(which is on the formula sheet
for the first exam!):
cos(A+B)=cos(A)cos(B)sin(A)sin(B)
If A=t and B=78, then cos(t78)=cos(t)cos(78)sin(t)sin(78). So
look at
c_{1}cos(t)+c_{2}sin(t)+c_{3}cos(t78)=0
and realize (with cos(78)=cos(78) and sin(78)=sin(78)) that
cos(78)cos(t)sin(78)sin(t)+cos(t78)=0
Therefore the answer to the original question:
Are the functions
cos(t), sin(t), and cos(t78) linearly independent?
is
NO. You could take c_{1}=cos(78) and
c_{2}=sin(78) and c_{3}=1. I hope I got the signs right.
Word of the day regress
1. To go back; move backward.
2. To return to a previous, usually worse or less developed state.
QotD
The QotD represents an opportunity for people to earn 5 points on next
week's exam. Also, it allows students to convince me that they can
convert a matrix to RREF. I gave everyone a randomly chosen 3 by 5
matrix, with entries integers between 3 and 3. I asked people to tell
me the RREF of the matrix. Any one who has not earned 5 points for the
exam can try again out of class (in my office, Hill 542, for example).
Here, for the curious, is the Maple code which created these problems:
>with(linalg): > z:=rand(3..3); z := proc() (proc() option builtin; 391 end proc)(6, 7, 3)  3 end procCreates a "random number" procedure, with output 3,2,1,0,1,2,3.
> ww:=(p, q) > matrix(p, q, [seq(z(), j = 1 .. p*q)]); ww := (p, q) > matrix(p, q, [seq(z(), j = 1 .. p q)])Creates a p by q matrix with entries random outputs of the procedure z previously defined.
> zz:=proc(p, q, A, B) C := ww(p, q); B := rref(C); A:=C; return end; Warning, `C` is implicitly declared local to procedure `zz` zz := proc(p, q, A, B) local C; C := ww(p, q); B := linalg:rref(C); A := C; return end procCreates a matrix and gets its RREF.
> zz(3,5,frog,toad);eval(frog);eval(toad); [1 2 1 1 2] [ ] [2 0 1 3 3] [ ] [0 2 1 2 3] [1 0 0 1 5] [ ] [0 1 0 3/2 5] [ ] [0 0 1 1 13]One example of the final procedure.
HOMEWORK
Prepare for the exam. Review suggestions are
available.

Thursday, September 29 

A system of linear equations is a collection of linear equations. So
41x_{1}.003x_{2}+Pi x_{3}=98 x_{1}+0x_{2}441,332x_{3}=0is a system of linear equations.
Frequently we will abbreviate a system of linear equations. The actual variable names (x_{3} and x_{44}, etc.) may not be important. The coeffients of the variables (as an ordered set!) and the constants are important. The system
41x_{1}.003x_{2}+Pi x_{3}=98 x_{1}+0x_{2}441,332x_{3}=0has coefficient matrix equal to
( 41 .003 Pi ) ( 1 0 441,332 )and its augmented matrix is
( 41 .003 Pi  98 ) ( 1 0 441,332  0 )The use of the vertical bars () to separate the coefficients and the constants is conventional in your textbook and many other places, but not everywhere.
Matrix algebra
A matrix is a rectangular array. If the number of rows is n and the number of columns is m,
the matrix is said to be nxm (read as "n by m"). If A=(a_{ij}) is a matrix with
entries a_{ij}, then i refers to the row number and j refers to the cokumn number.
Matrices of the same size can be added.
( 2 3 9 1 ) ( 0 9 4 Pi ) ( 2 12 5 Pi+1 ) ( 3 4 0 7 ) + (10 1 7 0 ) = (13 3 7 7 ) ( 5 0 4 12) ( 3 3 3 3 ) ( 8 3 7 15 )so if A and B are both nbym matrices, the ij^{th} entry of A+B is a_{ij}+b_{ij}.
Matrices can be multiplied by scalars, so that
( 9 4) ( 63 28 ) 7( Q P) = ( 7Q 7P ) ( 3 1) ( 21 7 )The ij^{th} entry in cA is ca_{ij}.
Matrix multiplication is more mysterious, and why it occurs the way it does may not be completely obvious. Suppose somehow that quantities x_{1} and x_{2} are controlled by y_{1} and y_{2} by the equations
3x_{1}2x_{2}=y_{1} 5x_{1}+7x_{2}=y_{2}Maybe this is some kind of reaction or linkage or something. Also now suppose that y_{1} and y_{2} are in turn ?controlled by quantities z_{1} and z_{2}, and this control is also expressed by a system of equations:
4y_{1}+15y_{2}=z_{1} 8y_{1}7y_{2}=z_{2}How can we expressed the relationship between the pair z_{1} and z_{2} and the pair x_{1} and x_{2} more directly? Well, we can plug in the equations for the x's into the equations for the y's:
4(3x_{1}2x_{2})+15(5x_{1}+7x_{2})=z_{1} 8(3x_{1}2x_{2})7(5x_{1}+7x_{2})=z_{2}This sort of concatentation of control does occur quite often in many mathematical models. If we multiply through, we see that the equations have become:
(4·3+15·5)x_{1}+(4·2+15·7)x_{2}=z_{1} (8·3+7·5)x_{1}+(8·2+7·7)x_{2}=z_{1}The coefficients combine in sort of a strange way.
A · B = this which is equal to this ( 4 15 ) ( 3 2 ) ( 4·3+15·5 4·2+15·7 ) ( 87 97 ) ( 8 7 ) ( 5 7 ) ( 8·3+7·5 8·2+7·7 ) (11 65 )
Matrix multiplication is defined only when the "inner dimensions" coincide. So A·B is defined when A is an nbym matrix and B is an mbyp matrix. The result, C, is an nbyp matrix, and c_{ik}=SUM_{t=1}^{m}a_{it}b_{tk}: it is the dot product of the i^{th} row of A by the j^{th} column of B. This "operation" is important and occurs frequently enough so that many chip designers have it "hardwired".
A solution to a system of linear equations is a ntuple of scalars which when substituted into each equation of the system makes the equation true. The phrase ntuple is used since I want to have the scalars ordered as they are substituted> So consider the system
5x_{1}x_{2}+2x_{3}=23 7x_{1}+3x_{2}x_{1}=10The 3tuple (triple) of numbers (2,1,7) is a solution to this system. The triple (0,0,2) is not a solution, and neither is the triple (1,7,2).
Here are some linear algebra questions:
Systems of linear equations are frequently analyzed by changing them to equivalent systems. Two systems are equivalent if they have the same set of solutions. So, for example, the systems
5x_{1}x_{2}+9x_{3}=17 2x_{1}+33x_{2}x_{3}=3
10x_{1}2x_{2}+18x_{3}=34 6x_{1}+99x_{2}3x_{3}=9are equivalent.
I want to describe an algorithm which allows us to "read off" whether
solutions of systems exist, and, when they exist, to describe them
efficiently. The algorithm uses
Elementary row operations
I will show how to use these operations to change a matrix to reduced row echelon form (RREF). The algorithm is called Gaussian elimination or row reduction or ...
HOMEWORK
Please hand in these problems on Monday. I hope they will be graded
and returned to you on Thursday. Maybe this will help with your exam
preparation. So: 4.5: 11, 4.6: 11, 8.1: 37, and 8.2: 6. I strongly
suggest that you don't look in the back of the book for the
answers. Do the problems first!
Further information about the exam will be available soon. There will
be a review session on Sunday, October 9. Further information on that
will also be available soon.
Maintained by greenfie@math.rutgers.edu and last modified 9/2/2005.