Find unique values in a pseudo inverse based on SVD

8 views (last 30 days)
Meng Xi Zhu
Meng Xi Zhu on 10 Dec 2021
Edited: John D'Errico on 11 Dec 2021
For the equation Ax = B, I can use the pseudo inverse of A * B to get the best estimate for x
Now A is not full rank and there's linearly dependent columns, so when performing pinv(A) * B, some of the X values may not be unique,
For example, consider that
a =
1 2 0 0 0 0
2 4 0 0 0 0
0 0 1 0 0 1
0 0 0 1 0 0
0 0 0 0 1 0
b =
1
3
3
4
5
Now if we calculate x by taking pinv of a
x = pinv(a) * b
x =
0.2800
0.5600
1.5000
4.0000
5.0000
1.5000
But we know that the 1,2,3, and 6th values of x have no unique answers. How do we identify which values of x are not unique, preferably using SVD because we use SVD to perform the pseudo inverse anyways and it would save calculation steps for us.

Accepted Answer

John D'Errico
John D'Errico on 10 Dec 2021
Edited: John D'Errico on 11 Dec 2021
Um, in general, ALL of the values in the computation will be non-unique. In some simple, RARE cases, that will not be true. How can we know? By understanding the linear algebra. When you have a singular problem with a rank deficiency, what does the solution look like? For example, with:
A = [1 2 0 0 0 0
2 4 0 0 0 0
0 0 1 0 0 1
0 0 0 1 0 0
0 0 0 0 1 0];
b =[1;3; 3; 4; 5];
First, what is the numerical rank of A? A is a 5x6 array.
rank(A)
ans = 4
This tells us that one of the rows of A can be written as a linear combination of the other rows, but also that two of the columns of A are linear combinations of the others. We wish to try to solve the problem A*X==b. Of course, there will often be no exact solution, but there is a simple test.
rank([A,b])
ans = 5
And this tells us that B does not live in the column space of A. So no linear combination of the columns of A can reproduce b exactly. The simple solution to finding the best approximation here is just
x0 = pinv(A)*b
x0 = 6×1
0.2800 0.5600 1.5000 4.0000 5.0000 1.5000
But that will not be exact, as I claimed before.
[A*x0,b]
ans = 5×2
1.4000 1.0000 2.8000 3.0000 3.0000 3.0000 4.0000 4.0000 5.0000 5.0000
Close, but no cigar. Since A is rank deficent by 1, but A has more columns than rows, the general solution for all possible solutions will be of the form
pinv(A)*b + s*N1 + t*N2
where s and t are unknown parameters, and N1 and N2 are vectors that span the nullspace of A. The functino null gives us that nullspade.
format long
Anull = null(A)
Anull = 6×2
-0.894427190999916 0 0.447213595499958 0 0 -0.707106781186547 0 0 0 0 0 0.707106781186548
So N1 and N2 are the two columns of Anull. As you can see, two rows of N1 and N2 are EXACTLY zero. We can therefore write the full solution to your problem as:
syms s t
x0 + Anull*[s;t]
ans = 
ANY other (equally good) solution can be derived from that expression for some specific values of s and t. However the solution with minimum norm (what pinv yields) is the one where s=t=0. And there you should see that elements 3 and 4 of the complete solution are fully known. But all other elements will vary, essentially non-unique.
Is this the expected case, that is, is this usual? NO!!!!!! Almost always, you will expect to see all elements of your solution are not unique, because it will be rare for the nullspace vector(s) to have exactly zero rows. Here, we could identify those elements in advance, merely by looking at the nullspace vectors. Again, it is nothing you would normally expect.
For example, I'll create a randomly generated 5x6 array that has rank 4.
Ar = randn(5,4)*randn(4,6);
You can see that because A is the product of two arrays that are each at most rank 4, the product will also be rank 4 at most. Since they are randomly generated, the probability they will be of rank less than 4 has measure 0.
rank(Ar)
ans =
4
WHEW! That worked. ;-) Now lets look at the nullspace vectors for Ar.
null(Ar)
ans = 6×2
0.330678680955099 0.771575969482887 0.448751202199362 -0.337226202731890 -0.178720976978177 0.324796024549724 -0.216544906883725 0.006231106457684 -0.445094614481126 0.383184822969187 0.642130725270958 0.196436042269879
And here you see that these vectors do not have rows that are entirely 0. And therefore, the solution will have no elements that are non-unique. I need not even generate that solution, but if it makes you happy...
vpa(pinv(Ar)*b + null(Ar)*[s;t],10)
ans = 
where you should see that ALL elements of the general solution have multiples of both s and t in EVERY element. This is the result you would normally expect from a problem that was not probably cooked up as can be found in A.
Finally, could I have looked at A in advance, and understanding the linear algebra, have seen this to be the case by inspection? Well, yes. Technically, yes. But that would also rarely be the case, and since the simple solution is to just compute the nullspace vectors of A, there is no need to waste time trying to work it out by inspection.
  2 Comments
John D'Errico
John D'Errico on 11 Dec 2021
Close. If an element of the nullspace is zero, that corresponds to an element that will be fixed, so that element WILL be unique in the solution. Those elements cannot vary.
Honestly, I would probably not use the SVD, just use null. That is because null gives you all the information you need. Yes, null is itself based on the SVD, and null picks out the column of V that it should. But, can you use the SVD as you are saying? Well, yes.
A = [1 2 0 0 0 0
2 4 0 0 0 0
0 0 1 0 0 1
0 0 0 1 0 0
0 0 0 0 1 0];
Now, look at the matrices returned, comparing the last two columns of the matrix V from the SVD, then to the result produced by null.
[U,S,V] = svd(A);
V
V = 6×6
-0.4472 0 0 0 -0.8944 0 -0.8944 0 0 0 0.4472 0 0 -0.7071 0 0 0 -0.7071 0 0 1.0000 0 0 0 0 0 0 1.0000 0 0 0 -0.7071 0 0 0 0.7071
nullA = null(A)
nullA = 6×2
-0.8944 0 0.4472 0 0 -0.7071 0 0 0 0 0 0.7071
What you need to see is that null returns those last two columns. So what have you gained by calling the SVD, and then having to figure out which columns of V should be used? Nothing. The nullspace of A is given as those last two columns. So just use null.
Fially, you ask igf you can learn which elements are related to each other. Well, again, for almost ANY normal real life problem, ALL of the elements will be "connected", so related to the others. The only reason the first two elements are connected to each other and only each other, is the matrix A has the property of being block diagonal. That is, if you look at A, you will see the block A(1:2,1:2).
A(1:2,1:2)
ans = 2×2
1 2 2 4
Now look to see that A has no other non-zero elements in the first two rows. It has NO other non-zeros in the first two columns. And that is probably not a common circumstance. The normal case is like the second example I gave, where everything talks to everything else.

Sign in to comment.

More Answers (1)

Jon
Jon on 10 Dec 2021
I think that you can identify the "non-unique" components in your specific example, by looking at the last two columns of V (or equivalently the last two rows of V', where using the svd we have A = U*S*V'. The non-zero elements in the last two rows of V' are the "non-unique" components.
  5 Comments
John D'Errico
John D'Errico on 10 Dec 2021
As I said in my answer, almost always all elements can vary in the set of all "solutions". Only for certain simple problems, will certain specific elements be fixed in value. And those problems will be usually constructed to have that property. If you spend a little time, you can probably see when that will happen. (THINK ABOUT IT! Eventually you might say aha! when the insight hits you.) But it is also true that null will also make it obvious when that is the case. So just use null to tell you exactly when it will happen. At the same time, don't expect this to happen often. I'd even speculate the answer is almost never in the realm of real world problems.

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!