Looking for linear system Ax=0 precision improvements

Hi, I'm solving a system of equations .
x=null(A);
The problem is that due to a very wide range of values in the A matrix () the roundoff error result in inacceptably large relative errors in the solution. How can I improve this numerical resolution? The matrix is relatively small (30x30) so speed is not a big limitation. Also all the large values are on the same colums.
Edit2: clarifying the question and objectives
To clarify my issue, my understanding is that null sort of ends up minimising .
Its limited double precision and even if you plug in the exact solution, i can only hope to have about
A*x <= length(x)*eps(max(A.*x'))
When i plot A*x, I get something closer to
A*x = length(x)*eps(max(A.*x',[],'all'))
This is probably to be expected as minimisation of "distributes the error" over all components equally?
In my case, I care about relative errors of individual coordinates . I want them to be small and at least know when a value is smaller than engine precision.
Can I be more precise in calculating the small value at the cost of the larger ones if necessary, and how up to what threshold I can trust this value?
A good improvement was proposed:
s = max(abs(A),[],1); %rescaling
x = null(A./s)./(s');
This is better but not quite sufficient and I stil miss the key aspect of knowing what value is below reasonaly accurate. I initially used x>eps(1) to check for that. This was wrong but my following attemps were not any better.
----- Edit: Correcting the question following Matt's advice. -----
I benchmark the relative error over each element of the solution , with .
relative_error = abs((A*x)./(vecnorm(A)'.*x));
I have a kinda dirty solution using a combination of rescaling, null(), and fsolve(). You can see below the result for four different approaches.
x1 = null(A);
x1 = x1/sum(x1(:,1));
s = max(abs(A),[],1); %rescaling
x2 = null(A./s)./(s');
x2 = x2/sum(x2(:,1));
x3 = fsolve(@(y) A*y,x2,optimoptions('fsolve','Display','off'));
x3=x3/sum(x3);
precision = length(A)*eps(max(abs(A.*(x2')),[],2));
x4 = fsolve(@(y) A*y./max(abs(x2),precision),x2,optimoptions('fsolve','Display','off'));
x4 = x4/sum(x4);
figure
hold on
plot(x4,'k--')
relative_error = abs((A*x1)./(vecnorm(A)'.*x1));
plot(relative_error)
relative_error = abs((A*x2)./(vecnorm(A)'.*x2));
plot(relative_error)
relative_error = abs((A*x3)./(vecnorm(A)'.*x3));
plot(relative_error)
relative_error = abs((A*x4)./(vecnorm(A)'.*x4));
plot(relative_error)
set(gca,'Yscale','log')
legend({'x_4','err 1','err 2','err 3','err 4'})
The method is probably a bit slow and dirty but the relative error reached is close to eps.

6 Comments

Please attach A in a .mat file.
Depends on parameter, let's just say you can build one with the following script
n=30;
A=normrnd(-2,10,n,n);
A=10.^(A);
for i=1:n
A(i,i)=0;
A(i,i) =-sum(A(:,i));
end
Combining everything so far, the accurate way to solve this is:
load('A.mat');
n=size(A,1);
s=-A(1:n+1:n*n);
B=vpa(A./s,128);
B(1:n+1:n*n)=zeros(1,n);
B(1:n+1:n*n)=-sum(B);
[V,D]=eig(B);
x = double(V(:,1));
x = x./s';
x = x/sum(x); % my normalisation criterion
x = double(x);
norm(A*x)
Thank you to everybody for the solutions and sugesstions
Hi vanni,
A few thoughts to consider.
Here is the original code:
load('A.mat');
n=size(A,1);
s=-A(1:n+1:n*n);
B=vpa(A./s,128);
B(1:n+1:n*n)=zeros(1,n);
B(1:n+1:n*n)=-sum(B);
[V,D]=eig(B);
x = double(V(:,1));
x = x./s';
x = x/sum(x); % my normalisation criterion
x = double(x);
norm(A*x)
ans = 6.8898e-20
The code assumes that the first eigenvalue returned by eig is the smallest, but I don't see anything on that doc page that states that as a fact. So it might be more robust to actually find which eigenvalue is the smallest.
If going the VPA route, it's probably better to convert A and s to VPA and then divide
B = vpa(A,128)./vpa(s,128);
B(1:n+1:n*n)=zeros(1,n);
B(1:n+1:n*n)=-sum(B);
[V,D]=eig(B);
[mineig,index] = min(abs(diag(D)));
mineig
mineig = 
5.8980671129533164155466150759502e-42
x = double(V(:,index));
x = x./s';
x = x/sum(x); % my normalisation criterion
x = double(x);
norm(A*x)
ans = 4.7959e-20
I noticed that the code converts x back to double immediately after the call to eig, and thought further improvement might be gained by keeping things in VPA until the end.
B = vpa(A,128)./vpa(s,128);
B(1:n+1:n*n)=zeros(1,n);
B(1:n+1:n*n)=-sum(B);
[V,D]=eig(B);
[mineig,index] = min(abs(diag(D)));
mineig
mineig = 
5.8980671129533164155466150759502e-42
x = (V(:,index));
x = x./vpa(s,128)';
x = x/sum(x); % my normalisation criterion
x = double(x);
norm(A*x)
ans = 1.0077e-19
Alas, that's not the case. But maybe that's because the diagonal of B isn't really the diagonal of A. If we force the diagonal of A to meet the same constraints as B, then
AA = A;
AA(1:n+1:n*n) = 0;
AA(1:n+1:n*n) = -sum(AA);
norm(AA*x)
ans = 4.6937e-20

Sign in to comment.

Answers (3)

Hi vanni,
My comment of March 4 discussed relative error calculations in solving for x in A*x = 0. (Some boring details are mentioned there but not here)
You mentioned that column elements of A tend to have the same size, meaning that scaling the columns of A can help increase the precision.
In the comment, a proposed solution is based on the fact that for A*x = z, the nth element of z is found by multiplying the nth row of A by x. For each row vector Arow, Arow*x should be zero, i.e. every Arow is orthogonal to x. A common expression for orthogonality relative error is
(Arow*x)/(norm(Arow)*norm(x)) % row norms of A.
Orthogonality and scaling lead to the following code, which assumes row norms only. The process works (see notes below).
% unscaled
xa = null(A);
xa = xa/norm(xa);
rela = (A*xa)./(vecnorm(A,2,2)*norm(xa)); % relative error w/ row norms
% use scaling
s = max(abs(A )); % max by column
xbs = null(A./s); % scaled null vector
xb = xbs./(s'); % scale back
xb = xb/norm(xb);
relb = (A*xb)./(vecnorm(A,2,2)*norm(xb)); % relative error w/ row norms
You posted three A matrices all together,one on the day after you posted the question (I will call this A1) and two more, which I think are identical, in the last day or so (A2). Both matrices have all negative values on the diagonal, and all positive values (or zeros) off the diagonal. So I imagine you could make a good case that the x you are solving for should have all positive entries (not counting that you could mutiply x by an overall minus sign and still have a solution).
Sometimes the calculated xa (unscaled A) or xb (scaled A) have negative values. The only bad case would occur when xa or xb is negative but significantly larger in absolute value than its error estimate.
You have said that that bad cases are happening for A2, but sticking to the code above I am not finding any bad cases for the scaled version xb, although there are bad cases for the unscaled version xa. (xa and xb needed to be multiplied by an overall minus sign in order to get them into the right form).
Matrix A1 comes up with one negative value for xb but it is not a bad case.
various comments:
oo For Ax = 0, the normalization of x does not matter, so for consistency it makes sense to use normalized x, x = x/norm(x). In all these cases the first element of x is very close to 1, and the rest of the elements are small to very small.
oo Looking for max accuracy with the posted A matrices could be difficult, because for all I know they may have been calculated to 15 significant figures, but the shipping process came up with 5 significant figures on this site.
oo This problem is not set up for great numerical success since the condition numbers of A1 and A2 are 1.3067e+21 and 1.7792e+22 respectively.
oo If you mentioned that the columns of A are intended to sum to 1, I missed it.
oo The comment employed column norms as well as row norms, with the idea of showing that column norms were unwanted.
I'm not an expert in this type of thing so take the following with caution but perhaps try vpa to give more precision? Something like
vpaA = vpa(a,128);
x = null(vpaA)

2 Comments

I would not recommend this approach. I would suggest instead
sA = sym(A);
x = null(sA);
possibly followed by double(x) or vpa(x)
Ok so i'm not familiar with vpa. if i do this, I have an empty nullspace. This comes from the fac that my matrix elements are not exact either:
A = vpa(A,128)
A(1:n+1:n*n)=zeros(1,n);
A(1:n+1:n*n)=-sum(A); % this step introduces roufoff
But when I solve it, the nullspace is empty due to the roundoff error. And i cannot add a tolerance to symbolic null:
>> null(A,1e-25)
Error using sym/null
Too many input arguments.
This also fails:
>> null(vpa(A,64))
Empty sym: 30-by-0

Sign in to comment.

s=max(abs(A),[],1);
x = null(A./s)./(s')

18 Comments

So i tryed normalizing as well. My original system is
then transforming back to .
I did not however do a mistake by taking s=max(abs(A),[],2);
Wait no, i used
s=max(abs(A),[],1);
But I'm supposed to use
s=max(abs(A),[],2);
x = null(A./s')./(s)
Which indeed already improves quite a bit
In your OP, you said that the columns of the matrix are different orders of magnitude. So, I don't understand why max(abs(A),[],2) makes any sense.
You're right
s=max(abs(A),[],1);
x = null(A./s')./(s)
is correct.
The relative error however has not been improved. The precision I expect from this implementation is:
precision = length(A)*eps(max(abs(A.*(x')),[],2));
However my relative error is now on th order of 1e+12:
I don't really understand your relative error metric,
relative_error = abs((A*x)./max(x,100*precision));
which is essentially abs((A*x))./x. Why should there be any comparability between A*x and x element by element?
some components of x are about 6 orders of magnitude greater than the real solution
This seems to suggest that you know the ground truth null vector xtrue. If so, one would normally measure the relative error as,
relative_error = norm(x-xtrue,p)./norm(xtrue,p);
From your graphs, and because you said A is 30x30, you also seem to be assuming that there is a unique null-vector to compare with, i.e., that the nullity of A is 1. Perhaps you are extracting the singular vector with the smallest singular value?
There is no apriori reason for us to believe that the nullity is 1. We haven't been shown the 2nd smallest singular value and seen how it compares to the smallest and largest.
That's right, the justificarion for existence and uniqueness of a single null singular value comes from the physical system which I didn't provide.
I don't have acess to xtrue obviously. But you are right, my relative error mertic is kinda wrong as well. In fact it's so wrong' ill have to edit the original post.
Hi vanni,
I don't believe at all that the relative error calculations are correct, assuming they are of the form as in the original answer,
relative_error = abs((A*x)./(vecnorm(A)'.*x)); % where x is x1,x2, etc. (a)
(I looked at what has been done on this thread and I don't see any code that is doing it differently).
[1] The function vecnorm(A) is looking at the norm of each column. (then the transpose is done to line up with element-by-element multiplication by x, but it's still the norm of each column). However, the columns of A have nothing to do with obtaining A*x. To find A*x = z, the nth element of z is found by multiplying the nth row of A by x. For each row vector Arow you want Arow*x to be zero, and the relative error is usually taken to be something like
(Arow*x)/(norm(Arow)*norm(x))
[2] The use of individual elements of x in the denominator is not correct. What if some element of x turned out to be close to zero basically by chance? The numerator in (a) is still not small, so you would get a large, incorrect relative error. A good expression is
relative_error = abs((A*x)./(vecnorm(A,2,2)*norm(x))); (2)
where the second '2' is for dimension 2.
You have said that in your matrix M, values of similar size tend to go by column, but to see what is going on it doesn't help that M is more or less symmetric. The code below uses a matrix that concentrates diverse values into columns and is very much not symmetric. The idea is to show the difference between (2) and an incorrect use of column norms,
relative_error = abs((A*x)./(vecnorm(A,2,1)'*norm(x))); (1)
so
A = randn(3,4).*[1e9 1 1e-4 1e5];
A = [A(2,:);A]; % duplicate a row so that there is one null vector
A = rand(4,4)*A; % linear combination of rows, preserve column sizes
xa = null(A)
rela1 = (A*xa)./(vecnorm(A,2,1)'*norm(xa)) % using column norms
rela2 = (A*xa)./(vecnorm(A,2,2)*norm(xa)) % using row norms
% use scaling
s = max(abs(A)); % max by column
xbs = null(A./s) % null vector of scaled A
xb = xbs./(s'); % scale back
relb1 = (A*xb)./(vecnorm(A,2,1)'*norm(xb)) % using column norms
relb2 = (A*xb)./(vecnorm(A,2,2)*norm(xb)) % using row norms
results in
rela1 =
5.39126623218795e-26 % col norms
7.92019792627436e-17
5.22933649831754e-12
2.24659840451935e-21
rela2 =
8.48983300421776e-26 % row norms
1.43624902854945e-25
3.0692906086727e-25
1.63971271254854e-25
relb1 = % col norms, scaled
2.2486619072629e-30
5.85905556234354e-21
2.17556203924493e-16
1.48401834434605e-25
relb2 = % row norms, scaled
3.54105385514604e-30
1.06248138467817e-29
1.2769176620758e-29
1.08313249933072e-29
As you can see, using row norms gives consistent results for relative error, and using column norms gives results that are all over the map.
Note that scaling provides about four orders of maginitude improvement in accuracy.
Without scaling, xa has to work pretty hard and has widely ranging values. The scaled version has a much easier time of it:
xa =
4.73207765930223e-14
-5.92226778105561e-05
-0.999999998246337
3.26583652739487e-10
xbs = % scaled
0.741821754956099
-0.494545909549514
-0.348326612516448
0.289470893587789
I'm not confident in my metric either. But i dont quite understand your metric either.
To clarify my issue, my understanding is that null sort of ends up minimising .
Its limited double precision and even if you plug in the exact solution, i can only hope to have about
A*x <= length(x)*eps(max(A.*x'))
When i plot A*x, I get something closer to
A*x = length(x)*eps(max(A.*x',[],'all'))
In my case, I care about relative errors of individual coordinates . I want them to be small and at least know when a value is smaller than engine precision.
Can I be more precise in calculating the small value at the cost of the larger ones if necessary, and how up to what threshold I can trust this value?
I initially used x>eps(1) to check for that. This was wrong but my following attemps were not any better.
The scaling suggestion is a good starting point, but I would need to push this a bit further, and most importantly to know when my limit accuracy is reached.
@David Goodmanson My metric is probably indeed wrong, but i dont think your is correct either. I ran it with my matrix attached above. It gives me negative values greater than the relb2. I know from the physics that all values are positive, ad using fsolve indeed gives me a positive result.
Hi vanni,
the matrix A that you posted has all negative values on the diagonal, and all positive values (or zeros) off the diagonal. So I imagine you could make a good case that the x you are solving for should have all positive entries (not counting that you could mutiply x by an overall minus sign and still have a solution). For Ax = 0, the normalization of x does not matter, so one might as well use normalized x, x = x/norm(x).
I ran A through the code that I used. Both normalized xa (the unscaled solution) and normalized xb (the scaled solution), have x(1) very close to 1, another element = 0.0015574 and all the other elements on the order of 10^-4 or less, most of them a lot less.
xa has all positive elements. xb has all positive elements except a single negative element, x(30). But that element is -1.7582e-13 which is quite small and consistent with the relative error term relb2(30) = 3.7067e-13. So I don't believe the method is flawed, especially since it uses a common expression for orthogonality relative error,
Arow*x/(norm(Arow)*norm(x))
Of course you can go with any method that to you makes more sense.
Hi David,
Thank you for your time and explaination. I do agree with your comment on normalisation.
In the following case, i do have negative values significantly larger than relb2, hence my confusion on when the values xb are thrustworthy:
The nullspace of your matrix is 0-dimensional (only contains the 0-vector).
load("A.mat")
eig(vpa(A))
ans = 
That my be true numerically. But you can see the first eigenvalue is very close to zero:
I'm trying to avoid these errors, as I have
Should I do this?
n=size(A,1);
A=vpa(A)
A(1:n+1:n*n)=zeros(1,n);
A(1:n+1:n*n)=-sum(vpa(A));
eig(A)
I get . I mean i cant expect anything closer to 0.
How do i add a tolerance to vpa solution
n=size(A,1);
A=vpa(A)
A(1:n+1:n*n)=zeros(1,n);
A(1:n+1:n*n)=-sum(vpa(A));
x=null(A,tol)
returns an error
I am not very smart, i just have to take the first eigenvector.
Hi vanni,
After taking a look at this, I feel good enough about the method to post the material as an answer.

Sign in to comment.

Products

Release

R2024b

Asked:

on 27 Feb 2025

Commented:

on 30 Mar 2025

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!