Looking for linear system Ax=0 precision improvements
Show older comments
Hi, I'm solving a system of equations
.
x=null(A);
The problem is that due to a very wide range of values in the A matrix (
) the roundoff error result in inacceptably large relative errors in the solution. How can I improve this numerical resolution? The matrix is relatively small (30x30) so speed is not a big limitation. Also all the large values are on the same colums.
Edit2: clarifying the question and objectives
To clarify my issue, my understanding is that null sort of ends up minimising
.
Its limited double precision and even if you plug in the exact solution, i can only hope to have about
A*x <= length(x)*eps(max(A.*x'))
When i plot A*x, I get something closer to
A*x = length(x)*eps(max(A.*x',[],'all'))
This is probably to be expected as minimisation of
"distributes the error" over all components equally?
In my case, I care about relative errors of individual coordinates . I want them to be small and at least know when a value is smaller than engine precision.
Can I be more precise in calculating the small value at the cost of the larger ones if necessary, and how up to what threshold I can trust this value?
A good improvement was proposed:
s = max(abs(A),[],1); %rescaling
x = null(A./s)./(s');
This is better but not quite sufficient and I stil miss the key aspect of knowing what value
is below reasonaly accurate. I initially used x>eps(1) to check for that. This was wrong but my following attemps were not any better.
is below reasonaly accurate. I initially used x>eps(1) to check for that. This was wrong but my following attemps were not any better.----- Edit: Correcting the question following Matt's advice. -----
I benchmark the relative error over each element of the solution
, with
.
. relative_error = abs((A*x)./(vecnorm(A)'.*x));
I have a kinda dirty solution using a combination of rescaling, null(), and fsolve(). You can see below the result for four different approaches.

x1 = null(A);
x1 = x1/sum(x1(:,1));
s = max(abs(A),[],1); %rescaling
x2 = null(A./s)./(s');
x2 = x2/sum(x2(:,1));
x3 = fsolve(@(y) A*y,x2,optimoptions('fsolve','Display','off'));
x3=x3/sum(x3);
precision = length(A)*eps(max(abs(A.*(x2')),[],2));
x4 = fsolve(@(y) A*y./max(abs(x2),precision),x2,optimoptions('fsolve','Display','off'));
x4 = x4/sum(x4);
figure
hold on
plot(x4,'k--')
relative_error = abs((A*x1)./(vecnorm(A)'.*x1));
plot(relative_error)
relative_error = abs((A*x2)./(vecnorm(A)'.*x2));
plot(relative_error)
relative_error = abs((A*x3)./(vecnorm(A)'.*x3));
plot(relative_error)
relative_error = abs((A*x4)./(vecnorm(A)'.*x4));
plot(relative_error)
set(gca,'Yscale','log')
legend({'x_4','err 1','err 2','err 3','err 4'})
The method is probably a bit slow and dirty but the relative error reached is close to eps.
6 Comments
Catalytic
on 28 Feb 2025
Please attach A in a .mat file.
vanni meier
on 28 Feb 2025
Edited: vanni meier
on 28 Feb 2025
vanni meier
on 28 Feb 2025
vanni meier
on 6 Mar 2025
Hi vanni,
A few thoughts to consider.
Here is the original code:
load('A.mat');
n=size(A,1);
s=-A(1:n+1:n*n);
B=vpa(A./s,128);
B(1:n+1:n*n)=zeros(1,n);
B(1:n+1:n*n)=-sum(B);
[V,D]=eig(B);
x = double(V(:,1));
x = x./s';
x = x/sum(x); % my normalisation criterion
x = double(x);
norm(A*x)
The code assumes that the first eigenvalue returned by eig is the smallest, but I don't see anything on that doc page that states that as a fact. So it might be more robust to actually find which eigenvalue is the smallest.
If going the VPA route, it's probably better to convert A and s to VPA and then divide
B = vpa(A,128)./vpa(s,128);
B(1:n+1:n*n)=zeros(1,n);
B(1:n+1:n*n)=-sum(B);
[V,D]=eig(B);
[mineig,index] = min(abs(diag(D)));
mineig
x = double(V(:,index));
x = x./s';
x = x/sum(x); % my normalisation criterion
x = double(x);
norm(A*x)
I noticed that the code converts x back to double immediately after the call to eig, and thought further improvement might be gained by keeping things in VPA until the end.
B = vpa(A,128)./vpa(s,128);
B(1:n+1:n*n)=zeros(1,n);
B(1:n+1:n*n)=-sum(B);
[V,D]=eig(B);
[mineig,index] = min(abs(diag(D)));
mineig
x = (V(:,index));
x = x./vpa(s,128)';
x = x/sum(x); % my normalisation criterion
x = double(x);
norm(A*x)
Alas, that's not the case. But maybe that's because the diagonal of B isn't really the diagonal of A. If we force the diagonal of A to meet the same constraints as B, then
AA = A;
AA(1:n+1:n*n) = 0;
AA(1:n+1:n*n) = -sum(AA);
norm(AA*x)
Answers (3)
David Goodmanson
on 7 Mar 2025
Edited: David Goodmanson
on 30 Mar 2025
Hi vanni,
My comment of March 4 discussed relative error calculations in solving for x in A*x = 0. (Some boring details are mentioned there but not here)
You mentioned that column elements of A tend to have the same size, meaning that scaling the columns of A can help increase the precision.
In the comment, a proposed solution is based on the fact that for A*x = z, the nth element of z is found by multiplying the nth row of A by x. For each row vector Arow, Arow*x should be zero, i.e. every Arow is orthogonal to x. A common expression for orthogonality relative error is
(Arow*x)/(norm(Arow)*norm(x)) % row norms of A.
Orthogonality and scaling lead to the following code, which assumes row norms only. The process works (see notes below).
% unscaled
xa = null(A);
xa = xa/norm(xa);
rela = (A*xa)./(vecnorm(A,2,2)*norm(xa)); % relative error w/ row norms
% use scaling
s = max(abs(A )); % max by column
xbs = null(A./s); % scaled null vector
xb = xbs./(s'); % scale back
xb = xb/norm(xb);
relb = (A*xb)./(vecnorm(A,2,2)*norm(xb)); % relative error w/ row norms
You posted three A matrices all together,one on the day after you posted the question (I will call this A1) and two more, which I think are identical, in the last day or so (A2). Both matrices have all negative values on the diagonal, and all positive values (or zeros) off the diagonal. So I imagine you could make a good case that the x you are solving for should have all positive entries (not counting that you could mutiply x by an overall minus sign and still have a solution).
Sometimes the calculated xa (unscaled A) or xb (scaled A) have negative values. The only bad case would occur when xa or xb is negative but significantly larger in absolute value than its error estimate.
You have said that that bad cases are happening for A2, but sticking to the code above I am not finding any bad cases for the scaled version xb, although there are bad cases for the unscaled version xa. (xa and xb needed to be multiplied by an overall minus sign in order to get them into the right form).
Matrix A1 comes up with one negative value for xb but it is not a bad case.
various comments:
oo For Ax = 0, the normalization of x does not matter, so for consistency it makes sense to use normalized x, x = x/norm(x). In all these cases the first element of x is very close to 1, and the rest of the elements are small to very small.
oo Looking for max accuracy with the posted A matrices could be difficult, because for all I know they may have been calculated to 15 significant figures, but the shipping process came up with 5 significant figures on this site.
oo This problem is not set up for great numerical success since the condition numbers of A1 and A2 are 1.3067e+21 and 1.7792e+22 respectively.
oo If you mentioned that the columns of A are intended to sum to 1, I missed it.
oo The comment employed column norms as well as row norms, with the idea of showing that column norms were unwanted.
Mike Croucher
on 27 Feb 2025
I'm not an expert in this type of thing so take the following with caution but perhaps try vpa to give more precision? Something like
vpaA = vpa(a,128);
x = null(vpaA)
2 Comments
Walter Roberson
on 27 Feb 2025
I would not recommend this approach. I would suggest instead
sA = sym(A);
x = null(sA);
possibly followed by double(x) or vpa(x)
vanni meier
on 6 Mar 2025
Edited: vanni meier
on 6 Mar 2025
s=max(abs(A),[],1);
x = null(A./s)./(s')
18 Comments
vanni meier
on 27 Feb 2025
Edited: vanni meier
on 27 Feb 2025
vanni meier
on 27 Feb 2025
Matt J
on 28 Feb 2025
In your OP, you said that the columns of the matrix are different orders of magnitude. So, I don't understand why max(abs(A),[],2) makes any sense.
vanni meier
on 28 Feb 2025
vanni meier
on 28 Feb 2025
I don't really understand your relative error metric,
relative_error = abs((A*x)./max(x,100*precision));
which is essentially abs((A*x))./x. Why should there be any comparability between A*x and x element by element?
some components of x are about 6 orders of magnitude greater than the real solution
This seems to suggest that you know the ground truth null vector xtrue. If so, one would normally measure the relative error as,
relative_error = norm(x-xtrue,p)./norm(xtrue,p);
Matt J
on 28 Feb 2025
From your graphs, and because you said A is 30x30, you also seem to be assuming that there is a unique null-vector to compare with, i.e., that the nullity of A is 1. Perhaps you are extracting the singular vector with the smallest singular value?
There is no apriori reason for us to believe that the nullity is 1. We haven't been shown the 2nd smallest singular value and seen how it compares to the smallest and largest.
vanni meier
on 28 Feb 2025
David Goodmanson
on 4 Mar 2025
Edited: David Goodmanson
on 4 Mar 2025
Hi vanni,
I don't believe at all that the relative error calculations are correct, assuming they are of the form as in the original answer,
relative_error = abs((A*x)./(vecnorm(A)'.*x)); % where x is x1,x2, etc. (a)
(I looked at what has been done on this thread and I don't see any code that is doing it differently).
[1] The function vecnorm(A) is looking at the norm of each column. (then the transpose is done to line up with element-by-element multiplication by x, but it's still the norm of each column). However, the columns of A have nothing to do with obtaining A*x. To find A*x = z, the nth element of z is found by multiplying the nth row of A by x. For each row vector Arow you want Arow*x to be zero, and the relative error is usually taken to be something like
(Arow*x)/(norm(Arow)*norm(x))
[2] The use of individual elements of x in the denominator is not correct. What if some element of x turned out to be close to zero basically by chance? The numerator in (a) is still not small, so you would get a large, incorrect relative error. A good expression is
relative_error = abs((A*x)./(vecnorm(A,2,2)*norm(x))); (2)
where the second '2' is for dimension 2.
You have said that in your matrix M, values of similar size tend to go by column, but to see what is going on it doesn't help that M is more or less symmetric. The code below uses a matrix that concentrates diverse values into columns and is very much not symmetric. The idea is to show the difference between (2) and an incorrect use of column norms,
relative_error = abs((A*x)./(vecnorm(A,2,1)'*norm(x))); (1)
so
A = randn(3,4).*[1e9 1 1e-4 1e5];
A = [A(2,:);A]; % duplicate a row so that there is one null vector
A = rand(4,4)*A; % linear combination of rows, preserve column sizes
xa = null(A)
rela1 = (A*xa)./(vecnorm(A,2,1)'*norm(xa)) % using column norms
rela2 = (A*xa)./(vecnorm(A,2,2)*norm(xa)) % using row norms
% use scaling
s = max(abs(A)); % max by column
xbs = null(A./s) % null vector of scaled A
xb = xbs./(s'); % scale back
relb1 = (A*xb)./(vecnorm(A,2,1)'*norm(xb)) % using column norms
relb2 = (A*xb)./(vecnorm(A,2,2)*norm(xb)) % using row norms
results in
rela1 =
5.39126623218795e-26 % col norms
7.92019792627436e-17
5.22933649831754e-12
2.24659840451935e-21
rela2 =
8.48983300421776e-26 % row norms
1.43624902854945e-25
3.0692906086727e-25
1.63971271254854e-25
relb1 = % col norms, scaled
2.2486619072629e-30
5.85905556234354e-21
2.17556203924493e-16
1.48401834434605e-25
relb2 = % row norms, scaled
3.54105385514604e-30
1.06248138467817e-29
1.2769176620758e-29
1.08313249933072e-29
As you can see, using row norms gives consistent results for relative error, and using column norms gives results that are all over the map.
Note that scaling provides about four orders of maginitude improvement in accuracy.
Without scaling, xa has to work pretty hard and has widely ranging values. The scaled version has a much easier time of it:
xa =
4.73207765930223e-14
-5.92226778105561e-05
-0.999999998246337
3.26583652739487e-10
xbs = % scaled
0.741821754956099
-0.494545909549514
-0.348326612516448
0.289470893587789
vanni meier
on 5 Mar 2025
Edited: vanni meier
on 5 Mar 2025
vanni meier
on 5 Mar 2025
vanni meier
on 5 Mar 2025
David Goodmanson
on 6 Mar 2025
Hi vanni,
the matrix A that you posted has all negative values on the diagonal, and all positive values (or zeros) off the diagonal. So I imagine you could make a good case that the x you are solving for should have all positive entries (not counting that you could mutiply x by an overall minus sign and still have a solution). For Ax = 0, the normalization of x does not matter, so one might as well use normalized x, x = x/norm(x).
I ran A through the code that I used. Both normalized xa (the unscaled solution) and normalized xb (the scaled solution), have x(1) very close to 1, another element = 0.0015574 and all the other elements on the order of 10^-4 or less, most of them a lot less.
xa has all positive elements. xb has all positive elements except a single negative element, x(30). But that element is -1.7582e-13 which is quite small and consistent with the relative error term relb2(30) = 3.7067e-13. So I don't believe the method is flawed, especially since it uses a common expression for orthogonality relative error,
Arow*x/(norm(Arow)*norm(x))
Of course you can go with any method that to you makes more sense.
vanni meier
on 6 Mar 2025
The nullspace of your matrix is 0-dimensional (only contains the 0-vector).
load("A.mat")
eig(vpa(A))
vanni meier
on 6 Mar 2025
vanni meier
on 6 Mar 2025
David Goodmanson
on 7 Mar 2025
Edited: David Goodmanson
on 8 Mar 2025
Hi vanni,
After taking a look at this, I feel good enough about the method to post the material as an answer.
Categories
Find more on Fixed-Point Matrix Operations in Simulink in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!




