Resultant matrix is rotated 90 whether I use MEX or MatLab function

1 view (last 30 days)
Dear all,
I am puzzled by something odd. I have written a MEX function using C of a Matlab function that contains 2 nested loops to calculate stress on a 2D grid.
The inputs of the MEX function and Matlab function are identical and are in the form of a 1D array (of the 2D grid).
The outputs of the MEX function and the Matlab function are not identical in the sense that if I reshape into a 2D array using reshape(answer,m,n) and surf() I get that the MEX and Matlab answers are rotated by 90 degrees!
I literally copy and pasted the C function into Matlab, so I am 99% certain that there are no computational differences (apart from the minor changes like brackets etc.)
My code is structured as follows:
for (coord=0; coord<N; coord++) {
for (seg=0; seg<S; seg++) {
...accessing arrays with index coord: px,py,pz
...accessing arrays with index seg: p1x,p1y,p1z,p2x,p2y,p2z,bx,by,bz
sxx[coord] += ...maths
}
}
Similarly in Matlab,
for coord=1:N
for seg=1:S
...accessing arrays with index coord: px,py,pz
...accessing arrays with index seg: p1x,p1y,p1z,p2x,p2y,p2z,bx,by,bz
sxx(coord) = sxx(coord) + ...maths
end
end
Both functions have the same identical nomenclature and variable order. I use the same input arrays in the workspace, and yet the 2D grids after reshape are rotated!
My mexFunction for the C code is: void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { int N,S; double *px,*py,*pz; double *p1x,*p1y,*p1z,*p2x,*p2y,*p2z; double *bx,*by,*bz; double a,MU,NU; double *sxx,*syy,*szz,*sxy,*syz,*sxz; int i;
//get the scalar inputs N,S,a,MU,NU
//create pointers to the input matrices
N=mxGetScalar(prhs[0]);
S=mxGetScalar(prhs[1]);
px=(double *) mxGetPr(prhs[2]);
py=(double *) mxGetPr(prhs[3]);
pz=(double *) mxGetPr(prhs[4]);
p1x=(double *) mxGetPr(prhs[5]);
p1y=(double *) mxGetPr(prhs[6]);
p1z=(double *) mxGetPr(prhs[7]);
p2x=(double *) mxGetPr(prhs[8]);
p2y=(double *) mxGetPr(prhs[9]);
p2z=(double *) mxGetPr(prhs[10]);
bx=(double *) mxGetPr(prhs[11]);
by=(double *) mxGetPr(prhs[12]);
bz=(double *) mxGetPr(prhs[13]);
a=mxGetScalar(prhs[14]);
MU=mxGetScalar(prhs[15]);
NU=mxGetScalar(prhs[16]);
//call the output pointer to the output matrix
plhs[0]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[1]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[2]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[3]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[4]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[5]=mxCreateDoubleMatrix(N,1,mxREAL);
//create a C pointer to a copy of the output matrix
sxx=(double *) mxGetPr(plhs[0]);
syy=(double *) mxGetPr(plhs[1]);
szz=(double *) mxGetPr(plhs[2]);
sxy=(double *) mxGetPr(plhs[3]);
syz=(double *) mxGetPr(plhs[4]);
sxz=(double *) mxGetPr(plhs[5]);
//call the C subroutine
StressDueToSeg(N,S, px, py, pz,p1x, p1y, p1z,
p2x, p2y, p2z, bx, by, bz,
a, MU, NU,
sxx, syy, szz, sxy, syz, sxz);
}
I wonder whether some kind of indexing difference in the mexFunction leads to this problem? I know I could just rotate the answer by 90 degrees, but seeing as I want to use the MEX function to speed up my computation, rotating large 2D arrays doesn't sound very efficient.
Thank you,
Kind Regards,
F

Answers (2)

James Tursa
James Tursa on 30 Jul 2013
It is the code that you don't show that is likely at the heart of the problem ... the indexing you use and how it relates to the mxArray variables you are passing in and returning to the workspace. E.g., are any of your inputs 2D? If so, how do you index into them in your mex function? Can you show us the actual indexing code?
  1 Comment
Francesco
Francesco on 30 Jul 2013
Hello,
My inputs are vectors, so are my outputs. I reshape() in Matlab once I get the outputs from the MEX or Matlab function.
Here is the MEX function. It's a bit long.
#include <math.h>
#include <mex.h>
static void StressDueToSeg(int N, int S, double *px, double *py, double *pz,
double *p1x, double *p1y, double *p1z,
double *p2x, double *p2y, double *p2z,
double *bx, double *by, double *bz,
double a, double MU, double NU,
double *sxx, double *syy, double *szz,
double *sxy, double *syz, double *sxz);
void mexFunction( int nlhs, mxArray *plhs[],
int nrhs, const mxArray *prhs[])
{
int N,S;
double *px,*py,*pz;
double *p1x,*p1y,*p1z,*p2x,*p2y,*p2z;
double *bx,*by,*bz;
double a,MU,NU;
double *sxx,*syy,*szz,*sxy,*syz,*sxz;
int i;
//get the scalar inputs N,S,a,MU,NU
//create pointers to the input matrices
N=mxGetScalar(prhs[0]);
S=mxGetScalar(prhs[1]);
px=(double *) mxGetPr(prhs[2]);
py=(double *) mxGetPr(prhs[3]);
pz=(double *) mxGetPr(prhs[4]);
p1x=(double *) mxGetPr(prhs[5]);
p1y=(double *) mxGetPr(prhs[6]);
p1z=(double *) mxGetPr(prhs[7]);
p2x=(double *) mxGetPr(prhs[8]);
p2y=(double *) mxGetPr(prhs[9]);
p2z=(double *) mxGetPr(prhs[10]);
bx=(double *) mxGetPr(prhs[11]);
by=(double *) mxGetPr(prhs[12]);
bz=(double *) mxGetPr(prhs[13]);
a=mxGetScalar(prhs[14]);
MU=mxGetScalar(prhs[15]);
NU=mxGetScalar(prhs[16]);
//call the output pointer to the output matrix
plhs[0]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[1]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[2]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[3]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[4]=mxCreateDoubleMatrix(N,1,mxREAL);
plhs[5]=mxCreateDoubleMatrix(N,1,mxREAL);
//create a C pointer to a copy of the output matrix
sxx=(double *) mxGetPr(plhs[0]);
syy=(double *) mxGetPr(plhs[1]);
szz=(double *) mxGetPr(plhs[2]);
sxy=(double *) mxGetPr(plhs[3]);
syz=(double *) mxGetPr(plhs[4]);
sxz=(double *) mxGetPr(plhs[5]);
//call the C subroutine
StressDueToSeg(N,S, px, py, pz,p1x, p1y, p1z,
p2x, p2y, p2z, bx, by, bz,
a, MU, NU,
sxx, syy, szz, sxy, syz, sxz);
}
void StressDueToSeg(int N, int S, double *px, double *py, double *pz,
double *p1x, double *p1y, double *p1z,
double *p2x, double *p2y, double *p2z,
double *bx, double *by, double *bz,
double a, double MU, double NU,
double *sxx, double *syy, double *szz,
double *sxy, double *syz, double *sxz)
{
double oneoverLp, common;
double vec1x, vec1y, vec1z;
double tpx, tpy, tpz;
double Rx, Ry, Rz, Rdt;
double ndx, ndy, ndz;
double d2, s1, s2, a2, a2_d2, a2d2inv;
double Ra, Rainv, Ra3inv, sRa3inv;
double s_03a, s_13a, s_05a, s_15a, s_25a;
double s_03b, s_13b, s_05b, s_15b, s_25b;
double s_03, s_13, s_05, s_15, s_25;
double m4p, m8p, m4pn, mn4pn, a2m8p;
double txbx, txby, txbz;
double dxbx, dxby, dxbz;
double dxbdt, dmdxx, dmdyy, dmdzz, dmdxy, dmdyz, dmdxz;
double tmtxx, tmtyy, tmtzz, tmtxy, tmtyz, tmtxz;
double tmdxx, tmdyy, tmdzz, tmdxy, tmdyz, tmdxz;
double tmtxbxx, tmtxbyy, tmtxbzz, tmtxbxy, tmtxbyz, tmtxbxz;
double dmtxbxx, dmtxbyy, dmtxbzz, dmtxbxy, dmtxbyz, dmtxbxz;
double tmdxbxx, tmdxbyy, tmdxbzz, tmdxbxy, tmdxbyz, tmdxbxz;
double I_03xx, I_03yy, I_03zz, I_03xy, I_03yz, I_03xz;
double I_13xx, I_13yy, I_13zz, I_13xy, I_13yz, I_13xz;
double I_05xx, I_05yy, I_05zz, I_05xy, I_05yz, I_05xz;
double I_15xx, I_15yy, I_15zz, I_15xy, I_15yz, I_15xz;
double I_25xx, I_25yy, I_25zz, I_25xy, I_25yz, I_25xz;
int coord, seg;
//precompute some constants
m4p = 0.25 * MU / M_PI;
m8p = 0.5 * m4p;
m4pn = m4p / (1 - NU);
mn4pn = m4pn * NU;
a2 = a * a;
a2m8p = a2 * m8p;
for (coord=0; coord<N; coord++) { //loop over the grid points
for (seg=0; seg<S; seg++) { //loop over the segments
vec1x = p2x[seg] - p1x[seg];
vec1y = p2y[seg] - p1y[seg];
vec1z = p2z[seg] - p1z[seg];
oneoverLp = 1 / sqrt(vec1x*vec1x + vec1y*vec1y + vec1z*vec1z);
tpx = vec1x * oneoverLp;
tpy = vec1y * oneoverLp;
tpz = vec1z * oneoverLp;
Rx = px[coord] - p1x[seg];
Ry = py[coord] - p1y[seg];
Rz = pz[coord] - p1z[seg];
Rdt = Rx*tpx + Ry*tpy + Rz*tpz;
ndx = Rx - Rdt*tpx;
ndy = Ry - Rdt*tpy;
ndz = Rz - Rdt*tpz;
d2 = ndx*ndx + ndy*ndy + ndz*ndz;
s1 = -Rdt;
s2 = -((px[coord]-p2x[seg])*tpx + (py[coord]-p2y[seg])*tpy + (pz[coord]-p2z[seg])*tpz);
a2_d2 = a2 + d2;
a2d2inv = 1 / a2_d2;
Ra = sqrt(a2_d2 + s1*s1);
Rainv = 1 / Ra;
Ra3inv = Rainv * Rainv * Rainv;
sRa3inv = s1 * Ra3inv;
s_03a = s1 * Rainv * a2d2inv;
s_13a = -Rainv;
s_05a = (2*s_03a + sRa3inv) * a2d2inv;
s_15a = -Ra3inv;
s_25a = s_03a - sRa3inv;
Ra = sqrt(a2_d2 + s2*s2);
Rainv = 1 / Ra;
Ra3inv = Rainv * Rainv * Rainv;
sRa3inv = s2 * Ra3inv;
s_03b = s2 * Rainv * a2d2inv;
s_13b = -Rainv;
s_05b = (2*s_03b + sRa3inv) * a2d2inv;
s_15b = -Ra3inv;
s_25b = s_03b - sRa3inv;
s_03 = s_03b - s_03a;
s_13 = s_13b - s_13a;
s_05 = s_05b - s_05a;
s_15 = s_15b - s_15a;
s_25 = s_25b - s_25a;
txbx = tpy*bz[seg] - tpz*by[seg];
txby = tpz*bx[seg] - tpx*bz[seg];
txbz = tpx*by[seg] - tpy*bx[seg];
dxbx = ndy*bz[seg] - ndz*by[seg];
dxby = ndz*bx[seg] - ndx*bz[seg];
dxbz = ndx*by[seg] - ndy*bx[seg];
dxbdt = dxbx*tpx + dxby*tpy + dxbz*tpz;
dmdxx = ndx * ndx;
dmdyy = ndy * ndy;
dmdzz = ndz * ndz;
dmdxy = ndx * ndy;
dmdyz = ndy * ndz;
dmdxz = ndx * ndz;
tmtxx = tpx * tpx;
tmtyy = tpy * tpy;
tmtzz = tpz * tpz;
tmtxy = tpx * tpy;
tmtyz = tpy * tpz;
tmtxz = tpx * tpz;
tmdxx = 2 * tpx * ndx;
tmdyy = 2 * tpy * ndy;
tmdzz = 2 * tpz * ndz;
tmdxy = tpx*ndy + tpy*ndx;
tmdyz = tpy*ndz + tpz*ndy;
tmdxz = tpx*ndz + tpz*ndx;
tmtxbxx = 2 * tpx * txbx;
tmtxbyy = 2 * tpy * txby;
tmtxbzz = 2 * tpz * txbz;
tmtxbxy = tpx*txby + tpy*txbx;
tmtxbyz = tpy*txbz + tpz*txby;
tmtxbxz = tpx*txbz + tpz*txbx;
dmtxbxx = 2 * ndx * txbx;
dmtxbyy = 2 * ndy * txby;
dmtxbzz = 2 * ndz * txbz;
dmtxbxy = ndx*txby + ndy*txbx;
dmtxbyz = ndy*txbz + ndz*txby;
dmtxbxz = ndx*txbz + ndz*txbx;
tmdxbxx = 2 * tpx * dxbx;
tmdxbyy = 2 * tpy * dxby;
tmdxbzz = 2 * tpz * dxbz;
tmdxbxy = tpx*dxby + tpy*dxbx;
tmdxbyz = tpy*dxbz + tpz*dxby;
tmdxbxz = tpx*dxbz + tpz*dxbx;
common = m4pn * dxbdt;
I_03xx = common + m4pn*dmtxbxx - m4p*tmdxbxx;
I_03yy = common + m4pn*dmtxbyy - m4p*tmdxbyy;
I_03zz = common + m4pn*dmtxbzz - m4p*tmdxbzz;
I_03xy = m4pn*dmtxbxy - m4p*tmdxbxy;
I_03yz = m4pn*dmtxbyz - m4p*tmdxbyz;
I_03xz = m4pn*dmtxbxz - m4p*tmdxbxz;
I_13xx = -mn4pn * tmtxbxx;
I_13yy = -mn4pn * tmtxbyy;
I_13zz = -mn4pn * tmtxbzz;
I_13xy = -mn4pn * tmtxbxy;
I_13yz = -mn4pn * tmtxbyz;
I_13xz = -mn4pn * tmtxbxz;
I_05xx = common*(a2+dmdxx) - a2m8p*tmdxbxx;
I_05yy = common*(a2+dmdyy) - a2m8p*tmdxbyy;
I_05zz = common*(a2+dmdzz) - a2m8p*tmdxbzz;
I_05xy = common*dmdxy - a2m8p*tmdxbxy;
I_05yz = common*dmdyz - a2m8p*tmdxbyz;
I_05xz = common*dmdxz - a2m8p*tmdxbxz;
I_15xx = a2m8p*tmtxbxx - common*tmdxx;
I_15yy = a2m8p*tmtxbyy - common*tmdyy;
I_15zz = a2m8p*tmtxbzz - common*tmdzz;
I_15xy = a2m8p*tmtxbxy - common*tmdxy;
I_15yz = a2m8p*tmtxbyz - common*tmdyz;
I_15xz = a2m8p*tmtxbxz - common*tmdxz;
I_25xx = common * tmtxx;
I_25yy = common * tmtyy;
I_25zz = common * tmtzz;
I_25xy = common * tmtxy;
I_25yz = common * tmtyz;
I_25xz = common * tmtxz;
sxx[coord] += I_03xx*s_03 + I_13xx*s_13 + I_05xx*s_05 +
I_15xx*s_15 + I_25xx*s_25;
syy[coord] += I_03yy*s_03 + I_13yy*s_13 + I_05yy*s_05 +
I_15yy*s_15 + I_25yy*s_25;
szz[coord] += I_03zz*s_03 + I_13zz*s_13 + I_05zz*s_05 +
I_15zz*s_15 + I_25zz*s_25;
sxy[coord] += I_03xy*s_03 + I_13xy*s_13 + I_05xy*s_05 +
I_15xy*s_15 + I_25xy*s_25;
syz[coord] += I_03yz*s_03 + I_13yz*s_13 + I_05yz*s_05 +
I_15yz*s_15 + I_25yz*s_25;
sxz[coord] += I_03xz*s_03 + I_13xz*s_13 + I_05xz*s_05 +
I_15xz*s_15 + I_25xz*s_25;
}
}
return;
}
This is the equivalent Matlab function.
function [sxx, syy, szz, sxy, sxz, syz] = StressDueToSeg_test(N,S,px,py,pz,p1x,p1y,p1z,p2x,p2y,p2z,bx,by,bz,a,MU,NU)
m4p = 0.25 * MU / pi;
m8p = 0. * m4p;
m4pn = m4p / (1.0 - NU);
mn4pn = m4pn * NU;
a2 = a * a;
a2m8p = a2 * m8p;
sxx=zeros(N,1);
syy=zeros(N,1);
szz=zeros(N,1);
sxy=zeros(N,1);
sxz=zeros(N,1);
syz=zeros(N,1);
for coord=1:N
for seg=1:S
vec1x = p2x(seg) - p1x(seg);
vec1y = p2y(seg) - p1y(seg);
vec1z = p2z(seg) - p1z(seg);
oneoverLp = 1 / sqrt(vec1x*vec1x + vec1y*vec1y + vec1z*vec1z);
tpx = vec1x * oneoverLp;
tpy = vec1y * oneoverLp;
tpz = vec1z * oneoverLp;
Rx = px(coord) - p1x(seg);
Ry = py(coord) - p1y(seg);
Rz = pz(coord) - p1z(seg);
Rdt = Rx*tpx + Ry*tpy + Rz*tpz;
ndx = Rx - Rdt*tpx;
ndy = Ry - Rdt*tpy;
ndz = Rz - Rdt*tpz;
d2 = ndx*ndx + ndy*ndy + ndz*ndz;
s1 = -Rdt;
s2 = -((px(coord)-p2x(seg))*tpx + (py(coord)-p2y(seg))*tpy + (pz(coord)-p2z(seg))*tpz);
a2_d2 = a2 + d2;
a2d2inv = 1 / a2_d2;
Ra = sqrt(a2_d2 + s1*s1);
Rainv = 1 / Ra;
Ra3inv = Rainv * Rainv * Rainv;
sRa3inv = s1 * Ra3inv;
s_03a = s1 * Rainv * a2d2inv;
s_13a = -Rainv;
s_05a = (2*s_03a + sRa3inv) * a2d2inv;
s_15a = -Ra3inv;
s_25a = s_03a - sRa3inv;
Ra = sqrt(a2_d2 + s2*s2);
Rainv = 1 / Ra;
Ra3inv = Rainv * Rainv * Rainv;
sRa3inv = s2 * Ra3inv;
s_03b = s2 * Rainv * a2d2inv;
s_13b = -Rainv;
s_05b = (2*s_03b + sRa3inv) * a2d2inv;
s_15b = -Ra3inv;
s_25b = s_03b - sRa3inv;
s_03 = s_03b - s_03a;
s_13 = s_13b - s_13a;
s_05 = s_05b - s_05a;
s_15 = s_15b - s_15a;
s_25 = s_25b - s_25a;
txbx = tpy*bz(seg) - tpz*by(seg);
txby = tpz*bx(seg) - tpx*bz(seg);
txbz = tpx*by(seg) - tpy*bx(seg);
dxbx = ndy*bz(seg) - ndz*by(seg);
dxby = ndz*bx(seg) - ndx*bz(seg);
dxbz = ndx*by(seg) - ndy*bx(seg);
dxbdt = dxbx*tpx + dxby*tpy + dxbz*tpz;
dmdxx = ndx * ndx;
dmdyy = ndy * ndy;
dmdzz = ndz * ndz;
dmdxy = ndx * ndy;
dmdyz = ndy * ndz;
dmdxz = ndx * ndz;
tmtxx = tpx * tpx;
tmtyy = tpy * tpy;
tmtzz = tpz * tpz;
tmtxy = tpx * tpy;
tmtyz = tpy * tpz;
tmtxz = tpx * tpz;
tmdxx = 2 * tpx * ndx;
tmdyy = 2 * tpy * ndy;
tmdzz = 2 * tpz * ndz;
tmdxy = tpx*ndy + tpy*ndx;
tmdyz = tpy*ndz + tpz*ndy;
tmdxz = tpx*ndz + tpz*ndx;
tmtxbxx = 2 * tpx * txbx;
tmtxbyy = 2 * tpy * txby;
tmtxbzz = 2 * tpz * txbz;
tmtxbxy = tpx*txby + tpy*txbx;
tmtxbyz = tpy*txbz + tpz*txby;
tmtxbxz = tpx*txbz + tpz*txbx;
dmtxbxx = 2 * ndx * txbx;
dmtxbyy = 2 * ndy * txby;
dmtxbzz = 2 * ndz * txbz;
dmtxbxy = ndx*txby + ndy*txbx;
dmtxbyz = ndy*txbz + ndz*txby;
dmtxbxz = ndx*txbz + ndz*txbx;
tmdxbxx = 2 * tpx * dxbx;
tmdxbyy = 2 * tpy * dxby;
tmdxbzz = 2 * tpz * dxbz;
tmdxbxy = tpx*dxby + tpy*dxbx;
tmdxbyz = tpy*dxbz + tpz*dxby;
tmdxbxz = tpx*dxbz + tpz*dxbx;
common = m4pn * dxbdt;
I_03xx = common + m4pn*dmtxbxx - m4p*tmdxbxx;
I_03yy = common + m4pn*dmtxbyy - m4p*tmdxbyy;
I_03zz = common + m4pn*dmtxbzz - m4p*tmdxbzz;
I_03xy = m4pn*dmtxbxy - m4p*tmdxbxy;
I_03yz = m4pn*dmtxbyz - m4p*tmdxbyz;
I_03xz = m4pn*dmtxbxz - m4p*tmdxbxz;
I_13xx = -mn4pn * tmtxbxx;
I_13yy = -mn4pn * tmtxbyy;
I_13zz = -mn4pn * tmtxbzz;
I_13xy = -mn4pn * tmtxbxy;
I_13yz = -mn4pn * tmtxbyz;
I_13xz = -mn4pn * tmtxbxz;
I_05xx = common*(a2+dmdxx) - a2m8p*tmdxbxx;
I_05yy = common*(a2+dmdyy) - a2m8p*tmdxbyy;
I_05zz = common*(a2+dmdzz) - a2m8p*tmdxbzz;
I_05xy = common*dmdxy - a2m8p*tmdxbxy;
I_05yz = common*dmdyz - a2m8p*tmdxbyz;
I_05xz = common*dmdxz - a2m8p*tmdxbxz;
I_15xx = a2m8p*tmtxbxx - common*tmdxx;
I_15yy = a2m8p*tmtxbyy - common*tmdyy;
I_15zz = a2m8p*tmtxbzz - common*tmdzz;
I_15xy = a2m8p*tmtxbxy - common*tmdxy;
I_15yz = a2m8p*tmtxbyz - common*tmdyz;
I_15xz = a2m8p*tmtxbxz - common*tmdxz;
I_25xx = common * tmtxx;
I_25yy = common * tmtyy;
I_25zz = common * tmtzz;
I_25xy = common * tmtxy;
I_25yz = common * tmtyz;
I_25xz = common * tmtxz;
sxx(coord) = sxx(coord) + I_03xx*s_03 + I_13xx*s_13 + I_05xx*s_05 +...
I_15xx*s_15 + I_25xx*s_25;
syy(coord) = syy(coord) + I_03yy*s_03 + I_13yy*s_13 + I_05yy*s_05 +...
I_15yy*s_15 + I_25yy*s_25;
szz(coord) = szz(coord) + I_03zz*s_03 + I_13zz*s_13 + I_05zz*s_05 +...
I_15zz*s_15 + I_25zz*s_25;
sxy(coord) = sxy(coord) + I_03xy*s_03 + I_13xy*s_13 + I_05xy*s_05 +...
I_15xy*s_15 + I_25xy*s_25;
syz(coord) = syz(coord) + I_03yz*s_03 + I_13yz*s_13 + I_05yz*s_05 +...
I_15yz*s_15 + I_25yz*s_25;
sxz(coord) = sxz(coord) + I_03xz*s_03 + I_13xz*s_13 + I_05xz*s_05 +...
I_15xz*s_15 + I_25xz*s_25;
end
end

Sign in to comment.


dpb
dpb on 30 Jul 2013
Edited: dpb on 30 Jul 2013
Matlab and C differ in internal storage order--Matlab is column major whereas C is row major. IOW, the first index in 2D array in Matlab goes down the first column whereas in C the same index traverses the first row.
The result you see is directly the result of that. For speed you want to operate on arrays in linear memory order so you would reorder the indices between the two languages to make that happen. So, in the C mex-function, order the array loops to traverse by row.
  7 Comments
dpb
dpb on 31 Jul 2013
Edited: dpb on 2 Aug 2013
OK, I did paste the code into an editor so could pare it down to the essence--
The problem has to be as IA says in what you haven't shown...the declarations and calling sequences and how you then manipulated the results.
You've somehow transposed the data before going into the routine (the round translucent orb tells me... :) )
Shows us the minimum steps you used to illustrate the problem from the start...specifically, the form/shape of the input and all steps prior to and including the call and then any manipulations afterwards up until the point at which you think there's a problem (for both solutions, of course).

Sign in to comment.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!