Resultant matrix is rotated 90 whether I use MEX or MatLab function

Question

Francesco on 30 Jul 2013

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/83559-resultant-matrix-is-rotated-90-whether-i-use-mex-or-matlab-function

Dear all,

I am puzzled by something odd. I have written a MEX function using C of a Matlab function that contains 2 nested loops to calculate stress on a 2D grid.

The inputs of the MEX function and Matlab function are identical and are in the form of a 1D array (of the 2D grid).

The outputs of the MEX function and the Matlab function are not identical in the sense that if I reshape into a 2D array using reshape(answer,m,n) and surf() I get that the MEX and Matlab answers are rotated by 90 degrees!

I literally copy and pasted the C function into Matlab, so I am 99% certain that there are no computational differences (apart from the minor changes like brackets etc.)

My code is structured as follows:

for (coord=0; coord<N; coord++) { 
        for (seg=0; seg<S; seg++) {
        ...accessing arrays with index coord: px,py,pz
        ...accessing arrays with index seg: p1x,p1y,p1z,p2x,p2y,p2z,bx,by,bz
        sxx[coord] += ...maths
        }
   }

Similarly in Matlab,

for coord=1:N  
        for seg=1:S
          ...accessing arrays with index coord: px,py,pz
          ...accessing arrays with index seg: p1x,p1y,p1z,p2x,p2y,p2z,bx,by,bz
          sxx(coord) = sxx(coord) + ...maths
        end
  end

Both functions have the same identical nomenclature and variable order. I use the same input arrays in the workspace, and yet the 2D grids after reshape are rotated!

My mexFunction for the C code is: void mexFunction( int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) { int N,S; double *px,*py,*pz; double *p1x,*p1y,*p1z,*p2x,*p2y,*p2z; double *bx,*by,*bz; double a,MU,NU; double *sxx,*syy,*szz,*sxy,*syz,*sxz; int i;

      //get the scalar inputs N,S,a,MU,NU
      //create pointers to the input matrices
      N=mxGetScalar(prhs[0]);
      S=mxGetScalar(prhs[1]);
      px=(double *) mxGetPr(prhs[2]);
      py=(double *) mxGetPr(prhs[3]);
      pz=(double *) mxGetPr(prhs[4]);
      p1x=(double *) mxGetPr(prhs[5]);
      p1y=(double *) mxGetPr(prhs[6]);
      p1z=(double *) mxGetPr(prhs[7]);
      p2x=(double *) mxGetPr(prhs[8]);
      p2y=(double *) mxGetPr(prhs[9]);
      p2z=(double *) mxGetPr(prhs[10]); 
      bx=(double *) mxGetPr(prhs[11]);
      by=(double *) mxGetPr(prhs[12]);
      bz=(double *) mxGetPr(prhs[13]);
      a=mxGetScalar(prhs[14]);
      MU=mxGetScalar(prhs[15]);
      NU=mxGetScalar(prhs[16]);
      //call the output pointer to the output matrix
      plhs[0]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[1]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[2]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[3]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[4]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[5]=mxCreateDoubleMatrix(N,1,mxREAL);
      //create a C pointer to a copy of the output matrix
    sxx=(double *) mxGetPr(plhs[0]);
      syy=(double *) mxGetPr(plhs[1]);
      szz=(double *) mxGetPr(plhs[2]);
      sxy=(double *) mxGetPr(plhs[3]);
      syz=(double *) mxGetPr(plhs[4]);
      sxz=(double *) mxGetPr(plhs[5]);
      //call the C subroutine
      StressDueToSeg(N,S, px, py, pz,p1x, p1y, p1z,
                  p2x, p2y, p2z, bx, by, bz,
                  a, MU, NU,
                  sxx, syy, szz, sxy, syz, sxz);
      }

I wonder whether some kind of indexing difference in the mexFunction leads to this problem? I know I could just rotate the answer by 90 degrees, but seeing as I want to use the MEX function to speed up my computation, rotating large 2D arrays doesn't sound very efficient.

Thank you,

Kind Regards,

F

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

James Tursa on 30 Jul 2013

2
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/83559-resultant-matrix-is-rotated-90-whether-i-use-mex-or-matlab-function#answer_93132

It is the code that you don't show that is likely at the heart of the problem ... the indexing you use and how it relates to the mxArray variables you are passing in and returning to the workspace. E.g., are any of your inputs 2D? If so, how do you index into them in your mex function? Can you show us the actual indexing code?

1 Comment
Show -1 older commentsHide -1 older comments

Francesco on 30 Jul 2013

Hello,

My inputs are vectors, so are my outputs. I reshape() in Matlab once I get the outputs from the MEX or Matlab function.

Here is the MEX function. It's a bit long.

#include <math.h>
#include <mex.h>
static void StressDueToSeg(int N, int S, double *px, double *py, double *pz,
                    double *p1x, double *p1y, double *p1z,
                    double *p2x, double *p2y, double *p2z,
                    double *bx, double *by, double *bz,
                    double a, double MU, double NU,
                    double *sxx, double *syy, double *szz,
                    double *sxy, double *syz, double *sxz);
void mexFunction( int nlhs, mxArray *plhs[],
                  int nrhs, const mxArray *prhs[])
    {
    int N,S;
    double *px,*py,*pz;
    double *p1x,*p1y,*p1z,*p2x,*p2y,*p2z;
    double *bx,*by,*bz;
    double a,MU,NU;
    double *sxx,*syy,*szz,*sxy,*syz,*sxz;
    int i;
      //get the scalar inputs N,S,a,MU,NU
      //create pointers to the input matrices
      N=mxGetScalar(prhs[0]);
      S=mxGetScalar(prhs[1]);
      px=(double *) mxGetPr(prhs[2]);
      py=(double *) mxGetPr(prhs[3]);
      pz=(double *) mxGetPr(prhs[4]);
      p1x=(double *) mxGetPr(prhs[5]);
      p1y=(double *) mxGetPr(prhs[6]);
      p1z=(double *) mxGetPr(prhs[7]);
      p2x=(double *) mxGetPr(prhs[8]);
      p2y=(double *) mxGetPr(prhs[9]);
      p2z=(double *) mxGetPr(prhs[10]); 
      bx=(double *) mxGetPr(prhs[11]);
      by=(double *) mxGetPr(prhs[12]);
      bz=(double *) mxGetPr(prhs[13]);
      a=mxGetScalar(prhs[14]);
      MU=mxGetScalar(prhs[15]);
      NU=mxGetScalar(prhs[16]);
      //call the output pointer to the output matrix
      plhs[0]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[1]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[2]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[3]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[4]=mxCreateDoubleMatrix(N,1,mxREAL);
      plhs[5]=mxCreateDoubleMatrix(N,1,mxREAL);
      //create a C pointer to a copy of the output matrix
    sxx=(double *) mxGetPr(plhs[0]);
      syy=(double *) mxGetPr(plhs[1]);
      szz=(double *) mxGetPr(plhs[2]);
      sxy=(double *) mxGetPr(plhs[3]);
      syz=(double *) mxGetPr(plhs[4]);
      sxz=(double *) mxGetPr(plhs[5]);
      //call the C subroutine
      StressDueToSeg(N,S, px, py, pz,p1x, p1y, p1z,
                  p2x, p2y, p2z, bx, by, bz,
                  a, MU, NU,
                  sxx, syy, szz, sxy, syz, sxz);
      }
void StressDueToSeg(int N, int S, double *px, double *py, double *pz,
                    double *p1x, double *p1y, double *p1z,
                    double *p2x, double *p2y, double *p2z,
                    double *bx, double *by, double *bz,
                    double a, double MU, double NU,
                    double *sxx, double *syy, double *szz,
                    double *sxy, double *syz, double *sxz)
{
    double   oneoverLp, common;
    double   vec1x, vec1y, vec1z;
    double   tpx, tpy, tpz;
    double   Rx, Ry, Rz, Rdt;
    double   ndx, ndy, ndz;
    double   d2, s1, s2, a2, a2_d2, a2d2inv;
    double   Ra, Rainv, Ra3inv, sRa3inv;
    double   s_03a, s_13a, s_05a, s_15a, s_25a;
    double   s_03b, s_13b, s_05b, s_15b, s_25b;
    double   s_03, s_13, s_05, s_15, s_25;
    double   m4p, m8p, m4pn, mn4pn, a2m8p;
    double   txbx, txby, txbz;
    double   dxbx, dxby, dxbz;
    double   dxbdt, dmdxx, dmdyy, dmdzz, dmdxy, dmdyz, dmdxz;
    double   tmtxx, tmtyy, tmtzz, tmtxy, tmtyz, tmtxz;
    double   tmdxx, tmdyy, tmdzz, tmdxy, tmdyz, tmdxz;
    double   tmtxbxx, tmtxbyy, tmtxbzz, tmtxbxy, tmtxbyz, tmtxbxz;
    double   dmtxbxx, dmtxbyy, dmtxbzz, dmtxbxy, dmtxbyz, dmtxbxz;
    double   tmdxbxx, tmdxbyy, tmdxbzz, tmdxbxy, tmdxbyz, tmdxbxz;
    double   I_03xx, I_03yy, I_03zz, I_03xy, I_03yz, I_03xz;
    double   I_13xx, I_13yy, I_13zz, I_13xy, I_13yz, I_13xz;
    double   I_05xx, I_05yy, I_05zz, I_05xy, I_05yz, I_05xz;
    double   I_15xx, I_15yy, I_15zz, I_15xy, I_15yz, I_15xz;
    double   I_25xx, I_25yy, I_25zz, I_25xy, I_25yz, I_25xz;
    int     coord, seg;
      //precompute some constants
      m4p = 0.25 * MU / M_PI;
      m8p = 0.5 * m4p;
      m4pn = m4p / (1 - NU);
      mn4pn = m4pn * NU;
      a2 = a * a;
      a2m8p = a2 * m8p;
      for (coord=0; coord<N; coord++) { //loop over the grid points
          for (seg=0; seg<S; seg++) { //loop over the segments
          vec1x = p2x[seg] - p1x[seg];
          vec1y = p2y[seg] - p1y[seg];
          vec1z = p2z[seg] - p1z[seg];
          oneoverLp = 1 / sqrt(vec1x*vec1x + vec1y*vec1y + vec1z*vec1z);
          tpx = vec1x * oneoverLp;
          tpy = vec1y * oneoverLp;
          tpz = vec1z * oneoverLp;
          Rx = px[coord] - p1x[seg];
          Ry = py[coord] - p1y[seg];
          Rz = pz[coord] - p1z[seg];
          Rdt = Rx*tpx + Ry*tpy + Rz*tpz;
          ndx = Rx - Rdt*tpx;
          ndy = Ry - Rdt*tpy;
          ndz = Rz - Rdt*tpz;
          d2 = ndx*ndx + ndy*ndy + ndz*ndz;
          s1 = -Rdt;
          s2 = -((px[coord]-p2x[seg])*tpx + (py[coord]-p2y[seg])*tpy + (pz[coord]-p2z[seg])*tpz);
          a2_d2 = a2 + d2;
          a2d2inv = 1 / a2_d2;
          Ra = sqrt(a2_d2 + s1*s1);
          Rainv = 1 / Ra;
          Ra3inv = Rainv * Rainv * Rainv;
          sRa3inv = s1 * Ra3inv;
          s_03a = s1 * Rainv * a2d2inv;
          s_13a = -Rainv;
          s_05a = (2*s_03a + sRa3inv) * a2d2inv;
          s_15a = -Ra3inv;
          s_25a = s_03a - sRa3inv;
          Ra = sqrt(a2_d2 + s2*s2);
          Rainv = 1 / Ra;
          Ra3inv = Rainv * Rainv * Rainv;
          sRa3inv = s2 * Ra3inv;
          s_03b = s2 * Rainv * a2d2inv;
          s_13b = -Rainv;
          s_05b = (2*s_03b + sRa3inv) * a2d2inv;
          s_15b = -Ra3inv;
          s_25b = s_03b - sRa3inv;
          s_03 = s_03b - s_03a;
          s_13 = s_13b - s_13a;
          s_05 = s_05b - s_05a;
          s_15 = s_15b - s_15a;
          s_25 = s_25b - s_25a;
          txbx = tpy*bz[seg] - tpz*by[seg];
          txby = tpz*bx[seg] - tpx*bz[seg];
          txbz = tpx*by[seg] - tpy*bx[seg];
          dxbx = ndy*bz[seg] - ndz*by[seg];
          dxby = ndz*bx[seg] - ndx*bz[seg];
          dxbz = ndx*by[seg] - ndy*bx[seg];
          dxbdt = dxbx*tpx + dxby*tpy + dxbz*tpz;
          dmdxx = ndx * ndx;
          dmdyy = ndy * ndy;
          dmdzz = ndz * ndz;
          dmdxy = ndx * ndy;
          dmdyz = ndy * ndz;
          dmdxz = ndx * ndz;
          tmtxx = tpx * tpx;
          tmtyy = tpy * tpy;
          tmtzz = tpz * tpz;
          tmtxy = tpx * tpy;
          tmtyz = tpy * tpz;
          tmtxz = tpx * tpz;
          tmdxx = 2 * tpx * ndx;
          tmdyy = 2 * tpy * ndy;
          tmdzz = 2 * tpz * ndz;
          tmdxy = tpx*ndy + tpy*ndx;
          tmdyz = tpy*ndz + tpz*ndy;
          tmdxz = tpx*ndz + tpz*ndx;
          tmtxbxx = 2 * tpx * txbx;
          tmtxbyy = 2 * tpy * txby;
          tmtxbzz = 2 * tpz * txbz;
          tmtxbxy = tpx*txby + tpy*txbx;
          tmtxbyz = tpy*txbz + tpz*txby;
          tmtxbxz = tpx*txbz + tpz*txbx;
          dmtxbxx = 2 * ndx * txbx;
          dmtxbyy = 2 * ndy * txby;
          dmtxbzz = 2 * ndz * txbz;
          dmtxbxy = ndx*txby + ndy*txbx;
          dmtxbyz = ndy*txbz + ndz*txby;
          dmtxbxz = ndx*txbz + ndz*txbx;
          tmdxbxx = 2 * tpx * dxbx;
          tmdxbyy = 2 * tpy * dxby;
          tmdxbzz = 2 * tpz * dxbz;
          tmdxbxy = tpx*dxby + tpy*dxbx;
          tmdxbyz = tpy*dxbz + tpz*dxby;
          tmdxbxz = tpx*dxbz + tpz*dxbx;
          common = m4pn * dxbdt;
          I_03xx = common + m4pn*dmtxbxx - m4p*tmdxbxx;
          I_03yy = common + m4pn*dmtxbyy - m4p*tmdxbyy;
          I_03zz = common + m4pn*dmtxbzz - m4p*tmdxbzz;
          I_03xy = m4pn*dmtxbxy - m4p*tmdxbxy;
          I_03yz = m4pn*dmtxbyz - m4p*tmdxbyz;
          I_03xz = m4pn*dmtxbxz - m4p*tmdxbxz;
          I_13xx = -mn4pn * tmtxbxx;
          I_13yy = -mn4pn * tmtxbyy;
          I_13zz = -mn4pn * tmtxbzz;
          I_13xy = -mn4pn * tmtxbxy;
          I_13yz = -mn4pn * tmtxbyz;
          I_13xz = -mn4pn * tmtxbxz;
          I_05xx = common*(a2+dmdxx) - a2m8p*tmdxbxx;
          I_05yy = common*(a2+dmdyy) - a2m8p*tmdxbyy;
          I_05zz = common*(a2+dmdzz) - a2m8p*tmdxbzz;
          I_05xy = common*dmdxy - a2m8p*tmdxbxy;
          I_05yz = common*dmdyz - a2m8p*tmdxbyz;
          I_05xz = common*dmdxz - a2m8p*tmdxbxz;
          I_15xx = a2m8p*tmtxbxx - common*tmdxx;
          I_15yy = a2m8p*tmtxbyy - common*tmdyy;
          I_15zz = a2m8p*tmtxbzz - common*tmdzz;
          I_15xy = a2m8p*tmtxbxy - common*tmdxy;
          I_15yz = a2m8p*tmtxbyz - common*tmdyz;
          I_15xz = a2m8p*tmtxbxz - common*tmdxz;
          I_25xx = common * tmtxx;
          I_25yy = common * tmtyy;
          I_25zz = common * tmtzz;
          I_25xy = common * tmtxy;
          I_25yz = common * tmtyz;
          I_25xz = common * tmtxz;
          sxx[coord] += I_03xx*s_03 + I_13xx*s_13 + I_05xx*s_05 +
          I_15xx*s_15 + I_25xx*s_25;
          syy[coord] += I_03yy*s_03 + I_13yy*s_13 + I_05yy*s_05 +
          I_15yy*s_15 + I_25yy*s_25;
          szz[coord] += I_03zz*s_03 + I_13zz*s_13 + I_05zz*s_05 +
          I_15zz*s_15 + I_25zz*s_25;
          sxy[coord] += I_03xy*s_03 + I_13xy*s_13 + I_05xy*s_05 +
          I_15xy*s_15 + I_25xy*s_25;
          syz[coord] += I_03yz*s_03 + I_13yz*s_13 + I_05yz*s_05 +
          I_15yz*s_15 + I_25yz*s_25;
          sxz[coord] += I_03xz*s_03 + I_13xz*s_13 + I_05xz*s_05 +
          I_15xz*s_15 + I_25xz*s_25;
          }
      }
      return;
  }

This is the equivalent Matlab function.

function [sxx, syy, szz, sxy, sxz, syz] = StressDueToSeg_test(N,S,px,py,pz,p1x,p1y,p1z,p2x,p2y,p2z,bx,by,bz,a,MU,NU)    
m4p = 0.25 * MU / pi;
m8p = 0. * m4p;
m4pn = m4p / (1.0 - NU);
mn4pn = m4pn * NU;
a2 = a * a;
a2m8p = a2 * m8p;
sxx=zeros(N,1);
syy=zeros(N,1);
szz=zeros(N,1);
sxy=zeros(N,1);
sxz=zeros(N,1);
syz=zeros(N,1);
for coord=1:N  
        for seg=1:S
          vec1x = p2x(seg) - p1x(seg);
          vec1y = p2y(seg) - p1y(seg);
          vec1z = p2z(seg) - p1z(seg);
          oneoverLp = 1 / sqrt(vec1x*vec1x + vec1y*vec1y + vec1z*vec1z);
          tpx = vec1x * oneoverLp;
          tpy = vec1y * oneoverLp;
          tpz = vec1z * oneoverLp;
          Rx = px(coord) - p1x(seg);
          Ry = py(coord) - p1y(seg);
          Rz = pz(coord) - p1z(seg);
          Rdt = Rx*tpx + Ry*tpy + Rz*tpz;
          ndx = Rx - Rdt*tpx;
          ndy = Ry - Rdt*tpy;
          ndz = Rz - Rdt*tpz;
          d2 = ndx*ndx + ndy*ndy + ndz*ndz;
          s1 = -Rdt;
          s2 = -((px(coord)-p2x(seg))*tpx + (py(coord)-p2y(seg))*tpy + (pz(coord)-p2z(seg))*tpz);
          a2_d2 = a2 + d2;
          a2d2inv = 1 / a2_d2;
          Ra = sqrt(a2_d2 + s1*s1);
          Rainv = 1 / Ra;
          Ra3inv = Rainv * Rainv * Rainv;
          sRa3inv = s1 * Ra3inv;
          s_03a = s1 * Rainv * a2d2inv;
          s_13a = -Rainv;
          s_05a = (2*s_03a + sRa3inv) * a2d2inv;
          s_15a = -Ra3inv;
          s_25a = s_03a - sRa3inv;
          Ra = sqrt(a2_d2 + s2*s2);
          Rainv = 1 / Ra;
          Ra3inv = Rainv * Rainv * Rainv;
          sRa3inv = s2 * Ra3inv;
          s_03b = s2 * Rainv * a2d2inv;
          s_13b = -Rainv;
          s_05b = (2*s_03b + sRa3inv) * a2d2inv;
          s_15b = -Ra3inv;
          s_25b = s_03b - sRa3inv;
          s_03 = s_03b - s_03a;
          s_13 = s_13b - s_13a;
          s_05 = s_05b - s_05a;
          s_15 = s_15b - s_15a;
          s_25 = s_25b - s_25a;
          txbx = tpy*bz(seg) - tpz*by(seg);
          txby = tpz*bx(seg) - tpx*bz(seg);
          txbz = tpx*by(seg) - tpy*bx(seg);
          dxbx = ndy*bz(seg) - ndz*by(seg);
          dxby = ndz*bx(seg) - ndx*bz(seg);
          dxbz = ndx*by(seg) - ndy*bx(seg);
          dxbdt = dxbx*tpx + dxby*tpy + dxbz*tpz;
          dmdxx = ndx * ndx;
          dmdyy = ndy * ndy;
          dmdzz = ndz * ndz;
          dmdxy = ndx * ndy;
          dmdyz = ndy * ndz;
          dmdxz = ndx * ndz;
          tmtxx = tpx * tpx;
          tmtyy = tpy * tpy;
          tmtzz = tpz * tpz;
          tmtxy = tpx * tpy;
          tmtyz = tpy * tpz;
          tmtxz = tpx * tpz;
          tmdxx = 2 * tpx * ndx;
          tmdyy = 2 * tpy * ndy;
          tmdzz = 2 * tpz * ndz;
          tmdxy = tpx*ndy + tpy*ndx;
          tmdyz = tpy*ndz + tpz*ndy;
          tmdxz = tpx*ndz + tpz*ndx;
          tmtxbxx = 2 * tpx * txbx;
          tmtxbyy = 2 * tpy * txby;
          tmtxbzz = 2 * tpz * txbz;
          tmtxbxy = tpx*txby + tpy*txbx;
          tmtxbyz = tpy*txbz + tpz*txby;
          tmtxbxz = tpx*txbz + tpz*txbx;
          dmtxbxx = 2 * ndx * txbx;
          dmtxbyy = 2 * ndy * txby;
          dmtxbzz = 2 * ndz * txbz;
          dmtxbxy = ndx*txby + ndy*txbx;
          dmtxbyz = ndy*txbz + ndz*txby;
          dmtxbxz = ndx*txbz + ndz*txbx;
          tmdxbxx = 2 * tpx * dxbx;
          tmdxbyy = 2 * tpy * dxby;
          tmdxbzz = 2 * tpz * dxbz;
          tmdxbxy = tpx*dxby + tpy*dxbx;
          tmdxbyz = tpy*dxbz + tpz*dxby;
          tmdxbxz = tpx*dxbz + tpz*dxbx;
          common = m4pn * dxbdt;
          I_03xx = common + m4pn*dmtxbxx - m4p*tmdxbxx;
          I_03yy = common + m4pn*dmtxbyy - m4p*tmdxbyy;
          I_03zz = common + m4pn*dmtxbzz - m4p*tmdxbzz;
          I_03xy = m4pn*dmtxbxy - m4p*tmdxbxy;
          I_03yz = m4pn*dmtxbyz - m4p*tmdxbyz;
          I_03xz = m4pn*dmtxbxz - m4p*tmdxbxz;
          I_13xx = -mn4pn * tmtxbxx;
          I_13yy = -mn4pn * tmtxbyy;
          I_13zz = -mn4pn * tmtxbzz;
          I_13xy = -mn4pn * tmtxbxy;
          I_13yz = -mn4pn * tmtxbyz;
          I_13xz = -mn4pn * tmtxbxz;
          I_05xx = common*(a2+dmdxx) - a2m8p*tmdxbxx;
          I_05yy = common*(a2+dmdyy) - a2m8p*tmdxbyy;
          I_05zz = common*(a2+dmdzz) - a2m8p*tmdxbzz;
          I_05xy = common*dmdxy - a2m8p*tmdxbxy;
          I_05yz = common*dmdyz - a2m8p*tmdxbyz;
          I_05xz = common*dmdxz - a2m8p*tmdxbxz;
          I_15xx = a2m8p*tmtxbxx - common*tmdxx;
          I_15yy = a2m8p*tmtxbyy - common*tmdyy;
          I_15zz = a2m8p*tmtxbzz - common*tmdzz;
          I_15xy = a2m8p*tmtxbxy - common*tmdxy;
          I_15yz = a2m8p*tmtxbyz - common*tmdyz;
          I_15xz = a2m8p*tmtxbxz - common*tmdxz;
          I_25xx = common * tmtxx;
          I_25yy = common * tmtyy;
          I_25zz = common * tmtzz;
          I_25xy = common * tmtxy;
          I_25yz = common * tmtyz;
          I_25xz = common * tmtxz;
          sxx(coord) = sxx(coord) + I_03xx*s_03 + I_13xx*s_13 + I_05xx*s_05 +...
          I_15xx*s_15 + I_25xx*s_25;
          syy(coord) = syy(coord) + I_03yy*s_03 + I_13yy*s_13 + I_05yy*s_05 +...
          I_15yy*s_15 + I_25yy*s_25;
          szz(coord) = szz(coord) + I_03zz*s_03 + I_13zz*s_13 + I_05zz*s_05 +...
          I_15zz*s_15 + I_25zz*s_25;
          sxy(coord) = sxy(coord) + I_03xy*s_03 + I_13xy*s_13 + I_05xy*s_05 +...
          I_15xy*s_15 + I_25xy*s_25;
          syz(coord) = syz(coord) + I_03yz*s_03 + I_13yz*s_13 + I_05yz*s_05 +...
          I_15yz*s_15 + I_25yz*s_25;
          sxz(coord) = sxz(coord) + I_03xz*s_03 + I_13xz*s_13 + I_05xz*s_05 +...
          I_15xz*s_15 + I_25xz*s_25;
          end
  end

Sign in to comment.

Answer 2

dpb on 30 Jul 2013

2
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/83559-resultant-matrix-is-rotated-90-whether-i-use-mex-or-matlab-function#answer_93133

Edited: dpb on 30 Jul 2013

Matlab and C differ in internal storage order--Matlab is column major whereas C is row major. IOW, the first index in 2D array in Matlab goes down the first column whereas in C the same index traverses the first row.

The result you see is directly the result of that. For speed you want to operate on arrays in linear memory order so you would reorder the indices between the two languages to make that happen. So, in the C mex-function, order the array loops to traverse by row.

7 Comments
Show 5 older commentsHide 5 older comments

Francesco on 31 Jul 2013

Thanks for the interest. I'm not entirely convinced myself...

dpb on 31 Jul 2013

Edited: dpb on 2 Aug 2013

OK, I did paste the code into an editor so could pare it down to the essence--

The problem has to be as IA says in what you haven't shown...the declarations and calling sequences and how you then manipulated the results.

You've somehow transposed the data before going into the routine (the round translucent orb tells me... :) )

Shows us the minimum steps you used to illustrate the problem from the start...specifically, the form/shape of the input and all steps prior to and including the call and then any manipulations afterwards up until the point at which you think there's a problem (for both solutions, of course).

Sign in to comment.

Resultant matrix is rotated 90 whether I use MEX or MatLab function

0 Comments
Show -2 older commentsHide -2 older comments

Answers (2)

1 Comment
Show -1 older commentsHide -1 older comments

7 Comments
Show 5 older commentsHide 5 older comments

See Also

Categories

Tags

Community Treasure Hunt

Resultant matrix is rotated 90 whether I use MEX or MatLab function

0 Comments Show -2 older commentsHide -2 older comments

Answers (2)

1 Comment Show -1 older commentsHide -1 older comments

7 Comments Show 5 older commentsHide 5 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

7 Comments
Show 5 older commentsHide 5 older comments