what is the difference between assigning with and without range?

Question

Ernst Reißner on 23 Jul 2021

0
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/884454-what-is-the-difference-between-assigning-with-and-without-range

Commented: Chunru on 25 Jul 2021

I have variables a and b both holding columns with same length N.

Is there a difference between assigning

a=b

and

a(1:N)=b

?

Maybe there is a difference in performance??

a is preassigned with zeros(N, 1, 'double')

2 Comments
Show NoneHide None

Ernst Reißner on 23 Jul 2021

It seems to me that if the number of elements is known, and so numel is not needed, it is even faster with indexing.

I had the idea that it is faster with indexing because no additional memory is required

and garbage collection is circumvented.

But seemingly no.

Chunru on 23 Jul 2021

I think matlab array object has the numel property (or something similar). So the overhead of getting numel is really minimal and one should not worry about it.

Sign in to comment.

Sign in to answer this question.

Answer 1

John D'Errico on 23 Jul 2021

1
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/884454-what-is-the-difference-between-assigning-with-and-without-range#answer_752334

Open in MATLAB Online

Yes. There is a difference, and a fundamental one. In the first case, the assignment a=b COMPLETELY replaces a. The variable is overwritten, if it already exists. If not, a new variable is created with that name. What a was before is completely irrelevant. Even the class of a is replaced. For example:

a = uint8(1:3);
b = rand(1,4);
whos a b
  Name      Size            Bytes  Class     Attributes

  a         1x3                 3  uint8               
  b         1x4                32  double              

As you can see, the two variables are not even the same classes. But now when we use a = b, a has been replaced. It has a new size. And a is now double precision.

a = b
a = 1×4
    0.0366    0.9518    0.5077    0.6811
whos a b
  Name      Size            Bytes  Class     Attributes

  a         1x4                32  double              
  b         1x4                32  double              

Now, lets try the second example, where we use indexing.

a = uint8(1:3)
a = 1×3
   1   2   3
b = rand(1,4)
b = 1×4
    0.3677    0.6780    0.7891    0.8548

Now use indexing:

a(1:4) = b
a = 1×4
   0   1   1   1
whos a b
  Name      Size            Bytes  Class     Attributes

  a         1x4                 4  uint8               
  b         1x4                32  double              

Here, elements of a are now selectively replaced with elements of b. But now there is a class conversion that happens first. Here the elements of a are still uint8, so a round was performed to convert elements of b into elements of a.

Is one operation faster than the other? The index operation must certainly be slower. But this is not a slow thing. So I'll put the operations into a function, then use timeit.

a = ones(1,3e7,'uint8');
b = rand(1,3e7);
timeit(@() speedtest1(a,b))
ans = 1.1380e-05
timeit(@() speedtest2(a,b))
ans = 0.2106

So, where there was a class conversion, speedtest1 is way faster. In the next test, there will be no class conversion.

a = randn(1,3e7);
b = rand(1,3e7);
timeit(@() speedtest1(a,b))
ans = 4.5789e-06
timeit(@() speedtest2(a,b))
ans = 0.2247

So here the replacement was way faster. In both cases, when you do an insert of selected elements, MATLAB spends a lot of time, first, generating the index vector. Then it needs to overwrite those selected elements, making sure any class conversion is done if needed.

function a = speedtest1(a,b)
  a = b;
end
function a = speedtest2(a,b)
  a(1:numel(b)) = b;
end

So, is there a difference? Yes. There must be one.

10 Comments
Show 8 older commentsHide 8 older comments

John D'Errico on 23 Jul 2021

Open in MATLAB Online

Using tic and toc is a terrible way to test time!!!!!! Learn to use timeit.

You are correct, in that MATLAB does a lazy copy if it can. With the simple a = b, MATLAB figures out that it can just link a to b, without completely copying the elements over.

When you force one element to change, then MATLAB does force the variable to be copied. But even there, see this comparison:

a = ones(1,3e7,'uint8');
b = rand(1,3e7);
timeit(@() speedtest1(a,b))
ans = 1.3400e-05
timeit(@() speedtest2(a,b))
ans = 0.2210
timeit(@() speedtest3(a,b))
ans = 0.1935

So there is some difference, and that difference IS repeatable. Does it lie in the class conversion? The next test has no class conversion.

a = ones(1,3e7);
b = rand(1,3e7); % Both are doubles now
timeit(@() speedtest1(a,b))
ans = 6.3956e-06
timeit(@() speedtest2(a,b))
ans = 0.2337
timeit(@() speedtest3(a,b))
ans = 0.1937

Again, the copy a=b is more efficient, even if we force MATLAB to resolve the copy, by modifying one element.

function a = speedtest1(a,b)
  a = b;
end
function a = speedtest2(a,b)
  a(1:numel(b)) = b;
end
function a = speedtest3(a,b)
  a = b;
  a(1) = a(1) + 0;
end

Walter Roberson on 25 Jul 2021

Open in MATLAB Online

It is known that

x = 1 : 100;
for k = x

causes 1 : 100 to be executed and the result placed into an array, and then the for loop to iterate over elements of the stored array.

It is known that

for k = 1 : 100

does not cause 1 : 100 to be executed immediately, with instead the initial and final value and increments being stored in hidden locations, and the increment being added as needed. This can be demonstrated by performance timings, and in particular it can be demonstrated by using for with so many iterations requested that the memory required to store the loop values would exceed available memory.

So now.. .what about indexing? If you have

A(1:100) = 5

then does it do the equivalent of

 temp13103 = 1:100;
 A(x) = 5;
 clear temp13103

or does it process the range internally without generating the vector? You could possibly tease that out with timing tests.

The model, with subsref() and subsasgn(), is that the vector is actually generated. User-provided subsref() and subsasgn() does not need to process general A:B:C colon operator, and instead receives an already-instantiatiated vector or else the literal ':' (which subref() and subsasgn() supposedly only get passed when the entire dimension is specified as colon by itself.)

... But the model for how user object classes work, is not necessarily the same as how MATLAB handles the Execution Engine.

A small number of releases ago, MATLAB started keeping hidden copies to "small enough" vectors, so if you wrote

 A = 1:50;
 B = 1:50;
 

then A and B might end up with the same data pointer and the normal reference count might not act the same as before. James (I think it was) showed that if you were to do an in-place operation on A then that could affect B even though they are supposedly not linked.

Exactly how the code was written affected whether sharing could take place. Spacing and comments were important.

I seem to recall 500 bytes being mentioned as the upper bound on when this internal sharing happened.

That leads me to wonder whether now some of that private sharing is going on for vectors used for index operations.

 A(1:50) = 5;
 B(1:50) = 7;
 

Does this involve 1:50 being generated as an actual vector at run-time, twice (once for each of the lines) ? Or does the parser now generate 1:50 internally and "private share" it with the 1:50 of the second line -- a point that could be important for timing purposes ? If so does the same thing happen for larger index vectors?

To be sure, if I had coded

 A(1:100000) = X;
 B(1:100000) = Y;

I would tend to think it was a Good Thing if MATLAB did not generate an actual index vector twice, either because MATLAB internally recodes it in terms of start and stop position or because it shares the vector. But it becomes important that we know how this does (or does not) work when we try to do timing tests of indexing: we might be timing the wrong thing.

Chunru on 25 Jul 2021

Open in MATLAB Online

If I were the one to implement the indexing such as a(2:2:1000) internally, I would prefere to not generationg the index 2:2:1000. Instead, I would using something similar to python iterator to generate those index when necessary (without taking up large size of memory). My bet is that MATLAB would not generate index vector explicitly here.

a = randn(1,1e6);
timeit(@() indexing1(a))
ans = 0.0012
timeit(@() indexing2(a))
ans = 0.0031

It shows that a(1:5e5) is faster, indicating indexing is likely imlicitly generated.

For the code like

A(1:100000) = X;

B(1:100000) = Y;

I guess there is no good reason we need to share 1:10000 if they have never been generated explicitly.

For subsref() and subsasgn(), they are designed for quite diffent purpose and we don't know matlab can manage those index generation in an implicit way.

function indexing1(a)
    a(1:5e5) = 1;
end
function indexing2(a)
    ind = 1:5e5;
    a(ind) = 1;
end

Sign in to comment.

Answer 2

Chunru on 23 Jul 2021

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/884454-what-is-the-difference-between-assigning-with-and-without-range#answer_752324

Open in MATLAB Online

There should not be major performance difference between "a=b" and "a(1:N)=b". The assignment with range allow partial assignment of an array, for example:

a = zeros(100, 1);
b = randn(20, 1);
a(1:20) = b;

2 Comments
Show NoneHide None

John D'Errico on 23 Jul 2021

Edited: John D'Errico on 23 Jul 2021

Really? Not much aof a major difference? See my example cases, where there is a factor of 10000 to 1 difference. ANd even if you force the copy to be resolved, there is STILL a significant difference.

Chunru on 23 Jul 2021

@John D'Errico See my example below. I have explained the performance difference. The timeit for just create a pointer may not truly reflecting the difference. The factor of 10000 to 1 is not a fair comparison in the term I have explained below. I agree that type conversion will take more time (I have not thourt that from the question and assume same type here).

Sign in to comment.

what is the difference between assigning with and without range?

2 Comments
Show NoneHide None

Answers (2)

10 Comments
Show 8 older commentsHide 8 older comments

2 Comments
Show NoneHide None

See Also

Categories

Tags

Products

Community Treasure Hunt

what is the difference between assigning with and without range?

2 Comments Show NoneHide None

Answers (2)

10 Comments Show 8 older commentsHide 8 older comments

2 Comments Show NoneHide None

See Also

Categories

Tags

Products

Community Treasure Hunt

2 Comments
Show NoneHide None

10 Comments
Show 8 older commentsHide 8 older comments

2 Comments
Show NoneHide None