How to use memmapfiles safely for inter-process communication?

Question

Igor Dimitrijevic on 6 Jun 2017

1
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/343521-how-to-use-memmapfiles-safely-for-inter-process-communication

Commented: Igor Dimitrijevic on 15 Jun 2017

I am interested in using memory mapped files to implement inter-process communication between a Matlab process and a foreign process. Portability (Windows / Linux) is a concern, but my main concern is reliability.

Looking at the example in Share Memory Between Applications, I am surprised the code is that simple. Does the code actually work? The shared byte m.Data(1) controls which process is allowed to access the shared data, but m.Data(1) itself doesn't seem to be protected against data races. To implement this example in C++, one would typically add some synchronization object, either a locking one (mutex, semaphore or condition variable) or a lock-free one (involving some kind of memory barrier). The Boost library provides some good examples of such mechanisms.

Can we use such synchronization objects with Matlab's memmapfiles? Or is there some kind of magic the Matlab compiler adds behind the scenes, that makes my concern pointless?

Edit: I am specifically concerned by the compiled code of this example.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Philip Borghesani on 8 Jun 2017

0
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/343521-how-to-use-memmapfiles-safely-for-inter-process-communication#answer_270065

In MATLAB on Intel (x86) platforms I believe this code is safe and will work correctly. Because the communication process is token based and designed to only work between two processes a simple mechanism is possible. If this was done in C with a memory mapped file the only change needed would be to declare the first memory location atomic.

Also note that this is not the most efficient way to do this type of thing. The sleep calls dictate the maximum call throughput and the poling mechanism is inefficient. Doing this with one or two mex files and proper inter-process communication would have much better performance.

11 Comments
Show 9 older commentsHide 9 older comments

Igor Dimitrijevic on 13 Jun 2017

I don't see the point with the amount of code being executed between two reads. If this amount is high, surely the reads and writes to m.Data(1) are very rare compared to the remaining parts. But out-of-order execution can apply potentially to any memory access. So the probability that it affects m.Data(1) in a way that breaks the expected global behavior only depends on the immediate execution context when m.Data(1) is accessed and the specific rules for out-of-order execution on the platform.

As far as I understand, the MATLAB system (just) guarantees no instruction reordering is done at compile time when dealing with memmaped object. That's a good point, but not enough to get proper synchronization. Still I don't get why, even for a simple 1 producer 1 consumer, C/C++ (and others) programmers carefully make use of synchronization objects to guarantee data consistency, while MATLAB users would just have to say "it just isn't going to happen".

Considering that the scientists at my office do care about the consistency of their data, what would you recommend?

Is there any safe way to use MATLAB memmaped objects for a simple 1 producer 1 consumer task, within the MATLAB language, or do we have to do the actual io thing in C++ via mex files to guarantee data consistency?

Igor Dimitrijevic on 14 Jun 2017

The confusion, if any, can't be mine: I am not concerned with C/C++ code, which I am familiar with, and if I wanted to ask anything about C/C++ code, I surely would ask somewhere else.

When I mentioned C or C++ code, it was to give some examples about how people deal with synchronization in concurent programming in other languages, since many concepts in this field are the same no matter the language.

And since Philip Borghesani asked me, I said by compiled code I meant what (whatever) the MATLAB Compiler generates, NOT what you could obtain by using the Matlab Coder and compiling the generated C++ code by yourself.

Philip Borghesani already clarified to me the fact that the MATLAB compiler runs code identically to how it runs in MATLAB. That's an important point to know, but doesn't help much regarding my concern.

He also already explained that no instruction reordering is done at compile time (at least when dealing with memmapfile objects), which is a very good point regarding data races safety. It is, though, only part of what is needed to get proper synchronization. The easiest part by the way. More details can be found by following the link to a related question in stackoverflow I've given.

Walter Roberson on 14 Jun 2017

If you were to access m.data(I,J) with vector I or J then the order of the accesses to the elements is not defined and could be different for large enough arrays (because the pattern of copying out to call the high performance libraries could be different)

Igor Dimitrijevic on 15 Jun 2017

@Philip Borghesani

I misunderstood your point about the intervening instructions that MATLAB generates. I get it now.

In no way am I an expert in this field, but it seems to me that, regarding data races issues with out-of-order execution, one should not compare the intervening instructions with the size of the execution pipeline, nor the instruction decode pipeline. It seems to me the intervening instructions should be compared to the sizes of the reorder buffer or the reservation station. These can be fairly large: for exemple the reorder buffer has 224 uop entries and the reservation station has 97 uop entries on the Intel Skylake.

One should notice that, even if "only" one word of data is altered, this could lead to huge data losses: this one word could be an index needed to interpret other data.

Another thing to keep in mind, is that any of the two processes can be halted for a short time, if the processor that run it decides to swich for some urgent task. This could potentially increase data losses/corruption.

Sign in to comment.

How to use memmapfiles safely for inter-process communication?

0 Comments
Show -2 older commentsHide -2 older comments

Answers (1)

11 Comments
Show 9 older commentsHide 9 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

How to use memmapfiles safely for inter-process communication?

0 Comments Show -2 older commentsHide -2 older comments

Answers (1)

11 Comments Show 9 older commentsHide 9 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

11 Comments
Show 9 older commentsHide 9 older comments