Definitive answer for hyperthreading and the Parallel Computing Toolbox (PCT)?

Question

Andrew Diamond on 25 Jun 2013

2
Link

Direct link to this question

https://ch.mathworks.com/matlabcentral/answers/80129-definitive-answer-for-hyperthreading-and-the-parallel-computing-toolbox-pct

Edited: MathWorks Support Team on 1 Sep 2022

Assume that the computations are amenable to a parfor loop and are computation, vs memory, bound.

Given my budget, I could buy a 4/8 (4 physical core/8 virtual core - hypertheading) i7 CPU or a slower 6/12 xeon or a straight 8 AMD (which tests suggest is much inferior to the i7 in various non-matlab benchmarks). If the hyperthreading, in this context, is the same as physical cores then the i7 would seem to be the way to go.

From what I've read, I can't figure out what the actual story is with respect to what extent PCT uses, or not, the hyperthreading. If anyone has an answer, i.e. a roadmap for determining what algs would and wouldn't benefit, I'd like to see it.

Frankly, I'd love it if mathworks would make a comprehensive benchmark across the various CPUs and various PCT parallelizatons and appropriate demo algs.

Thanks

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

Edric Ellis on 25 Jun 2013

8
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/80129-definitive-answer-for-hyperthreading-and-the-parallel-computing-toolbox-pct#answer_89842

Edited: MathWorks Support Team on 1 Sep 2022

By default, MATLAB and Parallel Computing Toolbox consider only real cores, not hyperthreaded cores.

You can override this choice in MATLAB using the maxNumCompThreads function.
You can override this choice in Parallel Computing Toolbox by modifying your local cluster profile. From the MATLAB desktop Parallel menu, select Create and Manage Clusters (or if you are using MATLAB version R2018a or earlier, select Manage Cluster Profiles). This opens the Cluster Profile Manager window. Select a local cluster profile, click the Edit button, and change the value of the NumWorkers property.

Whether hyperthreading provides any benefit depends on the nature of your algorithm. The reason the default is not to consider hyperthreading is that it was found generally not to be beneficial for most numerically intensive workloads.

There isn't currently a single all-encompassing benchmark, but there is a series of benchmarks of Parallel Computing Toolbox functionality here.

1 Comment
Show -1 older commentsHide -1 older comments

Michael Pelletier on 6 Mar 2018

One important consideration for MATLAB and other similar applications is that there is only one floating point unit per pair of Hyperthreads. Since MATLAB breathes floating point, its policy to only consider physical cores by default is usually going to be the correct one. That said, with modern Hyperthreading and topology-aware operating systems, you won't see much, if any, difference if you disable it in the BIOS.

Sign in to comment.

Answer 2

Walter Roberson on 25 Jun 2013

10
Link

Direct link to this answer

https://ch.mathworks.com/matlabcentral/answers/80129-definitive-answer-for-hyperthreading-and-the-parallel-computing-toolbox-pct#answer_89845

Hyperthreading causes two processes to share the same cpu, with fast context switching between the two processes. No additional computation resources are made available in this mode, so if both processes want access to the CPU then the two are going to contend for access, and on long term average probably each get a little under 1/2 of the work done that they would have gotten with dedicated CPUs.

The circumstances under which this kind of sharing is a gain, are the circumstances under which a CPU would otherwise be idle because it is waiting for a resource and the second process has all the resources it needs available. "Waiting for a resource" would often be waiting for I/O to finish. It might include waiting for an interrupt to occur (I do not know if it is implemented to do this). On systems with memory that is shared between a pool of CPUs, or better yet on integrated clusters with shared memory, waiting for a resource could include waiting for memory to become available from a different CPU.

A sample situation in which there would be a benefit would be if one of the threads was an interrupt handler (e.g., DAQ or GBIP input or output) and the other thread is compute bound. When the interrupt becomes ready, there could be a fast switch to the second process, service it briefly, and fast switch back to compute. Yes, you would probably do even better if you devoted a complete CPU to each thread, but the cost would be higher for that. The cost can add up a fair bit for a full context switch to have each of the two serviced by one CPU splitting the load. To invent a figure, you might be able to share a single core 96% compute, 3% I/O, 1% switching waste, whereas without hyperthreading the switching cost could be (say) 14%, leading to 83% compute, 3% I/O, 14% switching waste.

Now, if you are not doing that kind of work on all cores, if most of your cores are compute bound and not waiting for I/O or memory access, having hyperthreading on for those cores is not useful and would slow down progress; if I understand correctly, having hyperthreading turned on with an inappropriate job mix will slow down computations.

It is difficult to create the kind of guide you mention, as the benefits depend on what else is happening. Generally speaking, file operations, imread(), video decompression and decoding, video encoding, serial and DAQ can benefit -- but they might not benefit enough to be worthwhile if you have heavy computations. If you are wanting to do video encoding and decoding then an i3 processor can be a better choice than an i7, as the i3 has that capability built in (H.264 is what comes to mind.) I have forgotten whether the i5 has the encode/decode: my memory is saying that it is available either way.

1 Comment
Show -1 older commentsHide -1 older comments

Joseph Freivald on 25 Mar 2021

This is not strictly true. What you describe is multi-threading on a single core without multithreaded support on the processor with context switching by the OS. Hyperthreaded processor architectures allow threads to share processor scheduling resources, such as adders, multipliers, floating point, etc., without context switching.

See this video from GA Tech's CS-6290 course on processor architectures for a good visual representation and explanation of hyperthreading: https://bit.ly/3rljsru

Sign in to comment.

Definitive answer for hyperthreading and the Parallel Computing Toolbox (PCT)?

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment
Show -1 older commentsHide -1 older comments

More Answers (1)

1 Comment
Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

Definitive answer for hyperthreading and the Parallel Computing Toolbox (PCT)?

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

1 Comment Show -1 older commentsHide -1 older comments

More Answers (1)

1 Comment Show -1 older commentsHide -1 older comments

See Also

Categories

Tags

Products

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

1 Comment
Show -1 older commentsHide -1 older comments