Distributed toolbox and NUMA server

6 views (last 30 days)
Tyler Warner
Tyler Warner on 22 May 2018
Edited: Chris Hooper on 5 Jun 2022
Hi, I am helping an organization process many files (several terabytes). I have set it up where the script runs a `parfor` loop for each file to be processed. Each worker runs a function that will process the file number. Therefore, there should be minimal or no communication between each worker because the only thing passed is the file number.
My problem is that the code runs by using what appears to be only half the cores (physical cores). We have a NUMA server consisting of two 10-core(20 logical) Intel processors.
Currently I am running using different manual parpool numbers (14-20) to see if that makes any difference in the speed of processing a single file. When it runs, starting with parpool(14), Windows Resource Monitor shows that NUMA node 1 is almost maxing out (both physical and logical cores), but NUMA node 0 has minimal (average less than 20% total) use. Do I need to set up a distributed, albeit local, cluster to make both NUMA nodes run?
I'm not looking to simply make every processor run to 100%, but that I would like to process these files in less time.
  1 Comment
Chris Hooper
Chris Hooper on 5 Jun 2022
Edited: Chris Hooper on 5 Jun 2022
I am using Azure, AMD Epyc architecture. 2 numa groups. I am having what sounds like a similar problem. I can only create a pool of 60 workers despite there being 120 physical cores available.

Sign in to comment.

Answers (0)


Find more on Parallel Computing in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!