Parpool Fail 2015a HPC
1 view (last 30 days)
Show older comments
Hi,
We have recently purchased a HPC for simulation research. It has 64 cores (4 AMD Opteron) and 256 gb ram. The OS is linux CentOS 6.
We have installed MatLab 2015a 64-bit on the machine.
I am trying to open 64 parallel workers using parpool(64) command, but it gives me error. Even the profile cannot be validated. When I reduce the number to 25 it works. But any number more than 25 I get the error. The error is attached. I will really appriciate it, if you can help me in this regard. It is worth to mention that with the same machine in Windows 10, I can use all 32 cores and 32 hyperthreads (as windows only detect 2 physical CPUs). It means in Windows i can open up to 64 workers.
Stage: SPMD job test (createCommunicatingJob)
Status: Failed
Description:The job errored or did not reach state finished.
Command Line Output:(none)
Error Report:(none)
Debug Log:
LOG FILE OUTPUT:
[14] < M A T L A B (R) >
[14] Copyright 1984-2015 The MathWorks, Inc.
[14] R2015a (8.5.0.197613) 64-bit (glnxa64)
[14] February 12, 2015
[21] < M A T L A B (R) >
[21] Copyright 1984-2015 The MathWorks, Inc.
[21] R2015a (8.5.0.197613) 64-bit (glnxa64)
[21] February 12, 2015
[6] < M A T L A B (R) >
[6] Copyright 1984-2015 The MathWorks, Inc.
[6] R2015a (8.5.0.197613) 64-bit (glnxa64)
[6] February 12, 2015
[16] < M A T L A B (R) >
[16] Copyright 1984-2015 The MathWorks, Inc.
[16] R2015a (8.5.0.197613) 64-bit (glnxa64)
[16] February 12, 2015
[30] < M A T L A B (R) >
[30] Copyright 1984-2015 The MathWorks, Inc.
[30] R2015a (8.5.0.197613) 64-bit (glnxa64)
[30] February 12, 2015
[7] < M A T L A B (R) >
[7] Copyright 1984-2015 The MathWorks, Inc.
[7] R2015a (8.5.0.197613) 64-bit (glnxa64)
[7] February 12, 2015
[24] < M A T L A B (R) >
[24] Copyright 1984-2015 The MathWorks, Inc.
[24] R2015a (8.5.0.197613) 64-bit (glnxa64)
[24] February 12, 2015
[2] < M A T L A B (R) >
[2] Copyright 1984-2015 The MathWorks, Inc.
[2] R2015a (8.5.0.197613) 64-bit (glnxa64)
[2] February 12, 2015
[12] < M A T L A B (R) >
[12] Copyright 1984-2015 The MathWorks, Inc.
[12] R2015a (8.5.0.197613) 64-bit (glnxa64)
[12] February 12, 2015
[0] < M A T L A B (R) >
[0] Copyright 1984-2015 The MathWorks, Inc.
[0] R2015a (8.5.0.197613) 64-bit (glnxa64)
[0] February 12, 2015
[20] < M A T L A B (R) >
[20] Copyright 1984-2015 The MathWorks, Inc.
[20] R2015a (8.5.0.197613) 64-bit (glnxa64)
[20] February 12, 2015
[31] < M A T L A B (R) >
[31] Copyright 1984-2015 The MathWorks, Inc.
[31] R2015a (8.5.0.197613) 64-bit (glnxa64)
[31] February 12, 2015
[9] < M A T L A B (R) >
[9] Copyright 1984-2015 The MathWorks, Inc.
[9] R2015a (8.5.0.197613) 64-bit (glnxa64)
[9] February 12, 2015
[10] < M A T L A B (R) >
[10] Copyright 1984-2015 The MathWorks, Inc.
[10] R2015a (8.5.0.197613) 64-bit (glnxa64)
[10] February 12, 2015
[28] < M A T L A B (R) >
[28] Copyright 1984-2015 The MathWorks, Inc.
[28] R2015a (8.5.0.197613) 64-bit (glnxa64)
[28] February 12, 2015
[18] < M A T L A B (R) >
[18] Copyright 1984-2015 The MathWorks, Inc.
[18] R2015a (8.5.0.197613) 64-bit (glnxa64)
[18] February 12, 2015
[13] < M A T L A B (R) >
[13] Copyright 1984-2015 The MathWorks, Inc.
[13] R2015a (8.5.0.197613) 64-bit (glnxa64)
[13] February 12, 2015
[23] < M A T L A B (R) >
[23] Copyright 1984-2015 The MathWorks, Inc.
[23] R2015a (8.5.0.197613) 64-bit (glnxa64)
[23] February 12, 2015
[25] < M A T L A B (R) >
[25] Copyright 1984-2015 The MathWorks, Inc.
[25] R2015a (8.5.0.197613) 64-bit (glnxa64)
[25] February 12, 2015
[15] < M A T L A B (R) >
[15] Copyright 1984-2015 The MathWorks, Inc.
[15] R2015a (8.5.0.197613) 64-bit (glnxa64)
[15] February 12, 2015
[26] < M A T L A B (R) >
[26] Copyright 1984-2015 The MathWorks, Inc.
[26] R2015a (8.5.0.197613) 64-bit (glnxa64)
[26] February 12, 2015
[5] < M A T L A B (R) >
[5] Copyright 1984-2015 The MathWorks, Inc.
[5] R2015a (8.5.0.197613) 64-bit (glnxa64)
[5] February 12, 2015
[11] < M A T L A B (R) >
[11] Copyright 1984-2015 The MathWorks, Inc.
[11] R2015a (8.5.0.197613) 64-bit (glnxa64)
[11] February 12, 2015
[17] < M A T L A B (R) >
[17] Copyright 1984-2015 The MathWorks, Inc.
[17] R2015a (8.5.0.197613) 64-bit (glnxa64)
[17] February 12, 2015
[19] < M A T L A B (R) >
[19] Copyright 1984-2015 The MathWorks, Inc.
[19] R2015a (8.5.0.197613) 64-bit (glnxa64)
[19] February 12, 2015
[27] < M A T L A B (R) >
[27] Copyright 1984-2015 The MathWorks, Inc.
[27] R2015a (8.5.0.197613) 64-bit (glnxa64)
[27] February 12, 2015
[8] < M A T L A B (R) >
[8] Copyright 1984-2015 The MathWorks, Inc.
[8] R2015a (8.5.0.197613) 64-bit (glnxa64)
[8] February 12, 2015
[22] < M A T L A B (R) >
[22] Copyright 1984-2015 The MathWorks, Inc.
[22] R2015a (8.5.0.197613) 64-bit (glnxa64)
[22] February 12, 2015
[29] < M A T L A B (R) >
[29] Copyright 1984-2015 The MathWorks, Inc.
[29] R2015a (8.5.0.197613) 64-bit (glnxa64)
[29] February 12, 2015
[3] < M A T L A B (R) >
[3] Copyright 1984-2015 The MathWorks, Inc.
[3] R2015a (8.5.0.197613) 64-bit (glnxa64)
[3] February 12, 2015
[4] < M A T L A B (R) >
[4] Copyright 1984-2015 The MathWorks, Inc.
[4] R2015a (8.5.0.197613) 64-bit (glnxa64)
[4] February 12, 2015
[1] < M A T L A B (R) >
[1] Copyright 1984-2015 The MathWorks, Inc.
[1] R2015a (8.5.0.197613) 64-bit (glnxa64)
[1] February 12, 2015
[21]
[14]
[6]
[21]To get started, type one of these: helpwin, helpdesk, or demo.
[21]For product information, visit www.mathworks.com.
[21]
[14]To get started, type one of these: helpwin, helpdesk, or demo.
[14]For product information, visit www.mathworks.com.
[14]
[16]
[6]To get started, type one of these: helpwin, helpdesk, or demo.
[6]For product information, visit www.mathworks.com.
[6]
[30]
[2]
[16]To get started, type one of these: helpwin, helpdesk, or demo.
[16]For product information, visit www.mathworks.com.
[16]
[30]To get started, type one of these: helpwin, helpdesk, or demo.
[30]For product information, visit www.mathworks.com.
[30]
[0]
[20]
[9]
[28]
[2]To get started, type one of these: helpwin, helpdesk, or demo.
[2]For product information, visit www.mathworks.com.
[2]
[7]
[31]
[10]
[0]To get started, type one of these: helpwin, helpdesk, or demo.
[0]For product information, visit www.mathworks.com.
[0]
[24]
[25]
[12]
[23]
[9]To get started, type one of these: helpwin, helpdesk, or demo.
[9]For product information, visit www.mathworks.com.
[9]
[26]
[20]To get started, type one of these: helpwin, helpdesk, or demo.
[20]For product information, visit www.mathworks.com.
[20]
[28]To get started, type one of these: helpwin, helpdesk, or demo.
[28]For product information, visit www.mathworks.com.
[28]
[17]
[31]To get started, type one of these: helpwin, helpdesk, or demo.
[31]For product information, visit www.mathworks.com.
[31]
[5]
[13]
[7]To get started, type one of these: helpwin, helpdesk, or demo.
[7]For product information, visit www.mathworks.com.
[7]
[11]
[18]
[10]To get started, type one of these: helpwin, helpdesk, or demo.
[10]For product information, visit www.mathworks.com.
[10]
[19]
[8]
[15]
[25]To get started, type one of these: helpwin, helpdesk, or demo.
[24]To get started, type one of these: helpwin, helpdesk, or demo.
[25]For product information, visit www.mathworks.com.
[25]
[24]For product information, visit www.mathworks.com.
[24]
[12]To get started, type one of these: helpwin, helpdesk, or demo.
[12]For product information, visit www.mathworks.com.
[12]
[29]
[23]To get started, type one of these: helpwin, helpdesk, or demo.
[26]To get started, type one of these: helpwin, helpdesk, or demo.
[23]For product information, visit www.mathworks.com.
[23]
[26]For product information, visit www.mathworks.com.
[26]
[17]To get started, type one of these: helpwin, helpdesk, or demo.
[17]For product information, visit www.mathworks.com.
[17]
[27]
[13]To get started, type one of these: helpwin, helpdesk, or demo.
[13]For product information, visit www.mathworks.com.
[13]
[19]To get started, type one of these: helpwin, helpdesk, or demo.
[19]For product information, visit www.mathworks.com.
[19]
[11]To get started, type one of these: helpwin, helpdesk, or demo.
[11]For product information, visit www.mathworks.com.
[11]
[5]To get started, type one of these: helpwin, helpdesk, or demo.
[5]For product information, visit www.mathworks.com.
[5]
[18]To get started, type one of these: helpwin, helpdesk, or demo.
[3]
[18]For product information, visit www.mathworks.com.
[18]
[22]
[8]To get started, type one of these: helpwin, helpdesk, or demo.
[8]For product information, visit www.mathworks.com.
[8]
[15]To get started, type one of these: helpwin, helpdesk, or demo.
[15]For product information, visit www.mathworks.com.
[15]
[29]To get started, type one of these: helpwin, helpdesk, or demo.
[29]For product information, visit www.mathworks.com.
[29]
[1]
[4]
[27]To get started, type one of these: helpwin, helpdesk, or demo.
[27]For product information, visit www.mathworks.com.
[27]
[3]To get started, type one of these: helpwin, helpdesk, or demo.
[3]For product information, visit www.mathworks.com.
[3]
[22]To get started, type one of these: helpwin, helpdesk, or demo.
[22]For product information, visit www.mathworks.com.
[22]
[1]To get started, type one of these: helpwin, helpdesk, or demo.
[1]For product information, visit www.mathworks.com.
[1]
[4]To get started, type one of these: helpwin, helpdesk, or demo.
[4]For product information, visit www.mathworks.com.
[4]
[14] Academic License
[21] Academic License
[6] Academic License
[14]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[21]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[0] Academic License
[16] Academic License
[14]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[14]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[14]2015-11-07 17:01:29 | This process will exit on any fault.
[21]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[21]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[30] Academic License
[14]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[21]2015-11-07 17:01:29 | This process will exit on any fault.
[14]2015-11-07 17:01:29 | About to initialize MPI.
[21]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[21]2015-11-07 17:01:29 | About to initialize MPI.
[6]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[9] Academic License
[2] Academic License
[13] Academic License
[0]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[28] Academic License
[12] Academic License
[31] Academic License
[10] Academic License
[6]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core
[6]2015-11-07 17:01:29 | Enter distcomp_evaluate_filetask_core/iSetup
[6]2015-11-07 17:01:29 | This process will exit on any fault.
[29] Academic License
[6]2015-11-07 17:01:29 | This process will exit when its parent process dies.
[16]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[26] Academic License
[6]2015-11-07 17:01:30 | About to initialize MPI.
[19] Academic License
[30]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[5] Academic License
[0]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[0]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[9]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[0]2015-11-07 17:01:30 | This process will exit on any fault.
[0]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[2]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[25] Academic License
[0]2015-11-07 17:01:30 | About to initialize MPI.
[7] Academic License
[13]2015-11-07 17:01:29 | About to evaluate task with DistcompEvaluateFileTask
[20] Academic License
[16]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[16]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[8] Academic License
[18] Academic License
[16]2015-11-07 17:01:30 | This process will exit on any fault.
[30]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[28]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[31]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[30]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[24] Academic License
[12]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[16]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[30]2015-11-07 17:01:30 | This process will exit on any fault.
[16]2015-11-07 17:01:30 | About to initialize MPI.
[23] Academic License
[10]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[9]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[15] Academic License
[30]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[9]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[17] Academic License
[29]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[9]2015-11-07 17:01:30 | This process will exit on any fault.
[30]2015-11-07 17:01:30 | About to initialize MPI.
[9]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[2]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[2]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[9]2015-11-07 17:01:30 | About to initialize MPI.
[2]2015-11-07 17:01:30 | This process will exit on any fault.
[26]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[19]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[2]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[11] Academic License
[13]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[13]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[2]2015-11-07 17:01:30 | About to initialize MPI.
[5]2015-11-07 17:01:30 | About to evaluate task with DistcompEvaluateFileTask
[13]2015-11-07 17:01:30 | This process will exit on any fault.
[28]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[28]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[31]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[31]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[28]2015-11-07 17:01:30 | This process will exit on any fault.
[12]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[13]2015-11-07 17:01:30 | Unexpected error setting up process monitor. Error returned:
[13]Unexpected Standard exception from MEX file.
[13]What() is:boost::thread_resource_error
[13]..
[13]Error in distcomp_evaluate_filetask_core>iSetupProcessMonitoringThreads (line 622)
[13] dct_psfcns('pidwatch', pidToWatch)
[13]Error in distcomp_evaluate_filetask_core>iMaybeSetupProcessMonitoringThreads (line 256)
[13] iSetupProcessMonitoringThreads;
[13]Error in distcomp_evaluate_filetask_core>iSetup (line 506)
[13]iMaybeSetupProcessMonitoringThreads();
[13]Error in distcomp_evaluate_filetask_core (line 25)
[13] runprop = iSetup(handlers, mdceDebugEnabled, outputWriterStack, isSyncTaskEvaluation, varargin);
[12]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[13]2015-11-07 17:01:30 | About to exit with code: 1
[31]2015-11-07 17:01:30 | This process will exit on any fault.
[28]2015-11-07 17:01:30 | This process will exit when its parent process dies.
[12]2015-11-07 17:01:30 | This process will exit on any fault.
[10]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core
[10]2015-11-07 17:01:30 | Enter distcomp_evaluate_filetask_core/iSetup
[28]2015-11-07 17:01:30 | About to initialize MPI.
[31]2015-11-07 17:01:30 | This process will exit when its parent process dies.
job aborted:
rank: node: exit code[: error message]
0: 127.0.0.1: -2
1: 127.0.0.1: -2
2: 127.0.0.1: -2
3: 127.0.0.1: -2
4: 127.0.0.1: -2
5: 127.0.0.1: -2
6: 127.0.0.1: -2
7: 127.0.0.1: -2
8: 127.0.0.1: -2
9: 127.0.0.1: -2
10: 127.0.0.1: -2
11: 127.0.0.1: -2
12: 127.0.0.1: -2
13: 127.0.0.1: -2: process 13 exited without calling init while other processes have called init
14: 127.0.0.1: -2
15: 127.0.0.1: -2
16: 127.0.0.1: -2
17: 127.0.0.1: -2
18: 127.0.0.1: -2
19: 127.0.0.1: -2
20: 127.0.0.1: -2
21: 127.0.0.1: -2
22: 127.0.0.1: -2
23: 127.0.0.1: -2
24: 127.0.0.1: -2
25: 127.0.0.1: -2
26: 127.0.0.1: -2
27: 127.0.0.1: -2
28: 127.0.0.1: -2
29: 127.0.0.1: -2
30: 127.0.0.1: -2
31: 127.0.0.1: -2
Stage: Pool job test (createCommunicatingJob)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
Stage: Parallel pool test (parpool)
Status: Skipped
Description:Validation skipped due to previous failure.
Command Line Output:(none)
Error Report:(none)
Debug Log:(none)
0 Comments
Answers (1)
Edric Ellis
on 9 Nov 2015
This looks like your machine ran out of resources while trying to start up the workers. Do you have any ulimit in effect?
4 Comments
Darwin
on 17 Oct 2016
I manage Matlab on Linux HPC machines and can use the number of workers equal to the number of cores on 1 node with parpool. Hyperthreading does not work right under CentOS.
See Also
Categories
Find more on Parallel Computing Fundamentals in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!