Questions about building a computer cluster and matlab requirements

1 view (last 30 days)
We decided in our department (biology/physics) to build a computer cluster to run matlab (6 fairly good pcs, one 6core as a central 'server' and 5 quad cores). So I have a few questions regarding matlab.
The first is, how does this distributed toolbox work? Do we need just one license for matlab and the other needed toolboxes? (this is how it sounds but just want to make sure).
How will we be able to send jobs to the workers? Only from the pc with matlab installed or from any of the pcs in the cluster? Just controlling from the central server would be quite inconvenient.
Now a question about the number of workers we need. I know that if we have a fully parallel code made with parallel toolbox, we need one worker per core (not per cpu). Now the very important question is in case we just want to send batch jobs, without parallelizing the code, so just using the implicit multithreading that matlab is capable of, will one worker take advantage of all the cores on the cpu or not (so to make it short, can each worker act as a normal full matlab in regards to implicit multithreading?)
And a question about the operating system needed. Does each pc on the cluster need the same operating system, or can they have different ones? (linux + windows)? And in case of windows 7, which version is needed (home edition, professional, etc), because there are differences in regards to networking in these versions.
I think these are all the questions for now. Thank you in advance.

Answers (1)

Konrad Malkowski
Konrad Malkowski on 18 Apr 2011
Hi Sbiera,
Some of these questions depend on your intended software setup. I assume here that you are going to use PCT + MDCS and MathWorks Job Manager. Other configurations could change answers to some of these questions.
Q1) Do we need just one license for MATLAB and the other needed toolboxes?
You will need an MDCS license for each worker in the cluster. For example, if you have 16 cores, and want 16 workers, then you will need 16 MDCS licenses.
Each of the client MATLAB's (MATLAB's installed on the user machines) will need to have Parallel Computing Toolbox installed in order to connect to MDCS cluster. In general, any code that you can run on the client MATLAB, you will be able to run on the MDCS cluster.
Q2) How will we be able to send jobs to the workers?
The ideal setup is as follows. User start's up his/her MATLAB on their client machine, and submits a job to cluster. The job runs on the cluster, completes, and results become available to the user on the Client MATLAB (this is of course a simplistic description).
Q3) How many workers do we need?
That is up to you. Ideally, one per core. Or at minimum one per CPU. Keep in mind that MDCS Workers are always single threaded.
Q4) Can each worker act as a normal full MATLAB in regards to implicit multi-threading?
No each MATLAB workers is single threaded (when you are using MDCS + PCT).
Q5) Do the operating systems have to be the same?
If you are using MDCS + PCT and MathWorks Job Manager scheduler, then you can mix and match operating systems. If you are using a 3rd party scheduler, then all of the machines in the cluster have to run the same OS.
  2 Comments
Jason Ross
Jason Ross on 18 Apr 2011
I have one add-on recommendation for Q5. Although you can mix and match operating systems in a cluster, I would recommend that you pick the same one for all the hosts. As for which operating system, I'd pick the one that you (or your IT people/department) are most familiar with.
The reason for these recommendations are as follows:
- You can re-use existing OS deployment tools.
- You can re-use patching tools.
- You can re-use administration tools.
- You can re-use monitoring tools.
- You can re-use institutional knowledge regarding troubleshooting problems with the OS in your environment.
- You will integrate nicely with the existing security model.
- Your institution might have an agreement with Microsoft (or RedHat or Novell or someone else) already that you can leverage for licensing and support.
- If you need to change a setting on one host, you just need to do the same thing five times in the same way rather than figuring out something again.
Jason Ross
Jason Ross on 18 Apr 2011
One other recommendation: One worker per core is a starting point. A follow on recommendation is at least 1 GB of RAM per worker, as well. You might find that more or less workers result in faster computation, depending on what you are doing.
Networking, disk speed and file server performance can also play significant roles in overall speed of computation, as well.

Sign in to comment.

Categories

Find more on MATLAB Parallel Server in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!