Thank you for your suggestions, Matt J and Walter Roberson. I think I need to detail my question. I was actually going to convert my codes to a parallel version running on multiple computers so that the the efficiency could be dramatically enhanced.
My codes are about solving a pair of differential equations (the Brusselator model), in other words, 2D gird-simulation in high resolution.
    % Simulate 2D Brusselator model by Euler algorithm
    clear;clc;close all
    a = 5;
    b = 19;
    Dx = 5;
    Dy = 40;
    time_end = 150;    % length of simulation, in sec
    %%dimensions for the full-resolution grid
    [Nx Ny] = deal(60);  % no. of sampling points along each axis
    [Lx Ly] = deal(60);        % square cortex (cm)
    [dx dy] = deal(Lx/Nx, Ly/Ny);  % spatial resolution (cm)
    %%time resolution and time-base
    dt = 1*1e-3;      % Nx = Ny =  60; D2 = 1.0 cm^2   
    Nsteps = round(time_end/dt)
    %%3x3 Laplacian matrix (used in grid convolution calculations)
    Laplacian = [0 1 0; 1 -4 1; 0 1 0];
    %%set up storage vectors and grids
    [U_grid  V_grid]               = deal(0.001*randn(Nx, Ny));
    % save UV_inis U_grid  V_grid
    U0 = a; V0 = b/a;
    % initialize the grids at steady-state values
    [U_grid  V_grid]             = deal(U0 + U_grid, V0 + V_grid);
    % diffusion multipliers (depend on step size)
    Dx = Dx/dx^2;
    Dy = Dy/dx^2;
    %%Simulation
    stride2 = 100;          % iterations per screen update
    time = [0:Nsteps-1]'*dt;  % timebase
    ii = 1;
    for i = 1: Nsteps
        i
        U_grid = U_grid + dt*(a-(b+1)*U_grid + U_grid.^2.*V_grid + ...
            Dx*convolve2(U_grid, Laplacian, 'wrap')); 
        V_grid = V_grid + dt*(b*U_grid - U_grid.^2.*V_grid + ...
            Dy*convolve2(V_grid, Laplacian, 'wrap'));
        if (mod(i, stride2) == 1 || i == Nsteps)        
            mesh(x, y, U_grid);
            drawnow;
            U_save(:,:, ii) = U_grid;
            ii = ii + 1;
        end
    end
In the codes, "U_save" is the final results I want to have. If the "Nx" and "Ny" are set to 300, meaning high resolution, the computing speed will be very slow.
I have Parallel Computing Toolbox and Distributed Computing Toolbox. May I have your suggestions for an efficient calculation? Many thanks



