Main Content

Import Large Data from MongoDB

This example shows how to import a large set of flight data from a MongoDB® collection into the MATLAB® workspace using the Database Toolbox™ interface for MongoDB. To avoid out-of-memory issues with the Java® heap when retrieving many documents, use a loop to import large data in batches.

To run this example, you must first install the Database Toolbox interface for MongoDB. For details, see Database Toolbox Interface for MongoDB Installation.

Connect to MongoDB

Create a MongoDB connection to the database mongotest. Here, the database server dbtb01 hosts this database using port number 27017.

server = "dbtb01";
port = 27017;
dbname = "mongotest";
conn = mongo(server,port,dbname)
conn = 

  mongo with properties:

               Database: 'mongotest'
               UserName: ''
                 Server: {'dbtb01'}
                   Port: 27017
        CollectionNames: {'airlinesmall', 'employee', 'largedata' ... and 3 more}
         TotalDocuments: 23485919

conn is the mongo object that contains the MongoDB connection. The object properties contain information about the connection and the database.

  • The database name is mongotest.

  • The user name is blank.

  • The database server is dbtb01.

  • The port number is 27017.

  • This database contains six document collections. The first three collection names are airlinesmall, employee, and largedata.

  • This database contains 23,485,919 documents.

Verify the MongoDB connection.

ans =



The database connection is successful because the isopen function returns 1. Otherwise, the database connection is closed.

Determine Number of Documents to Import

Find the total number of documents totaldocs in the airlinesmall collection for the years 1997 through 2010. Use a MongoDB query to filter the flight data for the specified years.

collection = "airlinesmall";
mongoquery = '{"Year":{$gte:1997,$lte:2010}}';
totaldocs = count(conn,collection,'Query',mongoquery);

Retrieve Large Data in Batches

Estimate the batch size to be 15,000 documents. Define the MATLAB workspace variable for storing the retrieved data.

batchsize = 15000;
flightdata = [];

You can change the batch size depending on the performance and memory capacity of your system.

Use a while loop to retrieve flight data from the collection. The variable flightdata accumulates each batch of retrieved data.

% Track number of documents read
index = 0;

while index < totaldocs
    % Retrieve documents in a batch
    localdata = find(conn,collection,'Query',mongoquery, ...
    % Store retrieved documents locally
    flightdata = [flightdata; localdata];
    % Move to the next batch
    index = index + batchsize;

Display information about the flightdata variable. The retrieved data is a structure array that contains 75,603 structures. Each structure contains 30 fields of flight data.

whos flightdata
  Name                Size                Bytes  Class     Attributes

  flightdata      75603x1             285102752  struct  

Close MongoDB Connection


See Also

| | | | |

Related Topics

External Websites