datastore function creates a datastore, which is a
repository for collections of data that are too large to fit in memory. A
datastore allows you to read and process data stored in multiple files on a
disk, a remote location, or a database as a single entity. If the data is
too large to fit in memory, you can manage the incremental import of data,
tall array to work with the data, or use the
datastore as an input to
mapreduce for further
processing. For more information, see Getting Started with Datastore.
|Create datastore for large collections of data|
|Datastore for tabular text files|
|Datastore for spreadsheet files|
|Datastore for image data|
|Datastore for collection of Parquet files|
|Datastore with custom file reader|
|Datastore for in-memory data|
Read and Write from Datastore
Partition and Shuffle Datastore
Combine or Transform Datastores
Develop Custom Datastore
|Base datastore class|
|Add parallelization support to datastore|
|Add Hadoop support to datastore|
|Add shuffling support to datastore|
|File-set object for collection of files in datastore|
|File-reader object for files in a datastore|
|Add file writing support to datastore|
|Add Folder property support to datastore|
|File-set for collection of files in datastore|
|Blocked file-set for collection of blocks within file|
A datastore is an object for reading a single file or a collection of files or data.
Choose the right datastore based on the file format of your data or application.
This example shows how to create a datastore for a large text file containing tabular data, and then read and process the data one block at a time or one file at a time.
This example shows how to create a datastore for a collection of images, read the image files, and find the images with the maximum average hue, saturation, and brightness (HSV).
This example shows how to create a datastore for key-value pair data in a MAT-file that is the output of
This example shows how to create a datastore for a Sequence file containing key-value data.
Work with remote data in Amazon S3™, Azure® Blob Storage, or HDFS™.
Setup a datastore on your machine that can be loaded and processed on another machine or cluster.
Create a fully customized datastore for your custom or proprietary data.
This example shows how to develop a custom datastore that supports writing operations.
After implementing your custom datastore, follow this test procedure to qualify your custom datastore.