Engineering and IT teams are using MATLAB to build today’s advanced Big Data Analytics systems ranging from predictive maintenance and telematics to advanced driver assistance systems and sensor analytics. Teams select MATLAB because it offers essential capabilities not found in business intelligence systems or open source languages:

Physical-world data: MATLAB has native support for sensor, image, video, telemetry, binary, and other real-time formats. Explore this data using MATLAB MapReduce functionality for Hadoop, and by connecting interfaces to ODBC/JDBC databases.

Machine learning, neural networks, statistics, and beyond: MATLAB offers a full set of statistics and machine learning functionality, plus advanced methods such as nonlinear optimization, system identification, and thousands of prebuilt algorithms for image and video processing, financial modeling, control system design.

High speed processing of large data sets. MATLAB’s numeric routines scale directly to parallel processing on clusters and cloud.

Online and real-time deployment: MATLAB integrates into enterprise systems, clusters, and clouds, and can be targeted to real-time embedded hardware.

The Netflix Prize and Production Machine Learning Systems: An Insider Look

"No matter what industry our client is in, and no matter what data they ask us to analyze - text, audio, images, or video - MATLAB code enables us to provide clear results faster."

Dr. G. Subrahamanya VRK Rao, Cognizant

 

Accessing and Exploring Data

The first step in performing data analytics is to access the wealth of available data to explore patterns and develop deeper insights. From a single integrated environment, MATLAB helps you access data from a wide variety of sources and formats including:

  • Databases (ODBC and JDBC compliant), data warehouses, and distributed file systems (Hadoop)
  • Financial data servers to access live and historical market data
  • Internet of Things devices
  • OPC servers to access live and historical industrial plant data
  • File I/O including text, spreadsheet, XML, CDF/HDF, image, audio, video, geospatial, and web content

Preprocessing and Data Munging

When working with data from numerous sources and repositories, engineers and scientists need to preprocess and prepare data before developing predictive models. For example, data might have missing values or erroneous values, or it might use different timestamp formats. MATLAB helps you simplify what might otherwise be time-consuming tasks such as:

  • Cleaning data that has errors, outliers, or duplicates
  • Handling missing data with discarding, filtering, or imputation
  • Removing noise from sensor data with advanced signal processing techniques
  • Merging and time-aligning data with different sample rates
  • Feature selection to reduce high-dimension data to improve model predictive power
  • Feature extraction and transformation for dimensionality reduction
  • Domain analysis such as signal, image, and video processing

Developing Predictive Models

Prototype and build predictive models directly from data to forecast and predict the probabilities of future outcomes. You can compare machine learning approaches such as logistic regression, classification trees, support vector machines, and ensemble methods, and use model refinement and reduction tools to create an accurate model that best captures the predictive power of your data. Use flexible tools for processing financial, signal, image, video, and mapping data to create analytics for a variety of fields within the same development environment.


Integrating Analytics with Systems

Integrate analytics developed in MATLAB into production IT environments without having to recode or create custom infrastructure. MATLAB analytics can be packaged as deployable components compatible with a wide range of development environments such as Java, Microsoft .NET, Excel, Python, and C/C++. You can share standalone MATLAB applications or run MATLAB analytics as a part of web, database, desktop, and enterprise applications. For low latency and scalable production applications, you can manage MATLAB analytics running as a centralized service that is callable from many diverse applications.