Text Analytics Toolbox

 

Text Analytics Toolbox

Analyze and model text data

MATLAB code that extracts text data from Microsoft Word documents into a datastore.

Import and Visualize Text

Import text data into MATLAB from single files or large collections of files, including PDF, HTML, and Microsoft® Word files. Visually explore text data sets using word clouds and text scatter plots.

Screenshot of the Preprocess Text Data Live Editor task with results displayed as a word cloud.

Clean and Preprocess Text

Apply high-level filtering functions to remove extraneous content, such as URLs, HTML tags, and punctuation. Correct spelling, filter stop words, and normalize words to root form.

MATLAB code for creating a scatter plot and the created word embedding t-SNE plot.

Convert Text to Structured Format

Extract linguistic features by using a tokenization algorithm, calculate word frequency statistics to represent text data numerically, and train word embedding models such as word2vec and skip-gram.

Workflow for performing transfer learning with FinBERT transformer model on text data to identify positive and negative attitudes.

Apply AI to Text Analytics

Fit a machine learning or deep learning model, such as LSA, LDA, and LSTM, to text data. Leverage transformer models, such as BERT, FinBERT, and GPT-2, to perform transfer learning with text data.

Large Language Models

Connect MATLAB to the OpenAI™ Chat Completions API. Leverage the natural language processing capabilities of GPT models within your MATLAB environment, for tasks such as text summarization and chatting.

Illustration of cleaning text data for natural language processing. On the left: word cloud of raw data. On the right: word cloud of cleaned data.

Text Analytics for Engineers

Develop predictive maintenance schedules based on sensors and text log data. Automate requirement formalization and compliance checking.

Use text analytics to summarize multiple documents into one document.

Document Analysis

Analyze text with topic modeling to discover and visualize underlying patterns, trends, and complex relationships. Summarize documents, extract keywords, and evaluate document importance and similarity.

Word clouds separated into positive and negative words.

Sentiment Analysis

Identify the attitudes and opinions expressed in text data to categorize statements as being positive, neutral, or negative. Build models that can predict sentiment in real time.

Word cloud of generated text from the novel Pride and Prejudice.

Text Generation and Classification

Use deep learning to generate new text based on observed text and to classify text descriptions with word embeddings that can identify categories.

Get a Free Trial

30 days of exploration at your fingertips.


Ready to Buy?

Get pricing information and explore related products.

Are You a Student?

Your school may already provide access to MATLAB, Simulink, and add-on products through a campus-wide license.