Text Analytics Toolbox provides algorithms and visualizations for preprocessing, analyzing, and modeling text data. Models created with the toolbox can be used in applications such as sentiment analysis, predictive maintenance, and topic modeling.
Text Analytics Toolbox includes tools for processing raw text from sources such as equipment logs, news feeds, surveys, operator reports, and social media. You can extract text from popular file formats, preprocess raw text, extract individual words, convert text into numerical representations, and build statistical models.
Using machine learning techniques such as LSA, LDA, and word embeddings, you can find clusters and create features from high-dimensional text data sets. Features created with Text Analytics Toolbox can be combined with features from other data sources to build machine learning models that take advantage of textual, numeric, and other types of data.
Import and Visualize Text
Import text data into MATLAB from single files or large collections of files, including PDF, HTML, and Microsoft® Word files. Visually explore text data sets using word clouds and text scatter plots.
Clean and Preprocess Text
Apply high-level filtering functions to remove extraneous content, such as URLs, HTML tags, and punctuation. Correct spelling, filter stop words, and normalize words to root form.
Convert Text to Structured Format
Extract linguistic features by using a tokenization algorithm, calculate word frequency statistics to represent text data numerically, and train word embedding models such as word2vec and skip-gram.
Apply AI to Text Analytics
Fit a machine learning or deep learning model, such as LSA, LDA, and LSTM, to text data. Leverage transformer models, such as BERT, FinBERT, and GPT-2, to perform transfer learning with text data.
Large Language Models
Connect MATLAB to the OpenAI™ Chat Completions API. Leverage the natural language processing capabilities of GPT models within your MATLAB environment, for tasks such as text summarization and chatting.
Text Analytics for Engineers
Develop predictive maintenance schedules based on sensors and text log data. Automate requirement formalization and compliance checking.
Document Analysis
Analyze text with topic modeling to discover and visualize underlying patterns, trends, and complex relationships. Summarize documents, extract keywords, and evaluate document importance and similarity.
Sentiment Analysis
Identify the attitudes and opinions expressed in text data to categorize statements as being positive, neutral, or negative. Build models that can predict sentiment in real time.
Text Generation and Classification
Use deep learning to generate new text based on observed text and to classify text descriptions with word embeddings that can identify categories.
Product Resources:
Get a Free Trial
30 days of exploration at your fingertips.
Ready to Buy?
Get pricing information and explore related products.
Are You a Student?
Your school may already provide access to MATLAB, Simulink, and add-on products through a campus-wide license.