Write word embedding file



writeWordEmbedding(emb,filename) writes the word embedding emb to the file filename. The function writes the vocabulary in UTF-8 in word2vec text format.


collapse all

Train a word embedding and write it to a text file.

Load the example data. The file sonnetsPreprocessed.txt contains preprocessed versions of Shakespeare's sonnets. The file contains one sonnet per line, with words separated by a space. Extract the text from sonnetsPreprocessed.txt, split the text into documents at newline characters, and then tokenize the documents.

filename = "sonnetsPreprocessed.txt";
str = extractFileText(filename);
textData = split(str,newline);
documents = tokenizedDocument(textData);

Train a word embedding using trainWordEmbedding.

emb = trainWordEmbedding(documents)
Training: 100% Loss: 3.31459  Remaining time: 0 hours 0 minutes.
emb = 
  wordEmbedding with properties:

     Dimension: 100
    Vocabulary: [1x401 string]

Write the word embedding to a text file.

filename = "exampleSonnetsEmbedding.vec";

Read the word embedding file using readWordEmbedding.

emb = readWordEmbedding(filename)
emb = 
  wordEmbedding with properties:

     Dimension: 100
    Vocabulary: [1x401 string]

Input Arguments

collapse all

Input word embedding, specified as a wordEmbedding object.

Name of the file, specified as a string scalar or character vector.

Data Types: string | char

Introduced in R2017b