umlauts character interpret differently based on Matlab version.
11 views (last 30 days)
Show older comments
I have both Matlab 2020b & 2018b installed on my machine (windows 10). If I save a text file in Matlab 2020b with umlaut character, char(246), and open the txt file in Matlab 2018b, it appears and read as two characters: ö. While, if I open the same txt file in Matlab 2020b, it appears as ö. Why is that? How can I save the file in 2020b to skip this issue?
feature('locale') returns the same struct in both Matlab versions. Here is the code:
fileID = fopen('test_2020.txt','w');
fprintf(fileID,'%s\n', char(246));
fclose(fileID);
Answers (1)
Rik
on 4 Jan 2021
Edited: Rik
on 4 Jan 2021
The reason is that Matlab switched the default encoding to UTF-8 in R2020a. The only difference is how Matlab interprets the file. The file itself contains the same binary data. The reasoning behind UTF-8 is that the rest of the text remains readable, even if the special characters get messed up.
I ended up writing a function readfile to handle text files. This function will automatically determine the encoding (with a decent success rate). In the case of many emoji it will even perform better than the readlines function introduced in R2020b (until Mathworks fixes that bug).
2 Comments
Rik
on 4 Jan 2021
I'm not aware of the ability to change the default, but you can still save individual files with the ANSI encoding (you can select it in the 'Save as type' dropdown in the save menu).
You could even use an external program like Notepad++ to do this conversion for existing files.
See Also
Categories
Find more on Matrix Indexing in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!