node.js writestream file got invalid symbols

145 Views Asked by At

I'm reading text from a big file and write some parts into a new text file:

var ws = fs.createWriteStream('output.txt', {flags: 'w',encoding: 'utf8'});
    for (var i = 0; i < words.length; i++) {         
        ws.write(words[i][0].toString() + "\t" + words[i][1].toString() + "\n");       
    }
    ws.close()

However, if I open the created file, the editor (EDIT: xed on linux) refuses to open it. It says that there is something with the encoding. What can I do? Sanitize the string before writing? But how would I do that? Which symbols are problematic for a write stream?

1

There are 1 best solutions below

2
Dennis On

By default, fs.createWriteStream() uses the utf8 encoding, which supports most Unicode characters. However, if you are writing a string that contains characters that are not supported by this encoding, those characters may be replaced by some other character, which could appear as Chinese symbols or other unintelligible characters.

To avoid this problem, you can try using a different encoding that supports the specific Unicode characters you want to write. For example, you can use the utf16le encoding to write Unicode characters that are not supported by utf8.

Bonus: You can check if your string has non ASCII characters using the below code snippet.

function hasNonASCIIChars(str) {
  for (let i = 0; i < str.length; i++) {
    const code = str.charCodeAt(i);
    if (code > 127) {
      return true;
    }
  }
  return false;
}