python print method will raise encoding error when following | or > in powershell on windows

49 Views Asked by link89 At 21 September 2023 at 09:17

Given the following python script test.py

# -*- coding: utf-8 -*-
print("Er, Süleyman")

Here is the result of running this script in powershell

python test.py  # success
Er, Süleyman


python test.py > tmp.txt
    print("Er, Süleyman")
UnicodeEncodeError: 'gbk' codec can't encode character '\u0308' in position 6: illegal multibyte sequence

python test.py | Out-File -Encoding utf8 tmp.txt
    print("Er, Süleyman")
UnicodeEncodeError: 'gbk' codec can't encode character '\u0308' in position 6: illegal multibyte sequence

I have no idea about how to wrong around this.

The default language of my laptop is Chinese, by running the following code, I get the output cp936.

import locale
print(locale.getpreferredencoding())

I also try write buffer directly and the error is gone, but the content is incorrect.

# -*- coding: utf-8 -*-
import sys
sys.stdout.buffer.write("Er, Süleyman".encode('utf-8'))

python test.py > test.txt
cat test.txt
Er, Su虉leyman

Update

I found a solution to change the default encoding of powershell in this answer: Using UTF-8 Encoding (CHCP 65001) in Command Prompt / Windows Powershell (Windows 10)

In PSv5.1 or higher, where > and >> are effectively aliases of Out-File, you can set the default encoding for > / >> / Out-File via the $PSDefaultParameterValues preference variable:

$PSDefaultParameterValues['Out-File:Encoding'] = 'utf8'

Though it is not a problem of Python, but without knowledge of the difference among the default encoding of Python Stdout, Windows and Powershell, it will be hard to find the right solution. For example, Setting the correct encoding when piping stdout in Python is trying to fix from the Python's side but in fact it will introduce a new problem: force stdout to use gtk encoding will end up with some unsupported chars are displayed incorrectly. So I think this issue still have value for those who are not familiar with such complicated situation.

Original Q&A

python print method will raise encoding error when following | or > in powershell on windows

Update

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in POWERSHELL

Related Questions in GBK

Trending Questions

Popular # Hahtags

Popular Questions