[gentoo-user] python3 question

public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-user] python3 question
@ 2021-01-13 18:31 n952162
  2021-01-13 18:57 ` n952162
  2021-01-13 18:59 ` Grant Edwards
  0 siblings, 2 replies; 10+ messages in thread
From: n952162 @ 2021-01-13 18:31 UTC (permalink / raw
  To: Gentoo User list

Hello.  In python3, how do you do this?

tgt = 'gebuchte Umsätze;'

In python2, you could do this:

tgt = unicode ('gebuchte Umsätze;'.decode ('latin1'))

but that gives:

SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xe4 in
position 12: invalid continuation byte

In fact, any constant with ä in it will give you that.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] python3 question
  2021-01-13 18:31 [gentoo-user] python3 question n952162
@ 2021-01-13 18:57 ` n952162
  2021-01-13 19:41   ` n952162
  2021-01-13 18:59 ` Grant Edwards
  1 sibling, 1 reply; 10+ messages in thread
From: n952162 @ 2021-01-13 18:57 UTC (permalink / raw
  To: gentoo-user

On 1/13/21 7:31 PM, n952162 wrote:
> Hello.  In python3, how do you do this?
>
> tgt = 'gebuchte Umsätze;'
>
> In python2, you could do this:
>
> tgt = unicode ('gebuchte Umsätze;'.decode ('latin1'))
>
> but that gives:
>
> SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xe4 in
> position 12: invalid continuation byte
>
> In fact, any constant with ä in it will give you that.
>
>

Okay, I see that if your locale is not C, you can do:

tgt = 'gebuchte Umsätze;'




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] python3 question
  2021-01-13 18:57 ` n952162
@ 2021-01-13 19:41   ` n952162
  2021-01-13 19:43     ` [gentoo-user] python3 question [RESOLVED] n952162
  2021-01-13 19:57     ` [gentoo-user] Re: python3 question Grant Edwards
  0 siblings, 2 replies; 10+ messages in thread
From: n952162 @ 2021-01-13 19:41 UTC (permalink / raw
  To: gentoo-user

On 1/13/21 7:57 PM, n952162 wrote:
> On 1/13/21 7:31 PM, n952162 wrote:
>> Hello.  In python3, how do you do this?
>>
>> tgt = 'gebuchte Umsätze;'
>>
>> In python2, you could do this:
>>
>> tgt = unicode ('gebuchte Umsätze;'.decode ('latin1'))
>>
>> but that gives:
>>
>> SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xe4 in
>> position 12: invalid continuation byte
>>
>> In fact, any constant with ä in it will give you that.
>>
>>
>
> Okay, I see that if your locale is not C, you can do:
>
> tgt = 'gebuchte Umsätze;'
>
>
>

Okay, I see I had this bit of magic in line 2:

# -*- coding: utf-8 -*- [ this has to be in line1 or line2!!! ]

I've removed that and the error msg is somewhat different:

SyntaxError: Non-UTF-8 code starting with '\xe4' in file test.py on line
89, but no encoding declared; see http://python.org/dev/peps/pep-0263/
for details

Note that line 89 is a *comment* (with a ä)

So, I'm looking into that...

Oh, I think that gave me a solution!

# -*- coding: latin1 -*- [ this has to be in line1 or line2!!! ]

seems to work.  At least, I got some other errors now ;-)




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] python3 question [RESOLVED]
  2021-01-13 19:41   ` n952162
@ 2021-01-13 19:43     ` n952162
  2021-01-13 19:57     ` [gentoo-user] Re: python3 question Grant Edwards
  1 sibling, 0 replies; 10+ messages in thread
From: n952162 @ 2021-01-13 19:43 UTC (permalink / raw
  To: gentoo-user

On 1/13/21 8:41 PM, n952162 wrote:
> On 1/13/21 7:57 PM, n952162 wrote:
>> On 1/13/21 7:31 PM, n952162 wrote:
>>> Hello.  In python3, how do you do this?
>>>
>>> tgt = 'gebuchte Umsätze;'
>>>
>>> In python2, you could do this:
>>>
>>> tgt = unicode ('gebuchte Umsätze;'.decode ('latin1'))
>>>
>>> but that gives:
>>>
>>> SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xe4 in
>>> position 12: invalid continuation byte
>>>
>>> In fact, any constant with ä in it will give you that.
>>>
>>>
>>
>> Okay, I see that if your locale is not C, you can do:
>>
>> tgt = 'gebuchte Umsätze;'
>>
>>
>>
>
> Okay, I see I had this bit of magic in line 2:
>
> # -*- coding: utf-8 -*- [ this has to be in line1 or line2!!! ]
>
> I've removed that and the error msg is somewhat different:
>
> SyntaxError: Non-UTF-8 code starting with '\xe4' in file test.py on line
> 89, but no encoding declared; see http://python.org/dev/peps/pep-0263/
> for details
>
> Note that line 89 is a *comment* (with a ä)
>
> So, I'm looking into that...
>
> Oh, I think that gave me a solution!
>
> # -*- coding: latin1 -*- [ this has to be in line1 or line2!!! ]
>
> seems to work.  At least, I got some other errors now ;-)
>
>

Yes, indeed, this works now, even without setting my locale:

     tgt = 'gebuchte Umsätze;'




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [gentoo-user] Re: python3 question
  2021-01-13 19:41   ` n952162
  2021-01-13 19:43     ` [gentoo-user] python3 question [RESOLVED] n952162
@ 2021-01-13 19:57     ` Grant Edwards
  2021-01-13 20:06       ` n952162
  1 sibling, 1 reply; 10+ messages in thread
From: Grant Edwards @ 2021-01-13 19:57 UTC (permalink / raw
  To: gentoo-user

On 2021-01-13, n952162 <n952162@web.de> wrote:

> # -*- coding: utf-8 -*- [ this has to be in line1 or line2!!! ]

If you have that line in your source code, make sure your editor is
saving the file in UTF-8 encoding.

> Oh, I think that gave me a solution!
>
> # -*- coding: latin1 -*- [ this has to be in line1 or line2!!! ]
>
> seems to work. At least, I got some other errors now ;-)

What encoding is your editor using?

--
Grant





^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] Re: python3 question
  2021-01-13 19:57     ` [gentoo-user] Re: python3 question Grant Edwards
@ 2021-01-13 20:06       ` n952162
  2021-01-13 20:22         ` Victor Ivanov
  0 siblings, 1 reply; 10+ messages in thread
From: n952162 @ 2021-01-13 20:06 UTC (permalink / raw
  To: gentoo-user

On 1/13/21 8:57 PM, Grant Edwards wrote:
> On 2021-01-13, n952162 <n952162@web.de> wrote:
>
>> # -*- coding: utf-8 -*- [ this has to be in line1 or line2!!! ]
> If you have that line in your source code, make sure your editor is
> saving the file in UTF-8 encoding.
>
>> Oh, I think that gave me a solution!
>>
>> # -*- coding: latin1 -*- [ this has to be in line1 or line2!!! ]
>>
>> seems to work. At least, I got some other errors now ;-)
> What encoding is your editor using?
>
> --
> Grant
>
>
>
>

vi?  How would I determine that?  My locale is C



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] Re: python3 question
  2021-01-13 20:06       ` n952162
@ 2021-01-13 20:22         ` Victor Ivanov
  2021-01-13 21:31           ` n952162
  0 siblings, 1 reply; 10+ messages in thread
From: Victor Ivanov @ 2021-01-13 20:22 UTC (permalink / raw
  To: gentoo-user


[-- Attachment #1.1: Type: text/plain, Size: 485 bytes --]

On 13/01/2021 20:06, n952162 wrote:
>> What encoding is your editor using?
> 
> vi?  How would I determine that?  My locale is C
> 

You could use:

   :set fenc

to display the current encoding used for the file, or

   :set fenc=utf8

to force UTF-8 or any other encoding of your chosing. You can also add a 
magic line with fenc to the file to always ensure that the specified 
encoding is used, assuming you also have magic lines enabled in vimrc.

- Victor


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] Re: python3 question
  2021-01-13 20:22         ` Victor Ivanov
@ 2021-01-13 21:31           ` n952162
  0 siblings, 0 replies; 10+ messages in thread
From: n952162 @ 2021-01-13 21:31 UTC (permalink / raw
  To: gentoo-user

On 1/13/21 9:22 PM, Victor Ivanov wrote:
> On 13/01/2021 20:06, n952162 wrote:
>>> What encoding is your editor using?
>>
>> vi?  How would I determine that?  My locale is C
>>
>
> You could use:
>
>   :set fenc
>
> to display the current encoding used for the file, or
>
>   :set fenc=utf8
>
> to force UTF-8 or any other encoding of your chosing. You can also add
> a magic line with fenc to the file to always ensure that the specified
> encoding is used, assuming you also have magic lines enabled in vimrc.
>
> - Victor
>

   fileencoding= 195,3         55%


I suspect that's really useful for languages other than latin1.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [gentoo-user] Re: python3 question
  2021-01-13 18:31 [gentoo-user] python3 question n952162
  2021-01-13 18:57 ` n952162
@ 2021-01-13 18:59 ` Grant Edwards
  2021-01-13 19:09   ` n952162
  1 sibling, 1 reply; 10+ messages in thread
From: Grant Edwards @ 2021-01-13 18:59 UTC (permalink / raw
  To: gentoo-user

On 2021-01-13, n952162 <n952162@web.de> wrote:

> Hello. In python3, how do you do this?

Please explain what "this" is trying to accomplish, and we can tell
you how to do it in Python3. Are you trying to convert from Unicode to
Latin1 and back to Unicode?

  Python 3.8.6 (default, Jan  2 2021, 20:25:58)
  [GCC 9.3.0] on linux
  Type "help", "copyright", "credits" or "license" for more information.
  >>> 'gebuchte Umsätze;'.encode('latin1').decode('latin1')
  'gebuchte Umsätze;'


> tgt = 'gebuchte Umsätze;'
>
> In python2, you could do this:
>
> tgt = unicode ('gebuchte Umsätze;'.decode ('latin1'))
>
> but that gives:
>
> SyntaxError: (unicode error) 'utf-8' codec can't decode byte 0xe4 in
> position 12: invalid continuation byte
>
> In fact, any constant with ä in it will give you that.




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [gentoo-user] Re: python3 question
  2021-01-13 18:59 ` Grant Edwards
@ 2021-01-13 19:09   ` n952162
  0 siblings, 0 replies; 10+ messages in thread
From: n952162 @ 2021-01-13 19:09 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 1648 bytes --]

On 1/13/21 7:59 PM, Grant Edwards wrote:
> On 2021-01-13, n952162 <n952162@web.de> wrote:
>
>> Hello. In python3, how do you do this?
> Please explain what "this" is trying to accomplish, and we can tell
> you how to do it in Python3. Are you trying to convert from Unicode to
> Latin1 and back to Unicode?
>
>    Python 3.8.6 (default, Jan  2 2021, 20:25:58)
>    [GCC 9.3.0] on linux
>    Type "help", "copyright", "credits" or "license" for more information.
>    >>> 'gebuchte Umsätze;'.encode('latin1').decode('latin1')
>    'gebuchte Umsätze;'
>
>
I'm trying to search for a string in a file.  I don't know why there
needs to be any conversion going on.

Just running python3 in interactive mode, I can input the literal when
the locale is right:

    12/lcl/data/f/b>LC_ALL=de_DE python3
    Python 3.7.9 (default, Nov 16 2020, 00:32:07)
    [GCC 9.3.0] on linux
    Type "help", "copyright", "credits" or "license" for more information.
    Could not open PYTHONSTARTUP
    FileNotFoundError: [Errno 2] No such file or directory:
    '/home/mellman/lib/python/rpnrc'
     >>> s = "gebuchte Umsätze"
     >>> print (s)
    gebuchte Umsätze
     >>>

but it doesn't work from within my pgm...

With python2, I presume there was conversion going on because ... a
string can't have unicode chars, so it must be a unicode string that has
to be decoded.

    tgt = unicode ('gebuchte Umsätze;'.decode ('latin1'))

But python3 is supposed to make all that superfluous ... I thought that
was a major driving factor for python3 ... that everything was unicode,
conversion wouldn't be necessary.



[-- Attachment #2: Type: text/html, Size: 2611 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2021-01-13 21:31 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-01-13 18:31 [gentoo-user] python3 question n952162
2021-01-13 18:57 ` n952162
2021-01-13 19:41   ` n952162
2021-01-13 19:43     ` [gentoo-user] python3 question [RESOLVED] n952162
2021-01-13 19:57     ` [gentoo-user] Re: python3 question Grant Edwards
2021-01-13 20:06       ` n952162
2021-01-13 20:22         ` Victor Ivanov
2021-01-13 21:31           ` n952162
2021-01-13 18:59 ` Grant Edwards
2021-01-13 19:09   ` n952162

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox