Issue135

Title Encoding problem, "idna" missing
Priority bug Status testing
Superseder Nosy List desmoulinmichel, kayhayen
Assigned To kayhayen Keywords wrong_execution

Created on 2014-05-12.13:14:18 by desmoulinmichel, last changed by kayhayen.

Messages
msg730 (view) Author: kayhayen Date: 2014-07-21.09:37:10
Can you check, the latest release, and pre-release, are supposed to be good. I am 
putting this to testing, because the code now embeds "idna" in any case.
msg677 (view) Author: desmoulinmichel Date: 2014-06-03.06:55:39
pip install path.py.

It's a lib abstracting os and shutil and adding utility functions to list,
filter and manipulate file paths.

But I don't see why it would need idna.
msg675 (view) Author: kayhayen Date: 2014-06-03.06:41:37
I cannot reproduce the issue with requests, but also I couldn't resolve "from
path import path", may it is that. Can you point me to which package that is.
msg667 (view) Author: kayhayen Date: 2014-05-24.09:58:27
That should have been good, sorry. Seems I fixed a branch not used. Actually due
to using "free-stdlib" as part of standalone, the issus shouldn't even be
possible currently.

You have provided what it takes to reproduce, I think. So I will have to revisit
the issue.
msg666 (view) Author: desmoulinmichel Date: 2014-05-22.12:02:21
Ok, I pip installed this:

http://nuitka.net/releases/Nuitka-0.5.2pre5.tar.gz

But still got error with idna. Is that the proper code I should install or is
the patch failing ?
msg664 (view) Author: desmoulinmichel Date: 2014-05-21.06:58:52
I will try to test that today.

This is a broader problem than just IDNA, it means that encodings can implicitly
import modules with the same name, so in order to work, nuitka must check all
decode/encode calls for dependancies.
msg663 (view) Author: kayhayen Date: 2014-05-21.06:55:26
I see, I read that much in Wikipedia, but I somehow thought that it was your 
locale causing it. 

So this fix is likely to also address issues with "requests" module that were 
reported to me repeatedly. This must have been the missing implicit dependency 
there.

I will close the issue once it hits stable, unless you report its failure.
msg662 (view) Author: desmoulinmichel Date: 2014-05-20.10:50:29
Hi, 

I apologize for taking so much time to answer. About the lang :

⟩ echo $LANG
fr_FR.UTF-8

IDNA is an encoding to turn UTF8 encoded Internet ressource string into an ASCI
compatible string so it can be used with protocoles not supporting anything else
than ASCII.

In Python, you use it that way :

>>> u"çùëô".encode('idna')
>>> 'xn--7cai1as'
>>> print('xn--7cai1as'.decode('idna'))
çùëô

So I guess my example program use it because it's Web related. It would be good
to include it be default if any Internet module is loaded (ssh, imap, urllip, etc).

I'll try the new realease ASAP.
msg654 (view) Author: kayhayen Date: 2014-05-17.06:45:05
I added "idna" in the latest pre-release. But I still would be glad if you
provided instructions, or tried it out.
msg652 (view) Author: kayhayen Date: 2014-05-13.06:08:26
Hello,

the workaround may not help with constants initialisation in a standalone 
binary, which would explain the second issue. How to I get "idna" encoding in my 
shell, can you give me the LANG variables of yours? Otherwise I can include it, 
but not really test it.

Yours,
Kay
msg649 (view) Author: desmoulinmichel Date: 2014-05-12.13:14:18
I Hi,

I got into two issues with encoding that I manage to solve but you may want to
hear about it.

First, I wrote this script to test nuitka capabilities. It includes several
dependancies in a virtualenv and run fine with Python 2.7 on an Ubuntu 14.04 64
bits :

http://0bin.net/paste/mVZxvIKG35UJf0v7#XenGX0FVKsZRO1qcJ71C+fTwniOPRzLcp9WosC9IFiQ=

Then, I ran:

nuitka --recurse-all downloader.py --standalone

After compilation, I tried :

./downloader.exe http://google.com site.html

It raised :

unknown encoding: idna

It didn't raise this in the pure Python version and workd fine.

I solved it by adding this at the begining of the file :

import encodings.idna

I ran the compilation again, but got a second encoding problem. This time a
classic UnicodeDecodeError pointing to my last print().

Again, this doesn't happen with the pure Python version.

I solved it by using encoding best practices, such as declaring the encoding at
the top of the file, making the string a unicode one and decoding to utf8 before
printing.

The resulting file looks like :

http://0bin.net/paste/yp3pWOJrM8vb5z7-#oZHVGvujyVpw8PsKCadbbNocSdwmi4/K5rK8SdHhlfw=

Compiling this and running the resulting binary works in the same environnement
that it's been compiled.

However, there is clearly a difference in how the encoding is dealt with before
and after compilation. You probably want to do something about it, even if it's
just mentioning it in the doc, since it really catch you off guards.
History
Date User Action Args
2014-07-21 09:37:10kayhayensetstatus: in-progress -> testing
messages: + msg730
2014-06-03 06:55:40desmoulinmichelsetmessages: + msg677
2014-06-03 06:41:37kayhayensetmessages: + msg675
2014-05-24 09:58:27kayhayensetstatus: testing -> in-progress
messages: + msg667
2014-05-22 12:02:21desmoulinmichelsetmessages: + msg666
2014-05-21 06:58:52desmoulinmichelsetmessages: + msg664
2014-05-21 06:55:26kayhayensetmessages: + msg663
2014-05-20 10:50:29desmoulinmichelsetmessages: + msg662
2014-05-17 06:53:51kayhayensettitle: Encoding problem -> Encoding problem, "idna" missing
2014-05-17 06:45:05kayhayensetstatus: chatting -> testing
messages: + msg654
2014-05-13 06:08:26kayhayensetstatus: unread -> chatting
assignedto: kayhayen
messages: + msg652
nosy: + kayhayen
2014-05-12 13:14:18desmoulinmichelcreate