Title Standard library packages "__path__" are not lists, and not paths in standalone mode
Priority bug Status resolved
Superseder Nosy List alex_hu, kayhayen
Assigned To kayhayen Keywords standalone

Created on 2015-03-25.13:13:16 by alex_hu, last changed by kayhayen.

File name Uploaded Type Edit Remove alex_hu, 2015-03-28.07:58:58 text/plain
msg1904 (view) Author: kayhayen Date: 2016-04-18.18:25:09
This was released as stable and supposedly works fine.
msg1826 (view) Author: kayhayen Date: 2016-02-14.20:09:36
I have played with all ideas, and finally resorted to implement Nuitka's own bytecode module 
loader. It's on factory branch later today and will be part of the next release.

The one of Nuitka will make "__path__" of frozen packages a list with a path pointing to inside 
the ".dist" folder, giving it a chance to not crash when _xmlplus is found and thus fixing the 

I was considering to fix this by forcing compilation (bad, everybody will have to compile this 
for every program each time then), prefixing/patching the source, bad as it's pretty hard to 
not corrupt line numbers, patching the bytecode (it's very hard to prepend bytecode and 
preserving correctness), and ultimately this was the best choice.

This will also generally further the compatibility of Nuitka once we allow to choose what code 
to compile and what to include as bytecode. Packages will no longer be an issue there.

Expect the fix in the next pre-release. Sorry this took so long.

Thanks for the report, this made me aware of the issue of __path__ not being a list in the 
first place.

msg1155 (view) Author: kayhayen Date: 2015-04-12.09:05:18
Hello Alex,

I understand, seems as if that's totally needed. Now I need to think up, how I want to play this. Currently, I 
think, the frozen modules are being loaded by the generic freezer code of CPython.

That is good, and even necessary for "early" load modules. You can not even initialize libpython without e.g. the 
encodings package.

I could split these, into a part that still is handled that way. And a part that is loaded by the meta path based 
loader, that currently only loads extensions and compiled modules. That is one way of doing it. Then I could just 
treat __path__ in a more sane way.

The other way, is to compile from source to bytecode again, and thus not use the existing ".pyc" files, and then 
patch that source, to set "__path__ = [""]" or basically something like that. For "" we already 
do something like that, so ensure a sane "__file__" for it to work with.

Not sure yet, what is the best way to go about that. Need to think some more.

With --nofreeze-stdlib compilation, on Windows specifically will take much longer. It's a lot of code to be 
compiled. On Linux, it's a couple of minutes, maybe 2-3, on Windows I think it can easily expand things to 30-40 
minutes. But it will not have the issue.
msg1154 (view) Author: alex_hu Date: 2015-04-12.07:00:48
Thanks for your response.
Because my project use PyXML, and in my project, I import XML lib, 
So the project will run following script, this script is in 
    import _xmlplus
except ImportError:
        v = _xmlplus.version_info
    except AttributeError:
        # _xmlplus is too old; ignore it
            import sys
            sys.modules[__name__] = _xmlplus
            del v 

When the PyXML installed, the _xmlplus will be installed, so this script will be 
There is a statement in the script:
When meet this statement, it will cause the error. 
I don't known how to avoid this error, if I use --nofreeze-stdlib, what limit 
will I meet? I will try it later.
msg1127 (view) Author: kayhayen Date: 2015-04-06.17:36:52
So this is from CPython code:

    if (ispackage) {
        /* Set __path__ to the package name */
        PyObject *d, *s;
        int err;
        m = PyImport_AddModule(name);
        if (m == NULL)
            goto err_return;
        d = PyModule_GetDict(m);
        s = PyString_InternFromString(name);
        if (s == NULL)
            goto err_return;
        err = PyDict_SetItemString(d, "__path__", s);
        if (err != 0)
            goto err_return;

This produces the "str" value that we see for __path__ in case
it's a package.

For frozen packages, it sets "__path__" to a string value, which
is probably very bad for compatibility. It also means, that already,
nothing stand is Nuitka standalone, would successfully access any
files through it, as it's always useless for that purpose.

And when attempting to smuggle code in that sets "__path__", I noticed
that at least "encodings" package reacted allergic to it. Then loading
codes like "hex" would fail suddenly for no good reason. Not sure what
else could be affected by this.

So, this would mean to add byte code to every module (we load the .pyc
files for Python2 mostly), which is not trivial and has the risk of
breaking things, if standard library is programmed to expect it to be
like this, in other places too.

So, curious why you found this, maybe we can avoid the issue in a
different form.

msg1125 (view) Author: kayhayen Date: 2015-04-06.16:35:45
Using --nofreeze-stdlib will avoid the freezer, but especially on Windows, that 
is going to be really, really slow. :/
msg1124 (view) Author: kayhayen Date: 2015-04-06.16:30:32
I confirm, this happens to me on Linux as well, but only with standard library. It's a 
bug of how frozen packages have their "__path__" set.

Their "__path__" or using them, mostly should be a source of bugs, as data files from 
standard library will not be included (I know of none though). It should be made run time 
relative too.

May I ask how you noticed this problem, just out of curiosity.
msg1096 (view) Author: alex_hu Date: 2015-03-28.07:58:58
I have update the Nuitka to 0.5.11 and the result is the same, then I try the 
develop version via Git, the version is 0.5.12pre3, and the result is not right.

I can use a simple project to reproduce the issue:

import json

Following is my operation on and the result I get:

When I call the like following, the result is OK.

Then I compiled with Nuitka:
E:\Temp>nuitka --exe --recurse-all --standalone

Then run the test.exe:
E:\Temp>cd test.dist


The result is 'json', it's not right.

My OS is WIN7 64, the python is 32 bit, python version is v2.7.9.
msg1077 (view) Author: kayhayen Date: 2015-03-26.16:08:28
I have been making __path__ related changes a lot. I believe it should always be 
a list though, I am not sure. Can you test with the "factory" git branch from the 
official repo?

If not, there soon is going to be a pre-release with these changes. 

Since this is with standard library, can you provide the "", so I can 
reproduce myself?
msg1063 (view) Author: alex_hu Date: 2015-03-25.13:49:15
When I use Nuitka to compile my python project with standalone mode, the command 
is like this: nuitka --exe --recurse-all --standalone

And the compile is compete, but when I run the project, it throw an error:
AttributeError: 'str' object has no attribute 'extend'

And according the trace log,  I found the error is in xml module:

I found the error in following statement:

and the  _xmlplus is a module which imported in this file.

    import _xmlplus
except ImportError:
        v = _xmlplus.version_info
    except AttributeError:
        # _xmlplus is too old; ignore it
            import sys
            sys.modules[__name__] = _xmlplus
            del v 

And I try it without Nuita compiled, with Muita compiled but without standalone 
, and with standalone.

1. Without Nuita compiled             
It can work well, and the '_xmlplus.__path__' return correct absolute path,like:
['C:\\Python27\\lib\\site-packages\\_xmlplus'] and the type is List.

2. with Muita compiled but without standalone
The result is the same as 1.

3. with standalone
Can't return the correct path, just return a module name, not a absolute path, 
like:'_xmlplus' and the type is str.

What will Nuitka do If I call module's __path__ like the case we meet?
Please help to check if it is bug.
Date User Action Args
2016-04-18 18:25:09kayhayensetstatus: testing -> resolved
messages: + msg1904
2016-02-16 05:59:36kayhayensetstatus: chatting -> testing
2016-02-14 20:09:37kayhayensetmessages: + msg1826
2015-04-12 09:05:18kayhayensetmessages: + msg1155
2015-04-12 07:00:48alex_husetmessages: + msg1154
2015-04-06 17:37:53kayhayensettitle: The module's __path__ not show correct if I use standalone mode -> Standard library packages "__path__" are not lists, and not paths in standalone mode
2015-04-06 17:37:10kayhayensetkeyword: + standalone
2015-04-06 17:36:52kayhayensetmessages: + msg1127
2015-04-06 16:35:45kayhayensetmessages: + msg1125
2015-04-06 16:30:32kayhayensetmessages: + msg1124
2015-03-28 07:58:58alex_husetfiles: +
messages: + msg1096
2015-03-26 16:08:28kayhayensetmessages: + msg1077
2015-03-25 13:49:15alex_husetstatus: unread -> chatting
assignedto: alex_hu -> kayhayen
messages: + msg1063
2015-03-25 13:24:13alex_husetassignedto: kayhayen -> alex_hu
nosy: + alex_hu
2015-03-25 13:23:33alex_husetstatus: chatting -> unread
2015-03-25 13:13:16alex_hucreate