Ride the carousel of folly with a Python migration

Tags

,

I wanted to migrate my shelljob module from Python 2 to 3. I use this as part of the Leaf unit test setup. What I thought would be relatively straight forward has become quite a problem. I had troubles with the code itself. I had troubles uploading to PyPI. I had troubles generating documentation. I’m starting to regret even doing this.

The code migration

I presumed that I’d just open a “Migration from Python 2 to 3 Guide” and follow the instructions. Alas, there is no such guide. There appears to be no central resource whatsoever on the porting of code from version 2 to version 3. The tool 2to3 also seemed not to do anything to my code, so I was left on my own. Thankfully I have test cases.

I decided not to support targeting both versions of Python at the same time with my module. The resources I found for that were overly complex for this simple module. I’d just leave the old 2 code as-is as and port a new version to 3.

The obvious syntactic changes are quite easy to find. I had mainly just some print and except statements to change. Renaming of Queue to queue was also easy. A few unicode things to replace. Change an open statement to wb and use .encode() to take care of binary output.

I had a WrapException type before, and I couldn’t figure out why it no longer worked (it involved trickery). I just replaced it with Python 3’s raise ... from sourceException.

The only somewhat problematic code was this:

1
2
for line in iter( handle.stdout.readline, '' ):
    self.output.put( ( handle, line ) )

This no longer terminated an emitted lots of odd values (one number repeatedly). It turns out it returns an empty b'' now when empty, as opposed to a ''. This could have been avoided had this API sensibly returned None instead of an empty string. (I know the error seems simple, but it wasn’t simple to locate that this was the code responsible for my problem.)

It didn’t take me too long to migrate my module, though it is rather small. Syntactic changes are easy, but the subtle differences in dealing with streams and binary/test data would be a major pain in a large project.

PyPI

Once done I wanted to upload a new version to PyPI. One would suspect that if supporting Python 2 and 3 was important to a language there would be some easy way to upload modules for both versions. Alas, there is not. The best I could find is a StackOverflow post.

The solution has setup.py detect the version of Python and use an appropriate set of files. This works fine for my simple module. It doesn’t really allow for proper versioning of the old code, but luckily I don’t intend on supporting that anyway.

This setup ends up with a problem since the pip install -e . option no longer installed the module correctly. Luckily I wasn’t the first poor soul with this problem.

I also realized there is nothing in the PyPI package description that actually indicates what version of Python is being targeted. This is probably why I’m forced to use pip2 and pip3 when installing packages. Effectively Python 2 and 3 are entirely different environments, yet they regrettably share a version-less PyPI as the module source.

API Docs

The simple to use epydoc, which I previously used, doesn’t like Python 3 source code so I needed to find an alternative.

I tried sphinx. Good, they have a quickstart guide… but it produced no actual output. Running sphinx-apidoc produced some stuff, but generating the docs just produced an error. I ended up with a directory of configuration, skeleton files, and some API junk. This turned me off this solution quite fast. Products need to have really simple entry points for new users.

I then tried pydoc. It’s interactive mode seemed okay, but I had problems convincing it to produce output files. It just kept saying no Python documentation found. Using a single module name worked, pydoc3 shelljob, but it creates a file without the individual class docs. Using a directory form pydoc3 shelljob/ just caused more errors to be emitted and no files generated.

I also tried pdoc. It was simple and ran without issue. It unfortunately didn’t support the standard documentation markup like @param.

My frustration calmed enough to try sphinx again. Running just sphinx-autoapi gave me something. I wasn’t impressed, but it worked and I didn’t feel like spending more time on it.

Moving on

After I finished I just had a few things in the actual Leaf code base to change. The change in encoding of streams would definitely be horrible to migrate on a larger project.

Based on my experience I’m not surprised that many things are still done in Python 2 only. A migration guide would definitely be helpful. This should include information on how to publish dual version code to PyPI. That would naturally include some suggestions on how to write documentation, including what tools are helpful. These last two would be helpful even for projects starting in Python 3.