Thursday, February 4, 2016

byteplay ok on linux

Oh my but this was a strenuous afternoon. I believe I shall now drone on about my adventures, just because the story ends well.

First, I installed Ubuntu 14.04LTS on my 64-bit dev virtual machine. I kind of hated to do this because the install ISO does not offer what to a Mac user is the sensible choice, of reinstalling the OS without changing the user files. Nope. Either it will install "beside" the existing system, meaning in a different disk partition, or it installs "over" the existing system which wipes all your user files and settings. So I did the latter, knowing full well I would have to reinstall all my dev tools and dependencies.

In a month or two I will need to install Py/Qt5.6 and some Python modules, so I can upgrade PPQT and Cobro. But for the time being, I only needed to get a working Wing IDE and Python 3.5. The Wing IDE is a simple install, then activate with my license key.

Python.org does not offer an installable package for Linux. There are Ubuntu/Debian packages for Python 2.7 and for Python 3.4, but I needed 3.5. The only way to get that is by downloading the source package and making it. Well, I've done that often enough. Download, unzip, make, make test, sudo make install. All seems to go well until I try to start Wing, and it can't start Python. "Missing module _struct".

DuckDuck that phrase and you will find that hundreds of people have encountered it. What does it mean? It can be properly translated as, "you suck at installing software". Something didn't go well in the make-install step. More missing things show up when I find out that neither pip nor easy-install were installed. When I try to download and run get-pip.py, it fails with an error because the SSL lib wasn't installed, and also the zlib was not installed.

Scrolling back in the make output I find a list of modules that it couldn't locate. Did it put the list at the end where a person would notice it? Of course not! It buried it between a couple of hundred lines of output before and after. Did it tell you what to do? Of course not! Well, it did; it helpfully advised that you "look for the name of the missing module in setup.py".

Which I did, and saw where it built a list of libraries to search for things. Then I got a terminal window and used the find command to find the things it was not finding. Oh, there they are, in a special 64-bit library. Add the path to that library to the code of setup.py and rerun the make install. OK then! Now I've got pip, and Wing is working, whee.

So now I can attempt what I set out to do a couple hours earlier, pip install byteplay3. Which of course fails with an obscure message, "Could not find a version that satisfies the requirement". What requirement? Back to DuckDuckGo. Oh yeah, plenty of people were having this problem a couple years ago, when pip began to require version numbers that complied with PEP400. I read PEP400. No, my version number fits the pattern. But in one of these postings I see a remark that, oh, also, pip does not support modules that are not hosted directly at PyPi. A module hosted at github, for example. Oh. Which I am doing.

So I find out about setup.py's build and sdist and upload commands. And find out the hard way, that you can't do

python setup.py sdist
python setup.py upload

oh no no no! You have to do

python setup.py sdist upload

all one line. Because...? Who knows.

Anyway, did that, and then in my Linux system and my Mac OS system both I could do pip install byteplay3 and it did it and I could start Python and run the example from the readme, and it worked.

Tomorrow, Windows. I am so looking forward to that.

Wednesday, February 3, 2016

On the treadmill again

So part of supporting byteplay3, which I have optimistically uploaded to PyPi, is to check that it installs on linux and on Windows. So I fired up Parallels and opened my 64-bit Ubuntu dev machine.

Now, I do this not very often, so it seems inevitable that every time I do it, I have in the interim updated Parallels itself, and every time Parallels updates, it wants to update the Parallels Tools component it inserts into every virtual machine.

So of course as soon as I open the Ubuntu dev VM, it starts installing a new Parallels Tools. Which for some reason takes 5 minutes to crawl its progress bar across, and then wants a reboot.

Meanwhile I thought, hey, let's make sure Ubuntu is up to snuff also. I bring up the Synaptic package manager. Current Ubuntu has a much simpler "Software Updater" program; and I know that the hard-core Linux user doesn't want any of that GUI shit, nothing but sudo apt-get on the command line will do. But me, I remember the SGI package manager for IRIX (but can't remember its name, though) and Synaptic is a dead ringer for it. So I use Synaptic.

I tell it, reload your info and it says it is reloading but then it says it couldn't find a bunch of repositories. It marks a bunch of upgrades but when I tell it to apply them, it grinds a while and says it couldn't download the packages. 404's everywhere.

imagine an hour or so of increasingly irritable investigation here...

Long story short, at some time in the near past, I unwisely and unwittingly approved an upgrade from Ubuntu 14.04 LTS, to Ubuntu 14.10. Probably seemed like a reasonable thing to do at the time, you know? Minor point-release upgrade? But a bad, bad move on my part!

What was making Synaptic (and the Software Updater) fail flat was that their apt sources-list file specified "utopic" release in the repositories. Because, you see, I had upgraded to 14.10 Utopic Unicorn. Unfortunately, 14.10 Utopic was end-of-lifed back last July and all the "utopic" repositories were pulled from Ubuntu and Canonical. So that's why the 404's—the repositories are all gone.

Wait, what? I thought that Ubuntu 14 was a long-term support version!

No no no, you foolish person. Ubuntu 14.04 Titillated Tarsier, that was the LTS release.

When I stupidly approved an upgrade to 14.10, I moved away from the LTS system and once more stepped onto the every-six-months upgrade treadmill. I cannot apply maintenance to 14.10, I must upgrade to 15.04 Violent Vixen, and very soon again to 15.10 Willful Wildebeast, as per this diagram.

Alternatively I can find a DVD-ROM image of 14.04 and install it over my system to force a downgrade, which I may just do.

Meanwhile, fortunately, the 64-bit test system (without Python or Qt) is still on 14.04, as are, I hope the two 32-bit systems.

Meanwhile, less fortunately, the Windows 7 dev system seems to have forgotten how to run Pip-Win. It looks as if it is starting Python 3 but with the Python 2.7 lib. So some stupid error in the PATH or other environment variable. So I have that to look forward to, tomorrow.

Meanwhile again, I applied the 15.14 update and guess what? The dev system no longer remembers that it is a 1440-px wide window. Lost all the window sizes except 800. Another config file to remember the name of and edit.

Tuesday, February 2, 2016

Function signatures; bytecode lacks "computed go-to"

Well, sure, there was a bug. A good thing I blogged about it, or who knows it would have gone untested and unfound?

The Code class represents a code object, but in a form that can be manipulated. It has two crucial methods. from_code() accepts a Python code object and captures all its contents, returning a new instance of Code class. I had properly updated that method to notice the Python 3 features of varkwargs and the kwonlyargcount.

I had not completely dealt with these in the other method, to_code(). Calling to_code() of a Code object returns a code object that is supposedly equivalent. But I had not included the kwonlyargcount in the calculation of the code.co_argcount. So it was off by 1, causing a TypeError exception, wrong number of arguments, when you called the code.

Testing for that also revealed a bug in my unit-test scaffold code. But it's all good now.

Tiny Basic and the computed go-to

In the first post in this series, I mentioned that one use of byteplay would be to implement domain-specific languages using Python bytecode as a the implementation language. And I suggested I might try to implement a Tiny Basic compiler in this manner. Turns out? Not so much.

Historic background: Tiny BASIC was the term for several, minimal BASIC interpreters for early microcomputers. The first was written by Tom Pittman of Itty Bitty Computers. The more influential version—because it was for the 8080 where Pittman's was for the 1802—was by Li-Chen Wang. I was aware of Tiny BASIC although I never used it. (I programmed my Z80-based CP/M system in assembler, thank you.)

Anyway, Tiny BASIC is such a minimal language that its interpreter, including an adequate editor, can be implemented in a few kilobytes. On a walk yesterday I thought about it and at first got rather excited about the possibility. I quickly arrived at a program structure (a dict keyed by the line number with the text of the line as value) and visualized how I could use the built-in compile() function to reduce expressions to bytecode, and glue those bytecode bits together with more bytecode and thus produce a whole Python function from a BASIC program.

Then, sitting in a coffee shop, I used my phone to look up the syntax of the language...

...and is it not a fabulous age we are living in, where one can be sitting over a capuccino and have the passing notion, "I wonder what was the syntax of a programming language last used over thirty years ago?" and be reading the answer in less time than it takes to describe it? Seriously, people, why are we not all happy as kings?

So, here's the manual. Less than five minutes of skimming on the little screen of the phone revealed a major, major problem.

The problem is that Python bytecode has no real "jump" instruction as found in an assembly language. It has several jump opcodes, for example POP_JUMP_IF_TRUE and JUMP_ABSOLUTE, but the argument to all of these is a fixed integer offset in the bytecode string. The amount to jump forward or backward, or the absolute offset to jump to, is hard-coded in the instruction. There is no way in bytecode language to say, jump to the offset encoded in the top-of-stack item.

Without such a "computed go-to" it becomes much more difficult to implement Tiny BASIC. Because Tiny BASIC has both a computed GOTO and a GOSUB. Of the GOTO, Pittman's manual explicitly says, "the next statement to be executed after a GOTO has the line number derived by the evaluation of the expression in the GOTO statement. Note that this permits you to compute the line number of the next statement on the basis of program parameters during program execution. (my emphasis)". The GOSUB presents the same problem in two ways. First, it takes an expression, so the actual target is determined at execution time. Second, it stores the return location on a stack for use by the RETURN statement. RETURN is effectively a GOTO where the destination is fetched from the call stack.

None of these things can be implemented using a bytecode jump. So that pretty well ends any thought of compiling a Tiny BASIC source file into a single Python function.

I did think of a complicated mechanism, basically each line of the BASIC program would compile into a separate function that could operate on globals (the BASIC variables) and must return the number of the next line to be executed, with None meaning, "next sequential". But it would be kind of ugly and clumsy and not a good demo of byteplay3.

Monday, February 1, 2016

Byteplay3 uploaded to pypi

The PyPi registration process turned out to be remarkably easy. I had to register as a user, something I'd never done despite hundreds of uses of pypi over the years. Then it was just python setup.py register and that was it.

It took a little while to show up in a search at PyPi, but here it is now. I do not understand why the descriptive paragraphs (which the PyPi "edit package" page clearly says, can use reStructured Text markup) are not formatted right. The rST markup for the example and the headings is just being ignored. Meh. Don't care.

What I did to Noam's code

Heh heh quite a lot, really. For one thing, I commented the bejeezus out of it. That was how I went about understanding it (and it took some understanding, lemme tell ya). I read it (and read it and read it) and then I commented what it was doing. Initially I peppered it with comments like #TODO what the bleep is this for? and the like, then gradually replaced all those with meaningful comments.

Prior to and during this work, I was watching Philip Guo's lectures on Python internals. Although these are based on Python 2.7, he leads you through the modules of CPython that are involved with compiling and executing code. So part of the experience of making byteplay work in Python 3 was discovering what about CPython internals had changed from 2.7 to 3.4. The changes are mostly cosmetic, for instance the names of attributes of the function object are different. A few bytecodes have been added and deleted. But the concepts are pretty consistent.

Function Signatures

Some significant changes involve the code object and its handling of the function signature. I'm not sure if the *args and **kwargs features are Python 3 innovations? But the Python 3 code object has properties to deal with them that are not in Python 2.

If the signature of a function has some arguments, then *args, then another argument:

def foo( a, b=False, *args, z=True ):

then the argument z is a keyword-only argument. It cannot be entered as a positional value; it must be specified with z=:

res = foo( 1, True, 2,3,4, z='z' )

The code object attribute kwonlyargcount is the count of such arguments. If the function has such arguments, it is nonzero. This attribute wasn't present in Python 2, and I had to add it to the Code object that is the centerpiece of byteplay.

Byteplay2 did know about the varargs attribute of a code object, a flag bit to say if it has a *args in its signature. (So, that's from Python 2.) However, Python 3 has added a varkwargs flag. That also I had to incorporate into the Code class.

All of these affect the interpretation of other parts of the code object. One crucial thing in a code object is a tuple, varnames, composed of the names (strings) of all local variables. The sequence of names in this tuple is:

  • Names of normal arguments, ones that precede a *args or **kwargs item, in their declared order. From the example above, a and b.
  • Names of keyword-only arguments, if any. In the example, z
  • Name of a *args argument, if any (it doesn't have to be args).
  • Name of a **kwargs argument, if any.
  • Names of other local variables.

The varnames tuple is rather important because bytecodes that reference a local, such as STORE_FAST, have as their argument an index to this tuple. But the composition of the tuple is not drop-dead simple. The Code object method from_code() takes one apart and stores the pieces; and the to_code() method assembles the pieces to recreate a code object. And that had to change a bit for Python 3.

And now I've reminded myself, that my unit test suite lacks a test of function signatures with *args and **kwargs. I think I shall go and do one now...