Saturday, November 29, 2014

Early Morning Obsessions

So yesterday a user of PPQT V1 found a real bug, the first actual "this is a coding error that ought to be fixed" bug since I stopped development many months ago. It's something in the ASCII reflow code. Under a particular setting of parameters, reflowing a poem produces a stack trace,

Traceback (most recent call last):
  File "/Users/original/Desktop/scratch/build/ppqt/out00-PYZ.pyz/pqFlow", line 374, in reflowDocument
  File "/Users/original/Desktop/scratch/build/ppqt/out00-PYZ.pyz/pqFlow", line 591, in theRealReflow
  File "/Users/original/Desktop/scratch/build/ppqt/out00-PYZ.pyz/pqFlow", line 1350, in optimalWrap
IndexError: list index out of range

So it is something in the rather complex Knuth-Pratt optimal rewrap logic. I haven't looked at the code yet to see what the problem really is, nor have I made up my mind what to do about fixing it. There appears to be a not-too-awkward work-around so maybe I'll do nothing. I really do not want to have to rebuild the distribution bundles for V1. It would probably be doable. I just don't want to do it.

That's one of the thoughts I'm having, here at 6am in Honolulu, where we are visiting for the Thanksgiving weekend, instead of sleeping for another hour. Lying in bed obsessing about PPQT instead of sleeping.

But there's another and more serious thought, and that is about Qt and Edit Blocks, sometimes called edit macros. If you want to make a series of edit actions undoable with a single Undo, you create an edit block:

    work_tc = QTextCursor(my_edit_document)
    work_tc.beginEditBlock()
    # change the document in many ways via
    # work_tc, often in a loop
    work_tc.endEditBlock()

It works quite well, as long as the endEditBlock() call is executed. Well, why would it not? It would not, if somewhere between the Begin and the End your program raises an un-caught exception like "list index out of range" not inside a try-except block.

That's what is happening above. Reflowing the text is done inside an edit block, and because of the exception, the block is never ended.

Normal Python programs, when they cause a stack trace like this, simply terminate. But not a Qt app! The top function in the stack trace, reflowDocument(), was called by the QApplication as a result of processing an event, in fact the event of clicking on the "Reflow Document" button. The QApplication doesn't care that this subroutine ended with an exception. It ended, that's all. The QApplication keeps running, processing other events, calling other methods of the various objects to respond to button clicks and menu choices and edit keystrokes.

I really don't know what happens to the open Undo block. I presume the variables created by reflowDocument() go out of scope. What happens when a QTextCursor with an open Undo Block goes out of scope and is garbage-collected? The user reports that the operation cannot be undone.

In fact the whole document is in an ambiguous state and probably should not be saved. Maybe it is alright but maybe there is some garbage in it, or some text is missing (because the reflow logic deleted it and had not put it back when the error occurred). But if the user calls Quit, there will be a prompt to save the modified document.

The very important lesson here is: An Undo Block should never be opened unless it can be guaranteed to close. The situation is just like modifying a file: the logic should always be

    work_tc = QTextCursor(my_edit_document)
    work_tc.beginEditBlock()
    try:
        # change the document in many ways via
        # work_tc, often in a loop
    finally:
        work_tc.endEditBlock()

Thus the Undo Block will always be closed no matter what goes wrong.

I did not do this at any of the several points where I have Undo Blocks in V1. And I realize (here in the dark, not sleeping on a Saturday morning in Hawaii) that I just wrote the first use of an Undo Block in V2, in the footnote code, and I did not do it there. So that's bad. I need to fix that.

But what shall I do about this V1 problem? I will probably have to try to fix it and make new distribution bundles. You cannot imagine my reluctance to do so.

Thursday, November 27, 2014

Thanksgiving note

I'm thankful for having the leisure to futz around with programming to my heart's content.

Just as a side note, although Qt 5.4 isn't out, the documentation for it is up at the shiny new Qt website, Qt.io and looks very nice.

Oh, also, the footnotes panel is almost done. The data model is done and tested; the view/controller is completely coded and I'm confident a day of testing it will finish it. Tuesday next, hopefully.

When that's done, and if Py/Qt5.4 is still not out, I will do a long-neglected piece, the promised Preferences dialog, or at least a first draft of it.

Monday, November 17, 2014

Perhaps I Underestimated...

So the other day I happily wrote, code the footnotes module. That will go fast; most of the code can be lifted out of version 1 and needs only a wash and brush-up to use the V2 APIs.

So, not quite. The version 1 module is 1200 lines of fairly complex code which I haven't looked at in a year. I want to break this up into a "model" module and a "view" module. So I have to figure out which bits to copy into the model and which into the view. That's pretty clear, thanks to my generally readable and structured coding style, but still takes thought. And defining an API between them, one that will be logically clean and maintainable, but also will not add needless overhead to slow operations down in a large book.

There will be big chunks of code that can be copy/pasted, but even those lines will need individual editing.

What does carry over is the general logical structure and methods. For example the code to find footnotes took some time to work out originally, but I can reuse the logic flow. It goes like this, approximately:

Scan all the lines in the document looking for footnote[*] "anchors"[B] like those.
    Save them as a list of QTextCursor objects that select just the Key strings ("*" and "B")
Scan all the lines in the document looking for "^\[Footnote (Key):..." 
    For each, find the end of the note, defined as the next line that ends with "]"
    Save them as a list of QTextCursor objects that select the line(s) of each Note
Merge the two lists matching the Key of each Anchor to the first matching Note after it
    Remove the cursors for matched notes from their lists and
        add the (anchor, note) pair of cursors to the database as an item
    Insert the remaining unmatched Anchors in the database as (anchor, None)
    Insert the remaining unmatched Notes in the database as (None, Note)

Everything the view displays to the user in a 6-column table can be derived from the two QTextCursors, and QTextDocument keeps the QTextCursors updated as the user edits.

What takes time is that this code needs to be brought into a class definition, because each Book has its own Footnote database object. (Remember: allowing multiple books open changes everything.) So all of what are global variables in V1 become self.variables in V2. And I generally change camelCase names (other than Qt interface names) to under_score_names. So everything gets edited and moved.

So not really a "wash and brush-up" but more like a "strip it to the studs and put in all new wallboard and floor tile" remodel. 'twill take a few days. Satisfying work, though.

Friday, November 14, 2014

New Metadata Done; Looking Ahead

On branch new_meta, changed a number of modules to use the new JSON-based metadata system, and to store that metadata in a file suffixed .ppqt instead of .meta. Also added signals to the worddata and pagedata modules and slots to the corresponding view modules so that when the metadata is read in, the visible tables based on it update automatically. In the course of this I had to rewrite the test drivers for all the affected modules, and in that process made a number of improvements in how they were coded. Tested it all, and it seems to be working very well.

git checkout master; git merge --no-ff new_meta; git push origin master and done.

A couple of minor tweaks to do to things I noticed while going through the code; and I want to spend several hours tidying up the Tests folder and making sure that py.test runs things correctly. I would like to bring the Sikuli-based UI tests under the py.test umbrella but am not quite sure how to do that.

Here are the things to do after that.

  1. When Qt5.4 and the matching PyQt are available (which should have happened already but hasn't), install those on the new iMac and move development to there. There's no real "moving" involved other than my ass from one chair to another, as all the affected files are in Dropbox anyway.
  2. Then, bring CoBro up to the Qt5.4 level and replace the execrable WebKit browser with the new WebEngine one.
  3. Then, use Cobro as a test-bed for learning how to use pyqtdeploy to bundle an app. I am eager to find out if this is truly a way to make a self-contained executable on all 3 platforms, in place of pyinstaller.
  4. Presuming that works (and that the new web engine fixes the frequent crashes induced by webkit), release CoBro on all three platforms.
  5. Then, or right now to pass the time waiting for Qt5.4, code the footnotes module. That will go fast; most of the code can be lifted out of version 1 and needs only a wash and brush-up to use the V2 APIs.

At that point — which might be reached in calendar 2014, certainly in early 2015 — PPQT2 will be at what could well be called an alpha state, that is, with adequate function that an experienced user could post-process a book with it. That user would have to run from source, however, until the pyqtdeploy work is complete.

The work to be done after that includes:

  • Writing the translation interface module, which includes figuring out how to dynamically load translator modules.
  • Writing the plain-ascii example translator
  • Writing the HTML example translator
  • Bringing the "keyboard palettes" of V1 forward to V2 and making them load dynamically (using the same scheme as the translators?)
  • Finally going back into the UI and make panels drag-out-able, applying the drag-drop research with which I began this series of posts many months ago.
  • Writing the Help file and adding the Help panel
  • Rewriting the "suggested workflow" document to reflect all the changes; for this I will want to actually post-process a book myself to make sure I know the best way to use the app.
  • Make some screencasts to explain PPQT and show its features. The V1 screencast I made impressed a few people much more than any amount of words.

I would love to have this all done by mid-2015 but suspect it might drag on a bit longer.

Monday, November 3, 2014

JSON Metadata: sorted dicts and sordid ones

Continuing on git branch new_meta. Finding each module that calls the metadata manager and recoding it to save and load in the new JSON format. This usually results in a vast simplification. Previously, the "writer" method received a stream handle and was responsible for creating and writing formatted lines of text to encode its type of metadata; and the "reader" method got a stream handle and had to read the lines of formatted text and decode them. Under the new regime, the writer returns a single Python value (typically a list or dict), and the reader gets that single value as an argument. No more formatting data as lines and streaming them with << or >> operators. Just a blob of data out, a blob of data in.

For each module there's a modname_test module that exercises it. These unit-test drivers used the metadata system heavily. They formatted metadata streams and pushed them in via the metadata manager, and then used the manager to suck the metadata back and check it. Or pushed in invalid metadata and checked the contents of the log for proper error messages. It was a handy way to exercise every branch.

Naturally when the metadata readers and writers of a module change, so also must change the test code that prepares metadata and reads it back. So far there's about 3 times as many lines of code to alter in the test drivers as in the driven code. (Picture a frowny-face icon here.)

All went smoothly modifying and testing the four types of metadata handled by book.py (edit font size, edit cursor position, default dictionary tag, and user bookmark positions 1-9). Each of the reader/writer pairs became simpler, as expected.

Next up in alphabetic sequence is chardata.py. This is the module that maintains the census of characters in the document. Originally it did it using a sorteddict from the blist package, but recently I discovered the sortedcontainers package which is as fast as blist, and pure Python.

Either way, the character census is in a SortedDict object with single unicode characters as keys, and integer counts as values. So obviously, the metadata writer function could consist of just: return self.census that is, return the value of the dict of character counts. The reader would receive that dict as a single value. It had to be a bit more careful because the user might have edited the metadata, so the reader has to do basic sanity checks: are the keys single characters, the counts greater than 0, etc.

But this pretty scheme didn't work out well for the test driver. The test driver loaded the document with the contents of "ABBCCC" and then called the metadata manager to get the character census. Immediate error: "SortedDict cannot be serialized by JSON". Oh. Right. OK, change the writer to return dict(self.census). Convert the SortedDict to an ordinary dict. This worked in the sense that it could be serialized to JSON, but when the test driver pulled the metadata and compared it, it failed with:

expected: {"CHARCENSUS":{"A":1,"B":2,"C":3}}
received: {"CHARCENSUS":{"B":2,"C":3,"A":1}}

Oops. Obviously what's happening is that when json.dumps() a dict, it writes it in the order returned by dict.items(), which is the order of the key hash table. That isn't predictable. Time to stop to think.

Ok, I can leave it this way, and write the test driver to basically do a set-wise comparison on two dicts, ensuring that the received dict has all, but only, the keys and values of the expected dict. Not fun. Also, if I leave it as-is, it pretty well screws the possibility of the user editing this part of the metadata file. How would you find the entry for "X" in a random-sequenced list of 150 or more characters? And think ahead to worddata, which has almost the same structure: if its 5000-10000 metadata values aren't in sorted order, what a sordid mess.

So better to change the metadata format to something that can be sequenced. I rewrote the metadata writer as:

    return [ [key,count] for (key, count) in self.census.items() ]

The items() method of a SortedDict returns them in sorted order by key. JSON serializes list items in the order given, so they are in the file in sequence. It was no more code in the reader, because the reader already had code to examine each (key, count) item for validity.

Saturday, November 1, 2014

Parenthetically, the new iMac

I have done little coding this week, partly owing to taking more than a full day to complete the installation of a new computer:

This 27-inch "retina" iMac is sitting on a wall-mounted desk unit that has been the "office" section of the family bedroom since the 1980s. I got to wondering what other computers it has formerly supported. Here's the list as best as I can put it together:

  • An S-100 bus CP/M system with a home-assembled Heath/Zenith Z-19 monitor
  • A Zenith Z-89 CP/M system
  • A Mac SE/30
  • A Macintosh IIci with Radius Pivot monitor
  • A Power Macintosh (Blue and White)—can't remember what monitor that used
  • A Mac Pro with an Apple Cinema Display

I've owned other machines, such as a series of PCs while I was writing books about them, several different Mac portables, at one point even an Apple II with a Z80 CP/M card in it. But the ones listed above were the ones that sat on this office desk and got serious use for multiple years each.

The Mac Pro, bought within a month of its announcement in 2006, served the longest of any, more than eight years. It came with OS X 10.4 "Leopard" installed, shortly upgraded to 10.5 "Tiger", then to 10.6 "Snow Leopard".

Snow Leopard was a splendid OS, and I used it for nearly three years before I reluctantly "upgraded" (an upgrade it was not) to Lion. That was the end of the line, because this early-model machine had a 32-bit BIOS and was cut off from the genuine upgrade to 10.7 "Mountain Lion".

I kept the machine for two more years as it fell farther and farther off the software state of the art, simply because Apple didn't offer an adequate replacement. The new "canister" Mac Pro didn't interest me because I was tired of piecing systems together out of components. I didn't want to have to figure out what kind of external disk to buy for it, and anyway the current Apple monitors were clearly lagging technologically. Sooner or later, I was sure, Apple would have to produce a nice, tidy all-in-one iMac with a big screen with "retina" pixel density. When they fiiiiiinally announced one, I jumped—just as I had leapt onto the Mac Pro in 2006 to replace the aging Blue and White. (I hope I don't find out in five years that I should have waited six months for a crucial upgrade, just as I would have been better off waiting six months for a Mac Pro with a 64-bit BIOS!)

"Installing" a new computer is more about housekeeping than computers. I pulled out everything from this corner of the room, disturbing dust-bunnies that had been accumulating since the Mac Pro was new. An armload of old, incompatible software CDs and manuals went in the trash. It took several hours to do the general housecleaning of the area and make it all fresh and neat.

So far, the iMac looks like a keeper. The "magic" mouse is slick: no cord, and I can scroll by gently caressing its back with my middle finger. I also got the bluetooth track-pad visible in the picture, and I alternate between that and the mouse. Each is comfortable. The display is excellent, but there's a tiny drawback to such a large one. The Mac OS menu bar is always in the upper left of the screen. When an app's main window is in the center or to the right, and I want to click on the File or Edit menu, it's a loooong way off to the side. I feel like a tennis spectator swiveling my head from left to right. (Finally, a reason to put the menu bar on the app window instead of the screen-top.)

The silver box at the lower left is a NewerTech "mini-stack" with a 2TB hard drive and a Blu-Ray burner. Before retiring the Mac Pro I used Carbon Copy Cloner to duplicate its two drives onto this drive, so every file and app I had accumulated before is still accessible (some of those files date to the 1980s...). Actually the Apple Migration Assistant has become really slick. I just had to give it the password to the household Time Capsule and it simply took over the Time Capsule backup of the Mac Pro and used it to get almost everything I wanted.

I spent some hours installing the latest Python 2 using "brew", to supersede the Apple one, and Python 3.4 from the Mac distribution at Python.org. I installed Wing IDE and spent a little time selecting larger fonts so I could read my code while sitting a couple of feet from the screen. I haven't installed Py/Qt yet; I want to wait for Qt5.4 and the matching PyQt. In the meantime I will get back to developing on my laptop. But by the end of the month I expect I'll be doing most of my development work on the iMac, sitting up at the same desktop where I did PPQT version one and several books.