I committed all the translator material to github. Except (OMG!) it appears I forgot to add the two new modules, translators.py and xlate_utils.py. OK, done. So that is all wrapped up and documented very well if I do say so (and as a retired tech writer, I think I know when an API is properly documented). I'm very pleased at how I used a formal syntax to define and verify the document structure. The code of translators.py is clean and well-organized. The mechanism for defining and displaying an "Options Dialog" is simple and (I hope) understandable.
There are parts of the xlate_utils.py module that I'm not quite so proud of. The tokenize() function is going to be fairly heavily used and I confess there are parts of it that are rather ad-hoc not to say downright kludgy. And not superbly well tested, yet. However, in coming days I'll be writing an HTML translator that will test it.
That done, on Monday and Tuesday I turned to my list of issues and resolved most of them. One that came out better than I originally expected was this: PPQT2 expects input files to be encoded UTF-8. The user can get a Latin-1 file correctly opened by renaming it to a suffix of .ltn, but if that is not done, the file will be input through the UTF-8 codec. Some special characters will not decode right and will be replaced with Unicode \ufffd, the "replacement character". And if this isn't noticed right away, and the file is saved, there's permanent loss of data.
So I very much wanted to catch this error early and warn the user. But how? I researched the methods of QTextStream, QFile, and QTextCodec. I know that while QTextStream is executing a readAll() call, it must use a QTextCodec.toUnicode() function. That function is capable of returning a count of invalid characters, but there doesn't seem to be any way to find it out.
It looked as if the only ways I could use to find out if the file decoded properly would be either, one, to read it with Python, in which case the readall method would throw an exception; or two, to use QTextStream.readAll() into a string and search the string for replacement characters. Either method would require me to change the API between the main window and the Book, or else to read the possibly-large document file twice.
Then it dawned on me that the QPlainTextEditor has a perfectly good find() method. All I had to do was, in the Book just after it has loaded the editor from the file, to call the editor's find() to look for a replacement character. One hopes the search fails. But if it does not, I can notify the user with a warning message, including the character position of the first replacement character. I made the warning message detailed and also included a pointer to the Help topic.
Another long-standing issue, more of a major loose end to clean up, was logging. There are lots and lots of log messages being issued all over the program. But the logging output was going nowhere. I had initially thought that I would add argument parsing, and use it to support --log-path= and --log-level= parameters. But that's dumb; I'm packaging the Windows and Mac OS versions as clickable apps, with no command-line input. And the Linux version doesn't have to be launched from a command line. So I did some study and reading and chose writable locations for log files based on the platform: /var/tmp for Linux, ~/Library/Logs in Mac OS, and \Windows\Temp in Windows. Oops, I just realized I committed the code for that, but I should also update the Help file to document it. Or maybe put it in the README for each version?
Anyway: tomorrow is Museum day. Thursday I need to spend most of my free time studying the docs for a new volunteer gig. I have an online training session for that at 4pm that day and I want to be prepared. But Friday I will start coding an HTML Translator. Should have that done early next week. By the end of next week I should have PPQT2 packaged up ready to announce. Can't wait.
No comments:
Post a Comment