Thursday, July 23, 2015

Audio discoveries and problems

So I thought I would package Sidetone using PyInstaller, the latest version of which works so well with PPQT2. But strange things happen in the bundled version. The call to QAudioDeviceInfo.availableDevices(), which works perfectly running from source, returns an empty list to the bundled app. So both comboboxes are empty. Very appropriately, the empty comboboxes never generate a currentIndexChange signal, so the app never does anything (alos very appropriate).

I added code that, when the available list comes back empty, would get a one-item list of the device returned by QAudioDeviceInfo.defaultInputDevice() or QAudioDeviceInfo.defaultOutputDevice(). Because, the Qt docs assure me, "All platform and audio plugin implementations provide a default audio device to use." Which they do, but the devices being returned to the bundled app are invalid devices. They are QAudioInput or QAudioOutput objects, but they also return True from the .null() method, and when the app tries to start them, it generates a stderr message about trying to use a null device.

The code continues to run from source, but with this glitch. On my laptop—which is running the same levels of Mac OS, Python, Qt, and PyQt—when I unplug the USB headset, the app automatically switched to the built-in mic and speaker, and began an entertaining feedback warble. So I thought, OK, there must be some signal, some indication that a USB audio device has gone away. What is it?

But back on the desktop system, where I am doing the coding, things are different. There, when I pull the USB plug out, the app purrs on as if nothing had happened. I added code to intercept the stateChanged signal from the active devices and print the state. It goes from 3 (idle) to 0 (active) and stays there happily after the plug is pulled. And the system doesn't switch to the built-in devices. It is possible to select the built-in devices in Sidetone, and produce quite remarkable feedback effects, but it doesn't happen automatically on the desktop system.

I thought, OK, I'm plugging the headset into a USB hub. What if I put it directly into the back of the iMac? Something did change: the sound developed that "picket-fence" rattle indicating buffer under-run. I had to put the buffer size back up to 512 to eliminate it. Just speculating; the built-in USB delivers data faster than when the headset is on a hub, connected to the built-in hub? Don't care.

What did not change when I plugged into the built-in hub was the behavior when I pulled the plug out. No state change.

So I'm a kind of baffled on two points. One, how to know when the user yanks the plug on the device being used; and two, what is different about a bundled app than one running from source.

Neither is really important to my intended use (personal and casual). I'm cool running from source and I don't expect to be yanking the plug out in normal use. But I'd like to know. If you have any idea, please jump in with a comment.

Tuesday, July 21, 2015

Sidetone, first draft working

It turns out that yes, you can take input from a mic and put it into headphones, using Qt. The minimal process is this.

Acquire a list of available audio devices for input or output. For example,

self.input_info_list = QAudioDeviceInfo.availableDevices( QAudio.AudioInput )

The list items are QAudioDeviceInfo objects.

Populate a combobox (popup menu thing) with the names of the available devices, for example,

self.cb_inputs.addItems(
            [ audio_info.deviceName() for audio_info in self.input_info_list ]
            )

Present the two comboxes, one for input devices and one for output, and await a selection on either. Now it gets complicated, because on the currentIndexChanged signal from either combox, you maybe have not created any devices, or you've created one but not the other, blah blah. Anyway say you are creating an input device. You get new_index an index into that list of device info objects.

        audio_info = self.input_info_list[ new_index ]
        # Create a new QAudioInput based on that.
        preferred_format = audio_info.preferredFormat()
        self.input_device = QAudioInput( audio_info, preferred_format )
        self.input_device.setVolume( 1.0 )
        self.input_device.setBufferSize( 384 )

Now you have an input device. That last step, setting the buffer size, is import, as will be discussed in a minute.

Now the user selects an output device from that list.

        audio_info = self.otput_info_list[ new_index ]
        preferred_format = audio_info.preferredFormat()
        self.otput_device = QAudioOutput( audio_info, preferred_format )
        self.otput_device.setVolume( self.volume.value() / 100 )
        if self.input_device :
            self.input_device.start( self.otput_device.start() )

The very last line is what connects the input device to the output. The value of self.otput_device.start() is the QIODevice that the output device uses. Calling the input device's start() method and passing a QIODevice tells it, this is your target, the sink for your input data. The input device starts putting data into the QIODevice, and the output device takes it out and reproduces it.

In principle the exact reverse should also work, i.e. self.otput_device.start( self.input_device.start() ), but for some reason that leads to strange audio artifacts.

Anyway, the first time I ran this, without setting the buffersize, the sidetone was there but (as with the XCode demo program I wrote about) there was an echo, as if I were talking into a rather large barrel. It turns out the default buffer size is 4096 bytes. I changed the buffer size to 2048 and the echo became less. Then to 1024. Then to 512, with an improvement each time. At a buffer of 256, the audio stream developed a flutter or rapid "picket-fence" noise. Setting it back to 384 removed the noise. The sidetone still has a detectable echo, a ring, but it is tolerable.

Next I have to try it out on my laptop, which is where it will be used, simultaneous with the 3CX VOIP app that I am required to use. If Sidetone can co-exist with 3CX I'll be a happy camper.

If you'd like to play with Sidetone, it is right here on github.

Sidetone project 1

With PPQT pretty much out of my hair (until my OCD drives me back to do another translator or such) I turned to another project, a small one I call "Sidetone".

Sidetone is the sound of one's own voice in a telephone handset. I say handset because I learned the term when working for PacBell in the 60s. Picture a telephone handset:

You listen at the one end; you speak into the other; and a small amount of the sound of your own voice is repeated in the listening end. It makes the phone sound "live". Before a connection is made, or after the connection drops, there's no sidetone and your voice sounds flat, muffled, dead.

I'm starting to spend shifts taking calls on the Recovering From Religion hotline for which I wear a Plantronics headset. And it offers no sidetone. It's annoying. Of course I can hear myself, my voice is transmitted through the air and through my skull bones. And I know the mic is working because I can go to the Sound Preference pane and see the VU indicator bouncing as I speak. But I sound muffled to myself—no surprise, since I have padded things over my ears. I want sidetone! Checking around the web, I find some indications that in Windows, it is possible to get the audio driver to provide sidetone. It's an obvious feature for the system's audio to offer. Given you have a single USB or BlueTooth device that has both an input side and an output side, how hard would it be to direct an attenuated copy of the input signal back to the output? But this is not a feature offered by the Mac OS Sound Preference panel.

So I started picturing a simple little Mac app that would (somehow) take audio in and dribble a little sidetone back out. I thought maybe I could write a real "grown-up" app using XCode. I knew that Mac OS had something called Core Audio; and I know XCode has lots of example programs. So I tried to find some example that I could maybe manipulate to do what I wanted.

And I did, it's CAPlayThrough. But Oh. My. Word. what a monster. CAPlayThrough.cpp alone is over 800 lines (excluding the lengthy don't-sue-us prolog) and there are four other .cpp files and a bunch of .h files as well.

And the capper? It doesn't work very well! I had XCode build it and run it, and told it to take input from the mike and write output to the earphones. Speak into the mic and I sound as if I were in a good-sized barrel or a small cave; there is a latency of at least 0.1 second. That's not good for sidetone; sidetone has to be near-zero latency.

Then I poked around the Python docs and pypi for a while. Audio support is not one of the "batteries included" in Python. There are several libraries to interface to PortAudio, an open-source package. But it wasn't immediately clear how well PortAudio was integrated into Mac Core Audio. And the PyAudio interface package, like some other audio modules I looked at, depends on numpy. Which tells me they are storing and retrieving audio samples as numpy arrays. Probably I'm being unfair but that smells of latencies to me.

Then it occurred to me to look at good old (Py)Qt. What does Qt have to offer in the audio arena?

Quite a lot, it turns out. I'm not sure at this point if it is going to be possible to do what I want, but the facilities are certainly simple. The following program, displayed in full, lists all the available audio devices and their characteristics.

from PyQt5.QtMultimedia import QAudioDeviceInfo,QAudio
from PyQt5.QtWidgets import QApplication

app = QApplication([])

def show_info( mode, dev_list ):

    print('Found {} {} devices'.format( len(dev_list), mode) )

    for audio_info in dev_list :
        print('\n\nname:', audio_info.deviceName())
        preferred = audio_info.preferredFormat()
        print('\tpreferred channel count:', preferred.channelCount())
        print('\tpreferred sample rate:', preferred.sampleRate() )
        print('\tpreferred sample size:', preferred.sampleSize() )
        #print('\tsupported sample rates')
        #for rate in audio_info.supportedSampleRates() :
            #print( '\t\t{}'.format(rate) )
        #print('\tsupported sample sizes')
        #for size in audio_info.supportedSampleSizes() :
            #print( '\t\t{}'.format(size) )

dev_list = QAudioDeviceInfo.availableDevices( QAudio.AudioInput )
show_info( 'input', dev_list )

dev_list = QAudioDeviceInfo.availableDevices( QAudio.AudioOutput )
show_info( 'output', dev_list )

That took about 15 minutes to put together, most spent in the Assistant finding the names of the classes. Here's some sample output.

Found 2 input devices

name: Built-in Microphone
 preferred channel count: 2
 preferred sample rate: 44100
 preferred sample size: 24

name: Plantronics .Audio 648 USB
 preferred channel count: 2
 preferred sample rate: 44100
 preferred sample size: 16

Found 2 output devices

name: Built-in Output
 preferred channel count: 2
 preferred sample rate: 44100
 preferred sample size: 24

name: Plantronics .Audio 648 USB
 preferred channel count: 2
 preferred sample rate: 44100
 preferred sample size: 16

So this looks very encouraging. I'm going to try to cobble together some actual audio input and output next.

Friday, July 17, 2015

Patting myself on the back with both hands

This week I created the ASCII translator. It came to just over 800 lines of code—with lots of block comments and blank lines, so probably under 500 lines of actual Python. I wrote most of it on Monday and finished the first draft on Tuesday. Wednesday was museum day, but yesterday I sat down to test it. There were perhaps ten minor errors of the sort where I spelled a variable name two different ways, or forgot to initialize some global variable at the right time. There were maybe three places where I had to say, "hmmm, that can't work that way, need to recode". But by supper time all the features had been successfully tested except for table formatting. Mind you, this includes the Knuth-Pratt justification code, ported over from the V1 module but lightly recoded.

This morning I began testing table formatting. That's the process of reading something like this,

/T r6 l50 r8
I | CHAPTER ONE: THE MIRROR CRACKS | 3 |
II | CHAPTER TWO: THE GLAZIER IS CALLED AND COMES TWO HOURS LATE | 24 |
III| CHAPTER THREE: DIM REFLECTIONS ARE CAST | 33 |
T/

and producing output like this,

     I | CHAPTER ONE: THE MIRROR CRACKS                     |       3 |
    II | CHAPTER TWO: THE GLAZIER IS CALLED AND COMES TWO   |      24 |
       | HOURS LATE                                         |         |
   III | CHAPTER THREE: DIM REFLECTIONS ARE CAST            |      33 |

Note that the long table cell was "reflowed" to fit the specified width of the column. Knuth-Pratt reflowed, of course.

Well, it all went together very quickly. It actually mostly worked out of the gate. I spent more time getting it to issue appropriate error messages than I did making it format correctly.

So that's done. (OK, I might go back in and implement "Bottom" alignment for table cells, which could be used in the example above.) But a major piece of code by anyone's standards, I think, written and tested in four days. I told the wife, "You know, I rock." She said, "Sure you do."

What with that and having knocked off all the easily-fixable issues from the github list, it is time to make a new release. We're looking at a busy weekend, so it will probably be Monday when I do that.

There remain two pieces of PPQT2 that I might work on. One is a Ppgen Translator, although I think I would first try to persuade RFrank or one of his minions to do it. The other is to go back and actually implement the drag-out windows using the research with which I started this blog not quite 2 years ago. I don't feel a lot of urgency about either. I really want to move on to some other projects.

Edit: After I wrote the above, I thought about how to actually implement bottom cell alignment. And there really wasn't much to it. I opened up the file and had it working in about 15 minutes.

/T r6 l40 rB8
I | CHAPTER ONE: THE MIRROR CRACKS | 3 |
II | CHAPTER TWO: THE GLAZIER IS CALLED AND COMES TWO HOURS LATE | 24 |
III| CHAPTER THREE: DIM REFLECTIONS ARE CAST | 33 |
T/

Note the "B" in the spec for the third column.

     I|CHAPTER ONE: THE MIRROR CRACKS          |       3|
    II|CHAPTER TWO: THE GLAZIER IS CALLED AND  |        |
      |COMES TWO HOURS LATE                    |      24|
   III|CHAPTER THREE: DIM REFLECTIONS ARE CAST |      33|

I really do rock, you know.

Thursday, July 9, 2015

So back to work...

I released the mostly-baked PPQT2 to a reception that was friendly although very muted. In particular, nobody indicated any interest whatever in writing a Translator.

Then I spent a couple of days completing the basic work on a large and complex post-processing book. That is, I did all the steps that in my own "Suggested Workflow" document should precede translating the book to some other markup like HTML.

In the course of that, I found some issues with the sequence of events in the Suggested Workflow, so revised that. I also found a few minor usability issues with the app and added them to the Issues list on Github.

Then it was time to try translating a real, and large, document to HTML, complete with many Illustrations, a few Footnotes, and many Block Quotes and Unsigned Lists and a few Tables. So, all the stuff that a Translator should recognize.

The first step of Translating is parsing the document, and this threw up many errors. Some were legitimate; others should not happen, but do happen because the automated document syntax parser needs to be tweaked. Several more Issues went onto the stack. After I either fixed of circumvented those, the HTML translator actually got called, and it revealed two problems.

The first was a puzzling crash while processing a footnote. It turned out that I had mis-coded a Footnote in the document. This error was not being caught by the document structure parse, with the result that bad data was being passed to the translator. I had to tighten up a regex in the parser so it would not recognize an ill-formed footnote. It would just become a line of text.

The next problem was that the alt= and title= properties of most of the images were broken. The cause turned out to be obvious. Whatever text follows the [Illustration: markup, presumably the first line of the caption, is passed to the Translator along with the Open Figure event code. The point was to let the HTML translator use that first caption text as the alt=/title= string.

Unfortunately for most of the figures in this book, the opening of the caption looks like

[Illustration: <id="Fig_563"><sc>Fig</sc>. 563. Some bumble rumble thing

The <id="Fig_563"> is my optional markup for a link target; it is taken from the Ppgen markup. However, its presence results in building an img statement with:

<img src="images/f563.png" alt="<id="Fig_563"><sc>Fig</sc>. 563. some..."

I fixed this by adding a new utility function to the xlate_utils module: flatten_line(text) returns a text string with everything stripped out of except words and spaces. Then I had the html translator pass its Open Figure preview text through that, so that it would write HTML like this:

<img src="images/f563.png" alt="Fig 563 some bumble rumble thing"

Yes, flatten_line() even strips periods and other punctuation out. That's intentional, because after all, quote characters are punctuation too.

With these changes, the HTML translator is working quite nicely. It certainly produces an HTML book that is ready to be edited in html, have its CSS tweaked and so forth.

What next? Several things. First of all, there are 16 open Issues on the github site. Over the next few days I plan to fix at least ten of them. Then, I am going to write the ASCII translator I promised myself. When I have that, I will be able to complete post-processing of the book I'm working on. When that Translator works, I will put up an updated release of PPQT2 and make another plea for Translators, specifically for Ppgen and Fpgen ones. They are needed, and I am really not the right person to write one, as I lack the kind of deep knowledge of those markups that would make it easy. If necessary, I might write a very rudimentary one of each so I can tell the maintainers of those markups, there, now finish it please.

After that—which will happen by 30 July—I will dust off my hands and walk away from PPQT, returning only to fix serious bugs.

Thursday, July 2, 2015

Announced to a ringing silence

So I posted about PPQT2 in both the DP forum and the PGDP-Canada post-proofing forum. Almost no reaction, although I know a few people downloaded it because one of them had a problem (probably an incomplete download) and two others jumped in to say it worked for them.

For the time being I'm working at the book that I am actively post-proofing. In another week or so I will get to where I'd like to translate it to ASCII, at which time, if nobody else has written one, I'll write an ASCII Translator.

I have strong hopes, however, that the chaps who maintain Ppgen for DP and Fpgen for DP-Canada will step up and do Translators for their respective markups.