macOS Catalina made accessibility worse for users that utilize their voice to type. This is a multi-faceted issue, and I will walk you through my daily hell.

(if you want to know why this matters to me <–)

I’m still 3 music-based articles behind, so I spent portions of this weekend putting together this article while I revisit Prolog. Not much I can do when the software I’m reviewing is broken or suddenly has a major updating looming.

I really should write more how-tos I suppose. Anyway, on to the rant.

Contents

Some History

The best voice to text option on macOS was Nuance Dragon for Mac. In 2018 they canned it.

Dragon for Mac barely worked, but it did work. Updates to macOS, frameworks and Dragon’s authorization server’s spottiness quickly made the software unusable by early 2019.

Approximately 1 year later, macOS Catalina was released with the new Voice Control feature which promised…the exact same features of Dragon for Mac.

The connection isn’t difficult to make, given that “Voice Control” is powered by the Siri engine which itself is powered by technology from Nuance Communications (the maker of Dragon).

This doesn’t sound like a sad story on the surface, but you’ve yet to be exposed to Apple’s ineptitude.

Dictation

Dictation
Dictation

Macs have had the dictation feature since OS X Jaguar. It’s a server-side interpretation of speech which appears to simulate typing in some manner on the client.

Dictation works on nearly every text input widget. If it accepts text, then you can utilize the dictation feature.

BUT IT IS TERRIBLE

I speak relatively accent neutral. I am able to utilize Dragon with minimal effort, and friends are amazed when Siri understands every single word I say.

macOS Dictation has no clue. Around 50% of the output looks like some transliterated mess, with me speaking an alien language. Over time I’ve learned what it can and can’t do well, but that’s insufficient for daily work.

It WAS better

Dictation of old
Dictation of old

Dictation at least used to have options. The option of note was the ability to activate it by voice input. Obviously you don’t want random speech interfering with your computer’s operation, but you also don’t want that pop-up microphone on your screen at all times (you’ll see why this matters later).

After months (since mid 2019) of messing with the dictation feature, I did learn which words I could accurately get across. I would switch between dictation and typing to reduce the load on my hands.

That is gone in Catalina.

It’s replaced by….

Voice Control

Voice Control
Voice Control

One of Catalina’s headline features: Voice Control.

It has the basic features that you could want from Speech-to-Text:

  • Commands - A variety of commands to control your computer, edit text and the ability to add scripted commands to recognized vocalizations.
    • Mouse Control - Selection of items by number, a grid overlay to click portions of the screen.
  • Dictionary - Add new words to the recognition database, so I can say “Cubase” and not get “QBase”.
  • Accurate - Voice Control types what I say with few errors.

Four things that change the world for people like me.

It works fantastic…

When it works.

Raw Input

When will it work? (Video)

Based on my snooping around and testing, it appears that Voice Control only supports NSControl based inputs.

Talking to Safari works. Talking to TextEdit.app works. Talking to Pages.app works.

What doesn’t work?

How do I know this? Because that’s a list of the software I use nearly every day.

I can’t use Catalina’s Speech-to-Text with any of these applications!

Cmon Apple!

C'mon Apple, you're supposed to be better than this
C'mon Apple, you're supposed to be better than this

I use System Preferences->Accessibility->Display->Increase Contrast on my computer.

Look what it does to the ALWAYS ON TOP Voice Control icon (the black thing center right). It’s a mess!

We can peer further into the depths of Apple’s development by looking at the Dictation icon while using the same settings:

Dictation Overlay
Dictation Overlay

Oh.

OH. That looks fine.

Subtract the fact that the body of the microphone is supposed to be an input level meter. The level indicator works with neither overlay if you have “Increase Contrast” turned on.

I sometimes wonder if anyone actually uses this stuff.

Bug Fixes

Catalina 10.15.2 allowed me to use Voice Control with Emacs. It was an odd exception.

Luckily 10.15.3 “fixed” that issue. Now I can’t use it with any text editor that I use. Glad they squashed that terrible bug.

Thanks Apple.

Alternatives

There are no functional alternatives. The alternative products that do work, rely on Dragon for mac. The rest require extensive configuration (or developing your own application from a framework) to end up with a mediocre solution.

I’d love to be wrong. If you use something or find something (and try it) that works, please let me know!

Conclusion

I’m stuck on a platform I generally like, but with limited ability to utilize my computer. I work in cycles of 15 minutes on, 5-10 minutes off. I’ve become adept at utilizing spelling-correction to fix frequent errors.

If I switched to Windows then I could utilize Dragon Professional, which is a more advanced solution than has ever been available on macOS.

There is the option of writing in a VM, but that’s quite a complex workaround when you consider synchronizing softwares, file systems and handling build environments.

Right now… it just sucks. The work I could get done in 8 hours 3 years ago takes me 12-16 hours now, while still being only 8 billable hours.

Meta

This post took 9 hours to research, write and edit.