I am creating a native OS X application, and I was surprised at how difficult it is to find documentation on text-to-speech with native APIs. What would be the easiest way of having my application speak (using Alex's voice for example)?
Thanks!
What you call “text-to-speech” is also commonly abbreviated as TTS and alternatively called “speech synthesis”.
The Cocoa class NSSpeechSynthesizer is the API to use. The canonical sample code is CocoaSpeechSynthesisExample.
There also is a guide to “Speech Programming Topics” and a “Speech Synthesis Programming Guide” available.
Finally, there are lower level APIs available if you need access to stuff that is abstracted away for you by NSSpeechSynthesizer.
Look at this please NSSpeechRecognizer example
its a text to speech built in library for OS X .. NSSpeechRecognizer
Related
I'm primarily a windows developer but I need to port an app to the mac platform.
The app needs access to any scanners plugged into the computer. On windows, I've made use of the WIA library. I was wondering if the mac platform has an equivalent and also if it has bindings for the new Swift language?
Any help would be appreciated. Thanks.
You can access scanners and other devices using ImageKit + ImageCaptureCore. Some of Apple's documentation is very sparse, so you shouldn't be afraid to learn things from the header files (e.g. ICScannerDevice.h). It might help to look at some of the code I wrote for this application, particularly this file.
There is no such thing as "bindings" for Swift, since Swift currently uses the Objective-C runtime and interacts directly with Objective-C. This guide will help.
In OSX Mavericks, speech dictation is now included, and is very useful. I am trying to use the dictation capability to create my own digital life assistant, but I can't find how to use the recognition functionality to get the speech in an application rather than a text box.
I have looked into NSSpeechRecognizer, but that seems to be geared toward programming speakable commands with a pre-defined grammar rather than dictation. It doesn't matter what programming language I use, but Python or Java would be nice...
Thanks for your help!
You can use SFSpeechRecognizer (mirror) (requires macOS 10.15+): this is made for speech recognition.
Perform speech recognition on live or prerecorded audio, receive transcriptions, alternative interpretations, and confidence levels of the results.
Whereas as you have noted in the question NSSpeechRecognizer (mirror) indeed provides a “command and control” style of voice recognition system (the command phrases must be defined prior to listening, in contrast to a dictation system where the recognized text is unconstrained).
From https://developer.apple.com/videos/play/wwdc2019/256/ (mirror):
Another way is to directly use Mac Dictation, but as far as I know the only way is to rerdirect audio feeds, which isn't very neat, e.g. see http://www.showcasemarketing.com/ideablog/transcribe-mp3-audio-to-text-mac-os/ (mirror).
I'm writing code specific for MountainLion so I want to try to avoid to use deprecated APIs.
I use FSFileOperationCreate to receive information about copy progress (kFSOperationBytesCompleteKey, kFSOperationThroughputKey, kFSOperationTotalBytesKey) but documentation says
Creates an object that represents an asynchronous file operation.
(Deprecated in OS X v10.8. At the Foundation layer, use
copyItemAtURL:toURL:error: instead. At the POSIX/BSD layer, use
copyfile(3) OS X Developer Tools Manual Page instead.)
Using copyItemAtURL:toURL:error and NSFileManagerDelegate seems impossible to obtain same informations.
How can I obtain same behaviour in 10.8 without rewriting code myself?
Does Apple know now it is simple awful to make same thing?
Might not be the answer you wish to hear[1], but wrap copyfile(3) in your own Obj-C wrapper. You should be able to calculate all you need using the callbacks, in particular the progress one. HTH.
[1] Quite a few APIs in this general area have been deprecated, and while some new APIs have been introduced they seem incomplete. Reasonable guess might be more is coming in 10.9...
Would be usefull to see how things work but not sure on the legality of it
Most Mac apps are written using Cocoa in Objective-C; which, while it is a compiled language, means that there is a fair bit of information left over that could be used by a decompiler.
I'm not sure if there are a lot of decompilers out there that leverages this information, at least I haven't heard of any.
However, there are also another option; F-Script.
F-Script can be used to attach to an executable and explore its interfaces, while not as good as source, it can give you a pretty clear idea of how the executable is built, and how it operates.
As for the legality issue:
IANAL, but as far as I know, reverse-engineering for the purposes of compatibility is legal in many jurisdictions, and I can't imagine that decompiling an executable to look at its code is illegal, unless the specific EULA specifically prohibits it.
Edit: WRT Steam specifically, it is probably NOT written in Cocoa, but C# with some manner of .NET compatibility layer; and it's probably not a good place to start if you want to learn how to make applications for Mac OS X.
By far, the best Mac OS X disassembler I've used is Hopper available here:
http://www.hopperapp.com/
It will also convert the assembly to C pseudo code as best it can. It will generate code flow diagrams with blue lines (true blue, love it) for true and red for false paths.
It's The Mac OS reverse engineering tool. There are even Youtube videos that will show you how to use it.
If it's an open-source app, yes. Otherwise it's possible through decompilation but the output will be a real pain in the ass to look at. If you just want the protocols and the interfaces of categories and classes, have a look at class-dump.
I'm not aware of a nib decompiler.
Whether decompilation is legal: ask a lawyer. This may (and probably does) differ per jurisdiction.
Is it possible to view the source of a mac app?
Realistically, no. Sure, you might be a able to use a decompiler to get a peek, but the kind of output you'll get won't be easy to read. If you're asking this question, this route probably isn't going to be helpful to you.
Specifically interested in GUI and how the steam app for mac works
It's a good bet that it works about the same way that most other applications work. It might use custom controls to look different from a typical application that mostly uses the standard Cocoa controls. But underneath, just about any GUI application written for MacOS X will use the run loops, responder chain, and view hierarchy that Cocoa provide. The main exceptions would be applications that are built mostly using an alternate framework like OpenGL or WebKit.
Figure out what, specifically, the Steam application does that you'd like to do. Take a look at the tools that Cocoa provides to see if you can figure it out yourself; if not, ask about it here.
I'm trying to port an application I've written in Qt from the windows platform to the Mac OS X platform.
The application is relatively simple:
It queries the user for a document (either MS Word, or OOo Writer document). It than launches that that document inside the
respective application, and than replaces various text elements with other data (Think Mailmerge).
It starts up the application and does the text replacing using QAxObject which is wrapper for COM.
Now I'm wanting to port this to the Mac OS X. I've installed Qt Creator on the Mac etc., but obivously COM is a windows technique not available on the Mac OS X.
So I've been looking around for techniques on the Mac OS X that are similiar to COM.
For now, I'm especially interested in using the OOo API http://api.openoffice.org/.
I'd like some pointers which techniques I should be looking at. I'm also willing to accept that this just plainly is not possible at all.
Thanks in advance.
A bit of information about COM on OS X is available in this 2004 article from O'Reilly's MacDevCenter.
From the description of your problem, however, you're looking for something that works with Apple Events. Apple has developed an entire language for working with Apple Events, so most people equate them with the language -- AppleScript. You'll want to start exploring the field of scripting GUI applications with exploration of AppleScript, or Apple Events.
Each directly scriptable app has a "dictionary" of "verbs" and "nouns" you can manipulate. Nouns have properties, and name of the property is often either a string or the name of another noun (or a plural of a noun, which implies a collection - an array).
If the app doesn't have a dictionary (i.e. it isn't scriptable) or doesn't provide what you need via the dictionary, it is possible to send generic UI scripting commands to an "application" called "System Events".
Hmmm -- not a lot of experience in the OOo arena, but have you considered using UNO, the component model that is part of OpenOffice?
Some documentation can be found in the Developer's Guide here.