TALKING APP KIT

iOS Voice Operation

Create a Talking App Effortlessly.

Download Now

STEP 1: Watch video above.

STEP 1: Watch video above.

More advanced than Siri and Alexa. Interactivity – speak over it anytime. Own It Forever, Offline.

This package includes the Siri-like interface, smooth, fluid feeling like a person.

Respond quickly to voice and speak responses, as shown in the video above. The advanced system listens to the user even while it is speaking, thereby letting the user talk over it and change the conversation midstream. Implement simple, short commands to direct the conversation, offline with unlimited use.

What code is required in the video above?
What code is required in the video above?

1 of 7

14.28%

let skippableActions: Set<Action> = [.performingHumor, .askProfileQuestion]

self.recognize interpretation ([ RequestMoveOn ], preCondition: { spokenWords, finalInterpretation in

// Determine if action is necessary before performing expensive interpretation.

return skippableActions.contains ( Overseer.currentAction )

} )  { resume in

switch Overseer.currentAction {

case .performingHumor:

self.hideAllViews()

self.conscious.iWasFunny (humorLevel: .low)

self.speak ([.sureLetsSkipIt]), finished: {

resume (.continueExistingTrain)

}

default: resume (.continueExistingTrain)

}

} )

At 26 seconds in the video user says “Mmm, good question” while the app is still speaking. We recognize the comment even over our own volume coming from the speaker, and respond. Here we look for phrasing indicating compliment, speak thank you, then signal back to continue the existing train. In this case, we will return to the middle of asking a question where we were. To reduce expensive interpretation this is only looked for during asking a question.

self.recognize ( interpretation ( [ ThatsAGreatQuestion ], not: [ ThatsNotAGreatQuestion ] ) resume in

self.speak ( [.thankYou] ) {

resume ( .continueExistingTrain )

}

}, during: [.askProfileQuestion] )

Recognize a request to ask a question at any time.

self.recognize ( interpretation ( [ CanIAskAQuestion ] ) resume in

guard !Overseer.actionOccuring ( .handlingUserQuestion ) else {

self.handleNewQuestion()

return

}

self.enterQuestionMode()

} )

Speak an answer to their question and automatically continuing listening.

self.recognize interpretation ([ HowLongDoesThisTake ], preCondition: { spokenWords, finalInterpretation in

// No interpretation will be performed if pre-condition is not met.

return Overseer.inQuestionAskingMode

} )  { resume in

self.speak .thisIsUnlimited, .isThereAnythingElse )

} )

Recognize multiple forms of ‘nevermind’ and ‘finished’ interpretations, then depending on the action, skip to the next part of the flow, or stop speaking the current dialog. Use pre-conditions to limit expensive interpretations.

let iGetItActions: Set<Action> = [.handlingUserQuestion, .displayingPrivacyPolicy]

self.recognize interpretation ( [ IGetIt, Nevermind, ImFinished ], preCondition: { spokenWords, finalInterpretation in

// Only look for this interpretation when speaking or performing skippable action.

return self.speechService.speaking || Overseer.actionInProgress ( iGetItActions )

} )  { resume in

switch Overseer.currentAction {

case .handlingUserQuestion:

self.endUserQuestionAsking {

resume (.restartLastTrain)

}

case .displayingPrivacyPolicy:

self.moveOnFromPrivacy {

resume (.continueExistingTrain)

}

}

default: resume (.endCurrentDialog)

} )

Set up a response that is listened for only when speaking certain dialogs. Their response is recognized for ‘no’ while we are speaking, and after we are finished.

self.recognize ( interpretation ( [ No ] ) resume in

self.endUserQuestionAsking {

resume ( .restartLastTrain )

}

}, whileSpeaking: [.isThereAnythingElse] )

self.recognize interpretation ([ RequestMoveOn ], preCondition: { spokenWords, finalInterpretation in

return Overseer.currentAction == .askProfileQuestion

} )  { resume in

self.hideAllViews()

self.speak([.sure])

self.questions.moveToNext()

} )

How is Speech Understood?
How is Speech Understood?

A straightforward markup language is used to recognize intentions. Below is the code used for recognizing the user’s request to ask a question at 13 seconds in the video.

static let CanIAskAQuestion = Enobus(withString: [have/got/I’ve1 question2] [ask something/question] [question1 you2] [quick1 question2] [need help/assistance] 2[hey1 aimm/aim/aimee/aimey/aime2] [see/show/display1 menu/options2] 3[I1 am/have2 question(s)3])

[have/got/I’ve1 question2]

Number to right indicates order. ‘have’ must come before ‘question’.

[have/got/I’ve1 question2]

Number to right indicates order. ‘have’ must come before ‘question’.

2[hey1 aimm/aim/aimey/aime2]

Number to the left indicates total number of phrases required to be uttered.

2[hey1 aimm/aim/aimey/aime2]

Number to the left indicates total number of phrases required to be uttered.

3[I1 am/have2 question(s)3]

Slashes allow alternate words. Parenthesis create variations such as ‘question’ and ‘questions’.

3[I1 am/have2 question(s)3]

Slashes allow alternate words. Parenthesis create variations such as ‘question’ and ‘questions’.

Flights or Hotels Example
Flights or Hotels Example

Our technology can be used powerfully in many industries. For example, booking hotels. Recognizing the intention of “Can I book early?” with no order specified captures various asking styles, shown below.

static let CanIBookEarly = Enobus(withString: 3[like/can/flight/want book early/soon/right-away/now])

“Can I book early?”

“Can I book the flight early?”

“I want to book now.”

“I’d like to book soon.”

Multiple groups can be used to capture a variety of phrasing for asking for a single hotel room.

static let SingleRoom = Enobus(withString: [one/single room(s)] [only/just me])

“I want a single room.”

“A room just for me, please.”

“A single room, please.”

“Me only.”

You’ll create phrasing groups anticipating how your users will ask. In this example, we are looking for indication of whether they would like a view with their hotel room. Syntax checking guides you.

static let ViewIsImportant = Enobus(withString: [room(s) view] [view important/crucial/necessary] [great/outstanding/stellar view])

“I need a room with a view.”

“Crucial is the view.”

“The view is important.”

“It must have a stellar view.”

Conversation Flow
Conversation Flow

Now, string together your interpretations to create your conversation. In this hotels example, your user speaks one of the following phrases…

“I want a single room, view is important. And I’d like to book right away.”

“I want a single room, book now, and the view should be great.”

You’ll listen for all three intentions at the same time altering data as they come in. Then, when the user is finished speaking, you’ll direct to the next part of the conversation.

self.listen(for: [ CanIBookEarly, SingleRoom, ViewIsImportant ]) { heard, final in

if heard == CanIBookEarly {

user.bookNow = true

else if heard == SingleRoom {

room.size = .single

else if heard == ViewIsImportant {

room.view = .highPriority

}

if final == true {

self.gatherRoomDetails(room)

}

}

Then you’ll use a convenience method to speak and listen for responses for hotel bed preferences. We’ll listen while we are speaking, and after, and we’ll send you any unusual responses.

“I’ll take a king size bed.”

“Let’s go with a single bed.”

func gatherRoomDetails(_ room: Room) {

self.speak (.whatKindOfBedWouldYouLike, listenFor: [ SingleBed, DoubleBed, QueenBed, KingBed ], { choice in

// Handle selection

}, unknownResponse: { type in

// Handle unusual responses

}

}

Efficient, Unlimited, and Automatic iOS Speech Recognition

With thousands of built-in interpretations, a provided voice, and advanced customization available, you’ll quickly create a heightened, beautiful experience for your user as effortless for them as speaking.

Our mission is to spread better interfaces to everyone. Retrofit any iOS, iPhone or iPad app. Start Now.

Contact Us