Speech Recognition & Speaker Verification
Speech Recognition
Under many hands-busy and eyes-busy conditions, voice command using speech recognition provdes the best means for human-machine interface. SR is a language independent speech recognition technology, as it can be used for any language. SR applies to 4-kHz telephone-bandwidth speech input (with 8-kHz sample rate) for low-cost implemenattion. For applications that require higher recognition accuracy, SR can be easily extended for wider-bandwidth speech input (such as 7-kHz or higher) for further enhanced recognition performance.
FEATURES
- Applicable to all languages.
- Accurate recognition of short to long utterances: sentences, phrases, digits, English alphabet (A-Z).
- Minimal training for new speakers: TWO utterances for each speech template (more training is better but two are sufficient).
- Flexible vocabulary size depending on applications.
- Input format: 16 bit/sample linear PCM, 8-kHz sample rate.
PERFORMANCE
- Recognition Accuracy: tested under quiet condition - digit (99.7%), phrase (99%, size of 20), A-Z (90%).
- Field Test: tested under driving conditions on highways and local roads - at 35 mph: 98% accuracy for spoken phrases; at 60 mph: 95% accuracy for spoken phrases.
Speaker Verification
With explosive proliferation of mobile devices such as notebook PCs, cellphones, PDAs, etc., and with the fact that many of these devices store important and confidential personal or corporate information and data, how to make sure these devices are not accessed by unauthorized people has become a serious issue.
Speaker verification (SV), a biometric secure access solution, has proven to be an effective alternative to more expensive technologies such as fingerprint reading. In particular, CYBIT SV solution provides double protection against unauthorized access by combining personal speech characteristics and voice password to verify a speaker's identity. The spoken password is language-independent (i.e., can use any language) and can include digit, letter, word or a combination of them.

SV offers a high-security but low-cost solution for many access control applications. With its small hardware resource requirement, SV is especially useful for mobile devices. SV as described below applies to 4-kHz telephone-bandwidth speech input (with 8-kHz sample rate) for low-cost implementation. For higher security applications, SV can be easily extended for wider-band speech input, such as with 7-kHz or higher bandwidth speech for further enhanced verification performance.
FEATURES
- Training: A new user needs only speak voice password TWICE (more training is better but twice are sufficient).
- Voice password: Can use digit, letter, word or a combination of them. Suggested length: about 4 syllables.
PERFORMANCE
- Test condition: 35 speakers; voice passwords spoken at 10-dB SNR (signal-to-noise ratio).
- Test result: 95% acceptance rate (when both speaker and password are correct) and 95% rejection rate (when either speaker or password is incorrect).
Note that SV can be easily tuned (with change of certain parameters) for higher acceptance rate or higher rejection rate depending on application scenario. For example, to minimize unauthorized access, higher rejection rate is desired (e.g., 90% acceptance rate and 98% rejection rate). On the other hand, to reduce the inconvenience that an authentic user be rejected, higher acceptance rate is desired (e.g., 98% acceptance rate and 90% rejection rate). Note that raising the acceptance rate comes with lowering the rejection rate, and vice versa. For a specific application, a best compromise has to be determined.
