About me
I am a micro entrepreneur, I run a one-man software company and I love what I do.
I specialize in creating applications for Windows and data processing systems for Unix/Linux.
I operate in several areas, mainly:
I like it when all stages of the software development process are complete.
I love to build systems that scale.
I try to create things as simple and easy to use as possible.
Piotr Chlebek
SharkTime Software
Publications
Sep 12, 2020
Comparing Speech Recognition Services for HCI Applications in Behavioral Health
Presented on 5th International Workshop on Mental Health And Well-Being: Sensing And Intervention.
Website: UbiComp 2020 workshop.
Behavioral health conditions such as depression and anxiety are a global concern, and there is growing interest in employing speech technology to screen and monitor patients remotely. Language modeling approaches require automatic speech recognition (ASR) and multiple privacy-compliant ASR services are commercially available. We use a corpus of over 60 hours of speech from a behavioral health task, and compare ASR performance for four commercial vendors.
We expected similar performance, but found large differences between the top and next-best performer, for both mobile (48% relative WER increase) and laptop (67% relative WER increase) data. Results suggest the importance of benchmarking ASR systems in this domain. Additionally we find that WER is not systematically related to depression itself. Performance is however affected by diverse audio quality from users’ personal devices, and possibly from the overall style of speech in this domain.
Read Publication
Patents
Issued Dec 22, 2015
AUTOMATIC TUNING OF SPEECH RECOGNITION PARAMETERS
System and techniques for automatic tuning of speech recognition parameters are described herein. A clean audio segment and a dirty audio segment may be obtained, in an iterative fashion, optimized preprocessing parameters may be obtained by, at an iteration, selecting a set of parameters, preprocessing the clean audio segment with the set of parameters to produce a first result, preprocessing the dirty audio segment with the set of parameters to produce a second result, and scoring a portion of the first result with the a corresponding portion of the second result using clean-diff. When an optimization threshold is reached, exit the iterative process and provide the set of parameters from the last iteration.
Read Patent
Issued Jun 26, 2015
PHASE RESPONSE MISMATCH CORRECTION FOR MULTIPLE MICROPHONES
For a multiple microphone system, a phase response mismatch may be corrected. One embodiment includes receiving audio from a first microphone and from a second microphone, the microphones being coupled to a single device for combining the received audio, recording the received audio from the first microphone and the second microphone before combining the received audio, detecting a phase response mismatch in the recording at the device between the audio received at the second microphone and the audio received at the first microphone, if a phase response mismatch is detected, then estimating a phase delay between the second microphone and the first microphone, and storing the estimated phase delay for use in correcting the phase delay in received audio before combining the received audio.
Read Patent
Issued Jul 4, 2014
REPLAY ATTACK DETECTION IN AUTOMATIC SPEAKER VERIFICATION SYSTEMS
Techniques related to detecting replay attacks on automatic speaker verification systems are discussed. Such techniques may include receiving an utterance from a user or a device playing back the utterance, determining features associated with the utterance, and classifying the utterance in a replay utterance class or an original utterance class based on a statistical classification or a margin classification of the utterance using the features.
Read Patent