Vocal biomarkers are objective measurements made from speech signals that convey health or wellness information relevant to the speaker. Sonde has developed mental fitness vocal biomarkers that are derived from short segments of “free speech”. Free speech refers to open-ended verbal responses that users provide when prompted with open-ended questions or prompts. Research has identified a number of vocal biomarkers that are indicators of mental fitness status and can be used by individuals to track how their mental fitness may change over time. All of the vocal biomarkers measured here have been shown to trend lower for individuals with reduced mental fitness and trend to higher values for individuals with higher mental fitness: therefore, increasing/ decreasing scores would be interpreted as increasing/ decreasing mental fitness.
Mental fitness is a category of health/wellness that encompasses both psychiatric aspects including depressive symptoms (e.g., pleasure or enjoyment, feelings of hope or lack thereof, altered behavior patterns) but also (mental) fatigue. Because these vocal biomarkers are derived from voice analysis, conditions affecting voice for other reasons (e.g., airway infections or inflammation, speech disorders, intoxication) can also impact these measurements. The vocal biomarker scores do not provide a prediction about depression or other conditions and communicating these scores as in “normal” or “abnormal” ranges is likely to be confusing for end-users. Population reference ranges for typical values of the vocal biomarker features are provided below and should help users understand whether they are relatively low or high on these measures. Comparison of these reference ranges across Sonde datasets spanning multiple countries and languages has indicated that the ranges are globally applicable.
Individuals with reduced mental fitness may not have low scores for all these vocal biomarkers; likewise, healthy individuals may not have high values for all measures. The strongest value and health insights will be gained when used repeatedly as a health tracker so that users can observe their baseline and trends over time, in particular consistent increases or decreases in one or more vocal biomarker features.
The following table describes Mental Fitness vocal biomarker voice features. Sonde APIs will return the scores for these features. Both the recommended end user-friendly and expanded explanation descriptions are provided. Sonde Mental Fitness Tracker App showcases a recommended implementation of how to share feature descriptions and score interpretations with mobile app end users.
MF Acoustic Scores - Version 2
The following scores are available through the API (Use Case document )
Acoustic score definitions and range interpretation (this section for app developer reference)
Note: if any of the scores are beyond the specified min-max range (see measure/ display range in table), the scores are clipped to a minimum or maximum values.
Each acoustic and aggregate vocal biomarker listed below is correlated to mental fitness: higher values indicate higher mental fitness, lower values indicate lower mental fitness.
MF Acoustic Vocal biomarker scores available through API | Unit For display in app | Reference range Approximate population distribution (higher or lower values can occur) | Measure / display range API backend will clip values outside this range | Remarks |
---|---|---|---|---|
Smoothness | % | 60 - 100 | 40 - 100 | Mathematically can extend below 40%, but values <60% are uncommon |
Control | % | 70 - 100 | 55 - 100 | Mathematically can extend below 55%, but values <70% are uncommon |
Liveliness | octaves | 0.10 - 0.40 | 0 - 0.45 | Has a broad “normal” range; no hard upper limit. In regular conversational speech >0.40 octaves is uncommon |
Energy range | dB | 15 - 30 | 0 - 36 | No hard upper limit; in regular conversational speech >36 dB is uncommon |
Clarity | kHz2 | 0.20 - 0.30 | 0 - 0.36 | Range is defined by anatomy of vocal tract |
Crispness | ms | 150 - 500 | 0 - 600 | Mathematically can extend beyond 600 but it is uncommon |
MF Acoustic Vocal Biomarker Mental Fitness Score through API | Unit For display in app | Reference range Approximate population distribution (higher or lower values can occur) | Measure / display range API backend will clip values outside this range | Remarks |
---|---|---|---|---|
Mental Fitness Score | % | 60 - 85 | 0 - 100 | 0-54: “Pay attention” |
MF Acoustic score interpretation (recommended descriptions for user explanation in-app)
Vocal biomarker | Brief Description Directly below score display | Expanded Description Below score history / trend chart |
---|---|---|
Smoothness | Reduced mental health can negatively impact our ability to control our vocal pitch. When pitch control decreases, the smoothness of our voice decreases. | This measures small changes in voice pitch that happens as we speak. Pitch is a measure of your voice’s tone being low or high. We control pitch deliberately to emphasize words and convey emotion, but small and rapid irregularities in pitch also occur. Less control over vocal production can reduce smoothness, which may be perceived as a hoarse or rough voice quality. Typically, smoothness will fall between 60% to 100%. Speaking requires rapid and precise control over our vocal muscles. Changes in muscle tone, fatigue, inflammation, as well as mental health can affect this degree of control. Changes in mental health can be reflected in changing smoothness. |
Control | Reduced mental health can negatively impact our ability to have precise control over vocal muscles. When vocal muscles control decreases, our control decreases. | This measures small pressure changes in sound waves that our vocal folds while speaking. Reduced control over voice production can increase these pressure variations, which may be perceived as a breathy voice quality. Typically, control will fall between 70% to 100%. Speaking requires rapid and precise control over our vocal muscles. Changes in muscle tone, fatigue, inflammation, as well as mental health can affect this degree of control. Changes in mental health can be reflected in changing levels of vocal control. |
Liveliness | Depressed emotions or reduced mental health can affect how much vocal variety we use. Less variety or liveliness in our voice results in a more monotone and less engaging voice. | This measures the amount of deliberate change in voice pitch that happens as we speak. Pitch is measure of voice tone being low or high. Changes in pitch are used to emphasize words and convey emotion. A higher liveliness score indicates more variety in pitch. Typically, this measure will fall between 0.10 to 0.40 octaves (an octave is a doubling of pitch). When we speak, we use vocal variety to engage the listener and express emotion. When these emotions or mental health are reduced, liveliness can decrease, making the voice sound more monotone and less lively. |
Energy range | Reduced mental health or fatigue can cause our vocal energy range to decrease. We normally speak with a varying intensity for emphasis, leading to higher energy range. | This measures how much vocal energy changes while speaking. Energy range is influenced by much how much effort we put into making speech sounds, so that loud voices have more energy than soft voices. Sound energy is measured in decibels (dB), and energy range measures how much the energy changes while speaking. Typically, this measure will fall between 15 dB and 30 dB. When we speak, we change the intensity of our voice for emphasis and to engage the listener. When we are disengaged, tired, or experience reduced mental health, this can reduce energy range. |
Clarity | Reduced mental health can result in reduced movements of the tongue, jaw and lips. Reductions in these movements can lead to lower vocal clarity. | This measures the extent to which various vowels are produced differently during speaking. Different vowels are created by changing the position of the tongue, jaw and lips. Greater changes in position create more distinct vowels, which increases clarity. This may be perceived as more clearly intelligible speech. Typically, this measure will fall between 0.20 kHz² to 0.30 kHz². Speaking with clarity requires rapid and pronounced changes in position of the tongue, jaw, and lips. Reductions in the movements necessary to create these changes can lead to lower vocal clarity. Reduced vocal clarity has been associated with reduced mental health. |
Crispness | When we speak, each sound we produce has a typical duration. Reduced mental health or fatigue can lead to shorter sound durations and less crispness. | This measures how long, on average, we hold vowel sounds while speaking. Holding a sound steady over time requires coordination and effort of the vocal tract muscles, which can be affected by mental health or fatigue. Measuring crispness makes changes in speech sound durations visible. Crispness is measured in milliseconds (ms) and will typically fall between 150 and 500 ms. When we speak, each sound we produce has a typical duration over time. Fatigue or reduced mental health can lead to shorter sound durations and less crispness. Through measuring changes in vowel durations, we gain insight into mental fitness. |
MF Aggregate score interpretation (recommended descriptions for user explanation in-app)
Vocal biomarker Aggregate score | Brief Description Directly below score display | Expanded Description Below score history / trend chart |
---|---|---|
Score (Aggregate) | Your mental fitness score is based on measurements made on your voice. We use specific measurements that have been shown through research to relate to mental fitness. You can use changes in your score to reflect on how your mental fitness may be changing over time. The score ranges help you understand how your score relates to a larger population of users. Keep in mind that these scores are not medical assessments and should be used as just one part of how you understand your health and wellness. | n/a |
MF Acoustic Scores - Version 3
The following scores are available through the API (Use Case document )
Acoustic score definitions and range interpretation (this section for app developer reference)
Note: if any of the scores are beyond the specified min-max range (see measure/ display range in table), the scores are clipped to a minimum or maximum values.
Each acoustic and aggregate vocal biomarker listed below is correlated to mental fitness: higher values indicate higher mental fitness, lower values indicate lower mental fitness.
MF Acoustic Vocal biomarker scores available through API
| Unit For display in app | Reference range Approximate population distribution (higher or lower values can occur) | Measure / display range API backend will clip values outside this range | Remarks |
---|---|---|---|---|
Smoothness | % | 60 - 100 | 40 - 100 | Mathematically can extend below 40%, but values <60% are uncommon |
Control | % | 70 - 100 | 55 - 100 | Mathematically can extend below 55%, but values <70% are uncommon |
Liveliness | octaves | 0.10 - 0.40 | 0 - 0.45 | Has a broad “normal” range; no hard upper limit. In regular conversational speech >0.40 octaves is uncommon |
Energy range | dB | 15 - 30 | 0 - 36 | No hard upper limit; in regular conversational speech >36 dB is uncommon |
Clarity | kHz2 | 0.20 - 0.30 | 0 - 0.36 | Range is defined by anatomy of vocal tract |
Crispness | ms | 150 - 500 | 0 - 600 | Mathematically can extend beyond 600 but it is uncommon |
Speech rate | words/min | 30-180 | 0-180 | Values outside the reference range are possible but uncommon |
Pause duration | sec | 0.25 - 1.25 | 0 - 1.5 | Values outside the reference range are possible but uncommon |
MF Acoustic Vocal Biomarker Mental Fitness Score through API
| Unit For display in app | Reference range Approximate population distribution (higher or lower values can occur) | Measure / display range API backend will clip values outside this range | Remarks |
---|---|---|---|---|
Mental Fitness Score | % | 60 - 85 | 0 - 100 | 0-54: “Pay attention” |
MF Acoustic score interpretation (recommended descriptions for user explanation in-app)
Vocal biomarker | Brief Description Directly below score display | Expanded Description Below score history / trend chart |
---|---|---|
Smoothness | Reduced mental health can negatively impact our ability to control our vocal pitch. When pitch control decreases, the smoothness of our voice decreases. | This measures small changes in voice pitch that happens as we speak. Pitch is a measure of your voice’s tone being low or high. We control pitch deliberately to emphasize words and convey emotion, but small and rapid irregularities in pitch also occur. Less control over vocal production can reduce smoothness, which may be perceived as a hoarse or rough voice quality. Typically, smoothness will fall between 60% to 100%. Speaking requires rapid and precise control over our vocal muscles. Changes in muscle tone, fatigue, inflammation, as well as mental health can affect this degree of control. Changes in mental health can be reflected in changing smoothness. |
Control | Reduced mental health can negatively impact our ability to have precise control over vocal muscles. When vocal muscles control decreases, our control decreases. | This measures small pressure changes in sound waves that our vocal folds while speaking. Reduced control over voice production can increase these pressure variations, which may be perceived as a breathy voice quality. Typically, control will fall between 70% to 100%. Speaking requires rapid and precise control over our vocal muscles. Changes in muscle tone, fatigue, inflammation, as well as mental health can affect this degree of control. Changes in mental health can be reflected in changing levels of vocal control. |
Liveliness | Depressed emotions or reduced mental health can affect how much vocal variety we use. Less variety or liveliness in our voice results in a more monotone and less engaging voice. | This measures the amount of deliberate change in voice pitch that happens as we speak. Pitch is measure of voice tone being low or high. Changes in pitch are used to emphasize words and convey emotion. A higher liveliness score indicates more variety in pitch. Typically, this measure will fall between 0.10 to 0.40 octaves (an octave is a doubling of pitch). When we speak, we use vocal variety to engage the listener and express emotion. When these emotions or mental health are reduced, liveliness can decrease, making the voice sound more monotone and less lively. |
Energy range | Reduced mental health or fatigue can cause our vocal energy range to decrease. We normally speak with a varying intensity for emphasis, leading to higher energy range. | This measures how much vocal energy changes while speaking. Energy range is influenced by much how much effort we put into making speech sounds, so that loud voices have more energy than soft voices. Sound energy is measured in decibels (dB), and energy range measures how much the energy changes while speaking. Typically, this measure will fall between 15 dB and 30 dB. When we speak, we change the intensity of our voice for emphasis and to engage the listener. When we are disengaged, tired, or experience reduced mental health, this can reduce energy range. |
Clarity | Reduced mental health can result in reduced movements of the tongue, jaw and lips. Reductions in these movements can lead to lower vocal clarity. | This measures the extent to which various vowels are produced differently during speaking. Different vowels are created by changing the position of the tongue, jaw and lips. Greater changes in position create more distinct vowels, which increases clarity. This may be perceived as more clearly intelligible speech. Typically, this measure will fall between 0.20 kHz² to 0.30 kHz². Speaking with clarity requires rapid and pronounced changes in position of the tongue, jaw, and lips. Reductions in the movements necessary to create these changes can lead to lower vocal clarity. Reduced vocal clarity has been associated with reduced mental health. |
Crispness | When we speak, each sound we produce has a typical duration. Reduced mental health or fatigue can lead to shorter sound durations and less crispness. | This measures how long, on average, we hold vowel sounds while speaking. Holding a sound steady over time requires coordination and effort of the vocal tract muscles, which can be affected by mental health or fatigue. Measuring crispness makes changes in speech sound durations visible. Crispness is measured in milliseconds (ms) and will typically fall between 150 and 500 ms. When we speak, each sound we produce has a typical duration over time. Fatigue or reduced mental health can lead to shorter sound durations and less crispness. Through measuring changes in vowel durations, we gain insight into mental fitness. |
Speech Rate | Reduced mental health can lead to slower speech. Slower speech can be measured as a lower speech rate. | This measures the how fast we speak, in terms of number of words per minute of speaking. If we are not feeling at our best, our bodies and minds can slow down, including how we speak. Typically, speech rate will fall between 60 to 150 words per minute. When most people speak at their usual pace, their speech rate falls within a certain range. Changes in alertness or fatigue, certain medications, as well as mental health can affect how fast we speak. Changes in mental health can be reflected in changing speech rate. |
Pause duration | If we are not feeling well we tend to have longer silence gaps when we speak. More silence in our speech can be measures as an increase in pause duration. | This measures how long, on average, the silence gaps are when we speak. Longer silent gaps can be an indication that we are struggling more to put our thoughts into words and sentences, which can be caused by reduced mental health or fatigue. Pause duration is measured in seconds (sec) and will typically fall between 0.3 - 1.0 seconds. When we speak we pause occasionally to catch our breath and formulate our thoughts. Changes in alertness or fatigue, certain medications, as well as mental health can affect how long these pauses are. Changes in mental health can be reflected in changing pause duration. |
MF Aggregate score interpretation (recommended descriptions for user explanation in-app)
Vocal biomarker Aggregate score | Brief Description Directly below score display | Expanded Description Below score history / trend chart |
---|
Vocal biomarker Aggregate score | Brief Description Directly below score display | Expanded Description Below score history / trend chart |
---|---|---|
Score (Aggregate) | Your mental fitness score is based on measurements made on your voice. We use specific measurements that have been shown through research to relate to mental fitness. You can use changes in your score to reflect on how your mental fitness may be changing over time. The score ranges help you understand how your score relates to a larger population of users. Keep in mind that these scores are not medical assessments and should be used as just one part of how you understand your health and wellness. | n/a |
Standard unit links for reference: