Abstract : Two competing views of
regulating glottal airflow for maximum vocal
output are investigated theoretically. The
maximum power transfer theorem is used as a
guide. A wide epilarynx tube (laryngeal
vestibule) matches well with low glottal
resistance (believed to correspond to the
'yawn-sigh' approach in
voice therapy), whereas a narrow epilarynx tube
matches well with a higher glottal resistance
(believed to correspond to the "twang-belt"
approach). A simulation model is used to
calculate mean flows, peak flows, and oral
radiated pressure for an impedance ratio between
the vocal tract (the load) and the glottis (the
source). Results show that when the impedance
ratio approaches 1.0, maximum power is
transferred and radiated from the mouth. A full
update of the equations used for simulating
driving pressures, glottal flow, and vocal tract
input pressures is provided as a programming
guide for those interested in model
development.
INTRODUCTION
Speech language pathologies and singing
teachers have generated two competing views (and
accompanying behavioral strategies) about the
management of airflow in phonation. On the one
hand, there is the strategy of using a "sigh" to
release air with the voice (Linklater, 1976;
Colton and Casper, 1996; Brown, 1996), or using
a flowphonation mode (Sundberg, 1987). This flow
mode strategy helps to obtain maximum
peak-to-peak glottal airflow.
On the other hand, there is the opposite
strategy of increasing the adduction of the
vocal folds, as in belt (Sullivan, 1985;
Bestebreurtje and Schutte, 2000) and some
country-western singing (Sundberg et al., 1999)
to decrease both the average glottal flow and
the peak flow for (perhaps) greater glottal
efficiency. Even in some classical singing
approaches, airflow reduction is sometimes
encouraged by the mental image of "drinking in
the air" rather than blowing out the air.
In this paper, a few data sets will be
presented that simulate a "tight adduction" case
and a "loose adduction" case with a computer
model of phonation. One objective of the study
is to show that both techniques can lead to an
optimum acoustic output at the mouth, but the
vocal tract configuration has to be matched to
the glottal configuration. Tight adduction of
the vocal folds requires a narrower supraglottal
airway, whereas looser adduction requires a
wider airway to maximize the output power. An
underlying guiding principle is the maximum
power transfer theorem in electric circuits and
transmission systems, which states that the
internal impedance of the source should match
the impedance of the load for maximum power
transfer.
A second objective of the study is to update
the aerodynamic driving force equations for a
low-dimensional model of vocal fold vibration in
detail. Some changes have occurred since
publication of the three-mass body-cover model
(Story and Titze, 1995), particularly with
regard to flow separation from the glottal wall
and collision forces. In order to continue
explorations with this low-dimensional bodycover
model, it is important to provide the
aerodynamic detail as a programming guide. This
dual objective makes this paper somewhat of a
nontraditional mixture between model development
and a clinical application. But this mixture is
justified by the fact that there is an
unfortunate history of "modeling for modeling
sake," by this and other authors, with
insufficient benefit to practitioners in voice
and speech. This paper is an attempt to steer
toward application while also maintaining a
theoretical forward thrust.
[...]
CONCLUSIONS
Mean glottal airflow (or, alternatively,
glottal resistance) has been a target for
optimizing vocal output power in voice therapy
and singing training. The current investigation
suggests that the optirnization process should
involve both the vocal tract and the vocal
folds. It appears that an impedance matching
between the two mght take place. In general, a
wide epilarynx tube (from the ventrical to the
laryngeal vestibule) requires a low glottal
resistance for maximum power transfer.
Conversely, a narrow epilarynx tube requires a
high glottal resistance (more adduction) for
maximum power transfer. What Sundberg (1987) has
called the "flow mode" appears to be a condition
where the vocal tract impedance is considerably
smaller than the glottal impedance, making the
glottis a flow source acoustically, as for
steady-flow (aerodynarnic) conditions.
Vocologists (those who habilitate voices)
have some choices in guiding a speaker or
singer. If the desired (or acquired) voice
quality is to be bright and "twangy," as in some
forms of belting, gospel singing, or some
regional dialects, the vocal tract can be more
narrow in the epilaryngeal and pharyngeal region
(Estill et al., 1996; Story et al., 2001). For
such a vocal tract configuration, a
well-adducted pair of vocal folds, with
relatively high glottal resistance, would be a
good match. Because of this higher glottal
resistance, lung pressures would likely also be
on the high side. Conversely, if the desired (or
acquired) voice quality is to be "yawny,"
as in crooning, sobbing, or a mellow speech
dialect (Estill et al., 1996), the epilaryngeal
and pharyngeal vocal tract can be wider. For
this configuration, a lesser degree of
adduction, with lower glottal resistance and
probably lower lung pressure, is a good
match.
It is already known that "yawn-sigh"
is a good combination for voice therapy. Sigh
involves a glottal posture with low glottal
impedance that matches a "yawny" vocal
tract. Less is known about the "twang-belt"
combination in voice training and therapy. Here
the voice is sometimes initiated with a creaky
production, a tighter state of vocal fold
adduction. This is a match for twang, a tighter
vocal tract configuration. Some vocologists shy
away from a twang-belt approach to voice therapy
because they fear hyperfunction and excessive
vocal fold collision. But since mean glottal
flow is smaller, and hence presumably also the
amplitude of vibration of the vocal folds, it is
not clear that one or the other of these
techniques is necessarily more healthy. For the
time being, one must keep an open mind about
high pressure, low flow production as a viable
alternative to low pressure, high flow
production. The choice depends to a large degree
on the natural state of the vocal tract and the
voice quality to be achieved with it.
As a future investigation, it would be
worthwhile to examine if the subglottal
(tracheal) impedance could assume a compliant
characteristic to provide a truc impedance match
as a complex conjugate to the supraglottal
inipedance. It would also be instructive to test
maximum power transfer for conditions where the
fundamental frequency is at or above the first
formant frequency. Research is presently ongoing
in this area.
Source and
filter adjustments affecting the
perception
of
the vocal qualities twang and
yawn
Titze IR, Bergan CC, Hunter EJ,
Story B
Department of Speech
Pathology and Audiology, National Center for
Voice and Speech
The University of
Iowa, Iowa City 52242, USA.
Logoped Phoniatr
Vocol. 2003; 28; 4; 147-155
Two vocal qualities, twang and yawn, were
synthesized and rated perceptually. The stimuli
consisted of synthesized vocal productions of a
sentence-length utterance 'ya ya ya ya ya,'
which had speech-like intonation. In a continuum
transformation from normal to twang, the area in
the pharynx was gradually decreased, along with
vocal tract shortening and a decreased open
quotient in the glottal airflow. In a continuum
transformation toward yawn, the area in the
pharynx was gradually increased, along with
vocal tract lengthening and an increased open
quotient. The normal (untransformed) vocal tract
area was pre-determined by earlier studies
involving MRI scans of a human subject's vocal
tract. Listeners were asked to rate (on a scale
from 1-10) the 'amount of twang' in one
listening session and the 'amount of yawn' in
another listening session. Overall, the
perception of twang increased directly with
pharyngeal area narrowing, vocal tract
shortening, and decreased open quotient. The
perception of yawn increased with pharyngeal
area widening, vocal tract lengthening, and
increased open quotient. Adjustments of one
parameter alone yielded less significant
perceptual changes than the above combinations,
with open quotient showing the greatest effect
in isolation. Listeners demonstrated variable
perceptions in both continua with poor
inter-subject, intra-subject, and inter-group
reliability