May 2011

Research Report
Prosodic Theory of Standard Japanese for Speech Representation

Towards the Revision of The Japanese Language Pronunciation and Accent Dictionary

Mitsuru Sugihara

In recent years, studies on prosody, the accent and intonation of speech, has been enriched by rapid technological advances in sound analytical tools and software. There are, however, different schools of thought on how to approach basic assumptions from which interpretations are derived. While we will look into how they differ, we will explain our basic editorial policy for the revision of The Japanese Language Pronunciation and Accent Dictionary.” We will also present an overview of recent developments in prosodic studies from the perspective of how they reflect basic assumptions.

In the traditional study of the Japanese language, the mainstream approach to the Japanese accent patterns is to mark relative pitch differences between moras and analyze how they are arranged. On the other hand, there is an accent model focusing on the rise and fall of pitch rather than pitch differences between moras. This approach puts great store on the gradual fall of pitch for the accent of Standard Japanese. There is another accent model, more aggressive along these lines, which concentrates exclusively on the fall of pitch. This model sees the rise of pitch as part of the intonation amenable to changes in meaning and enjoys a renewed interest in recent years as the apposite model applicable to practical speech education and also to studies on the relationship between meaning and intonation. In reflection of these recent scholarly developments, the revised version of The Japanese Language Pronunciation and Accent Dictionary will be compiled on the basis of different editorial policies. For accent notations, pitch changes will be given. As for the root position of rising pitch, it will abandon the traditional “rule” that dictates the second mora high, apart from words with the first mora high. It will dictate the first mora high for words that have the kana “n” and a long vowel sound for the second mora, which is more in line with the way standard Japanese is actually spoken.

Application of this model to speech representation leads one to the following basic principle, “If the word carries meaning, rise. If not, do not rise.” “Intonation representing the correct meaning” means that a cluster of sounds (tonal phrases) starting with a rising pitch corresponds with a cluster of meanings (phrases). How the rising pitch changes is an important aspect of speech representation in terms of how it relates to the traditional concept of “prominence” and how varying nuances can be conveyed through the rapid rise or the slow rise. It can also be interpreted that combining tonal phrases of different sizes are related to phrases of different sizes, hence, clusters of meanings. This is a matter of some importance for contemplating the appropriate style and speech representation in the field of broadcasting where one needs to deal with more lengthy and convoluted clusters of meanings the likes of which one does not usually encounter in everyday speech.

The NHK Monthly Report on Broadcast Research