Foreign words can often be intimidating since we English speakers have no frame of reference when it comes to pronunciation. This guide is intended to alleviate the intimidation when it comes to Japanese words. After reading this guide, you will no longer feel self-conscience when ordering at a Japanese restaurant, pronouncing a Japanese person’s name, or talking about your favorite anime.
Japanese pronunciations are actually a lot easier to deal with than English ones. Japanese sounds are far fewer and more pure than in English. English has some 8000+ sounds, in part because of all the glides from one sound to another. Japanese, on the other hand is much more strict about its language. The total number of discrete sounds in Japanese is…..110. Yep, just 110. Have you ever heard a Japanese song that had some English sprinkled in it and noticed that the Japanese part sounded nice but the English part sounded like it was sung by a deaf person? Well, now you know why. That’s the sound of English getting reduced to 110 discrete sounds. This is both good and bad news for us. The good news is it’s a tiny, tiny number of sounds to get right. The bad news is that your natural language instincts are going to want to throw in all sorts of extra sounds that just don’t belong.
Before we delve into this guide, let me give you a quick overview of the Japanese writing system. Japan has two “alphabets”, which are more accurately called syllabaries since each character stands for a complete sound. This differs from the English alphabet where each letter makes up only part of a sound (try pronouncing “k” by itself”) The first syllabary is called hiragana and is used for all Japanese-native words. The second one is called katakana and is used when writing loan words from other countries. Both syllabaries are character-for-character mirrors of each other, and many of the characters even look similar between the two syllabaries.
I sort of lied above when I said there were only two “alphabets” in Japanese. Japan also has Kanji, which are symbolic representations of words or ideas that the Japanese borrowed from the Chinese. Unlike hiragana and katakana, kanji do not represent discrete phonetic sounds. Most kanji have at least two pronunciations associated with them, on-yomi (the original Chinese reading, which has been forced into Japanese phonics) and kun-yomi (the Japanese reading). Kanji are the toughest part of learning Japanese for me because for every kanji you have to memorize not only the meanings but two or more pronunciations. And then it gets worse since most kanji are not strictly words by themselves, but general ideas. Many Japanese words consist of multiple kanji or, more frequently, a kanji with some hiragana tacked to the end. But let’s not get too bogged down with worrying about kanji right now. Just know that even kanji-based words have a phonetic pronunciation that can be represented using hiragana. In fact, young kids in Japan start learning to read and write exclusively in hiragana and katakana. Only later do they tackle learning kanji.
The wonderful thing about Japanese is it is completely phonetic. Each character always has the same sound no matter what other characters it happens to be next to. This enables us to translate hiragana/katakana into our (Roman) alphabet quite easily. The Japanese call this romanji. Here is a hiragana chart with romanizations:
You’ll notice that most of the characters consists of a consonant + vowel sound, for example KA, KI, KU, KE, KO. It’s important to remember that even though they are represented in our alphabet as multiple characters, in Japanese they are single characters, each with a set sound. Remembering this will help tremendously with your Japanese pronunciations because it will allow you to split any Japanese word into easily pronounced bits.
OK, now lets get to the good stuff, the pronunciations. Let’s start with the vowels:
A – as in father. When the doctor asks you to say “Ahh” when looking into your mouth, that’s the sound we are looking for here.
I – as in machine. Yes, it’s a long “e” sound like the words “speed” or “read”.
U – as in “Jupiter”. It’s a double “o” kind of sound like “poo” or “you”.
E – as in “pen”. Enough said.
O – as in “hope”.
So our vowels sound something like ah, ee, oo, eh, oe. That’s it, just five vowel sounds. Please drill them into your head, as they are the essence of the language. Notice that there aren’t different versions of each vowel like in English. “A” in Japanese is always pronounced like it is in “father” and never like “apple” or “cape”. The same is true for the other vowels.
One interesting point is a few additional “English” vowel sounds can be approximated by joining together two Japanese vowels. For instance, using the guide above, say the sound for “E” and then the sound for “I”. It should sound like eh-ee. When said fast, doesn’t it sound like a long “a” as in “rain”? Ever hear of Seiko watches? We pronounce it “say-ko”, and this is pretty much bang-on. The only subtle difference is that in Japanese it is actually three syllables and not two. Now, using the guide above, say the sound for “A” followed by the sound for “I”. It should sound like ah-ee. When said fast, doesn’t it sound like the long “i” sound as in “high”?
I just want to stress that this is a cheap parlor trick. No real magic happens like it does in English. When you put two vowels together in Japanese, a new sound IS NOT produced. All that is happening is that two discrete vowel sounds when said right after each other are forming a complex sound that approximates a different vowel sound. “sei” might sound like “say”, but it’s actually two separate syllables “se” and ”i”. I know what you’re thinking… So what that they are two syllables?!?! When I say them together they sound like one so who cares?!?! Well, it turns out that the Japanese care. In Japanese, every syllable is given equal time when pronouncing a word. So we pronounce Seiko as say-ko, but the Japanese pronounce it seh-ee-ko. So in the true Japanese pronunciation it takes longer to spit out the “say” part. It may sound like splitting hairs, but it is an important distinction if you want to hone in your Japanese pronunciations.
OK, that horse has been thoroughly beaten. Let’s look at the other sounds.
The KA Series
KA, KI, KU, KE, KO – “K” sound plus appropriate vowel sound.
GA, GI, GU, GE, GO – The “K” series is able to take ten-ten marks (looks like quotes) which changes the “K” sound to a hard “G” sound as in “garden”.
The SA Series
SA, SHI, SU, SE, SO – “S” plus vowel sound. Notice our first hiccup… there is no such sound as SI (sounds like “sea”) in Japanese. Instead we get SHI (sounds like “she”).
ZA, JI, ZU, ZE, ZO – The SA series can also take ten-ten marks which changes the S sound to a Z sound (except for our odd-ball friend SHI who changes to JI).
The TA series
TA, CHI, TSU, TE, TO – “T” sound plus vowel. A few more oddballs here. There is no TI or TU in Japanese, only CHI and TSU. You’ll get used to it. To pronounce TSU (like the word tsunami) you hint at the T sound then go for the SU. Say “eat soup” fast and you’ll get the gist.
DA, JI, ZU, DE, DO – The “T” series can play with the ten-ten marks too. There is no DI or DU in Japanese and we have our first WTF moment. JI and ZU, didn’t we have them already? Yes, and they are pronounced the same. Occasionally, some Japanese words use the JI and ZU from the TA series instead of the SA series but it’s nothing we need to worry about since they look the same in romanji and are pronounced the same too. If you’re using a Japanese IME (Input Method), you can force it to make a JI from the TA series (CHI + ten-ten) by typing DI, and you can make a ZU from the TA series (TSU + ten-ten) by typing DU. If you have no idea what I just said, don’t worry about it – it has nothing to do with pronunciation.
The NA Series
NA, NI, NU, NE, NO – finally we get back to normal ground. Nothing odd here, and the NA series does not take ten-ten marks.
The HA Series
HA, HI, FU, HE, HO – OK, what the FUck is going on here? There is no HU sound in Japanese BUT the F in FU is a very windy sound. It’s really a hybrid between an H sound and a F sound. When we say FU (foo) in English we put our top teeth against our lower lip and break them apart as we say it. When we say HU (who) we keep our lips and teeth apart and use a lot of lung-work to get the sound out. To say the Japanese FU, we need to put our teeth and lips together like we want to, but them back them off slightly and say HU (who). The end result, as I said, sounds like a windy FU or a HU with some lip turbulence. Guys tend to make the sound more towards the “foo” sounds, and ladies tend to make it more like the softer, windier ”who” sound. Your best bet if you have trouble is to just say FU (foo) and be done with it.
BA, BI, BU, BE, BO – The HA series gets ten-ten which turns the H sound to a B sound.
PA, PI, PU, PE, PO – The HA series can also take a maru (circle) mark which changes the H sound to a P sound. This is the only series that can take the maru mark.
The MA Series
MA, MI, MU, ME, MO – Nothing crazy here, just an “M” sound plus a vowel.
The RA Series
RA, RI, RU, RE, RO – The “R” sound is a tricky one to explain and a little tricky to get correct even if you hear it. The “R” sound in Japanese doesn’t really sound like an English “R” in that it isn’t really formed in the throat. It is more sort of a trick of the tongue like creating a “D” or an “L”. In fact, the “R” sound usually comes out sounding more like a “D” or an “L” depending on where it is in the word. In the word “okaeri” (welcome home) the “RI” sounds almost like a soft “DI” sound or maybe even a light “TI” sound. The tongue only makes a quick, light touch on the roof of the mouth. Here is an example: Say the word “potter”. Now change the end of the word to “pottuh” like you have a Boston accent. This is almost identical to the Japanese word “para”. Most of the time, this is what an “R” should sound like. The only exception is when it begins a word, then it sounds a little more like an “L” sound as in the word “ringo” (apple) – which would sound like “lingo”.
The YA Series
YA, YU, YO – This is a short series. There is no YI nor YE in Japanese.
The Miscellaneous Crap Series
WA, WO, N – WA is pronounced as expected. WO is a grammatical particle and is usually pronounced the same as O. In Japanese sentences that have been transliterated into romanji, the WO is usually but not always written simply as O. N is a normal “n” sound and gets the honor of being the only Japanese syllable without a vowel. Sometimes this character sounds more like an “m” in some words, such as the Japanese word ganbatte, which means “do your best” and “hang in there”. This word sounds like gahm-bah-teh. Strictly speaking it is the “n” sound, but you’ll notice it’s hard to say it that way so in speech it comes out more like an “m”. So if you see an “m” hanging out by itself with no vowel behind it, don’t panic – it’s just an “n” that someone thought looked nicer as a romanji “m” since it’s pronounced more like one in that word. Just remember that if you are using a Japanese IME then you need to type it as an N and not an M.
Combinations – If you look at the hiragana chart, you’ll notice that there are all these funky combos! Essentially, anything in the “I” column (KI, GI, SHI, JI, CHI, NI, HI, BI, PI, MI, RI) can glom itself onto anything from the YA series (YA, YU, YO) to make several new sounds. If you’re interested in hiragana, notice that in the combos that the YA, YU, or YO is written half-height. Also note that these are single syllables. For instance, KYA is pronounced “kyah” not “kee-yah”. This makes sense when you see it in romanji, but looking only at the hiragana you might get fooled since it looks like two characters. That’s why they made the YA, YU, and YO half-height I guess.
Double Vowels – Simply hold the vowel sound for twice as long. Sometimes in romanji this is written as a vowel with a line over it. As stated before, every syllable in Japanese gets equal time. Holding double vowels actually makes a big difference. For instance, “shujin” means husband and “shuujin” means prisoner. Imagine asking a woman how her prisoner is doing – she might not like it, even if it might be accurate.
OU – This is one of those rare exceptions, OU is pronounced just like OO. In other words, an O held for two beats.
Double Consonants – Simply indicates a pause (equal to one beat). So the word I used before, “ganbatte”, sounds like “gahm-bah-(PAUSE)-teh”
When you have trouble with a word, break it into its phonetic parts. Say them slowly one at a time in a uniform amount of time for each syllable, and then repeat them quicker and quicker until you are at full speed. This is THE KEY to pronouncing anything in Japanese. Simply pronounce the vowels the right way and give each syllable its own time and Bob’s your uncle. Also remember that you don’t put an accent on any one syllable like you do in English – every syllable is pronounced with the same force and volume as its brothers and sisters. That’s all there is to it! If you think of Japanese as the language of an emotionless robot, it will help. Now, Native speakers do let some inflection/intonation come through from time to time, like raising the intonation at the end of a word to make it a question, just like we do in English, but mainly it’s the language of robots.
Try this one:
Kaeru – Break it down: ka/e/ru. Say each one separately, in its own time. Kah-eh-roo. Now say it faster and faster until it’s one word. Congratulations, you can now tell your friends to go home in Japanese. Kaeru!
If you are wondering how fast to speak Japanese, well, that comes with listening experience. Watch a lot of anime if you want to get an ear for it. In my approximation, each Japanese syllable is pronounced in approximately ¾ the time of an English one. It’s definitely slightly quicker. This means that double vowels are about 1 ½ the time of an English syllable. Many English loan words use double vowels because they would sound too sharp and quick without them, but the end result is often drawn out slightly longer than the English equivalent. For instance, the word for burger in Japanese is “baagaa”, which apart from sounding funny also takes longer to say. Since this is a loan word, it would be written in katakana, not hiragana by the way.
OK, lets try some more words:
Inu (dog) – i/nu pronounced ee-new
Neko (cat) – ne/ko pronounced neh-koe
Kuso (feces, shit) – ku/so pronounced koo-so
Great, lets try some harder ones:
Irasshaimase (Formal welcome often used in stores by the door greeter girls) – i/ra/(pause)/sha/i/ma/se pronounced ee-rah- -shah-ee-mah-seh. Note that the sha/i ends up sounding like “shy”. This might help with the pronunciation.
Benkyou (studies) – be/n/kyo/u pronounced ben-kyoe with the “o” sound at the end lasting twice as long. Remember that ou is the same as oo, so you get the double vowel action.
OK, now a few trick ones:
Desu (is) – de/su pronounced dess. Eh? Not deh-sue?
Ohayou gozaimasu (formal good morning) – o/ha/yo/u go/za/i/ma/su pronounced oe-hah-yoe goe-zah-ee-mah-ss. Where’d the -sue go again?
Suki (liking, fondness) – su/ki pronounced ss-kee. Huh? Wait, what? Shouldn’t it be sue-key?
OK, now we need to sit down and have a little talk. You know how I said that all syllables of all words are always pronounced the same. Turns out I was lying. Just like in English, people get lazy. Sometimes in Japanese, vowels are whispered; In other words they are barely audible or not said at all. This frequently happens to words that end is “SU” like “desu” and “gozaimasu” and sometimes in the middle of words like “suki”. A special thing to note about the word “suki” is that there is a short pause between the the “S” sound and the “KI” sound. It’s as if the “U” is still there but not really said aloud. That’s why they call ’em whispered vowels I reckon.
Another wispered vowel that you will often hear (well, I guess I mean often not hear) is the “I” in “SHI”. Take the ever popular Japanese phrase shikata nai (this basically means “Oh hell, this is going to suck really, really bad but I’m the only man for the job so I’m just going to have to roll up my sleeves and do what has to be done no matter how shitty it is, but often translated as “it can’t be helped”) for example. Since this breaks down as shi/ka/ta/na/i, one would expect it to sound like shee-kah-tah-nah-ee when in fact it is pronounced sh_ka-tah-na-ee (or if you blur the A and I at the end, sh_ka-tah-nie). Again, there is usually the tiniest of pauses where the whispered vowel should be, but not always.
So, the ten penny question is, how do you know when to whisper a vowel and when not to? And the answer is, I don’t know. According to learnjapanesefree.com (I don’t know anything about the site, it just came up in my Google search), “The vowels ‘I’ and ‘U’ come out as a whisper whenever they fall between the consonant sounds ch, h, k, p, s, sh, t, and ts or whenever a word ends in this consonant-vowel combination.” If you look at the above examples, this explanation seems to fit the bill. Me personally, I just got an ear for it from listening to hours and hours of native speakers. It’s also worth noting that you will sometimes hear people voicing those vowles, but it often sounds over-enunciated and snooty. The butlers and voice-over guys in all animes always voice every vowel – almost laughably so. The average Joe, however, normally whispers those vowels, even when speaking formally.
OK, while I’m messing with you, let’s try a few more:
Honda (as in the car) – ho/n/da pronounced hone-dah. Yep, strictly speaking, we all say it wrong. It should be a long “O” sound.
Toukyou (Tokyo, Capital city of Japan) – to/u/kyo/u pronounced toe-kyoe with each “O” sound lasting twice as long. Remember that OU = OO so these are double “O” sounds. Also remember that double vowels when written in romanji often get changed to a single vowel with a line over it? So Toukyou becomes Tookyoo becomes Tōkyō. And then out of laziness or the lack of special characters it becomes Tokyo. In terms of pronunciation, we say it wrong. We say “toe-key-yo”, but the Japanese pronunciation is “toe-kyoe” with the “O” sounds drawn out. On a stupid side note, “Toukyou” literally means East Capital, which makes sense since the old capital was Kyoto, and Tokyo is East of it. Speaking of which, Kyoto is “Kyouto” in Japanese so it’s kyoe-toe with the first “O” sound doubled. It isn’t key-yo-toe.
Whelp, you might not be a supreme overlord of Japanese pronunciation now, but at least you should get the gist of it. So now when you see a company name like “Yamaichi” you won’t freak out (like my boss did incidentally) and called it yama-goochi. I mean, there isn’t even a “g” anywhere, right? You and I know that it is ya/ma/i/chi pronounced yah-mah-ee-chee. Simple as pie.
If you found this guide useful, please support a starving writer and check out my book: