The standard romanization using the RTGS standard gives a fair approximation of the correct pronunciation of the Thai names, but with some limitations - both the vocal length and the tone are ignored, and one should not read the romanized name like an English name. The Bang Sue meme only works when reading the romanized name of that Bangkok district like it were written in English. Using the transcription of the Thai names into the International Phonetic Alphabet (IPA) instead avoids all three problems, then Bang Sue (บางซื่อ) becomes "bāːŋ sɯ̂ː". Also those cases where there seem to be two subdivisions sharing the same name - like the two districts Bang Sai in Ayutthaya province - disappear with this transcription.
In 2011 an anonymous user added the IPA transcription for several districts into the English Wikipedia, however sadly not to all. While I can do the RTGS romanization myself well already, I just know the basics of IPA, and the tone rules of written Thai are still too confusing for me, so I can neither check those IPA transcriptions nor add new ones. But I could add the IPA into my XML structure, and then run my bot to add those values to the items in WikiData. Though its just one small data field among many others, and its not available for all, at least it now makes that information from Wikipedia directly machine-readable without the need to parse English text. If anyone can provide me with more IPA transcriptions, I'd be happy to make sure these will find their way into Wikidata and Wikipedia...
No comments:
Post a Comment