Wednesday, February 1, 2017

Let the chaos begin

Don't worry, I won't talk about world politics - I am referring to a chaos starting on Wikipedia about the geographic items in Thailand. Just recently, one of the almost completely bot-filled Wikipedia in Cebuano language gets flooded with new article on geographic locations in Thailand, translated from the items in the geonames website. Which in turn imported a lot of their items from the GEOnet Names Server (GNS). The big problem is that both databases contain a lot of bogus entries, especially when in comes to the lower level of the administrative subdivision. Though I bet no human will ever read these articles - why should someone from the Philippines speaking only this local language ever look for a Thai village - the big problem with these articles is that they now need to be linked to the real world with Wikidata.

To give just a few examples of the mess
  • The principal town of Phunphin district in Surat Thani province is named Tha Kham, but on geonames it was wrongly named Phunphin. And to make it worse, different spellings including the term "Amphoe" were listed as alternative names, so now the bot-created page mixes up things about the district and the town.
  • Geonames has an entry named "Ban Talat Yai" as a populated page in Phuket. However, there is no "Ban" with that name there, but a subdistrict. So the bot created article is bogus, linking it to the subdistrict in Wikidata would be wrong. Nearby Kathu is even worse - there's a municipality with that name (which would fit to the category populated place I suppose), as well as a subdistrict, but the geonames entry had "Tambon Kathu" as an alternative name mixing both items.
  • Geonames has two entries for the Ta Phraya district, Sa Kaeo province (1949382 and 1605741), and sadly I am not able to delete the second one there due to insufficient user rights. And of course the bot created now two articles about the same item - one and two.
It is just lucky I already added all the geonames IDs of the districts and provinces to Wikidata, so at least those bot-created articles can be matched automatically.

And to make it worse, it was not only the populated places and subdivisions which were imported, but also all the hills, caves, lakes, rivers, etc. Cleaning up the mess regarding the towns alone already would keep me busy for weeks at least... The only small positive side to have articles on each entity in one Wikipedia - that will make it impossible to wrongly merge together Wikidata items which are related but not the same.

Tuesday, January 24, 2017

The provincial administration of Siam from 1892 to 1915

2005 Thai edition
Tej Bunnag's book "The provincial administration of Siam from 1892 to 1915" was the first academic source I checked when I started writing about the topic in Wikipedia articles. Being long out of print (published in 1977), used copies are only found at ludicrous high prices, luckily I back then scanned all pages so I at least have my private e-book version of it. In 2005, a second edition was published, but only in Thai - and though I have that one my Thai is still way too bad to read anything.

But now by coincidence I stumbled on the original 1969 doctoral thesis, which was the basis for the book. The University of Oxford, where Tej Bunnag studied and graduated, had made the original version an open access item, so it can now be downloaded and read easily by everyone. Not sure whether I'll find the time to actually check what changes were done between the thesis and the book version, in fact more interesting would be any changes which were done for the Thai edition.

Friday, January 20, 2017

Statistical yearbook 2016

The latest edition of annual statistical yearbook was published in October 2016, though I just noticed it to be available online. Sadly, not the full PDF file can be downloaded, only a version for online reading (which even wants to use the obsolete flash) is available - even worse most of pages are garbled as the conversion tool handled Thai characters very faulty. At least the table which I use to import from the yearbooks can still be read.

But its not just that the online version is of bad quality, even worse the data itself has problems. According to Table 1.23 (Area and Administration Zone by Region and Province), the number of the administrative subdivisions by type as of May 27 2016 are (in brackets the coresponding numbers from the 2015 yearbook)
  • 928 districts (Amphoe and Khet) [2015: 928]
  • 7425 subdistricts (Tambon and Khwaeng) [2015: 7425]
  • 57081 administrative villages (Muban) [2015: 55387]
  • 2452 municipalities (Thesaban, including Pattaya) [2015: 2442]
  • 5433 subdistrict administrative organizations [2015: 5334]
Comparing with the 2015 numbers shows no change in the central administrative units except the Muban, but oddly changes in the number of the local governments even there have been municipal upgrades in the last year. While looking through the provinces, it seems that for several of them (but by far not all) instead of the current number of TAO the original number of TAO in 2002 was listed. The additional ten municipalities are due to a wrong number in Kamphaeng Phet, inbsteal of 25 the table lists 35. These wrong data however make the whole table totally worthless. Additionally, the number of administrative villages seems to have risen in a strange way by almost 2000, but only 55 Muban were created in 2016 and just 11 in 2015. Thus I can only suspect that here again old numbers were mixed with the current values, counting Muban which were previously excluded as being part of municipalities and thus loosing their function. However 900 of the new Muban are in fact a mistake in the 2015 edition, which lists 1097 Muban for Surin instead of 1907 as in 2014 and 2016. I posted a more detailed discussion of the varying Muban numbers earlier.

Even though at least this table in the yearbook has become useless, I nevertheless translated it into my XML structure, but adding a new schema entries to indicate and correct the bogus values. But what really worries me is that if an amateur like me can already spot such big mistakes in this publication, what is the quality of the other tables then?

Monday, January 16, 2017

New issues of @amphoe

The Facebook page of the @Amphoe magazine did not get any update since September, and at about the same time the @amphoe page at the Department of Provincial Administration disappeared. Both made me assume that this magazine quietly was shut down after just nine issues. Thus it was kind of a surprise when I tried to look again the DOPA website and it suddenly not just worked again, but showed two new issues already.

Issue 10 (5/2559) has an interview with the chief district office of Phop Phra in Tak, Prasong La-on (ประสงค์ หล้าอ่อน), talking on the special problems of the rather remote district. The 200,000-baht-per-village project is the topic of another English section, as well as a short list of the travel highlight of Mukdahan province - hence the title photo of this issue showing the special rock formations in Phu Pha Thoep National Park.

Issue 11 (6/2559) is a special issue with the late King as its only topic. As far as I can see the content hardly concerns the Amphoe administration at all, and there is no English content this time. Given the immense popularity of the King as well as the publisher being a government department its no wonder they choose to publish one special issue.

And to do my usual nitpicking - the URL advertised in the magazines returns a 404 error, the URL which works is Also apparently the process to create the PDF files has been changed, now these are only containing graphics and no text elements anymore, which has increased the file size by a factor of 8 and now makes it completely impossible to copy any content to Google Translate. And sadly, still issue 8 (3/2559) is missing in the download page, the later issues are wrongly numbered.

Friday, January 13, 2017

Census 1970 codebook

When I was first looking for the old census data, one of the online resource I found was the Open Data library of the Worldbank, which includes some documents from the 1970 census. Though not the actual census data - which I later got elsewhere - but also the codebook is an interesting resource as it contains a list of the provinces with the number of districts, subdistrict and administrative villages, and even more useful a list of all the subdistricts with their village numbers - the total numbers for 1970 were 580 districts (including the minor districts), 5126 subdistricts and 45504 villages.

I am slowly working through this document to extract all these village numbers and compile them in my XML structure, which already turned out to be a good cross-check of the data I already compiled as I found a few cases where I had missed or wrongly added the creation of a subdistrict. The document however has two drawbacks, some pages are badly printed and even have hand-written corrections making the numbers sometimes difficult or impossible to read. The Thai names seem to have mistakes sometimes as well, some might have been spelling changes however which were not announced in the Royal Gazette. Also, some pages are missing, so it cannot be turned into a complete 1970 subdistrict list.

Yet, so far the biggest problem showed up with the above excerpt from Mueang Pathum Thani district. All 14 present-day subdistricts can be found in the list except Ban Chang (ตำบลบ้านฉาง) - and instead the list shows a Ban Nao subdistrict (ตำบลบ้านนาว) with seven Muban. As there are zero Google hits for such a subdistrict name, and I the name of Ban Chang seemed to have never changed, it might have been a mistake in the Thai name, changing two characters. But - Ban Chang has just four villages, but Ban Nao had seven, and there was no change in the boundaries of Ban Chang either explaining how the village number could have decreased. I can only suspect this is a real mistake in the codebook.

Issues of the Local Directory (ทำเนียบท้องที่) from about that time would help to clear up this issue as well as help to fill the pages missing in that file, but none of these are available online, and those few libraries I could visit so far in Thailand didn't have any such books.

Wednesday, January 11, 2017

New provincial license plates

On December 27th four announcements regarding the colorful provincial license plates were published in the Royal Gazette. Samut Songkhram is about the last province which now has such a graphic defined (only Yala and Mae Hong Son have none yet), the three other announcements only added a new design for the respective province.
  • Phetchabun: a completely new design added, showing the highlights of Khao Kho district - the hill with Wat Phra That Pha Son Kaeo (วัดพระธาตุผาซ่อนแก้ว) with a misty background, and inserted at the right the Khao Kho Memorial (อณุสรณ์สถานผู้เสียสละเขาค้อ) commemorating the victims of the communist insurgency 1965-1982. The 2013 design still looked very different. [Gazette]
  • Surin: Comparing with the plate announced in 2012, only the elephants are present in both designs. In the upper left corner are the flowers of the provincial symbol flower Fagraea fragrans. The upper right shows the Tha Sawang silk. However, I don't know which pier and body of water is depicted. [Gazette]
  • Roi Et: The new plate has the same elements as the plate announced in 2015, the only difference is the purple background instead of a green background in 2015. [Gazette]
  • Samut Songkhram: The license plate shows the same elements as the provincial seal - a drum (Klong) on the Mae Klong river with coconut trees on both sides. [Gazette]
My album of provincial license plates is still incomplete and badly sorted... I have also prepared a spreadsheet listing the years by which plates were announced for each province.

Tuesday, January 10, 2017

New non-hunting areas

The last announcements of 2016 I had processed from the Royal Gazette were about protected areas, and the first of 2017 are again of the same category - another six new non-hunting areas (เขตห้ามล่าสัตว์ป่า) were created by the publication on January 5.
  • Khao Phanom Thong (เขาพนมทอง), Phitsanulok, covering 14125 rai of the Lum Nam Wang Thong Fang Sai national forest. [Gazette]]
  • Mae Lao-Mae Kok (แม่ลาว-แม่กก), Chiang Rai, covering 8025 rai of the Mae Lao Fang Sai and Mae Kok Fang Khwa national forest. [Gazette]]
  • Nong Leng Sai (หนองเล็งทราย), Phayao, covering 8025 rai around the same-named lake. [Gazette]]
  • Huai Sak-Mae Kok (ห้วยสัก-แม่กก), Chiang Rai, covering 4003 rai of the Huai Sak and Mae Kok Fang Khwa national forest. [Gazette]]
  • Sob Kok (สบกก), Chiang Rai, covering 5550 rai of the Sob Kok Fang Khwa national forest. [Gazette]]
  • Mae Pun Noi-Mae Pun Luang-Huai Pong Men (แม่ปูนน้อย-แม่ปูนหลวง-ห้วยโป่งเหม็น), Chiang Rai, covering 5550 rai of the Mae Pun Noi, Mae Pun Luang and Huai Pong Men national forest. [Gazette]]
My main problem - I have no idea what are the official names of these non-hunting areas. The Royal Gazette announcements don't state a name for the protected area, it only names the national forests or the area which is affected. I haven't been able to find any updated list of non-hunting areas on the website of the National Park, Wildlife and Plant Conservation Department either. The latest I have is a spreadsheet which lists all the protected areas, which however dates from 2013. While it included a few non-hunting areas pending their official creation, none of these six was among them. Therefore, I haven't yet added items in WikiData for them, as I don't want to make those guessed names listed above any more public.