Wednesday, January 22, 2014

TambonBot on Wikidata

It took quite some time, but some days ago my automatic Wikidata editing bot has been approved, and it already did 12,000 edits on the 2428 administrative entities which have a corresponding Wikidata item. So far I have done only the trivial things which don't necessarily includes the addition of sub-statement to statements or sources. The activities done so far were
  • Normalize the item names not to include the type, e.g. Bueng Kan Province became Bueng Kan, both for English and for German.
  • For Thai however, the name always includes the type, thus Bueng Kan Province is labeled จังหวัดบึงกาฬ.
  • Give a description with the full hierarchy to be avoid any potential ambiguities, e.g. "district in Bueng Kan province, Thailand". For German I haven't implemented it yet as the Grammar makes it a little bit more complicate, for English and Thai it was simple string concatenation.
  • Every item now has the link to the country Thailand
  • Every item now is linked to the one in which it is located, except for the TAO and Thesaban - I am not sure if I should link the province, the district or every (partially) covered Tambon.
  • The type of the entity is also linked, for some reasons twice, once as "instance of" and once as "type of administrative unit".
  • Those entities which have a corresponding boundary item on OpenStreetMap are now linked as well.
As both the parent unit as well as the type could have changed in past, adding the historical values with the corresponding start and end dates is still an open task to be programmed.

A property to hold the geocode of the entity has been created by now as well - when I saw that it received the property id 1067 I realized I should have waited a bit longer to catch the number 1099 - as this code is related with the TIS 1099 standard. The code to add these identifiers is nearly finished, still need some polishing to add the references to the corresponding source of the code - TIS 1099:2535, TIS 1099:2548 or the full code list from DOPA. While waiting for this property to be created, I finally wrote down an article on Wikipedia about this Thai standard - copyreading or translation is welcome...

Also almost completed is the code to fill the list of subdivisions, in this case clearly leaving out the TAO and municipalities, as these are no real subordinate of any of the central administrative units. There are several other edits which are easiest done by a bot, I am collecting my ideas on the bot userpage. The item on Bueng Kan is kind of my test item, having the biggest number of statements of all the Thai subdivisions now, and already takes quite long to load in a Webbrowser.

I still learn more about what Wikidata can do - like discovering more properties which can be applied to the administrative unit I work with, as well as discovering of what it will be able to do in future once developing progresses - e.g. the data value for the population number is not yet available. But I also had my first negative experience there: As Phuket is the province with the smallest number of subdivisions, I added all those items which have no Wikipedia article yet as more-or-less blank items to be filled by my bot later (except the PAO, those are a really special thing). As the idea behind Wikidata is to be more than just a repository of data for Wikipedia articles, these items were perfectly fine to be added. But since there was no Wikipedia link as well as no link from any other item, one admin thought them to be unused orphans and deleted them, without notifying me or asking whether these were correct or not. So I had to do the same work twice, the only positive thing was that now all of these items contain the list of neighboring items, to make sure none looks like an unused orphan anymore.

Friday, December 27, 2013

Municipal area transfer in Rayong

On Monday, the change of the boundaries of two adjoining municipalities in Rayong province was announced in the Royal Gazette. Choeng Noen (เทศบาลตำบลเชิงเนิน), located north and east of Rayong city [Gazette] and Noen Phra (เทศบาลตำบลเนินพระ), located west of Rayong city [Gazette] are connected by a narrow area running parallel north of Sukhumvit road. I wasn't able to find a map of the original municipality boundaries, but it is possible to get them by the Tambon. Both municipalities were originally TAOs covering the same-named Tambon, except those area which are covered by Rayong city, and in 1998 the boundaries of the Tambon in Mueang Rayong district were officially defined [Gazette]. Though the boundaries are only defined in written text and not as a map, by comparing the coordinates it seems that in fact the two Tambon have no adjoining area, being separated by the city of Rayong. Thus the with the announcements now, Choeng Noen gained some area west of Rayong city, with a very narrow connection of the newly gained area north of Rayong city.

The interesting question is whether this change for the municipal boundaries will also cause an adjustment of the subdistrict boundaries, to keep the two municipalities within their original Tambon.

I have tried to turn the main part of the two maps within the announcements into a single Google map, the pink boundary is the area which I believe was added to Choen Noen.

View Choeng Noen, Noen Phra in a larger map

Friday, December 20, 2013

Busy times

To my shame I have neglected the blog in the last month, mostly because I was very busy with working through issues which not directly could be made into a blog post.

One big task has reached another milestone - I finally finished to add the election dates in 2004 which I could deduce from a PDF by the Election Commission listing the term ends of 2008. I now have little over 19,000 council elections in the XML, and a somewhat smaller number of mayor elections since I only added them if I have the name of the mayor, or the term differs from the council term. But sadly the elections aren't done completely yet, this year had another 3,500 local elections. Though there is no concise collective result, at least a good part could be found on the various pages of the province branches of the Election Commission. I am still working on adding those, so more on that in a separate blog posting.

The other task started with my "discovery" of Wikidata. I have now worked through all the Wikipedia articles on the administrative subdivisions and added the ID at Wikidata to all the corresponding entries in my XML. This included several merges of Wikidata entries, where two language editions of Wikipedia had an article but those were not joined yet. Also, some had to be split, the good old confusion between district or subdistrict and the local governments (cities, towns). Now there are 2362 Wikidata items, almost all correspond to at least one Wikipedia article. The next steps are now the programming of a bot to fill these items with statements - I have started and succeeded with a few test edits, but the library to access Wikidata with C# needs a lot of rework before it becomes really usable - so another work to do. At least I have to do some programming, not just XML editing... I detailed analysis of the Wikipedia article I found during this inventory will be posted later, and of course will keep you updated once the TambonBot starts operating.
I wish all my readers a happy Christmas and a good new year, hope I can return to the more regular posting after the holidays.

Wednesday, November 13, 2013

New license plate graphics

A week ago, five new provincial license plate graphics were announced in the Royal Gazette.
  • Trat [Gazette]
    The graphic refers to the Franco-Thai war (1940-41), more specifically the battle of Ko Chang on January 17 1941. Three Thai ships were sunk during the battle, thus probably the ships to the right in the graphic are depicting the HTMS Chonburi, HTMS Songkhla and HTMS Thonburi.
    This seems to be first design for a license plate for Trat, there were no previous designs announced.
  • Phetchabun [Gazette]
    To the left of the design are the fruits of the Tamarind tree, an important local product of Phetchabun province. The tamarind is also the provincial symbol tree for Ühetchabun. The star above and next to the tamarind fruits refers to the name of the province - Phetcha (เพชร) means diamond. The temple on the hill to the right is Wat Phra Son Kaeo (วัดพระธาตุผาแก้ว), one of the most spectacular Buddhist temples in the province.
    While the tamarind was present in the 2005 and 2009 designs as well, the temple was newly added, also the background was made much more colorful than before.
  • Amnat Charoen [Gazette]
    To the right are rice ears, as (sticky) rice is the major agricultural product of the province. The flowers to the right are from the Butea monosperma tree, the symbol flower of the province. The background shows Khit cloth, a traditional woven fabric still produced in the traditional way in some areas of the Northeast, including Amnat Charoen.
    This seems to be first design for a license plate for Amnat Charoen, there were no previous designs announced.
  • Chiang Mai [Gazette]
    As the 10 year loan of the to Panda to Chiang Mai Zoo has ended, the new license plate graphic - unlike the previous one from 2011 - no longer show any Panda. Again, Butea monosperma flowers are depicted, as this is also the provincial flower for Chiang Mai (and also Udon Thani and Lamphun). On the hill in the background the temple Wat Prathat Doi Saket is shown, one of the most famous temples in Thailand. To the left is a traditional Lanna house, with two Thais celebrating Songkhran by splashing with water. At the river another two persons celebrate Loi Krathong by floating the little candle boats. I am just not sure which temple is shown in the middle of the graphic.
  • Maha Sarakham [Gazette]
    Again, Khit cloth is shown, as it is also a local product for Maha Sarakham. The flowers on top of the cloth are from the provincial flower, the White Frangipani (Plumeria alba). The river is the Chi river. The temple silhouette in the background probably is the pagoda of Wat Phra That Na Dun, located in Na Dun district.
    The 2006 design showed only the cloth and the flower.
As usual, I have added these graphics to the web album, where one can find all the recent graphics and many of the older as well - if times allows I will slowly complete it with all the old ones as well. If there is one specific missing, please let me know. After this announcements, there are still five provinces which seem to have such graphic yet, I don't know whether the announcements were not published in the Royal Gazette, or there is really no such designs yet for Nong Bua Lamphu, Mae Hong Son, Samut Songkhram, Ranong and Yala.

Monday, November 4, 2013

Local governments renamed

Today, four local governments name changes were announced in the Royal Gazette.
  • Nong Saeng subdistrict municipality (เทศบาลตำบลหนองแสง), Wapi Pathum district, Maha Sarakham province renamed to Wapi Pathum (เทศบาลตำบลวาปีปทุม) to match with the name of the district. It also gives the TAO Nong Saeng, which shares the area of the subdistrict Nong Saeng with the municipality, the ability to be upgraded to a municipality without changing name. [Gazette]
  • TAO Huai Thap Mon (องค์การบริหารส่วนตำบลห้วยทับมอญ), Khao Chamao district, Rayong province, renamed to Khao Chamao (องค์การบริหารส่วนตำบลเขาชะเมา), as it is the local government unit which contains the district office.
  • TAO Bang Rachan (องค์การบริหารส่วนตำบลบางระจัน), Khai Bang Rachan district, Singburi province, renamed to Khai Bang Rachan (องค์การบริหารส่วนตำบลค่ายบางระจัน), as it is the local government unit which contains the district office.
  • TAO Mai Ai (องค์การบริหารส่วนตำบลแม่อาย), Mae Ai district, Chiang Mai province, renamed to Doi Lang (องค์การบริหารส่วนตำบลดอยลาง), to avoid confusion with the subdistrict municipality Mae Ai (เทศบาลตำบลแม่อาย), especially once the TAO gets upgraded to a municipality.
All name changes took effect on October 18, the announcements were all signed on September 19 by the deputy minister of interior Pracha Prasobdee (ประชา ประสพดี). And all were discussed by the Board to consider draft laws in their meeting on August 21.

Thursday, October 17, 2013

Streetview coverage of the Northeast started

While the systematic Google Streetview coverage is still limited to the provinces around Bangkok and Chiang Mai, and Phuket province as the only province in the South, already with the first batch of Streetview imagery two main roads were included - Phetkasem from Bangkok to Phuket and Phahon Yothin from Bangkok to the North. A similar single roundtrip by a Streetview car now started the coverage in the Northeast.

View Larger Map
Roundabout in Buriram in front of the Province Hall
The Isan roundtrip went through the provincial capitals of Sisaket, Surin, Buriram, Nakhon Ratchasima, Chaiyaphum, Khon Kaen, Maha Sarakham, Roi Et, Yasothon, Amnat Charoen and Ubon Ratchathani, and another two roads along the Thai-Cambodian border including up to the parking lot of Phreah Vihear. Also the main road along the border in Sa Kaeo and Trat province was added, including the border crossing in Aranyaprathet and Hat Lek.

So quite a lot of new opportunities for armchair traveling in areas less frequented by tourists, however for my task to collect the locations of the administrative offices the full coverage is much more useful, as not all the offices are located directly at the main road.

Monday, October 7, 2013

Wikidata

Many of the Wikipedia articles contain language-independent factlets, which have to be kept up-to-date in each language edition manually. If these factlets are stored at one central place - quite similar to the graphics used in Wikipedia articles which usually come from one separate Commons-Wiki - then these facts will be easily updated without speaking any of the languages of the Wikipedia articles in which they are used. Additionally, factlets organized in a more strict way than the human-readable text of the articles make other automatic processing possible. The Wikidata project does exactly this, more and more of the factlets in the Wikipedia articles can be imported automatically from the Wikidata Wiki.

The administrative subdivisions are one of the prime examples where this concept can be used, facts like the area, the population data, the country or province to which they belong, or the list of subdivisions already make up a good deal of a decent Wikipedia stub article. Especially the data which is displayed in the so-called infoboxes are mostly already includeable from Wikidata. One major thing which is not yet possible in Wikidata are lists, and not yet all of the data I am collecting has corresponding data categories (called properties in Wikidata). And it is a quite tedious work to add more than one factlet at one time manually, so to really get the Thai subdivision well-covered there I would have to learn how to use a bot for automatic editing.

One thing which already is imported completely on Wikidata are the language links to the Wikipedia article in various languages. In fact, every article which is available in more than one Wikipedia now has a corresponding page in Wikidata. Thus I now have added one more data item in my XML files, which can link every subdivision to the corresponding Wikidata page, and I am now slowly adding all the province and districts. And since OpenStreetMap and Wikimapia are another similar Wiki website which also has specific IDs for geographical entities (though one has to be careful to separate the office and the full entity), these are defined in the XSD as well. As an example, the province of Surat Thani now has these two links within the XML, which easily translate to URLs on WikiData and OpenStreetMap.
<entity type="Changwat" name="สุราษฎร์ธานี" english="Surat Thani" geocode="84">
  <wiki wikidata="Q240463" openstreetmap="1908825" /> 
</entity>