My coffee mug

Hello world, and welcome to my corner of the web! This is where I write words about what I'm working on, and post photographs of things I've seen.

I'm a Software Engineer at the Wikimedia Foundation, and so of course my personal website is a wiki (running on MediaWiki). In my spare time I volunteer with WikiClubWest to work on Wikimedia projects, mostly around my family's genealogy and local Western Australian history (especially to do with Fremantle). I try to keep up with issues on all the things I maintain.

I also try to find time to work in my workshop on various woodworking projects. Recently, that's been focused on getting my new workshop's doors built and installed.

Travel features in my life, not because I really hugely want to go elsewhere but because I just do — and also because then I can do some more interesting mapping on OpenStreetMap.

I'm currently reading People of the Book (Geraldine Brooks), and Room at the Top (John Braine, 1957), and The Fortnight in September (R. C. Sherriff, 1931), and The Heart of the Matter (Graham Greene, 1948).

To contact me, you can email me or find me on the Freenode IRC network (as 'samwilson'). If you want to leave a comment on this site (by creating an account), you need to know the secret code Tuart (it's not very secret, but seems to be confusing enough for most spammers).

In other parts of the web, I'm active or at least have an account on: Flickr (from where I hope to migrate all my photos one day, into this wiki); LibraryThing; YouTube.


Defining structured data in MediaWiki just using templates with Cargo, this is a good video: https://www.youtube.com/watch?v=f6wKXPVuBTU


Totally enjoying the videos from EMWCon Spring 2018, e.g. Toward a MediaWiki Roadmap.

Wikisource books for binding


I have been experimenting with turning Wikisource works into LaTeX-formatted bindable PDFs. My initial idea was to produce quatro or octavo layout sheets (i.e. 8 or 16 book pages to a sheet of paper that's printed on both sides and has the pages layed out in such a way as when the sheet is folded the pages are in the correct order) but now I'm thinking of just using a print-on-demand service (hopefully Pediapress, because they seem pretty brilliant).

Basically, my tool downloads all of a work's pages and subpages (in the main namespace only; it doesn't care about the method of construction of the work) and saves the HTML for these, in order, to a html/ directory. Then (here's the crux of the thing) it uses Pandoc to create a set of matching TeX files in an adjacent latex/ directory.

So far, so obvious. But the trouble with this approach of wanting to create a separate source format for a work is that there are changes that one wants to make to the work (either formatting or structural) that can't be made upstream on Wikisource — but we also want to be able to bring down updates at any time from Wikisource. That is to say, this is creating a fork of the work in a different format, but it's a fork that needs to be able to be kept up to date.

My current solution to this is to save the HTML and LaTeX files in a Git repository (one per work, e.g.) and have two branches: one containing the raw un-edited HTML and LaTeX, on which the download operation can be re-run at any time; and the other being based off this, being a place to make any edits required, and which can have the first merged into it whenever that's updated. This will sometimes result in merge conflicts, but for the most part (because the upstream changes are generally small typo fixes and the like) will happen without error.

Now I just want to automate all this a little bit more, so a new project can be created (with GitHub repo and all) with a single (albeit slow!) command.

The output ends up something like Commons:File:The Nether World by George Gissing.pdf.

2018 Firefox survey


I just filled in the 2nd Annual Firefox Census, which is just a survey about weird stuff and Firefox and Mozilla and stuff (literally: like "To what Hogwarts house do you belong?").

I don't know why I feel like it's sort of okay to share data with a company like Mozilla, but I guess I do. Although, come to think of it I'll also sign up to loyalty programmes at grog shops too... I guess I don't really care about my personal information that much!

It's more the bloody annoying algorithms of YouTube and Twitter that annoy me. Give me chronologies and folksonomies any day! (Sort of. There is, of course, more to it, but that's always the case.)

Template frontmatter

Golden Gate Club, San Francisco

A few years ago the static-site bloggig tool Jekyll popularised the idea of text files containing 'front matter', which is usually Yaml-formatted metadata put between some delimeters at the top of a file. This works pretty well in MediaWiki as well, with a slightly different format (i.e. templates).

Yaml Wikitext HTML
type: book
author: John Smith
publication_date: 1923
| author = John Smith
| publication_date = 1923
<div itemscope itemtype="http://schema.org/Book">
  A book by <span itemprop="author">John Smith</span>,
  published in <span itemprop="date">1923</span>.

I find it useful to think of wiki pages as representing instances of some sort of 'entity', and the template at the top is what defines this.

With the addition of Cargo, all this metadata becomes queryable from elsewhere in the wiki.

Waiting for a flight

Perth airport

Sunday morning, Perth airport. In a few hours (by the clock) I'll be in San Francisco. It'll take me a bit longer than that to get there, but that's okay—it's a nice day. The main question is whether one should add spaces next to an em dash. Or whether it's easy to use altogether too many em dashes! Tricky questions. Plenty of time to figure them out. And to find the asciicircum key (why don't they call it what we all call it?).

In the UK, "phonebox numbers reached their peak in 1992, when there were 92,000 of them"[1] and now they're getting rid of lots and will be left with only twenty thousand or so. I guess we all have other phones now (even if some of us don't use them ever and are terrible at answering when phoned).

Someone said this morning that "MediaWiki seems to be much more popular outside the foundation then it is inside it."[2] I hope that's true! I mean, it's not good that it's not popular inside the Foundation (we use Google docs sometimes for things! shock horror), but I like to think that it is popular outside. I like MediaWiki.

The coffee here really isn't as good as elsewhere. Yesterday, I had a terrific long black at the place on the corner of King and Wellington streets. Terrible seating, but lovely coffee, and a good window to peer out of. Important things. From here, a metre away, I can see black polyester-clad bums of two pilots, dragging their nice little square rolly suitcases.

Anyway, I'll stop blathering and go find a better spot to while away the time.

  1. Prangle, Brian (2018)The streets they are changing
  2. Bawolff (20 January 2018)T183313 Wikimedia Developer Summit 2018 Topic: Evolving the MediaWiki Architecture

Centenary Building


Archaeological excavations (by a company called ARCHAE-AUS) around the Freo town hall have been carrying on this week, with evidence of a farrier's and blacksmith found at the corner of Newman Court (was Street) and William Street, where the Centenary Building stood for thirty years or so from 1929.

The sign in the middle photo reads:

Exposing 1890–1913 blacksmiths', farriers', coachbuilders', & wheelwrights'.

Mr Tyler is in the photograph taken in 1896 in the front of his workshop (photo taken from other side of William St).

I'm not sure what happened in 1913.

Around the corner they've started work to uncover the foundations of the first St Johns church. This one they know more about, as it was excavated in the 1980s (although, the word on the street is that not enough photos were taken, nor other detailed records made). It'll be exciting to see this, after knowing about the church's outline in the paving stones my whole life.

What's the future of hosting MediaWiki?


What is to be the future of running one's own MediaWiki? Shall there be a dozen different services required (database, cache, search, parser, ...) all running with different technologies and different systems of upgrading and support? Or will we head back to the "old days" (in which things like WordPress still exist) where it's basically just a single PHP application, perhaps now with its own dependency manager (i.e. Composer), and nothing much else? Are people with shared hosting accounts going to still be able to get it running? Will they be able to get it running more easily than they can today? (Certainly, they're not often currently getting it running with Visual Editor, for example.)

I'd like to think that MediaWiki will become easier to install. Maybe that means going in the direction of Discourse, and only supporting deployment via Docker, in order to hide the complexities of all the required services. But that's got a whole lot of confusions of its own, that I think are perhaps too much. Is the future of self-hosting really going to be VPSs, or even "serverlessness"? I guess it could be. The security conundrums with shared hosts are bad, certainly... but perhaps not as bad as poorly-managed whole servers? At least Dreamhost and their ilk monitor for suspicious-looking stuff; Digital Ocean couldn't care less untill you're such a spam farm that you're interferring with other things.

Imagine if MediaWiki (with all the good bits as well) were super easy to install, that people could turn to it for any collaborative editing website! I guess I'm probably just showing my age though, and am harking back to 2002 when it seemed desirable that people would control their own bit of the web. Still, I do think MediaWiki does multiple-people-editing-multiple-pages-quickly rather well, and is still easier to use (once installed) than some combination of "Markdown files on Github and photos loaded from Instagram and embeds from Twitter" or "put it all on WordPress.com" (or God forbid "we don't need anything now we've got Slack").

MediaWiki, through it's structure and editing philosophy, really does encapsulate something great about the open web: we've got pages, they contain whatever you want, there are links between them, all changes are tracked, and beyond that lies an infinite field of human creativity and ingenuity. No algorithms to coorce your behaviour, nothing hidden and nothing prohibited. It still makes sense, and I think there's still a future for this sort of thing.

Open Source Hack Afternoon

Glendalough Railway Station

I went along today to my first open source hack afternoon, a regular language/platform agnostic hack group that's now meeting at the Artifactory.

It was a hot day today, with dark orange skies from the fires up near Mundaring, and when I got to the Artifactory there was a bit of a delay in getting inside and so we sheltered in half a metre of shade against a hot wall for a little while.

We had a pretty good room with an portable air conditioner that made it just about a bearable temperature (and provided white noise, in case that's useful). Stephen brought a projector, so we could share things more easily.

I'm looking forward to next month—and maybe more people will come! Maybe it'll be nicer weather.


Loops and deadlines


It's a loopy sort of day. I mean there are loops in it. Which is pretty common, but sometimes gets annoying. It's like having a flowchart in your head, but someone forgot to put an end state in it and instead it links back on itself. It only does this after some sufficiently large number of steps though, so it's sort of hard to see that it's a loop at first, because you're expecting it not to be and aren't keeping count properly.

There is no deadline so every second is one: on anxiety, perfectionism, and Wikimedia projects by Léna:

There is no deadline on Wikimedia projects, and we have so, so much to do. Maybe I should grab my camera, go outside, and take decent pictures of my city. Or I should work with my local LGBT community and do an edit-a-thon. Or work at a national level, contact centers and convince them to share some of their archives. Or I should work on my tools backlog. Gosh, I haven’t seen the admin discussions on French Wikipedia for months, I should give a helpful hand. And I have so many books, I just should take some time and write articles. Or I should look at the latest mass-upload on Commons and put pictures on Wikipedia. Or I should enrich the French Wiktionnary about Gaumais, the language of my ancestors, to my knowledge I am the only one on Wikimedia projects who knows about Gaumais. What is the most important ? Where do I have the most value ? Oh, I guess I should work on that friendly space policy at Wikimedia France, I have lot of experience on what works and what doesn’t on other spaces, I would be so useful there.

Not jumping for the shiny new things

It is very hard to not want to change things when a better idea comes along. The problem mostly seems to be that "better" is so easy to feel when you first see a new thing, and it's so hard to remember that the current way of doing things was once that same "better".

Bit by bit I'm learning to be more conservative in jumping to the next cool new technology, but I still feel the pull. A simple tweet about a new way of working with Markdown; a Hacker News thread about why hosting your own websites is dead; or the feeling of giving in to the corporate world of the web (Twitter & Flickr, mainly) will bring some cool relaxation. But really, ignoring the impulse (unless it comes again and again) is not that hard, and makes for an easier ride in the end.

Also finished listening to How to Stop Time (eps 8–10).

Beginning 2018


It's the fourth day of 2018, and nothing seems much different from 2017 yet. I'm still too slow at cutting bits of wood to the right size, and too indecisive about how much of my inaccuracies I want to leave showing in a finished piece, and so my wine glass shelves have progressed slowly. On the code front, I'm still trying to build a working system to extract my and my friends' photos from Flickr — the delays mostly around the best ways to keep them in MediaWiki (i.e. templates, using Cargo or not, etc.). One good thing that's happened is that GlobalPreferences' required changes to MediaWiki core have been merged, and so work can begin in earnest on the extension itself. This isn't yet that exciting, as no new functionality is being introduced, but we're one step closer to local overrides (and maybe one day even deploying the blasted thing).

My experiment in wiki-based genealogy research (over on ArchivesWiki) is progressing, and some issues around generating GraphViz graphs of human relationships are being resolved. One thing that looks like it could be useful is an idea to introduce a {{SHORTDESC:Lorem ipsum}} magic word. I reckon this could be the thing that is displayed in parentheses after a person's name in family trees (it'd beat the current system of attempting to glean a useful year-only date range out of non-standardized and variable-granularity date formats).

1895 Stornoway chart


Approaches to Stornoway titleblock.png

Yesterday I found a 1:18300 scale nautical chart of the Approaches to Stornoway in an antique shop in North Fremantle. It's not large, but looks nice, and reminds me both of how nice printed maps can be (on good, thick, paper) and also of Stornoway (where I spent a few days last year). I'm making a wooden clamping bar to hang it with.

Approaches to Stornoway.png Approaches to Stornoway harbour detail.png

Then I made a bolted-stick contraption, and hung it on the wall:

2018-01-06 Stornoway chart on wall.jpg

Monday MediaWiki


Monday morning, hot and humid, and the rain's been falling all night (nearly 5 mm!). It's one of those lovely days when you can look out to the ocean and stand on the limestone and feel this place.

I'm reading through the position statements that have been accepted for the Wikimedia Developer Summit in January. It's great to read other people's ideas in this form. I think there's not really enough of that, in MediaWiki development: it's hard to get an idea of other people's 'big picture' thoughts of what the future should hold.

PhpFlickr 4.1.0

I've just tagged version 4.1.0 of my new fork of the PhpFlickr package. It introduces oauth support, and hopefully improves the documentation of the user authentication process. This release deprecates some old behaviour, but I hope it doesn’t break any. Bug reports are welcome!

There are some parts that are still not converted to the new request flow, but I’ll get to them next.

CFB Folder 1 done


The first folder of the C.F. Barker Archives’ material is done: finished scanning and initial entry into ArchivesWiki. This is my attempt to use MediaWiki as a digital archive platform for physical records (and digitally-created ones, although they don’t feature as much in the physical folders). It’s reasonably satisfactory so far, although there’s lots that’s a bit frustrating. I’m attempting to document what I’m doing (in a Wikibook), and there’s more to figure out.

There are a few key parts to it; two stand out as a bit weird. Firstly, the structure of access control is that completely separate wikis are created for each group of access required. This can make it tricky linking things together, but makes for much clearer separation of privacy, and almost removes the possibility of things being inadvertently made public when they shouldn’t be. The second is that the File namespace is not used at all for file descriptions. Files are considered more like ‘attachments’ and their metadata is contained on main-namespace pages, where the files are displayed. This means that files are _not_ considered to be archival items (except of course when they are; i.e. digitally-created ones!), but just representations of them, and for example multiple file types or differently cropped photos can all appear on a single item’s record. The basic idea is to have a single page that encapsulates the entire item (it doesn’t matter if the item is just a single photograph, and the system also works when the ‘item’ is an aggregate item of, for example, a whole box of photos being accessioned into ArchivesWiki).

Tabulate updated to not require REST API plugin

I’ve removed Tabulate’s dependency on the REST API plugin, because that’s now been moved in to core WordPress. (Actually, that happened rather a while ago, but I’m slack and haven’t been paying enough attention to Tabulate this year; other things going on!)

I hope to get back to adding file-field support to Tabulate sometime soon. That’d be a useful addition for me. Also, the whole situation with Reports needs sorting out: better documentation, easier to use, support for embedding in posts and as sidebar wigets, that sort of thing.


The MediaWiki Display Title extension is pretty cool. It uses a page’s display title in all links to that page. That might not sound like much, but it’s really useful to only have to change the title in one place, and have it show correctly all over the wiki. (This is much the same as Dokuwiki with the useheading configuration variable set to 1).

This is the sort of extension that I really like: it does a small thing, but does it well, and it makes sense as an addition to the core software. It’s not trying to do something completely different and just sit on top of or inside MediaWiki. It’s also not something that everyone would want, and so does belong as an extension and not an addition to core (even though the display title feature is part of core).

The other thing the Display Title extension provides is a parser function for retrieving the display title of any page: {{#getdisplaytitle:A page name}}, so you can use the display title without creating a link.

Updating PhpFlickr

I sort of really like bringing old code up to date, even though it can be such a pain. Got to have the time for it though.

Recently, I’ve been trying to modernize phpFlickr, so I can set up better Flickr-to-MediaWiki importing (and Flickr-to-Piwigo perhaps).

Cross-referencing in MediaWiki Genealogy


I want to add a new tag to the Genealogy extension: {{#genealogy:ref|Page Name}}

This would display a person's page name and years of birth and death, in exactly the same way that the 'partners' and 'children' lists currently do (and in fact, those would be changed to use the new 'ref' system). So for example, {{#genealogy:ref|John Smith}} would result in John William Smith (1878–1953) with the first part just being the full page name (after redirects have been followed) and the second part in parentheses being the 'coverage' or 'short description' value of that page (the details of which are in phabricator:T175667).

The idea is that whenever one mentions a person anywhere on the wiki, it'd be nice to sometimes be able to display that person's full name and details without having to look them up and duplicate them on the current page. It's something I've used for years in a custom \bioref{} command in my LaTeX family histories.

I'm not sure that this is actually a genealogy problem though. It seems like the sort of thing that would be useful in lots of different contexts, and perhaps already does exist in some extension or other. I'll do some more investigating...


Retrieved from ‘https://wiki.samwilson.id.au/index.php?title=Welcome&oldid=1305