In my last blog post, I wrote about my chromaticwhale twitterbot, which is merrily posting fictitious Northern Rail updates. The original bot made use of cheapbotsdonequick which, as the name suggests, made setting up a twitter bot cheap (i.e. free) and quick. I was keen to add some flexibility to the system, so wanted something a little more sophisticated.
I’m a cheapskate though, so I still wanted something free. I could probably have quietly set up a cron job on a work machine, but I’m not sure that would have been looked on favourably as it’s perhaps not exactly pushing back the bounds of human knowledge. A quick cast around suggested that heroku.com would provide what I needed. Heroku is a cloud platform for applications. They provide paid-for infrastructure for scalable apps, but if you’re just interested in small scale
dicking about experimentation, there’s an entry plan that gives an allowance of free “dyno” hours a month.
Set up involved creating an account and then a heroku application. This sets up a
git repository that you can push code to which will then run on the heroku infrastructure. This works nicely as I was already using
git to manage the codebase. There’s also extra stuff in there that tells heroku about the requirements of your application. When you push to the remote repository, some magic happens at the heroku end and the application is built and set up.
For my purposes, I just wanted a one-off job run periodically, so I set up a scheduler that kicks off every hour. In order to give the impression of reality, the job doesn’t tweet every time it’s run, but on average once in every N times. This was one of the flexibilities I was keen to incorporate. Again, code on github shows how all this works.
Over December, the bot has used just over 8 hours of time, which is well within the 1,000 free hours allocated.
Disclaimer: I have no connection with Heroku other than being a (happy) user. I’m sure other similar infrastructures are available if you really want to spend some time looking for them. I didn’t.
A friend of mine, Di Maynard, who works in computational linguistics and NLP, alerted me to cheapbotsdonequick last week, a service that makes it really easy to set up a twitter-bot. It hooks up to a twitter account and will tweet generated messages at regular intervals. The message content is generated via a system called tracery, using a grammar to specify rules for string generation. There are a number of bots around that use this service including some that generate SVG images — @softlandscapes is my favourite. I thought this looked like an interesting and fun idea to explore.
I’d done some earlier raspberry pi-based experiments hooking up to real-time rail information, so I decided to stick with the train theme and develop a bot tweeting “status updates” for Northern Rail. These wouldn’t quite be real updates though.
A tracery grammar contains simple rules that are expanded to produce a final result. Each rule can have a number of different alternatives, which are chosen at random. See the tracery tutorial for more information. For my grammar, I produced a number of templates for simple issues, e.g.
high volumes of X reported at Y
plus some consequences such as re-routing or disruption to catering services. The grammar allows us to put together templates plus rules about capitalisation or plurals etc.
For the terminals of the grammar — the things that appear as X or Y, I pulled lists from an external, third party data source: dbpedia. For those who aren’t aware of dbpedia, it’s a translation of (some of) the data in Wikipedia into a nicely structured form (RDF), which is then made available via a query endpoint. In this case, I used dbpedia’s SPARQL endpoint to query for words to use as terminals in the grammar. There are other open data sources I could have used, but this was one I was familiar with.
This allowed me to get hold of the stations managed by Northern Rail, plus some “causes” of disruption, which I chose to be European Rodents, Amphibians, common household pests and weather hazards. The final grammar was produced programmatically (using python).
The grammar then produces a series of reports, for example:
Wressle closed due to Oriental cockroaches. Replacement bus service from Lostock Gralam.
So, is there anything to this other than some amusement value? Well, not really, but there are perhaps a couple of points of interest. First off, it’s an illustration of the way in which we can make use of third party, open information sources. This is nice because:
- I don’t need to think about lists of European rodents and amphibians or stations served by Northern Rail.
- The actual content of the lists were unseen to me, so the combinations thrown up are unexpected and keep me amused.
- I can substitute in a different collection of stations or hazards and extend when I get bored of hearing about Cretan frogs and Orkney voles.
- The data sources use standardised vocabulary for the metadata (names etc.) so it’s easy to pull out names of things (potentially in other languages).
I teach an Undergraduate unit on Fundamentals of Computation that focuses largely on defining languages through the use of automata, regular expressions and grammars. The grammars here are (more or less) context free grammars, so this gives an amusing example of what we can do with such a construct.
I am now awaiting the first irate email from a traveller who “didn’t go for the train because you said the station was closed due to an infestation of Orkney Voles”.
We had the annual School of Computer Science “Staff vs Students” coding competition this week. This is a competition run along the lines of the ACM_ICPC with teams of two trying to solve algorithmic problems. The problems are small in the sense that are specified in a paragraph or two, but are far from trivial! Each team of two had three hours to crack as many of the six problems as they could.
The competition is managed using the DOMjudge system. Solutions are uploaded via a web interface, where the system will then check for correctness. The system provides little feedback, with responses being along the lines of CORRECT, WRONG-ANSWER or reporting an issue with time or memory (solutions must run with limited resources). Teams can submit multiple attempts, with incorrect solutions attracting a time penalty, and the first team to solve a problem getting a bonus. This year we had 23 student teams along with 6 teams containing at least one staff member (staff can also pair with PhD students).
I teamed up with Valentino, one of our PhD students, and I’m pleased to say that, while we didn’t win, nor did we disgrace ourselves, managing to solve two problems in the time allotted, with a respectable mid-table finish. I committed the schoolboy error of diving into Problem A without doing sufficient triage and spent a huge amount of time on it. Problem C was much easier! Lessons learnt. Or perhaps not, as I did exactly the same thing last year….
Anyway, results aside, it was a lot of fun and one of the events that contributes to the friendly social atmosphere that I think we have within the School. We adjourned to the pub later on where Head of School Jim Miles stood a round of drinks and I was entertained by some pretty impressive card tricks by one of my first year tutees.
Kudos to the organising committee of Karol Jurasiński, Ion Diaconu, Ettore Torti, Tudor Morar and Gavin Brown who also do a great job in managing the Schools Coding Dojo and co-ordinating participation in external competitions. The final rankings were clear, with the top team solving four problems, and second and third three. There was then clear daylight between those and the rest of the bunch with all the other teams solving two or fewer.
We’re underwater, kneeling on the sand below Koona Jetty. It’s our last night dive on a trip around Ari Atoll in the Maldives, and we’ve been promised “something special”. Special it most certainly was.
When we started diving, a friend gave us a copy of David Doubillet’s “Water Light Time”. It’s a beautiful collection of underwater photographs, including a set showing manta rays feeding on plankton. Those images had stuck with me, and the manta was firmly on the must-see list. Despite having had years of great diving, though, they’d eluded us. Not for want of trying, either. We’d been to places were they were “quite likely” including a dive in Hawaii with Keller Laros — the Manta Man — at the very spot where Doubillet had shot his pictures. Nowt. How was it possible not to see one and a half metric tonnes of fish?
For a couple of days leading up to Koona, the crew had been alluding to some kind of treat, but without letting on exactly what. At last, they revealed that we’d be taking a night dive under the jetty by the hotel on Koona. The jetty is floodlit, the lights attract plankton, and, well, I’m sure you can guess the rest.
We jump in, and swim through the dark to the shore. The light from the jetty is bright enough that we don’t need our torches, and we’ve been told to kneel on the bottom, remain still and wait. So we wait. And wait. Five, then ten minutes go past, and there’s nothing. Surely it can’t happen again? Wildlife encounters are a matter of luck — they’re wild animals after all, and there are never any guarantees. But how unlucky can we be?
And then that longed-for diamond shape appears, and a manta passes over our heads. Then another, and another, swooping through the water above us. They take turns to barrel-roll through the clouds of plankton, turning tight somersaults with their paddle-like lobes funnelling the water through their mouths. They are huge — up to four metres across — but move through the water with an effortless grace, their wings barely moving as they glide past, turn and come around again. Each animal has distinct markings on its skin, and we have at least half a dozen dancing around us. The show goes on for nearly forty minutes, then, just as suddenly as it started, it’s over and they head back to the open ocean. As we swim back to the boat, a ray takes a last turn around, ducking underneath us and giving a last chance to enjoy these beautiful creatures.
We’d waited over thirteen years for this, and I’d wait another thirty if I could do it again. And if you’ve ever wondered whether it’s possible to shed a tear in a dive mask, it is.
Editorial Note: This is a piece that I wrote as an entry for the Guardian’s Travel Writing Competition in 2013 — 500 words on “an encounter”. It didn’t win, but I didn’t want it to go to waste! I also wrote on swimming with sharks.
It’s an hour after sunset and we’re standing on the back of a boat, staring down into the black waters of the Indian Ocean, wondering what lurks beneath the surface. Except that we know what’s lurking beneath the surface. Because this is Maya Thila in the Maldives, and what we’re going to find down there are sharks. Lots of sharks. Hunting.
Scuba diving takes you into an alien world, with easy movement in three dimensions, communication restricted to hand signals and flora and fauna quite unlike anything you’ll encounter on the surface. On night dives this becomes even more so, as that’s when all the really weird stuff comes out. Worms, slugs, crustaceans, feather stars, anenomes. Tonight though, we’re here to see the resident population of white tips out looking for their dinner. On earlier dives, we’ve seen plenty of sharks. During the day, they tend to be fairly sedentary, snoozing in the sand, or cruising slowly past the reef. At night, it’s all change, and even with these small reef sharks (classified in our fish book as “usually docile”), you can see just why they’re apex predators. As we circle the reef, there are sharks everywhere, flashing out of the gloom and through our torchlight, darting in and out of caves in search of their prey.
It’s a wonderful opportunity to see “Nature red in tooth and claw” close up. Where else could one be within touching distance of an animal that sits at the top of the food chain (other than humans of course) and watch as they demonstrate their rightful place at the head of that chain?
And contrary to all those years of bad press, they’re really not interested in us. Not that the adrenalin isn’t flowing. It’s like being immersed in an episode of the Blue Planet, and at times there’s almost too much to take in. Not only are there hunting sharks, but moray eels, lionfish and snapper are joining in the fray, making the most of the light from our torches to track and target.
After what seems like ten minutes, but is closer to an hour, the dive is done and it’s time to make our way up the mooring line. We break the surface and Jacques Cousteau’s Silent World is replaced by a hubbub of excited voices as buddy pairs dry off, sip hot tea and swap tales of the deep.
Editorial Note: This is a piece that I wrote as an entry for the Guardian’s Travel Writing Competition in 2013 — 500 words on “wildlife”. It didn’t win, but I didn’t want it to go to waste! I also wrote about an encounter with mantas.
The likelyhood of me getting to present the Oscars is rather low, but I did get to say those famous words during the “awards ceremony” for the Semantic Web Challenge last month at the International Semantic Web Conference in Sydney.
The Challenge is a yearly event, sponsored by Elsevier that invites researchers and developers to showcase applications and systems that are being built with emerging semantic technologies. Now in its 11th year, the Challenge doesn’t define a specific task, data set or application domain, but instead sets out a number of criteria that systems should meet.
Candidates were invited to demonstrate their systems during the posters and demos session on the first evening of the conference. A panel of judges then selected a set of finalists who gave short presentations during two dedicated conference sessions. The winners were then chosen following a lively debate between the judges. And so, without further ado, to the golden envelope…….
The winners of the Open Track in 2013 were Yves Raimond and Tristan Ferne for their system The BBC World Service Archive Prototype. Yves featured throughout ISWC2014, giving an excellent keynote to the COLD workshop and also presenting a paper featuring related work in the Semantic Web In Use Track. The winning system combined a number of technologies including text extraction and audio analysis in order to tag archive broadcasts from the World Service. Crowdsourcing (with over 2,000 users) is then used to clean and validate the resulting tags. Visualisations based on tags extracted from live news feeds allow journalists to quickly locate relevant content.
Second place in the Open Track went to Zachary Elkins, Tom Ginsburg, James Melton, Robert Shaffer, Juan F. Sequeda and Daniel Miranker for Constitute: The World’s Constitutions to Read, Search and Compare. Constitute provides access to the text of over 700 constitutions from countries across the world. As Juan Sequeda told us in his excellent presentation during the session, although this may seem like a niche application, each year on average 30 constitutions are amended and 5 are replaced. Drafting constitutions requires significant effort, and providing systematic access to existing examples will be of great benefit. One of the particularly appealling aspects of Constitute was that it demonstrated societal impact — this is an application that could potentially change lives. An interesting technical aspect was that while building the ontology that drives the system, a domain expert made use of an the pre-existing FAO Geopolitical Ontology (without being explicitly guided to do so). Thus we see an example of interlinking between, and reuse of, terminological resources which is one of the promises of the Semantic Web.
Joint third prizes went to B-hist: Entity-Centric Search over Personal Web Browsing History and STAR-CITY: Semantic Traffic Analytics and Reasoning for CITY. The latter was a system developed by IBM’s Smarter Cities Technology Centre in Dublin and highlighted the fact that the Challenge attracts entries from both academic and industrial research centres. A Big Data prize was awarded to Fostering Serendipity through Big Linked Data, a system that integrates the Linked Cancer Genome Atlas dataset with PubMed literature.
All the winning entries will have the opportunity to submit papers to a Special Issue of the Journal of Web Semantics.
This was my first year co-chairing the challenge (with Andreas Harth of KIT) and I was impressed by both the quality and variety of the submissions. The well attended presentation sessions also show a keen interest in the challenge from the community. I’ll be looking forward to seeing the submissions for ISWC2014 in Trentino!
ISWC in Sydney was also memorable due to the Semantic Web Jam Session featuring live RDF triple generation (that man Yves again), but that’s a whole other story……
I’m sure that almost anyone who reads this blog will be aware of the Raspberry Pi, the credit-card sized ARM GNU/Linux box that aims to get kids interested in coding. I’m one of those middle aged geeks to whom the Pi has a particular appeal, but I’d still like to share my early experiences.
I’m an academic in a Computer Science Department and have been writing code for over thirty years — I’m of the generation who cut their coding teeth on the BBC micro in the ’80s (the comparison between the Pi and the BBC as a vehicle for enthusing the next generation resonates). So for me, the fact that this is a Linux box I can write code for isn’t that exciting. What has been fun is the opportunity and ease of connecting up low level peripherals. That’s flashing lights and buttons to you and me.
Despite my background and career, I’ve never really dabbled in low level electronics, and my soldering just about stretches to the odd bit of guitar maintenance, or even construction. And sure, I could do low level stuff with my MacBook with the appropriate connections and some kind of USB magic (couldn’t I?), but the instant appeal of those little GPIO pins sticking out of the board is strong. Plus the fact that if, or more likely when, I fry the board with my incompetent electronic skillz, it’ll cost me not much more than a pizza and a bottle of wine in a restaurant.
Luckily for me, some of my colleagues have developed the Pi-face, an interface that plugs on to the Pi and provides easy access to a number of inputs and outputs. It even has four switches and eight LEDS built in. Along with the supporting python libraries, it was a breeze to get going and I had flashing lights in no time. Woo-hoo! The Pi-face was nice as it allowed me to do a little bit of playing around without worrying too much about Pi-fry. After all, if I can choose to spend the money on pizza or pi then mine’s a Fiorentina and a glass of nice red please.
From there on it’s been a slippery slope. I got myself a breadboard and an assortment of LEDs. More flashing lights! I discovered a wealth of ebay shops that will sell all manner of components at cheap-as-chips prices. I’ve been spending increasing amounts of time in the garage surrounded by bits of wire and blobs of solder. Of course I have more disposable income than your average 10 year old, but when you can pick up an LCD screen for a couple of quid we’re still very much in pocket-money territory. Hooking up the LCD was a blast and meant I could actually begin to build useful projects. First of these was the piPlayer, a streaming radio. My next project (train times monitoring — coming soon) needed more than 8 outputs*, so once I was confident with the Pi-face, I started experimenting with direct use of the GPIO pins, using the Adafruit cobbler to break the pins out. “Break the pins out” — see, I’m even using the language now! And my soldering’s getting better.
There have been some other interesting learning experiences. When I wanted to use a π character in my piPlayer display I found myself downloading the HD44780 datasheet (my reaction two months ago: datasheet, what’s a datasheet?) to find the appropriate hex character to send. It also took me a fair while to realise that the PiFace outputs are pulled low when set to 1. So when I first hooked up my LCD after cannibalising some instructions, I was faced with what appeared to be a screen of Korean characters and obscure punctuation, reminiscent of a bout of swearing from an Asterix character. When I finally realised the problem, flipped the bits in my python code and saw the words Hello Sean appear in blue and white letters, I punched the air like a little kid. And that’s the whole point of the Pi.
*Although I understand that the Pi-face v2 will allow the use of the input pins as outputs, giving more than eight.