Humbly Report: Sean Bechhofer

Semantics 'n' stuff

Something Special

with one comment

Manta!

We’re underwater, kneeling on the sand below Koona Jetty. It’s our last night dive on a trip around Ari Atoll in the Maldives, and we’ve been promised “something special”. Special it most certainly was.

When we started diving, a friend gave us a copy of David Doubillet’s “Water Light Time”. It’s a beautiful collection of underwater photographs, including a set showing manta rays feeding on plankton. Those images had stuck with me, and the manta was firmly on the must-see list. Despite having had years of great diving, though, they’d eluded us. Not for want of trying, either. We’d been to places were they were “quite likely” including a dive in Hawaii with Keller Laros — the Manta Man — at the very spot where Doubillet had shot his pictures. Nowt. How was it possible not to see one and a half metric tonnes of fish?

For a couple of days leading up to Koona, the crew had been alluding to some kind of treat, but without letting on exactly what. At last, they revealed that we’d be taking a night dive under the jetty by the hotel on Koona. The jetty is floodlit, the lights attract plankton, and, well, I’m sure you can guess the rest.

We jump in, and swim through the dark to the shore. The light from the jetty is bright enough that we don’t need our torches, and we’ve been told to kneel on the bottom, remain still and wait. So we wait. And wait. Five, then ten minutes go past, and there’s nothing. Surely it can’t happen again? Wildlife encounters are a matter of luck — they’re wild animals after all, and there are never any guarantees. But how unlucky can we be?

And then that longed-for diamond shape appears, and a manta passes over our heads. Then another, and another, swooping through the water above us. They take turns to barrel-roll through the clouds of plankton, turning tight somersaults with their paddle-like lobes funnelling the water through their mouths. They are huge — up to four metres across — but move through the water with an effortless grace, their wings barely moving as they glide past, turn and come around again. Each animal has distinct markings on its skin, and we have at least half a dozen dancing around us. The show goes on for nearly forty minutes, then, just as suddenly as it started, it’s over and they head back to the open ocean. As we swim back to the boat, a ray takes a last turn around, ducking underneath us and giving a last chance to enjoy these beautiful creatures.

We’d waited over thirteen years for this, and I’d wait another thirty if I could do it again. And if you’ve ever wondered whether it’s possible to shed a tear in a dive mask, it is.

Editorial Note: This is a piece that I wrote as an entry for the Guardian’s Travel Writing Competition in 2013 — 500 words on “an encounter”. It didn’t win, but I didn’t want it to go to waste! I also wrote on swimming with sharks.

Written by Sean Bechhofer

January 14, 2014 at 5:23 pm

Posted in diving

Swimming with Sharks

leave a comment »

Maya Thila Dives

It’s an hour after sunset and we’re standing on the back of a boat, staring down into the black waters of the Indian Ocean, wondering what lurks beneath the surface. Except that we know what’s lurking beneath the surface. Because this is Maya Thila in the Maldives, and what we’re going to find down there are sharks. Lots of sharks. Hunting.

Scuba diving takes you into an alien world, with easy movement in three dimensions, communication restricted to hand signals and flora and fauna quite unlike anything you’ll encounter on the surface. On night dives this becomes even more so, as that’s when all the really weird stuff comes out. Worms, slugs, crustaceans, feather stars, anenomes. Tonight though, we’re here to see the resident population of white tips out looking for their dinner. On earlier dives, we’ve seen plenty of sharks. During the day, they tend to be fairly sedentary, snoozing in the sand, or cruising slowly past the reef. At night, it’s all change, and even with these small reef sharks (classified in our fish book as “usually docile”), you can see just why they’re apex predators. As we circle the reef, there are sharks everywhere, flashing out of the gloom and through our torchlight, darting in and out of caves in search of their prey.

It’s a wonderful opportunity to see “Nature red in tooth and claw” close up. Where else could one be within touching distance of an animal that sits at the top of the food chain (other than humans of course) and watch as they demonstrate their rightful place at the head of that chain?

And contrary to all those years of bad press, they’re really not interested in us. Not that the adrenalin isn’t flowing. It’s like being immersed in an episode of the Blue Planet, and at times there’s almost too much to take in. Not only are there hunting sharks, but moray eels, lionfish and snapper are joining in the fray, making the most of the light from our torches to track and target.

After what seems like ten minutes, but is closer to an hour, the dive is done and it’s time to make our way up the mooring line. We break the surface and Jacques Cousteau’s Silent World is replaced by a hubbub of excited voices as buddy pairs dry off, sip hot tea and swap tales of the deep.

Editorial Note: This is a piece that I wrote as an entry for the Guardian’s Travel Writing Competition in 2013 — 500 words on “wildlife”. It didn’t win, but I didn’t want it to go to waste! I also wrote about an encounter with mantas.

Written by Sean Bechhofer

January 14, 2014 at 5:18 pm

Posted in diving

And the Winner is…

leave a comment »

A Big Cheque

The likelyhood of me getting to present the Oscars is rather low, but I did get to say those famous words during the “awards ceremony” for the Semantic Web Challenge last month at the International Semantic Web Conference in Sydney.

The Challenge is a yearly event, sponsored by Elsevier that invites researchers and developers to showcase applications and systems that are being built with emerging semantic technologies. Now in its 11th year, the Challenge doesn’t define a specific task, data set or application domain, but instead sets out a number of criteria that systems should meet.

Candidates were invited to demonstrate their systems during the posters and demos session on the first evening of the conference. A panel of judges then selected a set of finalists who gave short presentations during two dedicated conference sessions. The winners were then chosen following a lively debate between the judges. And so, without further ado, to the golden envelope…….

The winners of the Open Track in 2013 were Yves Raimond and Tristan Ferne for their system The BBC World Service Archive Prototype. Yves featured throughout ISWC2014, giving an excellent keynote to the COLD workshop and also presenting a paper featuring related work in the Semantic Web In Use Track. The winning system combined a number of technologies including text extraction and audio analysis in order to tag archive broadcasts from the World Service. Crowdsourcing (with over 2,000 users) is then used to clean and validate the resulting tags. Visualisations based on tags extracted from live news feeds allow journalists to quickly locate relevant content.

Second place in the Open Track went to Zachary Elkins, Tom Ginsburg, James Melton, Robert Shaffer, Juan F. Sequeda and Daniel Miranker for Constitute: The World’s Constitutions to Read, Search and Compare. Constitute provides access to the text of over 700 constitutions from countries across the world. As Juan Sequeda told us in his excellent presentation during the session, although this may seem like a niche application, each year on average 30 constitutions are amended and 5 are replaced. Drafting constitutions requires significant effort, and providing systematic access to existing examples will be of great benefit. One of the particularly appealling aspects of Constitute was that it demonstrated societal impact — this is an application that could potentially change lives. An interesting technical aspect was that while building the ontology that drives the system, a domain expert made use of an the pre-existing FAO Geopolitical Ontology (without being explicitly guided to do so). Thus we see an example of interlinking between, and reuse of, terminological resources which is one of the promises of the Semantic Web.

Joint third prizes went to B-hist: Entity-Centric Search over Personal Web Browsing History and STAR-CITY: Semantic Traffic Analytics and Reasoning for CITY. The latter was a system developed by IBM’s Smarter Cities Technology Centre in Dublin and highlighted the fact that the Challenge attracts entries from both academic and industrial research centres. A Big Data prize was awarded to Fostering Serendipity through Big Linked Data, a system that integrates the Linked Cancer Genome Atlas dataset with PubMed literature.

All the winning entries will have the opportunity to submit papers to a Special Issue of the Journal of Web Semantics.

This was my first year co-chairing the challenge (with Andreas Harth of KIT) and I was impressed by both the quality and variety of the submissions. The well attended presentation sessions also show a keen interest in the challenge from the community. I’ll be looking forward to seeing the submissions for ISWC2014 in Trentino!

ISWC in Sydney was also memorable due to the Semantic Web Jam Session featuring live RDF triple generation (that man Yves again), but that’s a whole other story……

Written by Sean Bechhofer

November 14, 2013 at 5:28 pm

Posted in conference

Tagged with ,

Life of Pi

with 4 comments

The piPlayer

I’m sure that almost anyone who reads this blog will be aware of the Raspberry Pi, the credit-card sized ARM GNU/Linux box that aims to get kids interested in coding. I’m one of those middle aged geeks to whom the Pi has a particular appeal, but I’d still like to share my early experiences.

I’m an academic in a Computer Science Department and have been writing code for over thirty years — I’m of the generation who cut their coding teeth on the BBC micro in the ’80s (the comparison between the Pi and the BBC as a vehicle for enthusing the next generation resonates). So for me, the fact that this is a Linux box I can write code for isn’t that exciting. What has been fun is the opportunity and ease of connecting up low level peripherals. That’s flashing lights and buttons to you and me.

Despite my background and career, I’ve never really dabbled in low level electronics, and my soldering just about stretches to the odd bit of guitar maintenance, or even construction. And sure, I could do low level stuff with my MacBook with the appropriate connections and some kind of USB magic (couldn’t I?), but the instant appeal of those little GPIO pins sticking out of the board is strong. Plus the fact that if, or more likely when, I fry the board with my incompetent electronic skillz, it’ll cost me not much more than a pizza and a bottle of wine in a restaurant.

Luckily for me, some of my colleagues have developed the Pi-face, an interface that plugs on to the Pi and provides easy access to a number of inputs and outputs. It even has four switches and eight LEDS built in. Along with the supporting python libraries, it was a breeze to get going and I had flashing lights in no time. Woo-hoo! The Pi-face was nice as it allowed me to do a little bit of playing around without worrying too much about Pi-fry. After all, if I can choose to spend the money on pizza or pi then mine’s a Fiorentina and a glass of nice red please.

From there on it’s been a slippery slope. I got myself a breadboard and an assortment of LEDs. More flashing lights! I discovered a wealth of ebay shops that will sell all manner of components at cheap-as-chips prices. I’ve been spending increasing amounts of time in the garage surrounded by bits of wire and blobs of solder. Of course I have more disposable income than your average 10 year old, but when you can pick up an LCD screen for a couple of quid we’re still very much in pocket-money territory. Hooking up the LCD was a blast and meant I could actually begin to build useful projects. First of these was the piPlayer, a streaming radio. My next project (train times monitoring — coming soon) needed more than 8 outputs*, so once I was confident with the Pi-face, I started experimenting with direct use of the GPIO pins, using the Adafruit cobbler to break the pins out. “Break the pins out” — see, I’m even using the language now! And my soldering’s getting better.

There have been some other interesting learning experiences. When I wanted to use a π character in my piPlayer display I found myself downloading the HD44780 datasheet (my reaction two months ago: datasheet, what’s a datasheet?) to find the appropriate hex character to send. It also took me a fair while to realise that the PiFace outputs are pulled low when set to 1. So when I first hooked up my LCD after cannibalising some instructions, I was faced with what appeared to be a screen of Korean characters and obscure punctuation, reminiscent of a bout of swearing from an Asterix character. When I finally realised the problem, flipped the bits in my python code and saw the words  Hello Sean  appear in blue and white letters, I punched the air like a little kid. And that’s the whole point of the Pi.

*Although I understand that the Pi-face v2 will allow the use of the input pins as outputs, giving more than eight.

Written by Sean Bechhofer

November 8, 2012 at 5:23 pm

Posted in raspberry pi

Tagged with

All the World’s a Stage

with 2 comments

Jason Groth Wigs Out

Anyone who knows me is probably aware of the fact that I’m a keen amateur* musician. So I was very pleased to be able to work on a musical dataset while spending some sabbatical time at OeRC with Dave De Roure. The project has been focused around the Internet Archive‘s Live Music Archive. The Internet Archive is a “non-profit organisation building a library of internet sites and other cultural artifacts in digital form”. They’re the folks responsible for the Way Back Machine, the service that lets you see historical states of web sites.

The Live Music Archive is a community contributed collection of live recordings with over 100,000 performances by nearly 4,000 artists. These aren’t just crappy bootlegs by someone with a tapedeck and a mic down their sleeve either — many are taken from direct feeds off the desk or have been recorded with state of the art equipment. It’s all legal too, as the material in the collection has been sanctioned by the artists. I first came across the archive several years ago — it contains recordings by a number of my current favourites including Mogwai, Calexico and Andrew Bird.

Our task was to take the collection metadata and republish as Linked Data. This involves a couple of stages. The first is to simply massage the data into an RDF-based form. The second is to provide links to existing resources in other data sources. There are two “obvious” sources to target here, MusicBrainz, which provides information about music artists, and GeoNames, which provides information about geographical locations. Using some simple techniques, we’ve identified mappings between the entities in our collection and external resources, placing the dataset firmly into the Linked Data Cloud. The exercise also raised some interesting questions about how we expose the fact that there is an underlying dataset (the source data from the archive) along with some additional interpretations on that data (the mappings to other sources). There are certainly going to be glitches in the alignment process — with a corpus of this size, automated alignment is the only viable solution — so it’s important that data consumers are aware of what they’re getting. This also relates to other strands of work about preserving scientific processes and new models of publication that we’re pursing in projects like wf4ever. I’ll try and return to some of these questions in a later post.

So what? Why is this interesting? For a start, it’s a fun corpus to play with, and one shouldn’t underestimate the importance having fun at work! On a more serious note, the corpus provides a useful resource for computational musicology as exemplified by activities such as MIREX. Not only is there metadata about large number of live performances with links to related resources, but there are links to the underlying audio files from those performances, often in hgh quality audio formats. So there is an opportunity here to combine analysis of both the metadata and audio. Thus we can potentially compare live performances by individual artists across different geographical locations. This could be in terms of metadata — which artists have played in which locations (see the network below) and does artist X play the same setlist every night? Such a query could also potentially be answered by similar resources such as http://www.setlist.fm. The presence of the audio, however, also offers the possibility of combining metadata queries with computational analysis of the performance audio data — does artist X play the same songs at the same tempo every night, and does that change with geographical location? Of course this corpus is made up of a particular collection of events, so we must be circumspect in deriving any kind of general conclusions about live performances or artist behaviour.

Who Played Where?

The dataset is accessible from http://etree.linkedmusic.org. There is a SPARQL endpoint along with browsable pages delivering HTML/RDF representations via content negotation. Let us know if you find the data useful, interesting, or if you have any ideas for improvement. There is also a short paper [1] describing the dataset submitted to the Semantic Web Journal. The SWJ has an open review process, so feel free to comment!

REFERENCES

  1. Sean Bechhofer, David De Roure and Kevin Page. Hello Cleveland! Linked Data Publication of Live Music Archives. Submitted to the Semantic Web Journal Special Call for Linked Dataset Descriptions.

*Amateur in a positive way in that I do it for the love of it and it’s not how I pay the bills.

Written by Sean Bechhofer

May 23, 2012 at 1:23 pm

Posted in linked data, music, rdf

Tagged with

What’s the Story, Morning Glory?

with one comment

What's the Story, Morning Glory?

I went to the sameAs meeting in London this week, where the theme of the meeting was storytelling. It’s the first time I’ve been to a sameAs meetup (I’m in Oxford at OeRC for the next few weeks and it’s a bit easier to get through to London from here than from Manchester) and it was an interesting evening.

As one of three talks, science writer and blogger Ed Yong told a tale that started 150 million years ago with a mayfly in some mud, and ended up with a scientist wandering around lost in a swamp [1]. The (ultimately successful) search resulted in a publication [2], but one of Yong’s points was that the (potentially interesting) back story about the search leading to the discovery of the fossil wasn’t related in the paper. Should it?

To answer that question we’d have to think “is it important to the science that’s being presented in the paper”, or perhaps more concretely “will including this make it more likely that the paper will be accepted for publication”. For a majority of publication outlets, the answer to that is probably a no. But it certainly belongs somewhere — if nothing else, it provides a human side to the work that would help in public engagement or dissemination. Yong suggested that perhaps such information should be included in supplementary material. Many scientists are also now bloggers, so an obvious option is that we tell these additional stories through our blogs.

A question asked after the talk was whether narrative was really crucial to scientific papers. In my opinion (and based on my admittedly narrow experience of writing Computer Science papers) it certainly is — having a clear story to tell is vital if we are to write good, readable scientific papers. That doesn’t necessarily mean to say that we include all of the contextual detail (for example, stumbling lost around a swamp), but we do need a story to guide the reader.

As highlighted in some of the discussion after Yong’s talk, the way in which the story is told in a paper often doesn’t represent the true nature investigation. We may have gone down blind alleys, backtracked, repeated or redesigned experiments along the way. So the final paper presentation often isn’t a chronologically accurate description of the process. The story can get chopped up and reconstituted with a post hoc presentation of the timeline. That retelling of the story may end up losing some key information for those wishing to understand the process that the authors went through.

Work that we’re currently pursuing in the Wf4Ever project is addressing (some of) these issues. The project is investigating the use of Research Objects [3] to aggregate and bundle together the resources that are used in a scientific investigation. In particular, we’re focusing on two domains (genomics and astronomy) that make use of scientific workflows to code up and execute analyses that are taking place in an investigation. The hope is that by bundling together the context (in terms of the method/workflow, data sets, parameters, provenance information about data, workflow traces etc), a researcher has a better chance of understanding what took place and in turn building on those results, supporting reproducible science [4]. Other related work aims to define executable papers (e.g. Elsevier’s Executable Paper Grand Challenge [5]) that allow validation of code and data. The FORCE11 group [6] also see the notion of Research Object as replacing or superceding traditional paper publication.

Of course, even an enhanced publication still needs a good narrative and a story. Perhaps though our publications of tomorrow will include not just the text and arguments, but also the data, methods, and GPS tracks of a researcher lost in the woods….

REFERENCES

  1. Treasure hunt ends with a stunning fossil of a flying insect. Ed Yong, Not Exactly Rocket Science http://blogs.discovermagazine.com/notrocketscience/2011/04/04/treasure-hunt-ends-with-a-stunning-fossil-of-a-flying-insect/.
  2. Late Carboniferous paleoichnology reveals the oldest full-body impression of a flying insect. R.J.Knecht et. al. PNAS 108(16) pp.6515–6519. http://dx.doi.org/10.1073/pnas.1015948108
  3. Linked Data is Not Enough for Scientists, S. Bechhofer et. al. Future Generation Computer Systems, 201110.1016/j.future.2011.08.004
  4. Accessible Reproducible Research. J. Mesirov. Science 327 (5964) pp.415–416. http://dx.doi.org/10.1126/science.1179653
  5. Executable Papers Grand Challenge http://www.executablepapers.com/.
  6. Improving Future Research Communication and e-Scholarship. Phil Bourne, Tim Clark et. al.FORCE11 Manifesto.

Written by Sean Bechhofer

February 22, 2012 at 5:36 pm

Posted in projects, research objects

Tagged with ,

Sparky’s Magic Piano

leave a comment »

Not actually the Magic Piano.....

In the week before Christmas, I attended the Digital Music Research Network meeting at Queen Mary, University of London. Digital Music research is not an area I’m currently involved with, but I went to the meeting at the suggestion of Dave De Roure. I’ll be spending some sabbatical time with Dave in Oxford this year and one of the things we’re going to be looking at is whether we can apply the technologies and approaches being developed in other project (in particular the Research Objects of Wf4Ever) to tasks like Music Information Retrieval. I’m also excited about this as it fits with some of my extra-curricular interests in music. The mix of the technical and artistic (in terms of both content and people) reminded me of Hypertext conferences that I went to back in ’99 and ’00.

Although some of the talks were a long way from my expertise, I found a few of particular interest. The opening keynote from Elaine Chew discussed some of the issues involved in conducting research — for example ensuring that work leads to publication (and publications that “count”), credit is given for researchers involved in the work, and that work is sustainable. This was illustrated with some fascinating video footage of experiments with a piano duo, investigating how the introduction of delay affects the interaction and interplay between performers.

Gyorgy Fazekas presented the studio ontology — a model that builds on earlier work on a Music Ontology by Yves Raimond. At first sight, the ontology seems fairly lightweight (largely asserted taxonomy), but given my own interests in Semantic Web technologies, this is clearly an area for further investigation.

The jewel in the crown, however, was Andrew McPherson‘s work on Electronic augmentation of the acoustic grand piano. The magnetic resonator piano uses electromagnets to induce string vibrations. For those of you familiar with the EBow, used by guitarists including Bill Nelson and Robert Fripp, it’s like a piano with 88 EBows bolted on to it. A keyboard sensor (I believe using a Moog Piano Bar) captures data from the keys and drives the system. The whole thing requires no alteration to the instrument, and can be set up in a few hours. It’s an electronic instrument, but all the sound is produced using the physical soundboard and strings of the instrument itself (i.e. no amplifier/speakers).

The overall effect is a little like an organ, with infinite sustain of notes, but many more subtle effects can be obtained including string “bending” and the introduction of additional harmonic tones. Andrew gave a demonstration of the instrument over lunch. One regret I have is that performance anxiety kicked in here (I’m a fairly rudimentary pianist) and I didn’t rush forward to have a go when he offered it to the floor! And I hadn’t brought a camera. Videos on Andrew’s site show the instrument in action.

One aspect here is the use of various gestures. Electronic keyboards have facilities like aftertouch, allowing the player to add additional pressure to the keys to control the additional tones/effects. This is possible here, with other gestures such as sliding the fingers along or up and down the keys being used to “play” the instrument. In the talk, Andrew described some additional work he was doing on providing enhanced keyboard controllers to support these additional gestures. The piano keyboard is a ubiquitous controller/interface to a musical instrument — it will be interesting to see how these additional gestures and controls fit in with players’ established practices, and which gestures are “right” for which effects.

Of course, the obvious question that we then all asked was what other instruments one could apply this approach to. Answers on a postcard……

Written by Sean Bechhofer

January 6, 2012 at 1:12 pm

Posted in music, workshop

Tagged with ,

Follow

Get every new post delivered to your Inbox.