Humbly Report: Sean Bechhofer

Semantics 'n' stuff

Archive for the ‘projects’ Category

What’s the Story, Morning Glory?

with one comment

What's the Story, Morning Glory?

I went to the sameAs meeting in London this week, where the theme of the meeting was storytelling. It’s the first time I’ve been to a sameAs meetup (I’m in Oxford at OeRC for the next few weeks and it’s a bit easier to get through to London from here than from Manchester) and it was an interesting evening.

As one of three talks, science writer and blogger Ed Yong told a tale that started 150 million years ago with a mayfly in some mud, and ended up with a scientist wandering around lost in a swamp [1]. The (ultimately successful) search resulted in a publication [2], but one of Yong’s points was that the (potentially interesting) back story about the search leading to the discovery of the fossil wasn’t related in the paper. Should it?

To answer that question we’d have to think “is it important to the science that’s being presented in the paper”, or perhaps more concretely “will including this make it more likely that the paper will be accepted for publication”. For a majority of publication outlets, the answer to that is probably a no. But it certainly belongs somewhere — if nothing else, it provides a human side to the work that would help in public engagement or dissemination. Yong suggested that perhaps such information should be included in supplementary material. Many scientists are also now bloggers, so an obvious option is that we tell these additional stories through our blogs.

A question asked after the talk was whether narrative was really crucial to scientific papers. In my opinion (and based on my admittedly narrow experience of writing Computer Science papers) it certainly is — having a clear story to tell is vital if we are to write good, readable scientific papers. That doesn’t necessarily mean to say that we include all of the contextual detail (for example, stumbling lost around a swamp), but we do need a story to guide the reader.

As highlighted in some of the discussion after Yong’s talk, the way in which the story is told in a paper often doesn’t represent the true nature investigation. We may have gone down blind alleys, backtracked, repeated or redesigned experiments along the way. So the final paper presentation often isn’t a chronologically accurate description of the process. The story can get chopped up and reconstituted with a post hoc presentation of the timeline. That retelling of the story may end up losing some key information for those wishing to understand the process that the authors went through.

Work that we’re currently pursuing in the Wf4Ever project is addressing (some of) these issues. The project is investigating the use of Research Objects [3] to aggregate and bundle together the resources that are used in a scientific investigation. In particular, we’re focusing on two domains (genomics and astronomy) that make use of scientific workflows to code up and execute analyses that are taking place in an investigation. The hope is that by bundling together the context (in terms of the method/workflow, data sets, parameters, provenance information about data, workflow traces etc), a researcher has a better chance of understanding what took place and in turn building on those results, supporting reproducible science [4]. Other related work aims to define executable papers (e.g. Elsevier’s Executable Paper Grand Challenge [5]) that allow validation of code and data. The FORCE11 group [6] also see the notion of Research Object as replacing or superceding traditional paper publication.

Of course, even an enhanced publication still needs a good narrative and a story. Perhaps though our publications of tomorrow will include not just the text and arguments, but also the data, methods, and GPS tracks of a researcher lost in the woods….

REFERENCES

  1. Treasure hunt ends with a stunning fossil of a flying insect. Ed Yong, Not Exactly Rocket Science http://blogs.discovermagazine.com/notrocketscience/2011/04/04/treasure-hunt-ends-with-a-stunning-fossil-of-a-flying-insect/.
  2. Late Carboniferous paleoichnology reveals the oldest full-body impression of a flying insect. R.J.Knecht et. al. PNAS 108(16) pp.6515–6519. http://dx.doi.org/10.1073/pnas.1015948108
  3. Linked Data is Not Enough for Scientists, S. Bechhofer et. al. Future Generation Computer Systems, 201110.1016/j.future.2011.08.004
  4. Accessible Reproducible Research. J. Mesirov. Science 327 (5964) pp.415–416. http://dx.doi.org/10.1126/science.1179653
  5. Executable Papers Grand Challenge http://www.executablepapers.com/.
  6. Improving Future Research Communication and e-Scholarship. Phil Bourne, Tim Clark et. al.FORCE11 Manifesto.

Written by Sean Bechhofer

February 22, 2012 at 5:36 pm

Posted in projects, research objects

Tagged with ,

Gone Fishin’

leave a comment »

Roman Aquaduct, Segovia

A Fish

The FISH.link project website is now online. FISH.link is a collaboration between the University of Manchester School of Computer Science, the Freshwater Biological Association, King’s College London Centre for e-Research and Queen Mary, University of London River Communities Group funded under JISC’s Managing Research Data Programme. The project overview is as follows:

Motivated by the large quantity of diverse data in the freshwater biology community, FISH.Link will provide a demonstrator of the benefits of publishing data by illustrating how data can be combined, repurposed and reused with attribution and provenance information to promote data sharing. The project intends to support the sharing and integration of research data through the application of lightweight vocabularies and vocabulary mapping, facilitating integration of data sets, and moving towards the Web of Data that forms the current Linked Open Data vision.

A case study that addresses a real scientific question will be used to provide motivation, requirements and support evaluation.

FISH.link will produce tools that allow fresh water biologists to publish data in to the Linked Data Cloud. These tools will be integrated into the FISHnet platform that supports the data life cycle in fresh water science and will use SKOS for the representation of vocabularies.

The project will run for 12 months until July 2011.

Written by Sean Bechhofer

August 24, 2010 at 10:04 am

Posted in linked data, projects, skos

Tagged with ,