What’s the Story, Morning Glory?

I went to the sameAs meeting in London this week, where the theme of the meeting was storytelling. It’s the first time I’ve been to a sameAs meetup (I’m in Oxford at OeRC for the next few weeks and it’s a bit easier to get through to London from here than from Manchester) and it was an interesting evening.

As one of three talks, science writer and blogger Ed Yong told a tale that started 150 million years ago with a mayfly in some mud, and ended up with a scientist wandering around lost in a swamp [1]. The (ultimately successful) search resulted in a publication [2], but one of Yong’s points was that the (potentially interesting) back story about the search leading to the discovery of the fossil wasn’t related in the paper. Should it?

To answer that question we’d have to think “is it important to the science that’s being presented in the paper”, or perhaps more concretely “will including this make it more likely that the paper will be accepted for publication”. For a majority of publication outlets, the answer to that is probably a no. But it certainly belongs somewhere — if nothing else, it provides a human side to the work that would help in public engagement or dissemination. Yong suggested that perhaps such information should be included in supplementary material. Many scientists are also now bloggers, so an obvious option is that we tell these additional stories through our blogs.

A question asked after the talk was whether narrative was really crucial to scientific papers. In my opinion (and based on my admittedly narrow experience of writing Computer Science papers) it certainly is — having a clear story to tell is vital if we are to write good, readable scientific papers. That doesn’t necessarily mean to say that we include all of the contextual detail (for example, stumbling lost around a swamp), but we do need a story to guide the reader.

As highlighted in some of the discussion after Yong’s talk, the way in which the story is told in a paper often doesn’t represent the true nature investigation. We may have gone down blind alleys, backtracked, repeated or redesigned experiments along the way. So the final paper presentation often isn’t a chronologically accurate description of the process. The story can get chopped up and reconstituted with a post hoc presentation of the timeline. That retelling of the story may end up losing some key information for those wishing to understand the process that the authors went through.

Work that we’re currently pursuing in the Wf4Ever project is addressing (some of) these issues. The project is investigating the use of Research Objects [3] to aggregate and bundle together the resources that are used in a scientific investigation. In particular, we’re focusing on two domains (genomics and astronomy) that make use of scientific workflows to code up and execute analyses that are taking place in an investigation. The hope is that by bundling together the context (in terms of the method/workflow, data sets, parameters, provenance information about data, workflow traces etc), a researcher has a better chance of understanding what took place and in turn building on those results, supporting reproducible science [4]. Other related work aims to define executable papers (e.g. Elsevier’s Executable Paper Grand Challenge [5]) that allow validation of code and data. The FORCE11 group [6] also see the notion of Research Object as replacing or superceding traditional paper publication.

Of course, even an enhanced publication still needs a good narrative and a story. Perhaps though our publications of tomorrow will include not just the text and arguments, but also the data, methods, and GPS tracks of a researcher lost in the woods….


