The programming we choose to do in our team is like the neck in the hourglass representing life science research and technology:
- There are many grains of sand above us. Those represent all the software tools developed in life science research.
- There is a large void below us. This represents the need for widely applicable tools in the life sciences.
- At the bottom there are also grains of sand. They are well settled. These represent the current technology: commercially available tools and well-serviced open source packages.
How does the sand get from the top to the bottom? Via the neck of development. It is narrow; only few academic tools make it to the bottom. The flow through the neck is powered by:
- push: a few academic groups that have the capability and capacity to make their tools available
- pull: a few companies that look far ahead and are able to see and use the potential of an academic tool
In our hourglass of life science tools, new sand is being added at the top all the time. And most of it overflows the beaker after a while. Some tools never deliver what the author thought they would do. Some are made to solve a single problem and rightfully abandoned when that is done. But many tools are published and left as orphans. Only a selection of tools that promise to be useful for a larger audience ever make it to the neck.
In practice, the neck is too narrow. There are many more valuable tools than are taken up. A team like ours can help to make the neck larger by making existing research tools applicable for wider use as a service to life scientists with a clear need (we call it professionalization). But it is sometimes hard to convince funding parties to pay for this. It is also hard to convince researchers to work on making their software better: professionalization does not generate new high-impact papers. We work on convincing the funding parties that it is better to professionalize existing successes than to reinvent them using research money. And we work on convincing the scientists that professionalization of their output will lead to higher citation scores on their existing publications.
Science wants novelty. And the current Dutch finance climate is directed towards applied science, towards innovation in society. Look at the picture, and you can see that these are hard to combine. Innovation starts where novelty ends. The only way to make the combination is to include development.
Photo by graymalkn on flickr
Some of the computing services at universities become paid services. And the primary reaction in the science groups often is a fight because the realistic costs of operating the existing infrastructure are high. And if the fight does not work, there is a flight towards running decentralized infrastructure. This can look cheaper but maintenance and incident control are rarely accounted for.
We will need good documentation to convince people of the true costs of the alternatives. It is such a waste if the rare time of good bioinformatics experts is spent on inefficient server management.
Computer infrastructure used in universities is not part of a market, let alone of a "transparent market" in which everyone has a clear view on what alternatives exist and what their relative merits and costs are.
Nobody in a university research group finds it strange to pay for pens and paper.
Nobody in a research group finds it strange to pay for state-of-the art lab equipment.
But very often computer services have been offered for free. Like water, and electricity, they have been discounted into general costs of running the university.
This situation is unsustainable in a world in which life-science research becomes driven by big data. And it also becomes unsustainable in a world where large storage and computer infrastructure suitable for routine jobs can be rented commercially.
The sustainable way to the future is to properly budget for data handling and storage. Budgeting for computing needs means people are required to balance cost and value, like with every other aspect of a research project.
Chemistry as a noun has two completely distinct meanings in every day life:
- A good social relationship:
"It was visible that there was chemistry between those two people"
- Something related to a compound that is supposedly bad for people or the environment. "Chemical" is often used as synomymous with poisonous:
"A chemical leaked from the container into the sea, endangering the fish"
How come these two meanings of the same word have such extremely different connotations? After all, the scientific word chemistry represents any kind of reaction between two compounds and does not have any positive nor negative meaning in itself. Water is a chemical. Life is chemistry.
As a chemist, I wish I could change the negative connotation of molecular chemistry in the news. But if I really do not succeed, maybe I can influence the social meaning of chemistry to make things consistent:
"There was chemistry between those two! When they first met, she tried to poison him. As soon as he recovered he exploded in anger."
Somehow I feel this would not be as satisfying.
[image credit: Nic McPhee on flickr]
Our team is working with a number of task forces in bioinformatics. Each of those task forces was started to collaborate on the development of a platform for their sub-field: a set of software tools that work together to solve the problems that everyone in the field needs to solve. Developing the platform does not require any new bioinformatics developments: the purpose is to put existing tools together.
The advantage of having these platforms available is obvious:
- to a biologist the advantage consist of having all the de facto standard tools available under the press of a button.
- to a specialist bioinformatics researcher working on a new tool the advantage is that he does not have to deal with the intricacies of all the other tools, and is able to plug his new tool into the platform using well described protocols.
To get to the development of such a platform there is a bootstrapping problem. The situation is like a table with biologists sitting on one side, bioinformaticians at the other side. Above the table, a thick (volcanic?) fog. The layout of the platform is drawn in diagrams on the table: all the tools making up the common work flow, with all their relations. On the side of the bioinformaticians, the diagram shows the concrete tools. Through the fog, they can vaguely see the workflow on the other side of the table. For the biologists, the situation looks completely different: they have a clear view on the concrete workflow they need, but the tools are vague entities that are only visible through the thick fog.
Without good support from a project leader that can listen to people on both sides of the table, the bioinformaticians will try to solve the very concrete problems they encounter on their very concrete individual tools. A little optimization here, a better data storage facility there. None of this is visible for the biologists.
This is why we put project leaders from our engineering team into each of the task forces. They will direct the focus of the bioinformaticians towards more visible changes. Work on common data formats. Work on (common) user interfaces.
Getting things to work together will bootstrap the true collaborative advantages. It will blow away the fog. Suddenly the biologists will be able to see what is going on. They will be able to provide directed feedback. And the bioinformaticians will be able to see the workflow even from their side, and build upon it.
Image credit: Three views of three tables, by EJP Photo on Flickr.