David Willetts sets out the benefits of making publicly-funded research free to access.
I am very grateful for this opportunity to set out the Government’s approach to accessing and publishing research findings.
We are very fortunate to have such outstanding science and research capacity in this country. It is second in its range and volume only to the US. When it comes to the output generated from the funding that goes in, it is quite simply the most productive in the world. And no other country produces such a high proportion of work that is excellent. The recent review by Reed Elsevier, amongst others, provides the rigorous evidence behind these statements. With 1% of the world’s population and 4% of its researchers, we produce 6% of the world’s academic articles and 14% of those which are most highly cited. There are about 1.7 million academic articles published around the world, of which about 120,000 come from UK research. Thanks to the quality and success of our publishing industry, meanwhile, 400,000 of the world’s academic papers are published in the UK. If the rest of Britain performed like our research and publishing community, we would have rather fewer economic problems to tackle.
The Coalition is absolutely committed to sustaining this excellence. That is why we have protected cash spending for science and research. It is also why we have introduced higher fees and loans to be repaid by graduates, despite the intense controversy, to ensure our universities are well funded even as public spending is being cut back. Initiatives such as the life sciences strategy, the new Catapult centres, and the extra funding for high-performance computing are all aimed at strengthening our research capabilities in the face of intense international competition.
Our research base is one of our greatest economic assets. But it also enriches us in deeper ways. It enriches our cultural life to have such a range of intellectual activity here. We have an extraordinary window on the world: just about whatever happens anywhere, from a hostage taken in an obscure tribal area to a new scientific experiment, there is almost always someone in the UK who can help us understand it. This is an extraordinary privilege that we enjoy and which we should not take for granted. It contributes immeasurably to the quality of our national life.
The Coalition set out its strategy on research and innovation last December. We understand that academic research is not a sausage machine where you simply put public money in one end and count the patents and the peer-reviewed articles that come out the other end. Instead we think of it as an ecosystem with subtle and intricate interdependences. It has many elements. Crucial is a spirit of free - and occasionally eccentric - intellectual enquiry. That in turn is sustained by our autonomous universities and our learned and professional societies. The ecosystem includes great charitable foundations like the Wellcome Trust. It is enriched by historic collections of data and objects organised with extraordinary skill and care. You, our world-class publishers of academic research, are another key part of this ecosystem.
We recognise the value which publishers add. Peer review is a crucial part of the research process. It takes various forms in which academics generously give their time to scrutinise draft articles. But value can also be added by identifying the academics to conduct peer review, through the editorial function of signalling which research is of the highest worth, and by helping others to find it. I do not scan the cacophony of opinion on the web to try to work out what is happening in the world: my laptop is set to open at the BBC news page. I do not follow wild and wacky online rumours. Instead I trust - well, usually - our leading newspapers and magazines to sort through the news for me. This process of winnowing data and research and then ranking them is of great value. Somehow it has to be paid for. But the way we pay for it is changing.
When I was a student, you went to the library to take physical copies of journal articles down from the shelves. The best articles were in the best journals. The physical location of books meant they were already sorted and ranked. The canon was embodied in, and indeed protected by, the need for the physical organisation of stuff in libraries, and museums and galleries too. Nowadays, the canon is not so easily sustained by the act of organising material in physical space: this is a profound change which has wider effects on our culture and our politics, with which we are still grappling. You do not have a Dewey Decimal system on the web. Instead, as David Weinberger puts it, everything is miscellaneous. The more data and freely available comment, the more important it becomes to sort and shape that information.
As well as reading some of the articles set by my tutors, I also remember browsing through the pages of the leading journals to see which articles were well-thumbed. It helped me to spot the key ones I ought to be familiar with - a primitive version of crowd-sourcing. The web should make that kind of search behaviour far easier. It offers a democratised version of the traditional canon. We are still at the early stages of harnessing the full power of the web to help us find and rate material. It is so frustrating to try to track something on the web, face 1.5 million results, and then give up after the first few. (Ironically, Google’s own search engine began as a device for measuring and tracking citations in academic journals; one reason why the ranking of results is still basically done by number of hits.) But researchers, just like the rest of us, spend too much time searching for material which they know must be out there on the web but can’t quite track down. Sir Tim Berners-Lee argues powerfully for the importance of developing a far more flexible semantic web. That is one of the challenges for our new Open Data Institute.
We need to have far more research material freely available, and we need to be better at editing and sorting it. The challenge is to discharge both of these crucial functions better. This is the challenge facing your industry but you are not on your own. It is one that we share, since Government has responsibilities for intellectual property, copyright and, of course, funding much research. We can work on this in partnership. Let me now set out our approach.
Our starting point is very simple. The Coalition is committed to the principle of public access to publicly-funded research results. That is where both technology and contemporary culture are taking us. It is how we can maximise the value and impact generated by our excellent research base. As taxpayers put their money towards intellectual enquiry, they cannot be barred from then accessing it. They should not be kept outside with their noses pressed to the window - whilst, inside, the academic community produces research in an exclusive space. The Government believes that published research material which has been publicly financed should be publicly accessible - and that principle goes well beyond the academic community.
Perhaps I might speak from the experience of writing my own book, The Pinch, on fairness between the generations. It was very frustrating to track down an article and then find it hidden behind a pay wall. That meant it was freely accessible to a professional in an academic institution, but not to me as an independent writer. That creates a barrier between the academic community and the rest of us, which is deeply unhealthy. I’ve heard of other anomalies: the lone researcher who enrols on an evening course he never attends in order to gain library access, since it’s cheaper than buying the articles he needs to read; the small business owner for whom one advantage of employing a sandwich-year student is because she can print off research papers through her university library. So there has to be a right to roam freely across the achievements of publicly-funded UK research.
The evidence underpinning our ambition for public access is compelling. For example, publicly funded and freely available information from the Human Genome Project led to greater take up of knowledge and commercialisation than from earlier protected data. To date, in fact, every dollar of federal investment in the Human Genome Project has helped generate $141 for the US economy. Separately, a report this year from the US Committee for Economic Development has concluded that the US National Institute of Health’s policy of open access after one year has accelerated scientific progress and the transition from basic research to commercialisation; generated more follow-on research and more citations; and reduced duplicate or dead-end lines of inquiry - so increasing the US government’s return on its investment in research. The researcher Philip Davis, meanwhile, has found that when publishers randomly made certain articles open access on journal websites, readership increased by up to 250% compared to protected articles.
I realise this move to open access presents a challenge and opportunity for your industry, as you have historically received funding by charging for access to a publication. Nevertheless that funding model is surely going to have to change even beyond the positive transition to open access and hybrid journals that’s already underway. To try to preserve the old model is the wrong battle to fight. Look at how the music industry lost out by trying to criminalise a generation of young people for file sharing. It was companies outside the music business such as Spotify and Apple, with iTunes, that worked out a viable business model for access to music over the web. None of us want to see that fate overtake the publishing industry.
Wider access is the way forward. I understand the publishing industry is currently considering offering free public access to scholarly journals at all UK public libraries. This is a very useful way of extending access: it would be good for our libraries too, and I welcome it.
Provided we all recognise that open access is on its way, we can then work together to ensure that the valuable functions you carry out continue to be properly funded - and that the publishing industry remains a significant contributor to the UK economy. I believe that academic publishing does add value, not least because peer review is at the heart of our system of determining and communicating high-quality research. We do not yet have any reliable version of peer review on the net. The controversy about the status and reliability of reviews on Trip Advisor is a reminder of how precious genuinely objective peer review is.
It would be deeply irresponsible to get rid of one business model and not put anything in its place. That is why I hosted a roundtable at BIS in March last year when all the key players discussed these issues. There was a genuine willingness to work together. As a result I commissioned Dame Janet Finch to chair an independent group of experts to investigate the issues and report back. We are grateful to the Publishers Association for playing a constructive role in her exercise, and we look forward to receiving her report in the next few weeks. No decisions will be taken until we have had the opportunity to consider it. But perhaps today I can share with you some provisional thoughts about where we are heading.
The crucial options are, as you know, called green and gold. Green means publishers are required to make research openly accessible within an agreed embargo period. This prompts a simple question: if an author’s manuscript is publicly available immediately, why should any library pay for a subscription to the version of record of any publisher’s journal? If you do not believe there is any added value in academic publishing you may view this with equanimity. But I believe that academic publishing does add value. So, in determining the embargo period, it’s necessary to strike a suitable balance between enabling revenue generation for publishers via subscriptions and providing public access to publicly funded information. In contrast, gold means that research funding includes the costs of immediate open publication, thereby allowing for full and immediate open access while still providing revenue to publishers.
There are versions of the green option - creating a single and very short standard embargo period for all publicly funded research - which may not work. There are, of course, differences between disciplines. In the arts and humanities, and to some extent the social sciences, the half-life of an article is longer than in, say, medicine. One of the key features of a science is that all previous knowledge is embodied in current theory, so older work may be honoured but does not have to be read. Literature, philosophy and historical writing are very different: it would be very peculiar to think you did not need to read Shakespeare or Kant or De Tocqueville. An article in the humanities or social sciences may be essential reading for much longer, so a green open access policy would have to take account of this difference. That is why our Research and Innovation Strategy referred to public access at “or around” the time of publication. This point was also addressed in the draft policy from RCUK - and we will reflect on it further in the light of Janet Finch’s findings.
Meanwhile, the Wellcome Trust has been rightly praised for its decision to move to gold open access. As the Research Councils have proposed in their own draft policy, Wellcome are treating the costs of publication as an inherent part of the research process. This is the key feature of the gold option and again recognises the need to pay for publication and editorial functions.
As we await Janet Finch’s advice, perhaps the most important question is how we get from here to there. We have to work out, with you, how to manage any transition, and it’s particularly tricky in an open international environment. The UK has many of the leading academic journals globally - of the world’s 23,000 peer-reviewed journals, 5,000 are published from the UK - but we do not have so many of the world’s academic libraries and research institutes. This means our journals are an important export industry, with perhaps 80% of their revenues coming from sales abroad. Let me put this crudely. At the moment, American and Chinese libraries have to pay for journals containing the results of our scientists’ research. In future we could be giving our research articles to the world for free via open access. But will we still have to pay for foreign journals and research carried out abroad? If so, would we not only have undermined a business model but an export industry too? And there could be collateral damage for our learned societies, a very important part of our research ecosystem, as many rely on journal subscriptions from abroad for part of their income. Those chauvinistic statistics with which I began this speech show that 94% of articles are published outside the UK, and we might still need to pay to access them while giving them ours. If so, there would be a clear shift in the balance of funding of research between countries. That is why, together with representatives of the academic community, I will be encouraging international action. Indeed, this is an area where if we can work together on an agreed approach we can then take a lead internationally and shape the debate.
I am pleased to report therefore that representatives of the European Commission will be coming to BIS very soon to discuss open access. We share common objectives with the Commission and want to ensure that a sustainable strategy is developed for Europe as a whole. I will also be discussing the whole issue with colleagues beyond the EU. Fortunately there is already a lively debate on these issues in the US, and we hope they will be implementing similar initiatives. The US Committee on Economic Development, for example, advocates building on open access initiatives taken by the National Institute of Health, arguing that the costs involved are outweighed by the economic benefits derived from greater utilisation of research.
There is another group we need to think about who do not conventionally have a place at the table. Earlier, I referred to my experience as someone outside the academic system trying to access academic research. The gold option would have helped researchers like me without a conventional academic post. But academics sometimes forget about a second type of outsider - someone who is conducting academic research without an official academic post but who legitimately expects to publish it in an academic journal. In future they will meet a new obstacle if they need to pay to be accepted in an academic journal. This is quite a significant issue in academic disciplines such as local history, where we have a rich literature generated by many highly skilled amateurs. I know some publishers already respond flexibly to this issue, for example through fee waivers for lone scholars, and I am keen to know if this might offer a sustainable way forwards.
We also need to take account of the rapid technological change now underway. We are seeing an extraordinary surge in the sheer volume of data being released. It all goes back to Moore’s Law. It was originally a proposition about transistors - that their capacity would double every two years. But in its modern form, the law includes improvements in software too. In this form it says that computing capacity doubles every 18 months. And far from falling behind on this we appear, if anything, to be speeding up - with some estimates that capacity is now doubling roughly every 13 months. While processing speeds may have increased a thousand times, another estimate is that better algorithms have improved performance 43,000-fold. In his 2000 book The Age of Spiritual Machines: When Computers Exceed Human Intelligence, Ray Kurzweil shows the power of Moore’s Law by applying it to the well-known story about the origins of chess. When the creator of chess demonstrated it to the king of Persia he was so impressed by the new game that he invited the inventor to name his reward. The man is supposed to have said he had a simple request: just one grain of rice for the first square of the chessboard, two grains for the next square, four for the next, eight for the next and so on for all 64 squares, with each square having double the number of grains as the square before. The king readily agreed, and for the first part of the chessboard, it was fine. But by the time they had gone to the second half of the board the squares were going to require more grains of rice than were available in his entire kingdom. This is the power of compounding. We are now experiencing this with IT.
The experts say that we are now reaching the second half of the chess board with enormous amounts of data now being generated - as with gene sequencing for example. The transformation of our capacity to handle very large amounts of data is going to have revolutionary effects. Some of the conventional distinctions between micro and macro analysis will erode as macro analysis starts emerging from very large volumes of micro data. I am persuaded by the argument that we are going to see a new era of data-intensive scientific discovery. And its implications spread way beyond academia to business. Instead of physically building prototypes, we can model them on computers. This is one reason why we invested a further £150 million in high performance computing. As they say, we must out-compute to compete. Our skills in software writing and computer simulation give us a real advantage.
Data mining is becoming an important part of scientific advance, with computer scientists working collaboratively with researchers and publishers to develop the necessary tools and technologies. With well over a million academic articles every year, researchers wanting to keep abreast of developments in their field are going to need analytic tools just to know where to start. There are proven benefits for humankind from text and data mining, such as the discovery of new treatments for Alzheimer’s. So we are considering how to advance UK capability in data mining in the light of the recommendations on intellectual property from Ian Hargreaves. Again we recognize that everyone adds some value here and we need to move forward in a way that shares the gains from text mining between everyone who contributes to it.
I know that publishers share an appreciation of the potential of these technologies, and that some of you are supporting developers in creating new applications. I would like to thank the publishers who have engaged with the Hargreaves process. The Government wants to see an environment which enables researchers to use datasets from a number of different publishers without undue costs or obstacles - and without undermining research publishing.
Finally let me update you on how other Government agencies are approaching the challenges ahead. They too will wait for Janet Finch’s report - there will be no final decisions until then. Recently, RCUK set out some ideas in draft and caused some excitement around the world. Actually I take some encouragement from this as it shows we can lead the debate and do have the capacity to shape international discussions. Like all of us in Government the Research Councils recognise that we need academic publishing as part of the research process. To enable greater public access to Research Council-funded research information and simplify networking between researchers and SMEs, the Councils are now investing £2 million in the development of a UK “Gateway to Research” portal. Set to open next year, the gateway will enable users to establish who has received funding and for what research. It will provide direct links to actual research outputs such as data sets and publications. They are already working to ensure information is presented in a readily reusable form, using common formats and open standards. I am delighted that Jimmy Wales of Wikipedia will be advising us on these common standards and helping to make sure that the new government-funded portal for accessing research really promotes collaboration.
HEFCE is also considering the issue. Peer review and assessment of impact are crucial to their allocation of research funding. The debate on open access will inform HEFCE’s planning for the research excellence process that succeeds the current one which concludes in 2014. Open access could be among the excellence criteria for qualifying articles in the future. We are developing a coherent strategy, with the decisions reached by the Research Councils and HEFCE meshing with the Hargreaves proposals.
I hope this has given you some sense of the approach we have taken in Government. Delivering transparency and access while protecting legitimate business models is an important challenge, and we are tackling it. When we came to Government we immediately set up Hargreaves. We have accepted his report and are acting on it. Then we had the roundtable on open access after which I set up the Finch group. We explained our overall approach in the Research and Innovation Strategy. Government, the Research Councils and HEFCE have all been working towards a common policy, and we have tried to maintain an open dialogue with the publishing industry throughout. There is a lot going on.
Although we all are aware of the tricky issues around moving to open access, we must not lose sight of the big prize here. Open access is not an end in itself, it is a means to an end. That end is improved popular involvement with the quite extraordinary output of our research community. I do not want to see science and humanities further removed from the so-called general reader. I was very struck by comments from Alice Bell of Imperial in the Times Higher that this could be a real opportunity to break down some of the barriers between academic professionals and the wider community. That is happening already with the surge of high quality popular science writing. I see it as well in the flourishing of science festivals and literary festivals. We welcome this. I hope that greater public access to research can be part of this very healthy trend in our culture. At the end of this we can hope to see science and research brought closer to the wider public.