Category Archives: #OpenKnowledge

From Big Data to Insight

The Institute’s Communications Assistant Hanna Heiskanen blogs about a recent event on Big Data.

The How We Prepare for a Future of Big Data? event held at the Finnish Ambassador’s Residence in London on 30 October gathered together a prestigious panel of big data experts as well as a knowledgeable and active audience. The event celebrated the recipient of this year’s Millennium Prize, Professor Stuart Parkin, whose innovations have played a large part in raising big data to the prominent position it has today.

The 1 million euro Millennium Prize has been awarded every other year since 2004 by The Technology Academy Finland. The Technology Academy Finland incorporates the Finnish Academy of Technology, the Swedish Academy of Engineering Sciences in Finland and the Industry Council, which represents leading Finnish companies. The Millennium Prize is funded by the Finnish state, and its recipients are innovators who have significantly improved people’s quality of life. Past beneficiaries include the inventor of World Wide Web Tim Berners-Lee, and Linus Torvalds of the Linux kernel.

Professor Parkin’s work centres on the field of spintronics and has lead to technological discoveries that have dramatically increased the storage capacity of magnetic disk drives. This in turn has allowed for the evolution of large data centres, cloud services, and other applications that require the processing of large amounts of data. Despite an exponential increase in the appetite for Big Data, much of the discussion around it has focused on the technical aspects of storage and processing data. The issues around interpreting and taking advantage of Big Data remain large and significant grey areas, which was also reflected in the discussion.

According to Professor Parkin, we are approaching the end of a technological era in terms of how data is stored and processed, and financial investment on the science that will allow for the development of these facilities must be increased. New approaches to storing and processing big data are emerging and carry huge potential – examples include storing data three- rather than two-dimensionally as well as Parkin’s research subject spintronics, which could increase storage capacity a hundredfold. The potential problem of increased carbon emissions, the by-product of large computers and data centres, could be solved through building more energy-efficient computers, or through handling data locally by people carry computing power directly on them, as envisioned by the President and CEO of Technology Academy Finland Dr Juha Ylä-Jääski.

The growing need to interpret and understand data before it can be applied came across strongly in the panel discussion. While technology is making more data available for use and is becoming better at interpreting it, the CEO and cofounder of Big Data specialists Mastodon C Francine Bennett pointed out that governments are only just starting to think about possible uses for the large amount of data they are already in hold of. Much of data remains unstructured and thus unusable. Indeed, Professor Parkin called for replacing the word ‘data’, which by itself might be useless, with ‘knowledge’. Global CEO of Social DNA business Starcount and developer of Tesco’s Clubcard Edwina Dunn’s sentiment that much of processing of Big Data is currently simply counting echoed the demand for more insight to be extracted from it and applied to practical use.

Cross-science collaboration was mentioned as possibly offering great benefits in making sense of Big Data. Particularly the humanities could be very helpful in creating context for raw data. Both StJohn Deakins, founder and CEO of citizenme, and Dr Ylä-Jääski argued that much of data requires layering or combining over data sets to reach a better picture of what they are about. Keeping the human element in mind is crucial both in understanding that despite all available data people are not machines and their behaviour is therefore difficult to predict, and in making data accessible and understandable for people, in which visualizing data might prove useful.

The gap between what can be done with Big Data and what should be done with it remains, in the words of Edwina Dunn, a form of art and is a particularly difficult issue from a legislative point of view. According to StJohn Deakins, the key to both more accurate and insightful data and to people’s willingness to share their data lies in creating a reciprocal relationship between the individual and the party holding the data. Dunn argued in the same vein that instead of collecting data behind people’s backs, businesses and organisations should aim to build a relationship of transparency and trust with them. Her rule of thumb for the use of Big Data is that it must always benefit the individual, too, not just for the party holding it. If done right, such a two-way relationship could bring phenomenal leaps forward. Dunn praised the Finnish tax system, which combines data from tax officials, banks, and the individuals, as an example of a successful data sharing relationship. Professor Parkin, on the other hand, brought up Facebook as an example of how people are willing to give up a very large amount of data about themselves as long as they feel they benefit from sharing it. He also suggested that people would feel more involved in society were they aware that revealing more data of themselves would be of general benefit.

The danger of breach of privacy predictably emerged as one of the main threats associated with using Big Data. Edwina Dunn remarked that as the customer is the ultimate judge on a company’s actions, brands value the element of trust most highly. Misuse of data can at worst lead to the removal of the permission to use it. While people should be aware of the data they are giving away and take precautions by for example encrypting it, the main responsibility of the security of data should still lie with the businesses and organisations in hold of and using it, StJohn Deakins said. This highlights building a relationship of trust that is based on reality between the individual and society. The misinterpretation of data was seen as another threat, leading potentially to the wrong conclusions and action taken based on it.

All in all, however, the panel was optimistic about the future of Big Data. Professor Parkin encouraged taking more advantage of scientific open source platforms, which would help accelerate the pace of innovation. He also argued that many of the future benefits or developments of Big Data would be impossible to foresee, just as it would have been impossible to envision the technologies we use today 30 years ago. Parkin nevertheless predicted that education would take huge leaps forward, with more people gaining access to education online, and education becoming more individualised. Francine Bennett identified the utilities industry as a potential field for Big Data. She also recommended combining government data over silos more often than is done at the present. Both Edwina Dunn and StJohn Deakins said that the advertising industry would benefit greatly from Big Data by gaining more insight of people, and therefore becoming more relevant to them. In fact, Deakins argued that in the future advertising as such might disappear completely and be replaced by providing information that is of benefit to the individual. 

The concept of MyData, introduced at the first Open Finland seminar in September, could offer new tools for tackling some of the issues brought up in the discussion, in particular privacy and creating a mutually beneficial relationship of data sharing. It also aims to highlight the active role of the citizens in taking advantage of the data that is being collected of them. You can access the report of MyData, which was commissioned by the Finnish Ministry of Transport and Communications and produced by Open Knowledge Finland, here.

Wikimania 2014 Highlights

Sampo Viiri from the Finnish Institute blogs about Wikimania, the annual event of the Wikimedia movement, which was held in London this year.
Wikimania, the official annual event of the Wikimedia movement, took place in London 8–10 August 2014. The event was an interesting mix between a conference and a festival, including over 200 speakers in 8 simultaneous spaces inside the Barbican Centre, with fringe events and hackathons running during the event and preceding it. Most participants seemed to be active Wikipedia editors, as Wikipedia is obviously the most well known Wikimedia product. The British have been active Wikipedians, the UK producing about 20% of all English language Wikipedia articles.
Wikipedia is now the world’s sixth most visited website and the British people trust it more than the BBC News. Despite the success of Wikipedia, the project faces a number of challenges. The majority of  Wikipedians are tech-savvy western men, which may impose a bias to the selection and style of articles, despite an ethos of neutrality. Wikimedia has so far “failed miserably” in fixing the gender imbalance, which requires new measures.
The English language encyclopedia also thrives compared to smaller languages. During the conference I met Finnish Wikimedia representatives and activists. It seems that in conjunction with the overall trend, Finnish Wikipedia is also struggling with declining editor numbers. Nevertheless, it was very interesting to hear about the development of different local Wiki projects, and for example about the slightly bizarre situation between Wikimedia and the Finnish fundraising law and its interpretation.
Image by: Sampo Viiri, CC BY-SA 4.0
I can admit that I’m not (yet) an active Wikipedia editor, and was mainly interested in the numerous sessions on open data and open scholarship, especially from a cultural heritage sector perspective. Wikimania demonstrated that Wikimedia projects are very much connected with the open knowledge ethos, manifested for example by the strong presence of Open Knowledge (Foundation) and the Open Data Institute. There were lots of relevant sessions and a whole programme track about the GLAM (galleries, libraries, archives, museums) sector. The sessions demonstrated how open cultural heritage data can be used improving Wiki projects and also how active Wikimedians can be a huge asset for these organisations.
The Wikimedian in Residence programme is particularly interesting. Participants in the programme dedicate time to working in-house at an organisation, usually financially compensated by the institution or by a Wikimedia chapter. Besides editing Wikipedia, they enable the host organisation to continue a productive relationship with the encyclopedia and its community after the residency. The Wikimedian in Residence model was first piloted by the GLAM initiative of Wikimedia, but has since been adopted by other types of organisations too. Wikimedia UK has recently released a review of the programme and a video documentary about the project.
Another noteworthy GLAM project was the German project Coding da Vinci, where the local Wikimedia chapter teamed up with Open Knowledge Foundation Deutschland, digiS – Service Center Digitization Berlin and Deutsche Digitale Bibliothek, organising a hackathon to make innovative use of open cultural data. The results included for example an iOS app that visualises the historical development of Berlin as an interactive map.
Wikimedian in Residence is quite similar to the Finnish Institute’s Mobius programme, a fellowship program for museum and archive professionals in the UK, Ireland and Finland. As open data and knowledge are some of the Institute’s focus areas, perhaps in the future there could be collaboration for open data fellowships too.
Some other interesting projects visible in Wikimania were Wikimedia’s own Wikidata and the Wikipedia Library, and the UK based projects ContentMine and Histropedia. It has to be noted that these are just some highlights from a variety of interesting initiatives.

Wikidata is a blooming project, a free centralised knowledge base for structured data that can be read and edited by humans and machines alike. While Wikimedia Commons is for storing images, sound, video and other media files, Wikidata is a similar service for machine-readable data. Wikidata could benefit all kinds of projects but in my opinion one important possible use would be to become a repository for academic researchers who need infrastructure for storing their digital research data.

In the United Kingdom the copyright law was altered in June 2014 so that it allows non-commercial researchers to legally copy material for a text and data mining analysis. ContentMine is a new initiative that exploits this new possibility. They use machine-reading to liberate 100 million facts from the scientific literature and make them free for everyone in Wikidata. This can enable new and exciting research, technology developments such as in artificial intelligence, and opportunities for wealth creation.

Histropedia is a new UK based project that made its public debut at the conference. It is an interactive tool to display historical events and described as “a combination of maps, timelines, and trends”. The site pulls data from Wikidata and Wikipedia plotting events on a timeline which is navigated with simple interface. (Demonstration in Guardian website.)

Open publishing is an important topic for Wikimedia and the whole academic community.  The publishing industry naturally wants to make money in their playground, and can be a bulwark against reform. Jonathan Gray from Open Knowledge argued that innovation in open publishing needs to come from the researchers, not publishers. Wikimedia has the Wikipedia Library project to fund active Wikipedians getting access to paywalled databases. Researchers could also be more active in editing Wikipedia, like the American historian Stephen W. Campbell has argued.
Open data and open knowledge may sound like an ideal state of affairs but the Wikimania conference reminded that these are controversial concepts and constantly being fought over. Both Jimmy Wales, the co-founder of Wikipedia and Lila Tretikov, the new executive director of Wikimedia Foundation made a strong argument against the new “right to be forgotten” EU legislation that gives people power to remove irrelevant or outdated information about themselves. They argued that the legislation is immoral and would lead to censorship and appearance of “memory holes” in Internet. Quoting Wales: “History is a human right and one of the worst things that a person can do is attempt to use force to silence another.” This is certainly an interesting topic that provokes different opinions, like the Jimmy Wales’ interview on BBC Newsnight perfectly demonstrates.
Participants gathered on the Barbican courtyard. Adam Novak CC BY-SA 4.0

Some Reflections from the Open Knowledge Festival 2014


Antti Halonen, the Finnish Institute’s head of society programme, blogs about the recent Open Knowledge Festival and Institute’s future plans.
The Finnish Institute in London were privileged to organise the inaugural Open Knowledge Festival together with Open Knowledge Foundation (recently re-branded as Open Knowledge) and Aalto University in Helsinki two years ago. It was therefore both important and extremely interesting to attend this year’s edition in Berlin and witness how both the festival concept and the international open knowledge community had evolved in two years.
And evolved they have. In only a couple of years Open Knowledge network has expanded into 56 different countries, the Finnish chapter Open Knowledge Finland amongst them. In my mind the work of Open Knowledge has always been based on pragmatism, intellectual honesty and thrive for collective action, which is likely to be the key to this recent success: merely pointing at bad things and saying how bad they are is not a sustainable way of achieving any positive change.
The ethos of sharing was as prominent as ever throughout the festival programme. “The more you share ideas – the more others can build on them”, was the message given by Neelie Kroes, Vice-president of the European Commission, in her keynote speech. In this spirit, we will use this blog as a platform to share a couple of ideas we’ve been contemplating at late.

Firstly, the Finnish Institute is starting a project which looks into the visibility and impact of contemporary art in society in Finland, the UK and Ireland. More of this project can be read in this blog later. Presumably there would be plenty of possibilities of intertwining this project into the work we’ve done on open knowledge.

The Finnish Institute’s work on open knowledge dates back to 2011, when we compiled a report on the development of open data policies in the UK and subsequently started to promote the subject in the Finnish societal discussion. Recently our focus has gradually shifted towards the role of openness in cultural sector organisations, such as galleries, libraries, archives and museums. In the world of Open, they are collectively known by the acronym GLAM.

OpenGLAM offers intriguing opportunities for a cultural institute like the Finnish Institute. This is partly because our mission is to apply methods of social sciences and arts in order to identify emerging issues in contemporary societies and take thinking of social challenges and cultural practices in new, positive directions. We recognise the immense value of cultural data that may still lie behind barriers of accessibility and understandability and work to raise awareness on the importance of public domain. In this regard OpenGLAM offers a huge potential in both enrichening the arts and culture sector but also in the very key societal questions, such as education (museum pedagogy) and quality of decision-making (access to archival material).  

According to the festival session Maintaining a healthy and thriving public domain – exploring the notion of originality and copyright when digitising analogue works, there is an increasing need to encourage culture sector organisations to release those contents that should legally be in public domain under actual public domain licenses. This is not always the case: many organisations apply restricting licenses to contents that should be placed in the public domain which causes both confusion and also at some scale frustration for open knowledge practitioners.

One suggestion of how to encourage GLAMs included generating a rating system similar to five star open data model. However, it is worth asking whether a rating system would in fact discourage culture sector organisations of releasing their contents, as they would be afraid of getting bad results despite a genuine will to be open. Arguably it would be better for the organisation to have no mention of rating at all than to have ⅕ stars.

Therefore, it seems that there is a demand for creating new methodologies of evaluating the value of public domain for society at large and most importantly for GLAMs themselves. In this work we could potentially apply both the existing work of Open Knowledge and their OpenGLAM Benchmark Survey and methodologically, our own upcoming research on the significance of modern art. Having said that, we would also like to know if there already is a widely accepted method of evaluating the qualitative and quantitative value of public domain for culture sector organisations, or if such an evaluation is not considered necessary.

Secondly, after several discussions with the delightfully plentiful array of Finnish contacts at Open Knowledge Festival, it emerged that there would be a real demand for:

a) strengthening the international ties in the field of open data and open knowledge research

b) giving young Finnish open data researchers / practitioners an opportunity to work a short while in the UK, which is recognised as one of the leading European countries in the field of open data.

There is an intriguing opportunity to look into possibilities of creating an open data fellowship program that would possibly intertwine with our existing fellowship programme for museum and archives sector professionals. Mobius-fellowship offers Finnish, British and Irish museum and archives professionals an opportunity to spend a three-month period in an international partner organisation. For a young Finnish open data practitioner, for instance, it might be useful to be able to spend a couple of months in the UK and to work within the British open data community.

These are not finalised programme plans, but merely ideas what the Finnish Institute might do in the future regarding open knowledge. In the name of openness, if you have any comments, suggestions or ideas, we’d be happy to receive them. Similarly, we’d be delighted should you wish to start a project of your own based on these ideas.

Digital Humanities and Future Archives: Upcoming Survey by the Finnish Institute

Sampo Viiri from the Finnish Institute blogs about the Institute’s upcoming survey on Digital Humanities.

Digital Humanities has been a buzzword of the humanities field in the last few years. By including digital methods, traditional humanistic research may ask genuinely new questions and transform the whole field of study. Digital Humanities (abbreviated DH) has even been labelled as a saviour to the field where tightening budgets and limited research funding leave scholars in desperate need to demonstrate their value to societies.[1] It may increase the humanities’ value for societies and also involve mass participation by the public in projects traditionally done by lone scholars. The hype around Digital Humanities has led to a certain new interest towards humanities but also created some confusion when everybody may want to label their research as Digital Humanities simply to get research grants.
So what really is Digital Humanities? Articles and even whole books have been written on the definition and history of Digital Humanities[2]. The simplest broad definition would be that Digital Humanities involves the use of digital tools in research, teaching, and scholarship in humanities disciplines. This is however not enough. Nowadays every humanities scholar uses computers in word processing, reading texts, finding references or communication. That still doesn’t qualify as Digital Humanities. “Digital humanities is what digital humanists do.”[3] This quote describes perfectly the confusion and frustration in defining digital humanities.
One common thing about various Digital Humanities projects is building things. Whereas traditional humanities scholarship usually outputs a text, in Digital Humanities the output can be a database or some other piece of digital infrastructure. If you need to build things to be a Digital Humanist, do you also need to code? That is another question which has raised different opinions.[4]
Also the ‘humanities’ part of the concept is controversial. Digital Humanities seeks to cross and redefine the borderlines among the humanities, the social sciences, the arts, and the natural sciences. DH projects are usually collaborational and in many cases also cross-disciplinary. By inventing new forms of inquiry, DH may expand the scope and quality of research and reach new audiences for humanities studies.


Digital Humanities have a history of many decades but the methods and objectives have changed a lot with the rapid technological progress in the last years. The internet itself is also constantly evolving, and the speed of computing increases and the prices decrease, opening new research possibilities. A decade ago Digital Humanities was still much about digital texts, nowadays sound, images and video have been more incorporated in the field.  


The main objectives of the Finnish Institute’s survey are to find out:
  • What kind of research has been done in the UK and Ireland on Digital Humanities and how is the discipline likely to evolve?
  • What is the status quo in Finland and what kind of practices should be brought in from the UK?
  • Focusing on the fields of history and the archives, we want to know how could study of social sciences be supported via methods of Digital Humanities?
  • In more precise terms, what could the future archives be like and how can disciplines such as Digital Humanities facilitate innovative archival practices?
The Digital Humanities field is constantly on the move and there is a steady flow of blog posts and tweets. The hype around Digital Humanities is so strong that it’s important to adopt a critical view on the research projects. We try to collect some information what has been the societal impact of Digital Humanities projects and what have been the costs and benefits. By reviewing some of the challenges we can improve the quality of future projects, avoiding falling into some easy pitfalls.
This survey is based both on existing literature and web discussions about digital humanities, as well as interviews with professionals associated with the field. The interviews contribute the most contemporary material, which is really useful in the field of DH where a book published a few years ago may feel terribly outdated.
Researchers, the government, the GLAM sector and the broad public view the field from different standpoints. We aim at finding good projects for each one. The survey tries to answer some existing questions as well as poses new questions and inspires discussion.
Should you wish to take part in the survey or should you know an innovative DH project we ought to know about, please contact us @SampoViiri or

[1] A very optimistic view can be found for example in the introduction of the handbook Digital_Humanities (2012).
[2] Defining Digital Humanities. A Reader (2013). Terras, Melissa, Nyhan, Julianne & Vanhoutte, Edward (eds.) Farnham: Ashgate. Day of DH also asks this each year from new participants.
[3] Quote by Rafael Alvarado. Reprinted in Debates in the Digital Humanities (2012).
[4] Gold, Matthew (2012). The Digital Humanities Moment. In Debates in the Digital Humanities (2012).


Opening Cultural Data: Inspirations from Britain and Finland

Sampo Viiri blogs about the Finnish Open Cultural Data course’s study trip to London 15–16 May 2014.

The GLAM organisations (galleries, libraries, archives, museums) have in their collections enormous amounts of cultural data that have so far been accessible mainly in physical format. Nowadays the Internet makes spreading knowledge digitally much easier. The Open Cultural Data course, organised by Open Knowledge Finland, aims to provide tools and skills for the Finnish GLAM professionals on how to digitally open their organisations’ cultural data to broader audiences. The Finnish Institute organised workshops in London 15–16 May as part of the course that has been held in Finland this spring.
Thursday 15 May
The group gathered at Mozilla’s London office where Paula Le Dieu and William Duyck explained about the activities in which Mozilla UK participates. Mozilla is a non-profit organisation heavily based on volunteer participation. The idea of open source and openness are at the core of all Mozilla activity. Mozilla’s main product, the Firefox browser, has championed open source principles and nowadays all the main browsers embrace certain openness standards. One of the reasons why the US based Mozilla decided to open their London location was because the principle of open web was already firmly rooted in Britain. For example the BBC has a public role of making content available to everyone. The government, creatives and the GLAM sector were all enthusiastic regarding to Mozilla’s ideas of openness, so the UK was a fitting base to build open source tools.
We discussed the main obstacles for GLAMs to “push their archives through the door”. Opening data requires new skills and also new attitudes, and sometimes there is a fear that opening data leads to lost profits even though the reality can be quite the opposite. Content is also only valuable when combined with expertise. This is why there needs to be good communication between the GLAM professionals and the technology infrastructure providers.
One of the initial reasons to kick off Mozilla was that by the end of the nineties, the basis of the Internet had changed. For the early geeks the web had been inherently interesting and exciting. They had gotten used to poking things and learned how to “make the web”. For the second wave of Internet users the web was just a thing to consume, which required a different kind of approach.
Mozilla’s Webmaker project aims to make the broad audience once again participant in “making the web” by providing easy-to-use tools. William Duyck presented the Webmaker and various different tools that Mozilla has developed for the public. This is something highly relevant for the GLAM sector and cultural data as well. By embracing the broad audience and making them participate the GLAMs could open their data with less costs than by making everything themselves.
Mark Hedges and Stuart Dunn from King’s College London’s Department of Digital Humanities continued on the community participation theme. They have done research on different crowdsourcing projects in the field of humanities. According to them the GLAM sector’s engagement in digital methods has typically led to enhanced experience in exhibitions and adding certain extra to the collections. However, recently crowdsourced methods have also contributed to creating genuinely new knowledge and interpretations to the material.
Mark and Stuart showed several interesting case examples where crowdsourcing has produced, for example, more detailed collection and object metadata, retroactively corrected OCR (optical recognition software) texts of archival material, the narratives of social media, and creative input from the visiting public itself. One of their findings has been that most high profile and successful crowdsourcing projects have been coordinated by the GLAM sector, not by universities. Galleries, archives and museums have always been public facing, meaning that their engagement in crowdsourcing would be quite natural. 
Last but definitely not least, Adam Green from the Public Domain Review introduced the benefits of public domain for the GLAM sector. The Public Domain Review is an online journal and not-for-profit project dedicated to promoting and celebrating the public domain in all its variety. Adam showed some interesting material he had collected on his website and different online projects based on non-copyrighted material. One of the best aspects of public domain is the possibility for the public to recreate and remix the material in whichever way they choose. This can lead to a feeling of collectivity and participation. We discussed the different possibilities and problems with copyrighted works and pondered how public domain material could also be used to create profits. The group concluded the day by making some GIF animations from stereographic images.
Friday 16 May
Friday’s sessions were held at the British Library, where Nora McGregor, Anna Vernon and James W. Baker from the library’s Digital Research team enlightened us about their interesting work with digital collections. The Digital Research team works with already digitised material, inventing new ways on how to use the library’s vast collections. Perhaps their most well-known project is the British Library’s online images on Flickr, where the library has published millions of pictures from their scanned books. Nora explained about the quite ambitious move when they decided to publish the pictures without knowing what might lurk in the depths of the collections. Sometimes you need a certain bold attitude to open cultural data. Anna Vernon introduced us to the different copyright aspects that need to be dealt with when publishing the collections online. The rules vary by country but the common EU legislation also opens possibilities for collaboration.
One important topic raised in the discussion is that because Finland is a small country and it doesn’t have big organisations like the British Library, we need collaboration in the GLAM sector to fully use the potential of digital collections and data. Hopefully this spring’s course will lead to new ways of collaboration in Finland and maybe also with other countries’ organisations.
Lastly James W. Baker held a workshop where the group invented new ways of using digital tools on imaginary collections. As James said, it is easy to get obsessed with data but what really matters is the research and how to use the data. This is where I believe the GLAM professionals step in. When the professional staff and also the researchers develop new digital tools and skills together, there are huge possibilities in humanities research. This is a topic in which the Finnish Institute in London will also continue to look into this year by conducting a survey on new trends of digital humanities in the United Kingdom, Ireland and Finland.
Overall, the two days were full of excellent presentations and the atmosphere was very enthusiastic. Personally, my head was buzzing with new ideas and knowledge and my notes full of scribbled remarks and future questions. Always a good sign.
More information (mostly in Finnish):
Twitter hashtags: #kulttuuridata #datakoulu

%d bloggers like this: