Marjorie Buchser
Ladies and gentlemen, thank you very much for joining tonight webinar on artificial intelligence, big data and the current health crisis. So, this is very much a multidisciplinary topic and, as such, this event is hosted in collaboration with Chatham House Digital Society Initiative, as well as Chatham House Centre for Universal Health. My colleague, Claire Munoz Parry, who is the Assistant Head of the Centre, is going to come moderate this event with me and also, jump in in case my connection drops, which is always a possibility in the case of a webinar.
So, tonight, we’re very fortunate to have an esteem international panel and it’s my privilege to welcome tonight David Aanensen. He’s the Director of the Centre for Genomic Pathogen Surveillance with the Oxford University, Marietje Schaake, who is the Policy Director of the Cyber Policy Center at Stanford University, and Stefaan Verhulst, who is the Co-Founder of New York’s – New York University Governance Lab.
So, as a result of COVID-19, a new debate has emerged on whether privacy preserving regulation and other ethical standard should be set aside during the emergency to enable more efficient responses, and in many part of the world, the crisis – the healthcare crisis really has accelerate and fast-forwarded the deployment and the adoption of surveillance technology to monitor the infection rates, to manage the pandemic and also, to enforce lockdown measures.
Now, while it’s the nature of emergency really to fast-forward historical processes, it’s also essential to ensure that the accelerated digitalisation that we’re witnessing today is not misused and it is also deployed for the benefit of all. And so, tonight our esteemed panel will discuss technology-driven solutions against COVID-19, but also, look at the ethical and legal issues that arise when governments fast-forward technology solution, in the case of emergencies and pandemic.
We want to start, however, this discussion with an interactive element and a quick poll to understand from you what you think about this issue. So, the poll is based on the infographic that you should soon see on your screens, so Myer, if you can put up the infographic, and this is also the poll, which basically ask you what do you think are the ethical and legal issues, regarding the deployment of digital technology? Do you think it’s transparency? Do you think it’s digital access? Do you think it’s consent? We’re going to keep the poll up for a few seconds, so you can vote on it, and it’s going to help us inform the debate and our panellists are going to take into account your perspective and we’re going to do our best to address your worries. So, I’m going to leave it up for a few minutes. Unfortunately, panellists cannot vote on that, so that’s only your call. Just to recap and to consider that for the fellow panellists, I believe that consent is definitely one of the main concern of our participant today, followed shortly be transparency, accountability and re-purpose of the information and data. I mean, it’s quite balanced, but I would say these are the four top questions that I invite you to consider in your remarks.
And, Stefaan, I’m going to start with you, because you really are the intersection of innovation, governance and technology and obviously, during the pandemic, and we’ve seen the really need to collaborate and share information, healthcare information especially, so could you tell us more about what’s the situation when it comes to data collaboratives and what the – are the initiative and efforts that you’ve seen in the space recently?
Stefaan Verhulst
Great, thanks so much for having me and thanks for the question and, indeed, COVID-19 is not only a human tragedy, it’s also a moment of profound uncertainty and, as a result, anxiety. We are uncertain, with regard to what is the real scale of the pandemic. We are uncertain, with regard to what are the options moving forward, especially with regard to, for instance, opening up the economy. We are uncertain to what extent is the impact of, for instance, remote education and so on and so on. And so, we see a profound uncertainty because obviously we are entering really, new territories that we haven’t experienced before and that’s where data and artificial intelligence comes in, because to a large extent, and that’s also what the Govlab has been focusing on for the last few years, to a large extent, data is a tool to really provide with options that can limit uncertainty, and with regard to decision-making, both at the government level, but also, uncertainty at the individual level, knowing what to do under certain kinds of uncertain conditions should be data-driven.
Now, data can be used to inform the situation and we have seen a lot of that happening as well. There is no shortage of data visualisations at the moment that have been presented. But, more importantly, data can also provide for a diagnostic of what makes a difference? What’s the relationship between X and Y, in order to really then understand the root causes of certain kinds of behaviour and certain kinds of events? And then, more importantly, it can also allow us to predict what might happen moving forward, and we have seen a few of those predictive models emerging as well that are obviously very important when you have to make decisions with an uncertain future. And then, lastly, and then that’s the fourth value proposition that we see, with regard to the use of data as it relates to COVID, but in generic aspects, with regard to decision-making. Lastly, data can, of course, be used for impact assessment, and I think what is really clear, from this pandemic and the way we respond to it, is that it has to be iterative.
It’s almost like an ongoing experimentation, on a daily basis, on what works, what does not work, and in order to assess what works in an iterative manner, we do need to have a data infrastructure and that’s where, from my point of view, one of the tragedies emerges, which is also why we have issued a call for action. Because despite the fact that we now, for the last 15 years, have heralded the arrival of big data and the fact that we are in a so-called data age, we somehow have not managed to really connect the vast amount of data that is being collected and is somewhere archived with actually the demand side, in order to become smarter about the pandemic and become smarter about what are the options, moving forward?
And so, from that perspective, we can see that actually, the fact that we have no real framework to reuse data in a responsible and ethical manner, to a large extent, that’s also a kind of an ethical challenge that we have as well, which is what’s the ethics of not being able to reuse data? And, towards that end, we’ve done a lot of work on really trying to figure out what are the challenges for reuse of data, data collaboration between the distributed data holders and, to a large extent that involves both public and private sector and the demand side that can act upon the insight, if we would find a way to unlock those data assets. And so, that’s where we have focused on what we call data collaboratives, because obviously, this is not just about dumping data and opening data that could be harmful for individuals’ rights and even harmful, with regard to how it is being used. This is really about a new partnership.
Now, it turns out that establishing those partnerships is not easy and that there are a whole range of transaction costs that we still have in 2020 that we really need to address moving forward, for COVID-19, but also, for future pandemics and especially other areas of decision-making. And so, we have identified seven areas where there is still a lot of transaction costs and especially barriers, in order to establish those new kind of data collaboratives between the data holder and the one that can either analyse the data and make insights, or the ones that can act upon the insights. And so, I’m going to briefly list those seven barriers, where we believe we need to make rapid progress, in order to really provide for responsible and ethical data collaboration, which is exactly what we need in the current pandemic.
The first one, obviously, is a pretty straightforward one, which is that we don’t really have an adequate governance framework to really understand what are the trade-offs and what are the cost and benefits that a particular kind of data collaborative needs to understand. The big challenge with data is that it’s mostly contextual, so you could, indeed, provide common sense for one kind of use, but then when that use changes, then the context changes and you do need to have some kind of a governance framework, or an institution, such as an ethical council, for instance, that can really understand how to move forward in the best possible way. We don’t have that, from a societal point of view, and we need to make progress there. That also means we need to make progress, with regard to data sharing agreements because a lot of this is basically contractual and not just governed by, for instance, a policy framework that is national or even let alone global.
The second challenge, very quickly, is that not only we don’t have really a well-defined governance framework, we quite often don’t have a well-defined demand for data, and one of the tragedies was that, following the emergence of COVID-19, we got asked a lot by the data holders, whether these were telecom operators, or social media operators, or just retail companies, that they wanted to engage with the demand side, but the demand side was not ready to actually engage with the supply side of data. And so, we need to invest in also building capacity among policymakers on what are the questions that matter, according to the different stages of a pandemic? And we really need to also then understand what are the priority questions, as opposed to what we see happening now, which is give us the data and we will figure it out, which turns out to be ethically challenging and especially also not really effective, moving forward. So, we need to invest in the demand.
Thirdly, we also need to invest in actually organising the supply side. We get asked all the time on how do we access data, especially within corporations, not only on what are the governance challenges, but who should we talk to? There is no profession at the moment, within the public and the private sector, that really is dedicated to data stewardship. So, we don’t have individuals that are – that have a mandate to engage with the public sector, for instance, around the data, and, as a result, it’s a massive unknown on who you should actually engage and who has, by the way, decision power, with regard to opening up data assets. And so, we need Data Stewards, as a Chief Data Stewards as a profession, rapidly established. And then, fourthly, we need to connect them in some kind of a network or association that can provide a code for conduct for those Data Stewards, because it’s not only ethical frameworks we need, we also need ethical professions and professional codes that really can inform what the best way forward is.
Fifthly, all of this, of course, needs to happen by engaging people, mean we actually, as a society, have not had a public conversation about reuse of data. We’ve had a lot of conversation about data collection and still, I feel like a lot of their current discussion is focused on data collection for contact tracing otherwise. But reuse of data is really something that we, as a society, need to get right and people need to be part of that discussion. Obviously, we also need to innovate, from a technological point of view, especially as it relates to, for instance, this – the identification, new ways of providing anonymised and aggregated access.
And then lastly, and then I’m going to stop, is that this, of course, is not free and it’s especially not cheap. And so, as a society, we need to embed, like, for instance, if we think about stimulus, we also need to embed basically, funds to allow for a data infrastructure to emerge and we need new kinds of funding mechanisms to do so. So, with that, seven, what we believe are barriers, but also seven areas where, as a society, we need to make progress on, to its ethical data collaboration.
Marjorie Buchser
Thank you, Stefaan. So, that was a good overview about the structure of the challenge, when you come to data sharing. David, I’m going to turn to you now, because you sort of are more on the deep end of technology-driven solution. You look at genomics, and in the context of COVID, that really has been a gamechanger, also because China released quite quickly the sequence of the virus. So, could you tell us a little bit more about how your team is using big data and AI to sort of create and also suggestion and recommendation of intervention at the national or international level?
David Aanensen
Great, thank you. So, firstly, thanks for the invitation. My name’s David Aanensen. I work at the Centre for Genomic Pathogen Surveillance, which is based at the Big Data Institute in Oxford, and also the Welcome Genome Campus, and we have a focus on using genomic methods for pathogen surveillance. So, we really have two strands to our work. One is what can we use genomes for, pathogen genomes, so we can understand pathogen dynamics, surveillance and analytics, and we use integrated data and technology, rapid data sharing, accessible tools and platforms and technology to move towards interpreting this information in, for example, insight into drug vaccine development, real-time analysis, and our real aim is real real-time epidemiology.
So, I have to apologise for the next couple of slides, they’re real back to basics. So this is just a slide that indicates that, within our species, there are lots of different kind of subtypes, different hair colour, different skin colour, different eye colour, different ability to act, depending on who you like here, but we’re sort of largely determined by our genomes, and this is the same for pathogens. So, one particular pathogen species is not a homogeneous entity. It’s existing in different subsets or strains or lineages, and one of the things we can use genomics to do is to try and identify which things are more related to each other and unrelated to other things. And if we combine that with lots of other information, we can interpret the dynamics of the pathogen that we’re interested in, and this could be done on lots of different data scales. So, for example, in a hospital, we might want to know who’s infecting who, or in a healthcare plant, or is there an outbreak of a food-borne pathogen, etc.? And then, can we scale that up to nationally between these initiatives, or in the community? Internationally, to European levels, and I appreciate the irony of the UK still being within the European slide, but we’ll move on from that. And then, ultimately, globally, so can we understand this information globally and how things spread?
So, there’s a huge amount of information that we need to bring together from that side, and with genomics, what we do is people sample a bunch of pathogens at whichever level, you run them through a genome sequence, you get your ACTGs out, and then you can represent the similarity between all of those samples, using this kind of classic tree, family tree-based way of describing it, and then you can map other information onto the things that are more similar to each other, to address questions at any of those levels, and this gives us an operational unit of surveillance, which is the similarity between any two genomes from person A, person B, institute A, institute B, etc. And one classic example is, for example, in anti-microbial resistance, if you sequence a bacterial pathogen, you can predict what it’s resistant to potentially, from the presence or absence of genes.
So, as I mentioned, there’s a huge amount of information there on the left-hand side. Stefaan has touched on some of – you know, all of the information outside of what you’re sampling from, the host, but actually movement patterns, etc. And bringing that information together with the genomics demands the development of new, innovative systems that can deliver and distil that information, using novel methods to this person sat on the end, and that person could be locally in a hospital, it could be somebody from a national public health institute in, for example, Public Health England, European, the ECDC level, or ultimately up to the WHO. So, one of the things we focus on in our team is building some of those applications and implementing them.
So, this is a couple of screenshots from the software. This particular top left example is some data from the recent previous Ebola outbreak, and these, kind of systems need to be dynamic. So this is geographic location of that outbreak, this is a phylogenetic tree of the similarity in the genomes, and the bottom is the timeline, and then we can run – use this kind of data to rollback and then investigate what has happened during that outbreak or pandemic and to try and understand and mitigate future events. These are the kind of things that one can achieve with this kind of data. So, these tools need to also be transferred, the ownership of these tools and interpretation to the locations and the data providers that can get value from those data. So, local use and ownership, regional, national utility, can you encourage the deposition of data into these kind of platforms by creating value for the data provider? That’s really important and, of course, as you build up larger datasets, this is where we need the new methodologies, AI and ML, for example, for predicting where things are going to occur, which of these lineages are going to move and where they’re going to emerge from.
So, one of the other things that’s necessary is to actually work on technical support for transferring the ownership of these technologies. We’re in really lucky situations that I find myself really privileged to work at places like the Sangam Institute, which has developed lots of this technology, but actually, how do you package those things up, so that you can transfer the ownership of those technologies and drive down the amount of time to implementation, so that you can link up the data between sites and locations? This is a project funded by the UK Government to establish and support PIs in four countries: Colombia, Nigeria, India and Philippines, who are embedded in national control programmes for antimicrobial resistance, to use sequencing to address the kind of infection control issues that they’ve had previously, but with better information. So, for example, in the Philippines, Celia Carlos, who heads up the Research Institute of Tropical Medicine, now talks about lineages, when she describes when resistance is going up or down. So, there’s really strong necessity to deliver these packages, not just the tools of the data, but also the technology.
So, I’ve just got two or three slides that has kind of taken up much of my time, over the last month and a half, which is a new initiative in the UK, the COVID Genome Consortium, which has been an attempt to use these kind of technologies, across the UK, to understand the current COVID pandemic. So, the idea is that local sampling sites, so it’s not centralised, this is a decentralised initiative, any sort of site that has the capability to do this kind of genomics and want to be enabled to pull that information together. So, this map on the bottom left is just a location of sampling sites across the country. The greens are the public health institutes, England, Ireland, Scotland and Wales, and this is the Wellcome Sanger Institute here. The idea is to bring positive samples into each of these sites to genome sequence, submit that information to open public archives. GISAID where viral sequencing is kind of moving towards, and then also to be able to understand the situation in the UK.
It’s a UK government funded project. There are a huge number of partners involved in this. It’s been a really rapid fire-up. It’s been a very exciting and difficult thing to be involved in. There’s more information on cogconsortium.uk, but over the last month and a half, there have been over ten and a half thousand genomes sequenced, and this has tripled the global total. So, this is really giving us some stronger insight into what’s going on in the UK, and the groups that work on phylogenetics can break this down into subsets. So, again, these lineages that are COVID is not one, it’s a big family tree, and we can use open data systems to look at the monitoring of where these lineages are.
For example, during lockdown, do the lineages become more geo-centralised? And then, if we remove lockdown, do we start to be able to pick up the movement of these kind of things? Again, more information can be found at COG, it’s an open data initiative, there’s been lots of issues around governance, etc., all touching on the things that we’re describing now.
And just very last, I’ll show you, this is the open data site that if you go to the COG Consortium website you can have a look at. This is, again, a map of the lineages across the country and you can do things like move and see where lineages are currently, the numbers, the – where they’re increasing, and we can importantly share this information within the Consortium and globally. So, I think I’ll stop there and move on, thank you.
Marjorie Buchser
Thanks, David, that’s a very graphic and sort of capture the attention of us, all this illustration. Fantastic, so we obviously are a policy institute, so we are very interested in hearing about what also, is the policy perspective on that. Marietje, you have, before going to stand for geo advice government on some of this cyber policy and regulation, so we’re curious to hear from you as well and looking at the pandemic and how it has accelerate technology adoption, what do you think is the long-term effect of that crisis, and whether it will transform our relationship to technology, and also, going back to the poll, to which extent this transformation is based on population consent and whether the public is integrated into this change?
Marietje Schaake
Thank you very much for bringing us together and thanks for everyone to take time to look at another screen, as we do so much during the lockdowns and other measures. Indeed, when you look at what governments have been proposing and whether that is with consent, my mind jumps immediately to all the various proposals of the use of an ‘app’ to track contacts and to, as a part of this, solution to end lockdowns and deploy technology that way. And I think we have to be very, very careful, for some kind of tech utopianism or tunnel vision on what technology can do. Not to make it the solution, but to keep considering it as a possible solution, with the need for clear guardrails and legal conditions to preserve privacy, to ensure that these apps are secure, and that there’s not unintended consequences that we will only find out about later.
So, what I’ve been seeing a lot, especially in Europe, is – or looking to Asia, for example, where these contact tracing apps or other types of apps, for example, to look at whether people are respecting lockdown and quarantine measures, have been used already, is a tendency in Europe, also by policymakers, to sort of pick and choose their most favourite positive impacts of the use of technology in the fight against the COVID-19 virus and pandemic. But there’s also a lot of negative lessons to be learned. For example, that data can be reidentified or de-anonymised and, in a country like South Korea, people are more afraid of being stigmatised for carrying the virus than having contracted the virus itself, which is a very, very serious concern.
Another lesson learned is that there can be a sense of false security, that can take hold when people think, “Well, as long as my phone doesn’t beep, I’m safe, and if I’ve been in touch with an infected person, then I’ll just have to wait for a message on my app, and until then, I can go to the park, hang out with friends, see family, and go about my so-called normal life.” And I think that these are all things to consider that the technology, even if there’s technology specific questions about privacy and security, there’s also a societal context that has to be understood and the promises and expectations have to come together. I’ve seen the announcement of the use of an app being both met with high hopes. Can this mean that we can go back to our normal lives, but also with deep concerns about a surveillance state on steroids, and the question of who should be in charge of deploying these apps? Should it be the already very powerful tech companies, or, as we’ve heard from the Dutch Government, are governments intending to start building apps themselves, which leads to all kinds of other questions and concerns, if we look at the success rate of IT projects done by governments in the past.
So, what I think we can see is, yes, obviously there is a huge dependence on the functioning of technology. People spend much more time online, children, people working, people consuming culture, and now, on top of that, a sort of wish to use technology to get out of the lockdown and all the measures that are needed. I think it is absolutely vital to keep this debate evidence-based and not to fall into some kind of tunnel vision, where there’s an excessive hope, in terms of what technology can bring here, without considering the critical questions, not only about privacy and security, but society-wide, and to also have an eye for the longer-term effect of people knowing that they may be traced, questions about what it will do to our behaviour and what it will do to power relations and accountability down the road.
So, those are my first thoughts about what I think is very urgent, when it comes to the use of technology and policies deployed by governments in Europe and elsewhere.
Marjorie Buchser
And I think one of the debate as well is, it’s not only the deployment of this technology, but whether there are any sunset clause and whether they will normalise surveillance tools on the long-term. Do you see that risk as a sort of shift that, because it’s an emergency measure, but they will become permanent because there’s no rollback clauses, basically?
Marietje Schaake
Yeah, so I think we have to distinguish between whether there is a very clear identification of the function that a technology might have. For example, an app to trace the contacts, and that it should be very clearly restricted, so that there’s not a function creep either. For example, from the need to look at whether the virus is spreading to companies possibly asking people to handover their profile before boarding a plane or ensuring or engaging in economic activity. And I think it will depend country-per-country whether existing laws are sufficient to foresee in those safeguards, or whether emergency packages will have to be adopted, and certainly, the question of who owns and governs the data, building also on what Stefaan said, and how the public interest is preserved is crucial. And I think it is not without reason that a lot of Scientists and experts in the field of artificial intelligence, in the field of cybersecurity, are raising alarm bells.
There are very few people, besides tech companies and some governments, that are still keeping high hopes in what this can do and, whatever technological solutions are chosen, we have to make sure that civil liberties are protected and that this crisis does not become a pretext for erosion of those rule of law principles. We’ve seen that in the past and oftentimes, you know, the saying is, “Under pressure everything becomes fluid.” So, walls and barriers and concerns that people experienced before may actually fade to the background.
I’ve experienced it myself. I mean, I was very critical of some of the proposals that the Dutch Government made, implying that an app would be mandatory and that it was going to be rolled out within two weeks, that it was going to be the key ingredient to ending the lockdown, and some of my friends, who I know to be very critical, said, “Oh, you know, don’t make it complicated, we need to get out of this lockdown.” And so, it shows that a lot of people are really hopeful of what technology can bring and I think there lies another pitfall for governments, which is, if they overpromise, they can easily lose the trust of the population, if they can’t deliver, with the high expectations of what people hope an app will do. But I’ve been surprised at the lack of critical thinking about these promises from people who may have been up in arms about privacy one year ago about other proposals. So, yes, I think the norms and the expectations and the fears and the trade-offs are clearly shifting before our eyes.
Marjorie Buchser
And last question, if I may, because you’ve also been involved with Chatham House Democracy and Technology Commission. One of the discussion that we’ve seen is that potentially – or a question is whether authoritarian countries will be better placed to deploy those technology because in democracy, obviously, consent and transparency have higher value. Do you agree with that statement, or do you see a greater divergence between those different regimes, when it comes to using that type of technology?
Marietje Schaake
Well, I think we generally see that the way technology is governed and used is often an extension of a governance model anyway. So, an authoritarian regime will probably use it to amplify its control over people and its censorship, its power, its restricting of people’s freedoms, whereas, hopefully, and this is where the concern lies, free and open societies will make sure that those freedoms and rule of law principles are not eroded with the use of technology. And that’s, you know, that’s a race against time anyway, but it’s important not to lose sight of what is at stake here and what can happen, if massive amounts of people are pushed to use certain technologies.
In the Netherlands, quite remarkably, the intelligence services raised the alarm bell and said that there are deep concerns when this happens, and I can’t remember a moment where the intelligence services spoke out like this about a proposal by the government, when it came to technology. So, what I’m hoping is that democratic checks and balances will do their work and that people will remain critical and if that means that we cannot massively deploy technologies, with a lot of risk involved, then I don’t think we’re losing out. In fact, I think we are being true to the values that we hold dear, whether there’s a crisis or not.
Marjorie Buchser
Excellent, thank you, Marietje. And just a reminder that you still have the ‘Chat’ function, you can either post a question anonymously, or with your name and identify yourself, as you’d like it, otherwise you could also vote for questions. There are already quite a few, so if you think that one is particularly good, vote for it. Before we go to the Q&A, I want to invite my colleague, Claire, from the Centre of Universal Health, whether she has a question for our panellists.
Claire Munoz Parry
Thank you very much and thank you very much also to the panellists for their excellent presentations and the discussion. And my first question is regarding data use and AI development and whether the current waves in this data usage and development of a new AI, whether this is going to trigger long-term impact innovation in healthcare?
Marjorie Buchser
David, maybe I will direct that question to you.
David Aanensen
Thank you. So, I think, inevitably, because of the pressure on things to be done and the urgency to be done in a shorter space of time, there are mechanisms that are coming into place that enable innovation to be driven slightly more quickly than maybe it has previously. And actually, I think it’s quite important to think about the potential legacy after this particular pandemic dies down of what we’re all currently involved in at the moment, certainly from the point of view of the kind of work that we’re involved in, with respect to the increasing use of technology, such as genomics. We’ve sort of been around this kind of issue of how do we drive down the time it takes to fire up new methods and bring data together, so that we can inflict these kind of new AI methodologies and learning on datasets? But there’s always a slight kind of slowness and lack of rapidity and take-up to some extent, because there’s issues around, well, you know, there’s more important things where we can do these things, but actually, it’s quite clear that these kind of new methods can bring some greater insight.
So, I sort of feel that there’s a speed up in this situation that lots of people have kind of been thinking on how to do and there’s an opportunity to do it now and to think about the legacy that this will leave, should there be another event such as this. I mean, I think we have to remember that we’re in the midst of a pandemic that we need to sort out, it is a pretty nasty thing that’s going on at the moment, and I agree with Marietje’s points around the need for concern around dropping levels of data requirements, etc. But we have to somehow address this issue and there are proven techniques and methodologies that will help us, so navigating that landscape. But I absolutely think that we have to also be cognisant of the legacy that the speed up in the appropriation of some of these technologies and these tools that we’re going through at the moment is kind of unprecedented. Things that would take an awfully lot longer can be done in a short amount of time and we have to do this intelligently and carefully, so that we do have some good silver linings that come out of this quite terrible situation.
Marjorie Buchser
Great, David, I will keep you up because our first Q&A question is from Julie, and she’s asking “How can your method work, when different country measure data in different ways, and there’s obviously a lack of trust in relation to statistic coming out of different countries? And I think that in our pre-meeting, pre-webinar conversation, we were also generally talking about how could we increase data collaboration and information sharing between governments?”
David Aanensen
So, I’m not sure I understand the question fully, so is – if this is specifically about how the kind of genomics work that sort of I described could be managed when different countries measure data in different ways, I’m not sure I can make too much of an impact on the statement around the collection and the reporting of, for example, care home cases and deaths. I mean, these are issues that are on a specific government level and I wouldn’t want to sort of comment. I agree that there’s deficiencies in the way that data are reported everywhere, but I think, again, the rapidity of how this has hit different countries and different countries are on different parts of the trajectory, it’s very difficult to do something comprehensive. But with respect to the sharing of genomic data and the use of that information globally, this has actually pushed the agenda a little bit quicker.
So, I mean, genomic data is just a series of ACTGs and actually, it’s fairly useless, unless you have associated metadata with it. And previously there’s been a real push for complete openness around genomic data and there are international databases where, if you write an academic paper, for example, you have to have an ID from having deposited that data to be able to publish. Now, that’s clearly not going to work if this moves from an academic exercise into public health, where actually, the major output of public health labs is not writing academic papers, as it shouldn’t be, it should be about what’s – how can we use this information to actually make some difference. So, that need for some better way to get the data, but also previous examples around Zika and Ebola, where countries have openly shared data and those data have, for example, been used to develop therapeutics that have then been sold back to those countries.
I mean, these are very sensitive issues and, within COVID, however, there’s an initiative called GISAID, G-I-S-A-I-D, where all the viral sequences are going on and anybody that submits there needs to sign up, but also needs to agree that if they use any other piece of data from anybody else who’s contributed, that they attribute that information. And actually, there was a conflict in the community about whether full openness or this kind of semi-openness is the right way to go, not just in academic and public health apps, but also up to the WHO level, and it seems like this kind of public access is the way it’s being described, where you register and you agree to give contributions to other places, has really taken off for COVID because it’s just the place that everybody’s deposited.
So, for example, all of those UK sequences that are being generated by all the labs at the moment are all being deposited into that initiative, so that they can be used globally. So, it’s an open initiative straightaway. I mean, to be fair, the UK project is also depositing in the completely open access way, but it has driven that agenda and it’s made people think a little bit quicker. So, that’s been a great example of how international data sharing has been – the agenda’s been driven forward through the need to actually communicate and collaborate and understand on a broader level than our own sort of small focus. And it’s taken some of this technology outside of pure academia into public health, which is really crucial, if any interventions are going to be firstly effective using this information, and also tested.
Marjorie Buchser
Thank you, David. Our next question is from Hans, and he says, “All of us using mobile phones have, over the years, more and more, more or less willingly, handed up over our personal data to big tech groups, would more data be needed for the purpose of infection tracking than is already in the hands of those companies?” Stefaan, I’m going to leave that question to you first, because I know you’re looking at – you had a repository of the Finnish initiative, also looking at how tech company can share their data with government, or sort of using some of the tech company – data that has been collected by those companies for healthcare purposes.
Stefaan Verhulst
Yes, thank you, and actually, yesterday in Science Advances, we published, together with a group of about 20 other Scholars and Data Scientists, we published a paper about the use of mobile phone data for a pandemic and what are the possibilities, but also the limitations? Because obviously, and I agree with Marietje, is that we shouldn’t assume that this is a magic bullet and we should definitely not develop some magic thinking that this will solve all the problems. But I do agree that we not only need to start thinking about what is the data, we need to collect? We actually do need to have far more transparency about what is the data that we already have collected and how can we start reusing that in a variety of ways for answering questions that might be similar, which then leads me to another observation and that might not be directly the answer to the mobile phone data. But I think a lot of the conversations about data, it’s focused on the supply side, i.e. what is the data that is there, and can we do something with the data? And from our point of view, what we actually should do first and more importantly is really start thinking about what are the questions that, as a society, we seek to answer? And then, realise or at least have some kind of an engagement on how shall we then act, if we have an answer to that question, differently?
And too often we see that people start with give me access to mobile phone data and then we will answer a set of questions, without realising what are the real questions that matter to society, and too often we see that an insight is being generated, without actually having any connection with the decision-making process. And so, we see a lot of that happening at the moment. Ah, I have access to data, let’s do something with it, and then we generate insight and then there is no-one listening, let alone ready to actually understand what can be done, if at all the question matters to the policymakers, given the different stages of the pandemic.
And so, what we really need, as a society, is to basically come up with a set of priority questions. We have suggested 100 questions or less, that is transparent, right, so it’s not only – because a lot of the discussion about data privacy is really about transparency, with regard to data use. I think transparency about the questions should actually be a first kind of standard. What are the questions? Why do those questions matter? How will we act upon the answer, if we have an answer to those questions, and then we can start thinking about so what’s the data we need, in order to really make progress, with regard to that question and what data already exist that actually can answer that question, without automatically assuming that we need a new data collection mechanism. So, question led is really a principle, from our point of view, that would solve a lot of those concerns.
Marjorie Buchser
Marietje, do you have a perspective from policy, because obviously the European Union, a lot of you have been – countries have been looking at regulating big tech also, in terms of data sovereignty in Europe, so I wonder if you have your take on how to manage the interaction with big tech?
Marietje Schaake
Well, firstly, I would say just because a lot of big tech companies already have assembled humungous amounts of data on people, doesn’t mean that we should continue now. I would say it’s all the more reason to be very mindful of what the harms can be in the space of competition, in the space of a lack of transparency and accountability, which I saw ranking very highly in the poll, and I think it’s really important. But, specifically, I think the Google Flu Tracker is a really interesting case in point. It was an experiment that Google did a while ago, more than ten years ago, I think almost 15 years ago, where I’m – I think they anticipated that maybe because they could analyse searches, for example, vitamin C, or, you know, tissues, or sick days, or, you know, cures for the flu, that they may be able to predict outbreaks. And actually, there were huge concerns about accuracy, there were also concerns about privacy, and the project was stopped.
So, sometimes, what we may think could be a useful solution isn’t and I want to really echo what Stefaan said, before you start building something, you really have to know what you’re looking for. In other words, what problem can an app, or technology, or repurposing of the use of data, or new analysis be a solution for? You would be really going in the wrong direction, for a variety of reasons that I mentioned, if you just open the floodgates and say, well, you know, kind of, you know, take all the data and come up with a solution as you see fit. We’ve actually had a bit of an experiment with that here in the Netherlands over the past couple of weeks where an app was announced by the government, a tender was hastily called for, people had to submit their proposals over the Easter weekend, there was a rapid selection process, there was a hackathon, and where we are today is that zero of these submitted proposals have been considered. The process is back to zero, and I think a lot of goodwill has been lost, both on the part of specific partners, but also, on the part of the general population that may have looked with either hope or fear to this process. If they looked with hope, they’re disappointed; if they looked with fear, their fears have been confirmed. And this is really, I think, a moment to use the fact that we’re all using technology in even, you know, longer hours and more volumes and generating more data and accessing more services, to also think critically about what it is we’re using and what kind of imprint it will leave on our societies, especially also, and I think that that may be one of the key points to look at too, is when children are using technology so much to get access to their basic education, it creates all kinds of new questions about mental health, about the importance of contact between people, you know, in a physical way versus a non-physical way. And I’m sure that there is a lot to learn here, but we have to keep looking at everything that’s happening and unfolding with a critical eye and not just with the rush and the wish to get out of this situation.
I feel that too, you know, I really wish I could be more free to move everywhere, that I could go back to California, see my students face-to-face, without jeopardising anyone’s health. But a desire or a wish is not a good counsel, it’s not a strategy, we have to really be more specific than that and also, educate the public as we go.
Marjorie Buchser
Thanks, Marietje. The next question, I’m going to try to take two other questions from the Q&A, and the next one is from Dina, and it’s quite specific, it’s looking at “How could we use big data or AI to anticipate and mitigate leaks, accidental or intentional, from level three or level four, from labs globally?” And same question for what market? So, that’s quite specific to leaks and anticipation, I don’t know whether David, you have any insight on that, or Stefaan, if you’ve seen initiative looking at this question. I’m going to first ask David and then move to Stefaan.
David Aanensen
Thank you. I’m not sure I’m really qualified to address that specific question. I mean, I don’t know the rates of leaks from labs. There’s a lot of speculation around how this arose and I’m not sure AI or big data’s particularly the way to address it. So, I’m not sure I’m going to be able to give a particularly useful answer to that, I apologise.
Marjorie Buchser
No worries, it’s quite technical. I don’t know, Stefaan, if you have…?
Stefaan Verhulst
Well, it’s linked to what I mentioned at the end as well is that we need also, not just innovation in how to share data, but also, we need innovation in actually access controls to certain kind of data. I mean, as David said, and we are obviously also big advocates for open data, but that doesn’t mean unconditional access and it also doesn’t mean that there are no controls, with regard to who has access to it, and the big challenge then is, of course, how do we oversee access controls. And I think having some more innovation in immutable audit, in really understanding if someone has access to data, what then happens with it, so that you actually have – which, again, which is kind of ironic, is that we actually have very little data about the use of data. And so, if we would have more transparency about how data is being used, who accesses data for what purposes, we would also have more transparency and we would probably also have a larger trust in actually how it is being accessed as well. And then so, leaks can be prevented through access controls, but it does not mean, of course, that in the certain conditions – in certain conditions actually, the access control should be lower, and the question here is, again, who decides what should be the access controls for what purpose?
Marjorie Buchser
Thank you, Stefaan. I’m going to take two more questions, so one from Nicola, that is a good friend of Chatham House, hi Nicola, so is asking, “Since government have been the source of solution during the crisis, how can we incentivise non-state actors, private sector especially, to collaborate for the general interest?” I think that we touch upon it a little bit, but we can add to that. And a question from Dina again on cybersecurity and cyber espionage from China, “Has there been an increase in cyberattack from China related to CCP virus research and why?” So, I see that David raised his hand, so I’m going to let you go first, and then I’m going to move to Marietje, because I’m sure she has comment on cyber espionage and cyberattack.
David Aanensen
Thanks. Again, I’m – I won’t answer the second question, but I’ll sort of address my experience with the first one around private sector, how do you incentivise the non-state actors, private sector? So, my experience so far is that we, and myself and lots of people involved in this initiative, on a daily basis get offers of help from members of the private sector. I know that certain – maybe within testing and providing capacity for sequencing, there’s been an awful lot of incentivised private sector actors who have wanted to put their expertise into the game.
I guess one of the questions is, how do you work out what the incentive behind that offer of help is, because actually, we’ve had to almost stop some of the responses to those letters, because it becomes quite clear that one of the reasons for offering help is because there’s the ability to say, “We’re doing something amazing with COVID and, therefore, that will increase our value.” So, there’s – for me, I find that kind of fascinating, because it’s not something that touches on my expertise, but, you know, the – we have people in this room who would have very strong ways of assessing and analysing, but I don’t think there’s necessarily been a lack of private sector, it’s the right private sector help. So, actually, you know, if you look at testing, there’s a huge issue globally about the need for re-agents and testing kits, etc., you know, how do you incentivise in the correct way?
So, I think – and that steer has to come from the right approaches to mitigate the event that’s happening, and what are the priority things and then how do you go towards the private sector parties who would be best-placed to do that, and different countries have tried different approaches. So, yeah, I’d be very interested from the other panellists, on their thoughts on that specific issue. Thank you.
Marjorie Buchser
Well, we have two minutes left, so I think that we’ll have to leave the last word about cyber espionage and Marietje.
Marietje Schaake
Yeah, I’ll try to connect the last questions and build on what David said, which this notion of charm offensives by the private sector is really unprecedented, you know, offering help with a clear agenda is not the same as actually serving the public interest. And so, I think one particular area that deserves a lot more attention is the access to information to do independent research of what the actual effect of disinformation is, for example. So, what we see is social media companies, for example, taking unprecedented steps to intervene in the information that is posted. They are doing the kinds of censorship that they never said they would, and it’s probably because they are also seeing unprecedented amounts of disinformation, bordering on, you know, hybrid conflict, where, indeed, state actors, could be China, could be Russia, could be others, are really deploying inauthentic behaviour to manipulate the debate, to lure people into malware attacks and things like that, right? We also see unprecedented attacks on hospitals, in a time where that is really the more cynical thing to expect. So, if the private sector really wants to serve the public interest, they have to open up data for independent research, for regulators to understand better, for lawmakers to get a better grasp of what the information echo system looks like, how information flows through it, and what bad actors of all kinds are doing.
So, I would like to leave with that and to say that the notion of closing the accountability gap, so to attach consequences when there is either an attack from a state, like China, or criminal activity going on, or manipulation of people’s fear right now to sell them false cures, or to sell them defunct masks, or to really erode the trust and to erode our ability to withstand this virus, should all be met with much more consequence and one of the only ways we can get there is if we know more about what’s actually going on. So, transparency and accountability is something that they private sector can really, really help with, if they actually want to serve the public interest.
Marjorie Buchser
Fantastic, Marietje, thank you very much. David, Stefaan, Claire, [inaudible – 57:00] that have been involved with the bank end, thank you very much, that was an excellent panel. I can see that participant have stay on with us for the entire discussion, so I think that’s a great indication of the good conversation. We could have stayed longer, but that was a great hour, thank you very much, and I’m going to clap because there’s nobody clapping out there. Thank you, guys, that was a great panel. I wish you all good evening. Stay safe.
Stefaan Verhulst
Thank you. Bye, bye.
David Aanensen
Thanks to you. Thanks to you as well.