The Power of Big Data

The Power of Big Data
Speakers: Matthew Parker, Anthony Sitas, & Giovanna Jaramillo Guiterrez
[Music Playing]
Matt: Hello and welcome to Invent Health, a podcast from technology and product development company TTP. I'm your host, Matt Parker.
Over the course of this season, we're going to be exploring the fascinating future of health technologies. Today, on this season's final episode, we ask, what will harnessing the power of big data allow for health tech?
So, we've reached the end of this season's journey into the future of health tech, and it's been quite a ride. From exploring how we can take humans out of the loop in diabetes monitoring, to the future of epilepsy treatment, sustainability, and cardiac health.
Every episode has offered up some fascinating technological solutions to a huge array of challenges. But I've also noticed a theme across the season, something that has come up in pretty much every episode; the transformative power of data in healthcare.
Data collected from wearables, facial recognition software, online forms — people are tracking their health like never before, is the thing which is driving the current AI revolution. But it seeps into every part of the healthcare ecosystem as well.
But a challenge that keeps getting brought up is how do we actually utilize this data effectively? How do we aggregate it across different fields? How do we move it from numbers to reality, from the cloud to the clinic?
So, this week to round up the series, I wanted to find out what a world where we do just this would look like, and what is the roadmap to getting there?
So, I sat down with a couple of people who've had experience working with these types of technologies. The first is my friend and colleague at TTP, Anthony Sitas. Hi Anthony, welcome to the show.
Anthony: Hey, Matt, thanks for having me, this is very, very exciting. The TTP Podcast has always been something that I've been tuned into so it's great to be contributing.
Matt: Anthony's been with us for the past year and a half, and across this time, has focused on both digital health strategy and design and product development partnerships.
His work here sits within the areas of psychology, neuroscience, clinical decision support systems and algorithmic explainability. All things which benefit hugely from large data sets to work from.
I started off by asking Anthony about the work he's doing at TTP in this context, and the healthcare world he would like to see if data were used truly effectively.
Throughout the course of this series of Invent Health, we've been hearing numerous examples now of how data and AI is set to transform some very specific aspects of healthcare.
Today, we wanted to take a step back and look at this more broadly and ask what does that mean for healthcare as a whole? And what does that mean looking forward into the future, maybe on a longer time horizon than we've previously looked at.
I wonder if maybe you could tell me a little bit about your work at TTP and how handling large data sets features in that.
Anthony: Alright. At TTP, lately one of our largest projects is actually focused on developing a remote patient monitoring system in the context of oncologic clinical trials. These systems are typically composed of multiple user touchpoints from the clinician to the patient and perhaps even other stakeholders.
So, for example, pharmacists or specialized carers, and although TTP is capable in this case of handling data, sometimes our work comes along with not actually touching the data but rather just designing the system to which it's going to be funneled through.
Matt: I was about to ask, and so I wonder if you could maybe talk a little bit about what some of these systems are allowing us to do, what does actually having a system that lets us collect large amounts of data from clinical study do that we weren't able to do before?
Anthony: That's a good question. I think quite simply put regardless of the industry, decisions are typically enhanced when more factors are evaluated, and the rate at which you can evaluate those factors is directly proportional to the rate at which you can arrive at an informed decision and generate outcomes.
So, in healthcare, up until now, decisions have been almost exclusively reactionary. But the problem is these diagnoses and protocols are based on relatively sparse, extremely siloed, and relatively generalized data.
But there are today, very few working clinical systems out there today that can truly claim to offer personalized medicine especially at scale. Inevitably, I believe that this is a domain of healthcare that is prime for disruption.
Matt: Fantastic, I look forward to hearing a bit more about it as we get into some of the meat. So, what are some of the specific, I guess, bits of current treatment that might impact? Is this going to predict disease progression? Is it going to predict whether I'm going to contract a disease?
Anthony: So, I'll take you through a journey. So, big data can do the first thing which is predict the disease onset. That is far down the line because to get there, you need to have a very high level of data but also, a consistent stream of data which tends to specific biomarkers that can address those probabilities of a disease onset.
And in order to actually deliver that, there needs to be an infrastructural-wide shift on how healthcare is viewed, but also on how it is delivered. And quite fundamentally, actually today, you are seeing some of those models but not population-wide.
It's only happening in patients or populations that are already deemed high risk because economically, and socially, it's easily justifiable to innovate there.
The second one is predicting and modeling or even tracking the disease progression once you know that that disease’s onset has actually happened. The interesting thing about predicting and tracking disease progression is there's a lot of downstream clinical outcomes that come from that.
So, yes, predicting the disease onset is great, but really, if you are able to track the disease prognosis as you go with precision, you are able to modify the care pathways and also, deploy some really innovative techniques like perhaps, model how likely a drug or a specific therapy is going to work in each individual patient depending on their specific variables to which pertains to their lives.
Matt: And so, we're getting that progression from being able to predict that actually someone might be septal contracting a disease or being diagnosed with a disease through being able to predict how that might progress through treatment.
But then ultimately, this idea of how specific medicines will work in that specific patient-based on factors that you're able to measure which is very exciting development from the population-wide medicine of today.
Anthony: One of the most exciting doors that big data is going to be opening and something that I've been realizing on the market side with our client work is actually harnessing what's called Multimodal Biomarkers.
So, essentially, when you're collecting mass amounts of seemingly unrelated or perhaps even known related data streams about a patient, your predictive capacity is simply greatly enhanced.
Because now, we start to understand, okay, if I take a walk at this certain time of the day and I have these genetics and this diet, and that relates to the epidemiology of my socio status, what is the overall prediction for my risk of cancer in 35 years?
Matt: And I wonder what this ultimately leads to, what's this kind of utopian vision of how we're going to be using this data in the future to change how even maybe healthcare itself is delivered?
Anthony: As I'm sure you've heard me yelling from the rooftops at TTP, the idea that healthcare will become a passive and immersive experience or rather not an experience at all is something that I am fully invested in.
So, at TTP, we used to reference this as ambient diagnostics to describe that vision, but we now reference this vision as ambient healthcare. So, ambient healthcare is a vision of the future where a person no longer has to seek their wellbeing from a geographical location or a specific individual that tends to their health.
So, in short, as time goes on, we are beginning to entertain the possibility that hospitals are going to become more and more obsolete. And that is going to be a result of the fact that personalized and predictive care is going to become so advanced to the point where it won't be about damage control, but it will be more just about tweaking fine things here and there as you go about your daily life, and in a way that is so seamless that it will be effortful for the person who is interacting or consuming with those changes.
There's a plethora of promising research actually emerging regarding the use of what's called non-invasive ambiently captured data. To understand this realism, I like to talk about autism as an example.
So, for those that don't know, autism spectrum disorder or ASD is typically characterized by things like social interaction, difficulties or repetitive obsessive behavior. All of that has lifelong considerations that have to be dealt with as you develop through that condition.
These characteristics, however, can be observed in children as young as six months. In the UK, just for context, there are currently half a million people who are affected. Interestingly, the economical strain realized from children with ASD sits at roughly 3 billion a year for children and roughly, 25 billion for adults.
So, what does that discrepancy of economical costs tell us? It points to the fact that as people age with ASD, they diverge further from their typical developing counterparts in terms of things like independence and social functionality. And in response to that, the healthcare system then begins to react.
But at this point, the ability for their cognitive development to exhibit plasticity is very, very limited because they have matured. The idea of this preventative big data approach is asking the question, “What if we could intervene earlier?”
Recently, there's been a pretty seminal paper actually I think it was published in nature, where they showed that ocular motor tracking (so this is eye tracking) at an early age, so toddlers or children as young as six months — you were able to reliably predict the onset or the presence of ASD just based on the fact of where their attention was.
Matt: That's fantastic.
Anthony: It's a really interesting point because there are some critical points that that solution actually solves. The first one is the fact that you are able to catch things very early. The second one is the fact that it is non-invasive and entirely ambient.
So, in a perfect world you can imagine something like a smart baby crib where it's constantly monitoring your baby and it can very reliably predict the onset of ASD.
Now, you ask, “Okay, that's great, assuming we can do that, what is the value of actually diagnosing ASD very early?”
As a follow-up study to that research that came out, they actually facilitated a double-blind investigation where they gave half of the children that they had diagnosed early, a behavioral therapy intervention and half of them, they let them progress as they normally would.
And what you see is that a significant proportion of them not only got better but actually, came under the clinical thresholds for ASD, I think it was after about a year, a year and a half to two years of that intervention.
And so, right there, you can point to a functional example of how you know if you can model or diagnose the onset of a certain condition very early especially in neurological conditions, there are both social and economical benefits to be had at large.
[Music Playing]
Matt: I loved Anthony's example using ASD there. Too often when we think about data like this, it takes the person out of the conversation, people's ailments become numbers, but imagine the joy and real-world personal changes, the early intervention based on aggregated data could give to children with ASD, it's a utopian vision of a future where we are harnessing data properly.
However, we're not there yet. The smart baby crib is not being rolled out across the globe, and there is a long way to go before health records are synthesized properly. But why, what's holding us back?
Next, I wanted to speak to someone who's seen the power of big data in the field, in places beyond Europe and the West, to find out how effective this can all be in fast-paced real world scenarios. So, we got in touch with Giovanna Jaramillo Gutierrez.
Giovanna wears many hats. She's both an epidemiologist researcher and a data scientist marrying two industries, which can fill poles apart. Beginning work as a molecular biologist, Giovanna worked with various international healthcare organizations, which brought her to the epidemiological research she does today.
She now works at the intersection of epidemiology and AI and is passionate about the development of ethical, human-centered, AI-based products and services that are fair, explainable and transparent. She does this for the WHO and Eticas Consulting amongst others.
She's done a lot of work in emergency health situations from Ebola to COVID, and has seen firsthand the impact that having access to specific personalized data on the ground can have in real time. These scenarios are exactly where we started off.
Thank you very much for joining us Giovanna, I'm excited to talk to you.
Giovanna: Pleasure. Thank you for having me.
Matt: And coming from the world of epidemiology, is that somewhere where data has always been a part of that area? Or is this something that's sort of changed more rapidly with greater availability of tools and being able to gather and analyse these larger data sets?
Giovanna: Epidemiology has always been the — like lots of statistics like to really understand how disease is spread in the population, who are the people at risk, or more severe, or forecasting of how for example, an outbreak will touch other areas in the region. So, the more information you get from different sources like hospital admissions, GP registries, all this is an ongoing collecting.
And now, I think with the advance of technologies like digital technologies, it's really helping in real time rather than — because you have to have in mind that if you picture the hospital, like you have the medical record, the laboratory have their own information system.
So, the sample comes with a paper filled in with the patient information, and then there's the medical record with the symptoms, and then you have to gather all this information then electronically. And historically, in the healthcare sector, this is not an easy thing to do. It's very labour-intensive.
So, you have to merge all this information, and then with all the information, you get to a composite picture for the epidemiologist to have a sense of what's going on. For example, where we are in the outbreak or what course are we going to take based on the available data.
And this is just health data, per se, medical data, but also, we look at trends like seasonality, climate, temperature, like it encompasses all types of data of course.
Matt: That's really interesting, I was just about to ask what sort of work did you do in this space during COVID, and what did you learn from that experience?
Giovanna: It happened that I was involved in 2015 during Ebola outbreak, I had colleagues that from the organization within WHO that helps deploy field epidemiology. So, they developed a tool that helped with contact tracing.
Contact tracing is really like if someone has a disease, and you have to understand how fast this will be transmitted to other people that were in contact with. So, you have to reach those people before they get sick, and you get information from these people. It's like a disease detective, they call it a field epidemiologist.
Now, they go, they ask questions, “Where did you go? Did you travel before?” So, basically, we needed something rather than just Excel and paper but that works in a low-income setting because we don't have servers, we don't always have internet.
And so, they developed this tool that is a software, it's called Go.Data. So, I was one of the few people that was lucky enough to help deploy this tool in Guinea when there was the Ebola outbreak back in 2015.
And what happened is that in 2020, suddenly all the countries wanted this tool, but it was still in development. And I mean, it worked because it was done really well. And so, I didn't know anything about deploying the software in countries. I mean, you just learn on the go.
And so, suddenly, they asked me questions because the people I work with are doctors, or nurses or health specialists. And they're like, “Oh, if I put it on my computer, does it work? Can my colleagues see it?” And I'm like, “No, no, we don't have a server, I have to go and install it in your computer.”
So, this is a really practical tool, nice and fancy, and you don't need internet, you can just download it on your computer. But even I mean, in our countries in Europe when you do medicine, you're not an expert in IT. I mean, this is something that requires quite a bit of involvement in capacity training. Like there's an emergency and you have to train people to use a tool they're not familiar with, they may be not comfortable.
So, basically, that was a very, very good experience because I realized, like what you were mentioning, all the hurdles and the point was we wanted to use this tool because then you input the data, it's analyzed, and you can share the information as soon as possible.
Because what you want during an emergency is real-time information as accurate as possible so that you take this information and you know, like, “Okay, what are the actions?” And this can be like, “Oh, okay, there are that many cases, so we need how much laboratory reagents that we need, how many ambulances,” so it's really pragmatic.
Matt: And particularly in the context of an emerging outbreak like that, speed is essential there. It's about getting those insights as quickly as possible.
Giovanna: But of course, again, I said, there's so many hurdles. You can throw technology at people but if there is no trust and if they were not involved in the co-design, then it's very difficult to convince people to use it because they have their ways and it's not how they function.
Matt: So, we touched on a couple of the areas of the epidemiology there, where there's these big data sets and tools are helping.
I wonder maybe if we could think a little bit more broadly and explore other areas of healthcare where these large data sets might transform the way that healthcare is done, both being able to predict maybe progression of diseases in individuals or as well as predicting someone's likelihood of contracting a disease or even how medicines are going to be effective in certain people.
Giovanna: Absolutely. So, there's been a lot of involvement in electronic health records. And so, this is great because if those data are standardized then we can really have good historical trends for diseases and make predictions.
And if you have enough data that have good representation for example, not only of a small area but let's say at the regional level, at the country level then you can really forecast.
And in that aspect, it's incredible if when you have actually these clean data sets and some countries historically have been very good at it, and they've been doing it for a long time before pandemic.
And now with the pandemic/COVID, people are more aware and there's more investment, but still because it's limited accessibility of data in the healthcare sector as I was mentioning, like it's very siloed.
Even within one hospital, it's a matter also of culture, like how much investment the healthcare sector is willing to do in IT in the public sector. I'm talking obviously, when you go to private clinics and the private sector that you see a big difference, on how much investment they do with IT.
And for example, I was a couple months in the Pacific working on COVID outbreak response. I mean, it was like I had to put software on a USB and take a boat and go to an island, like there's no internet, you know what I mean?
I think often, the technology companies they produce but they don't realize that some people in the world, they don't have access to this because their app is too big for their phone.
Matt: You’ve got to have the latest version for it to be installed and that’s not always the case.
Giovanna: Exactly, I wanted to install some software on a colleagues' computer in the islands, and they had to erase everything that was on their computer so I could put this software. So, I mean, these are realities, that's why I think accessibility is important, it's a very neglected topic.
[Music Playing]
Matt: This is such an important point. For all Anthony's visions of a world where we are tracking data from early childhood development onwards, the realities of implementing these on a global scale is a long way off.
Not only when we think about accessibility in more developing countries but even getting to a point where universal health records are available in Europe and the U.S. too.
But with these technological changes, there needs to be an even further merging of the worlds of healthcare and IT as well. Making things which are universally accessible to all working in healthcare is really important, and Giovanna's work sits at the heart of it.
So, what changes does the industry need to go through to action this? What are some of the steps that countries and their regulators are going to have to make to ensure we get there? I went back to Anthony to find out.
We've touched on regulation; do you currently see the regulators and the regulatory pathways that are available? Are they suitable for the new world that we've described, and we've outlined here? How do you think that's maybe going to have to change in future to enable some of these new technologies?
Anthony: Before I actually answer that, I think one thing that is quite overlooked in the context of data regulation for healthcare is, are we to assume that healthcare, the current model that we see today is going to be the model that we are always operating within.
So, let me provide an example. Imagine you're a patient and you're concerned with your health or perhaps you're perfectly healthy and you're still concerned with your health and you just want to check up on what's going on with you.
If you collect your own data through a wearable, through some natural language processing, through some ocular motor tracking, where does that data go? I would argue that that data should stay with you. So, much like how you saw a big disruption in the consumer space, so for example, Amazon.
Consumers are very well-educated today, they have buying decisions at their fingertips. They have resources to understand how to make those decisions. How is the healthcare industry going to change when people become more informed on themselves and more informed on what decisions are possible with regards to their health?
So, the reason why I bring that up is because from a regulatory standpoint, the primary things that are being addressed today are around anonymized data, patient redistribution, machine learning, decision support systems.
But from a more innovative forward-looking perspective, I would argue that the vision isn't so clear from a regulatory body standpoint as to how they're going to handle that if the model changes. Especially in the long-term, if the model becomes more preventative rather than reactive.
Matt: That's really interesting. So, the whole economics of the healthcare system start to get flipped on its head and that offers real change to some of the fundamentals as well as some of the requirements of what the regulator actually needs and does.
Anthony: Exactly, and so, when you ask me what are the barriers for reaching this utopian vision in the future, on one hand you have the actual system itself, the healthcare providers, the patients, them becoming empowered and invested in this data generation.
But ultimately, on the other side, you have the regulatory bodies as well. But which one's going to come first? Are the regulatory bodies going to say, “Okay, we're going to embrace this future of big data and we're going to flip the model on its head and understand what that means from a regulatory environment,” or is the healthcare system going to push it from their end which is probably unlikely, and the regulatory bodies react.
Matt: And they're trying to play catch up with an evolving field. I want to sort of coming back to a top base at TTP, are there some technology developments that are still required to enable this vision as well?
We've talked about the regulatory perspective, we've talked about some of the economics of healthcare and how that might have to change and adapt. But in terms of the actual technology that's driving this, are there some real investments and developments that are needed here before we can enable some of this?
Anthony: The key bits of the puzzle are business model generation, and clinical system integration. So, it basically comes down to the fact that the tech is more or less there.
We understand how to harness data, we understand how to make predictive algorithms but in terms of accelerating that innovation or accelerating that real world deployment, if you don't have a business case and you don't have a way of seamlessly or reliably integrating it and scaling it into clinical systems, it's never going to happen.
Matt: How can we make sure that this is something that really has impact worldwide?
Anthony: That's a great point. There's two things that come to mind here. One is the classic bias conversation that you can go down talking about the data sets and the algorithmic bias.
When you're creating these systems for the patient and they're engaging with them, you have to be sure that your design has elements to it that allow that system to scale over multiple demographics, different cultures, different expectations, but not only stop there, it has to be able to adapt as they interact with it and learn.
So, as a patient becomes more knowledgeable in certain areas or as their behavior changes, the system has to be intelligent enough in the long-term to compensate those differences. And that is, again, what we are trying to get at with the precision and personalized medicine approaches.
So, it goes beyond just a functional usable UI, user interface. It incorporates everything from prompts, understanding the types of language and the tone to use, but people are complicated and data is complicated too. But to unravel a human and do that at scale, that's going to be a big challenge for us as we start to deploy these precision medicine techniques.
[Music Playing]
Matt: Personalized medicine is certainly a way forward here, but Anthony mentioned various things within these systems that need to be addressed before we can get there. A key to this is removing bias.
This is something that Giovanna feels really passionate about especially given her work in developing countries who are so often left out of the conversation, are unrepresented in the large data sets being used to train and develop these models.
I went back to her to hear about bias and set out a roadmap to utilizing data on a huge scale.
Is there a risk that if we have stricter regulation and tools are more specific about how they can be used, is there a risk of when we then maybe can't deploy those tools in a more global context?
Is there a limit that by putting that kind of regulation in place we stop, maybe people don't want to develop things for a global audience and things stay very locked to maybe the U.S. healthcare system or European healthcare systems. And it stops tools being deployed more broadly?
Giovanna: Basically, I think with the use of digital platforms are more prevalent now in the healthcare sector. So, we'll get more and more data, but I'm hopeful and I'm a strong believer in synthetic data, so because of reasons of privacy.
So, I have a model population for, I don’t know, Germany, or I don’t know, India, and of course, there are regional differences and you have to take this into account — wouldn’t that be the best to have synthetic data based on those populations that you could use and then you could share so that if I want to use my tool in another country, I can go and try it or see whether that would fit.
So, it's not done at the moment in the public sector that I know of, but this would be for me something that would make sense. Maybe we don't have enough data, but hey, we have technology, we can use this data to make new data, synthetic data and lots of it. Because this is also, I was mentioning the data points in the health data, it's not like shares and clicks on social media.
And I think it's an important thing that people don't realize, that there is data in healthcare, but you have to unlock it because at the moment, it's not merge, it's not so easy to access. I mean, at least in Europe, there is a strong movement to standardize the data, et cetera. So, at the end I'm hopeful.
Matt: Absolutely. Are there any other easy wins like this that we should be doing now that would help either with developing tech or deploying it in a global context, any other easy wins like that that would make a real difference?
Giovanna: I think community health because when we talk about hospital, like you're sick and you go to the hospital, so that's data point you get, like, “Oh, this person's already sick.”
But I think with prevention, if we had sensors, if I may say in the community, whether it's people willing to share their data with the wearables or even having health-conscious ambassadors if you want, like if you go to the barber or something.
So, I feel for years, the technology is already here, it's not that we lack the technology, this is not the case. It's just adopting the technology in a way that respects privacy that is trustworthy, there's accountability. For me, this is the bigger issue, also the production of these technologies for health.
If there is no diversity in the engineering team for example, or in those companies, which sadly there is no — this is a very, very important thing because if you all have the same culture, how can you produce something that's for everyone?
And this is where I'm like, it's of course, the technology companies hire engineers but maybe they should hire philosophers. Maybe they should hire social scientists, maybe they should hire psychologists more.
Matt: Are you optimistic that the challenges with bias here are possible to be overcome, and this will become less and less of an issue over time as these tools continue to be developed and deployed?
Giovanna: Yeah, I think frameworks are not perfect, of course, but with these frameworks, like I was mentioning in the UK or in U.S. or in Europe, I think often people are well-meaning, to be honest. Like the engineering teams that I worked with that develop AI tools, they really do it because they want to help people.
So, you have to give them a framework that they can work with, and so I am hopeful that it's just we have to do it in a way that is iterative. Because often when you have guidelines and policy documents, to me, because I'm a pragmatic person, it's super vague and I'm like, “Okay, that's nice, but how's the deployment in real life?” And I need the use cases.
So, I think it should be intuitive, but I think the best things that work is when you do co-design. So, you work with your users, whether they're patients, whether they are health practitioners, and they tell you how to change your tool so that it fits their own needs. Otherwise, they're not going to use it. It can be very fancy everything, but they're not going to use it.
[Music Playing]
Matt: Absolutely, that's brilliant, thank you very much, Giovanna. That's been super interesting to talk to you, thank you so much for coming on.
Giovanna: Oh, thank you Matthew, it was lovely.
Matt: So, big data, is it the route forward to the kind of personalized healthcare that all of our guests over this season have been craving? I think it could be.
Once the technology and the infrastructure to use it becomes universal and issues related to bias and data privacy are overcome, there is so much opportunity to change the way we think about treating patients on a huge scale.
But the fascinating thing for me is that both of our guests today have said that the tech is already there. The next test is to see whether we, the people working in health tech can harness it. That's all for this week, and indeed, this season of Invent Health from TTP, thanks so much for listening.
And a big thanks to Anthony, Giovanna, and all of our guests across this season for joining us. Invent Health will return later in the year with a new season, make sure to keep a lookout for it on your podcast feed.
And in the meantime, do tune in to TTP's sister podcast, Invent: Life Sciences, which will be coming back with season two very soon.
If you enjoyed this episode and want to let us know, please do get in touch on LinkedIn, Twitter or Instagram, you can find us at TTP. And don't forget to subscribe and review Invent Health on your favorite podcast app because it really helps others to find our show. We'll see you next time.

The Power of Big Data
Broadcast by