Skip to content

How to get the most out of data warehousing and analytics – Transcript of interview with Steve Sweeney

"Steve Sweeney - Program Director Arteaus Therapeutics"
Steve Sweeney

With the volume of medical data doubling every 5 years, the challenge has now become how to derive actionable insights from massive amounts of information. Tools are now available to help doctors and researchers better manage and mine a vast array of medical information in their pursuit of realizing the promises of personalized medicine and its role in the fight against cancer.

In this interview, Steve Sweeney, Program Director of Arteaus Therapeutics, offers valuable advice to small and mid-sized drug developers and discusses the challenges and opportunities presented by the increase in cancer biomarker discoveries. He also talks about the role big data warehousing and analytics play in helping oncology drug developers make better decisions, not to mention significant discoveries that otherwise might have gone undetected.

With an extensive background in clinical operations, Steve also brings a wealth of experience around information technology solutions for pharmaceutical applications. Please share your thoughts and experiences or ask a question about this important subject.

Tell us about your experience with information technologies in healthcare.

My interest in technology started back in the early ‘90s when I was a CRA at Quintiles. I volunteered for a number of their technology initiatives, beginning with some of their early CTMS projects and have been working in the technology space ever since.  I have worked on a variety of clinical trial management system implementations, EDC build outs, data warehousing initiatives, as well as some work in the biospecimen management and repository space.

It’s always been a passion of mine – something that I’ve had as my side job while working in clinical operations for the last 15 years.

What challenges and opportunities do you think the increase of cancer biomarker discoveries and focus on targeted therapeutics present to small and midsize oncology drug developers today?

What struck me is that the design and complexity of the data we must now manage is mind boggling. There’s a massive amount of clinical trial data, including genetic data, proteomic data, imaging data, and the list goes on and on. There is a huge amount of data at our fingertips, but we have a hard time making use of it all.

We do use the data fairly well in the context of the study itself, but outside of that it becomes harder for us to work with. Finding strategies to make use of that data, both in the short and long-term is a key competitive advantage for a pharmaceutical company.  We all know that pharmaceutical companies are strongly rooted in science, but they are also becoming information management companies too. The question is whether they are making the best use of all that data they possess.

What role can data warehousing play in helping oncology drug developers deal with some of these challenges?

The ability to combine or query your clinical datasets with your biomarker and genetic datasets, or even pre-clinical data, is greatly enhanced if you use a data warehouse.  For example, at my previous company, Affinity Pharmaceuticals, we were developing a data warehouse for our molecular pathology group. With that thought in mind, imagine that you have a comprehensive clinical dataset and that all of your clinical data is being managed in a standardized way across all of your products and indications.

How much easier would it be to mine and analyze that data in one environment?  Now consider integrating that with other internal warehouses or even external datasets and the possibilities become pretty amazing. The biggest problem we have today is that all of this information is created in silos. The ability to then combine all of that information together becomes extraordinarily complex when they are housed in separate databases.

Therefore, aggregating and standardizing the information is extremely important in managing it effectively, and this is where a data warehouse is key.

What about data analytics? How can the technology available today guide drug development decisions?

It’s really all about speed.  When you warehouse your data and you have visualization technology on top of it, you have the ability to review data almost in real time, depending on how you’re set up.  Over the course of my career I’ve seen several ad hoc requests come from data review committees, institutional review boards etc….  It’s amazing how quickly we are able to pull that information together and review that information on the fly.  Even if you think you have a safety signal, your ability to mine and analyze that data in a warehouse environment is just that much quicker than traditional methods.

What has technology allowed you to find that may have taken longer had you not applied visual analytics?

Analytical tools like Spotfire on top of a warehouse, allow you to quickly build very compelling visualizations, without the need for slow and complicated and programming steps in between.  We’ve built a number of quick, ad hoc analyses or visualizations on top of a warehouse for things like mining a safety signal, responses to a data monitoring committee or ethics review board.  It really does boil down to speed. We’ve been able to pull and analyze our data and get that data out to a variety of customers much faster than we were able to do in our old traditional set-ups.

How do you see visual analytics being used for drug development in the future?

We’ve used visualizations to review aggregated safety data, mutational status data, efficacy data and many other things.  One of the most exciting things for me has been the trend towards making the data management and medical monitoring functions a far more visual practice.  The human eye is remarkably adept at picking up trends and outliers. Visualizations can enable you to spot these trends in a matter of seconds compared to a typical tabulated output, for example.

Another advantage of the warehouse environment is that you can start looking at these trends and spot erroneous data very early on in the study.  Traditionally we would run our data review meetings toward database lock, looking for this erroneous data in an aggregated sense, which is harder to pick up on a patient-by-patient basis.

We’re now able to pick these things out very early on in the process, so I think you’ll see data management and medical monitoring functions become far more visual and efficient due to the adoptions of these technological advances in the future.

At this year’s Outsourcing Clinical Trials New England event you mentioned that cost could be a major deterrent to clinical data warehousing.  What are the main factors that drive cost?

There is just not a lot of software available right now.  Many companies that want to get into this space are forced to build their own warehousing solution. The systems that are commercially available can be very expensive, and to be honest, do not provide out of the box functionality, so you still have to invest resources into customizing it to your needs.

That being said, there are a number of vendors that now offer data warehousing under the software as a service model. I know, that more and more vendors are coming online and offering this warehousing service so that you don’t need to have that infrastructure in house.

What advice would you give a small or midsized drug developer to make it more affordable?

If I was starting from scratch, I would obviously network with others that are in this space. It has not been widely adopted, so there are a lot of different opinions on how warehousing should be performed, which is a function of the newness I suppose.  I would recommend you assess all the vendors that offer data warehousing and then decide whether to build or buy, and obviously that’s going to be a function of your company’s strategic direction.

Don’t forget that you’ll need the staff to support all of the data mapping needs and visualization builds that are going to occur if you bring this technology in-house.  The advantage of building is that you’re in total control, but you’re going to have to make that upfront investment in the technology and the staff to make use of it.  For companies that are in the commercial space or maybe expect to enter the commercial space, I would recommend bringing the technology in house.  For virtual companies, I would expect they would outsource this function because they’re not going to want to make that big upfront investment.

When I say bring the technology in house, I don’t mean all the servers have to sit inside your building. You can push stuff out into the cloud.  That’s fine.  The point is that you should control the technology and the staff around it.  Because it’s a relatively new technology, to make the best use of it, you need highly trained people, who are very aware of your drug development programs and your data strategy.

In her eBook entitled Big Data in Healthcare, Hype and Hope, Dr. Bonnie Feldman from DrBonnie360, stresses that poor quality data hinders the ability to analyze and make decisions.  What advice can you give a small or midsized oncology drug developer in order for them to optimize their clinical research efforts?

One of the beautiful things about warehousing your data, is it gets you to think about standardizing and optimizing your data collection right up front.  The more non-standardized your data are, the dirtier obviously it is, the more difficulties you’re going to have in warehousing it.  So I agree, poor data quality hinders the ability to analyze and make decisions. That makes perfect sense.

The way to get around that is obviously to put more controls in place in terms of how you’re cleaning your data upfront, but the fact that you need to aggregate your data in a warehouse, certainly helps you in that thinking process.  It really allows you to enforce some standards across the board that not only will reap benefits in terms of your warehousing, but also just in terms of general efficiency.  You will get much more familiar with how you collect data across your entire drug program.



Back To Top