Home health remedies Unlocking the page – 20+ years of digital innovation at Elsevier

Unlocking the page – 20+ years of digital innovation at Elsevier

32
0
SHARE

Posted on December 2nd, 2021 by in AI & Data

Mark Sheehan is VP
Data Science on Elsevier’s Life Science team. His 20 years at the company maps
closely with Elsevier’s digital journey over that time. And today,
pharmaceutical companies can follow a similar journey – albeit highly
accelerated – using Elsevier’s latest AI-driven R&D boosters. Mark has enjoyed
many valuable experiences along the way, from the joys of cracking open a newly
printed book, to enabling people to speedily crack new synthetic pathways at
scale. “Yes, innovation always involves new technology. But it’s equally about
human collaboration,” he says.

When Mark joined
Elsevier in 2002 as a project manager, the company was in the first phase of transitioning
away from being primarily a print publisher. With each technological transition
that followed – from the move from print to online, through to content
enrichment, and now to predictive modeling – Mark was closely involved with Elsevier’s
transformation into becoming innovators in information analytics. Today, he
leads the data science team for the Life Sciences, exploring how AI and other
technologies can streamline the research and development work of chemists and
biologists.

In many ways, Mark’s
career is not only a mirror of Elsevier’s evolution over the same time, but also
of the current evolution of many companies and organizations who are embracing
digital transformation to achieve their short- and long-term goals. We asked
Mark to look back at some of his pivotal professional moments that may resonate
for those on this journey.

FIRST WAVE: THE
DIGITIZATION OF INFORMATION

So, you were there at the internet’s big bang …

Mark Sheehan: In some
ways I did get the full spectrum. When I started in publishing, immediately prior
to Elsevier, I was working in a very small journals publisher where they
copyedited, typeset and printed the journals themselves all in the same small
building in North London. And when I arrived at Elsevier it was a classic case
of right place, right time. Luckily, I proved to be a terrible copy
editor and proofreader – I just didn’t have the patience for it. But I did have
a natural affinity for computers and some rudimentary programming skills,
backed by a willingness to learn.

My first boss in
Elsevier was a great guy and very supportive of my growth. But we came from very
different worlds. He would love it when the first copies of a book came in from
the printer – he’d pick it up, sniff the fresh smell of ink on paper, and
vigorously shake it upside down to make sure the binding was good. Meanwhile, I
was this enthusiastic puppy evangelizing about the internet and jumping up and
down to take on each and every task relating to the shift online. He largely
left me and the other geeks to it, and put a lot of trust in us to lay the
foundations for the future.

By then ScienceDirect
was already up and running – a sign that the, um, shelf life of those printed
books was becoming limited.

ScienceDirect was actually an amazing strategy for the time. As a subscription service offering all Elsevier’s content in one place at one price, there are some comparisons with what Spotify would do with music years later. So, it was a bold shift for the whole company, first in journals then in books.

Doomsayers were predicting the death of the book.

Indeed. With the rise
of desktop publishing, the whole notion of a book as a container of information
was fundamentally shifting. We began thinking about what unit of information
did the customers ultimately want. How could we best organize these articles
and journals for them? Did they want the full book or just a chapter? Again,
it’s similar to music when customers shifted from CD to purchasing individual
tracks on iTunes. Forgive me, I do use a lot of music analogies, but both these
industries were fundamentally disrupted in parallel.

But coming from a place as a legacy print publisher, Elsevier must have had some grumpy employees unwilling to embrace the shift to digitization. And Elsevier is a rather huge company.

Yes, it’s a common perception
that larger companies are slow to change. And there were challenges, let’s be
clear on that. Some people really cared about the craft of print, which is
great – and print does remain important in many markets. And yes, some colleagues
got upset sometimes that we had to standardize our print designs so they could
also work online, which I can also sympathize with. And indeed, to push through
change, you have to consider the human implications as much as the technical
ones. But most were quick to see the bigger story was not about paper but about
the transmission of information – and being able to unlock what was written on
the page for the largest scientific audience possible.

SECOND WAVE:
LEVERAGING THE DATA

So, the average Elsevier employee proved to be less obsessed with books than with information?

Once they saw that
the internet wasn’t a threat and wasn’t taking anything away, but rather just changing
how we disseminated the information, the shift was clear for all of us. And as
individual consumers, we were all evolving with the times: getting iPods,
Kindles, laptops, et cetera. Everyone was aware of where the world was going,
because we were part of that world in the middle of this amazing shift in
society. And yes, some worried about whether their job would become obsolete. But
they soon recognized that their skills were still valuable and/or adaptable in
tandem with these changes.

But the true tipping point, the payback, only came later – when e-revenue eclipsed print revenue.

Yes, every year the
digital revenue grew and grew, and suddenly there was this tipping point when
we stopped focusing exclusively on when the book hit the warehouse for sale,
but also when it appeared on ScienceDirect or was available on the Amazon store.
Basically, we were leveraging what we already had spent years building: a solid
digitized foundation. So, this second wave was much less bumpy; it was more
about providing more and more different types and variations of deliverables
for different consumers.

And as this reach
continued to extend, the way people consumed our content fundamentally changed
as well – it was no longer exclusively about browsing through the library and
reading many physical copies to find what you needed. It was becoming more
about how to optimize your search across many online sources and databases. So,
the challenge became more about streamlining the finding of the digital needle
in the exponentially-expanding haystack.

THIRD WAVE: FROM
MANUAL TO AUTOMATED

So, the next step was to leverage the digitalized text even further: applying data science and AI to enrich, and extract from, this information in whole new ways to help with digital search and discovery.

My department first
came together in Elsevier five or six years ago to look into what can we do to
move the needle on data science in the Life Sciences. We discovered early on we
could do a lot of things to automate our traditional manual curating processes,
which previously involved very smart people reading all these articles,
literally page by page, and doing all sorts of clever annotations based on
following these dense indexing “rule books”. But we also discovered that no
matter how much technology you use, there’s a human limit to how much you can scale
this approach.  

So, to move forward, we began to wonder what would happen if we could teach the machine to read it all, do some initial enrichment that would update our customers quickly on new research, and also flag any interesting material to be read and indexed in detail by an expert. Luckily Elsevier already had some in-house tech pioneers who had already started building some automated tools and paths that we could quickly extend for processing at scale – particularly for our chemistry database Reaxys and our biomedical literature database Embase.

Can you tell us more on how the humans-meet-machine-learning axis was applied in Reaxys?

Basically, we
realized two things. First, that a single “silver bullet” technology doesn’t
exist for our use cases. It certainly won’t give you the range of what a human
being can do, nor what your customers are asking for. But if you stack
different and complementary technologies together, they can work to cover
different elements and you can get a much better view – “more sides of the
elephant”, as it were. 

Second … When this
department first started, there was a team of about 20 PhD-level chemists and
biologists who, in some cases, had been carefully curating these enrichment
flows for 30-plus years. We then brought these ‘manual’ domain experts together
with our data scientists and analysts to work together to ‘train the machine’.
This led to amazing results – as seen with our current patent coverage, for
example.

In just that first
year of our team coming together, we were able to deliver new automated
capabilities for Reaxys that could enrich articles from 16,000 journal titles
per year, versus the 400-odd we processed previously.

NEXT-LEVEL MAGIC

What do you see as the next huge leap forward? 

Well, if I can
backtrack a moment … We’ve been talking about the digital journey from content
(such as the books and journals on ScienceDirect) to data (such as the facts and
concepts indexed from those books and journals for easier search and
discovery). But the power of machine learning can also be used for predictions
– to, in effect, teach the machine chemistry by feeding it massive volumes of
complex chemical reactions and facts. 

For example, machine
learning models can be used to not only correctly identify well-established
paths to create a certain compound as well as a trained chemist, but it can
also suggest previously unknown paths to synthesize that compound – paths that
can be cheaper, faster, and more environmentally friendly.

Already, Entellect’s reactions workbench can be used to create such models from our Reaxys data and other high-quality data sources. And the Reaxys Predictive Retrosynthesis tool helps even very experienced chemists by suggesting new synthetic paths using a range of best-in-class proven predictive models. Meanwhile, we are continuing to work with a number of leaders in the field of predictive retrosynthesis, such as eminent researchers like Professor Mark Waller who published a very famous paper for Nature, ‘Planning chemical syntheses with deep neural networks and symbolic AI’, which provided some of the foundational work for Reaxys.

And this process of
predictive modeling can continuously improve as we add more enriched data, and
as our human experts validate the outputs of the machine learning models.
There’s still so much more opportunity in this space, and research continues to
move forward all the time.

What projects excite you most in terms of furthering this ‘continuous improvement’?

Well, without giving away too many company secrets, we have made some fantastic advances in recent years to mine the full text of chemistry-related journal articles. And since Elsevier acquired SciBite last year, we are working to add their powerful semantic technologies into our “data science toolkit” for further advances, particularly in the biomedical space. But in general, we will continue to expand our automation capabilities, while also moving deeper into the predictive chemistry space that I mentioned earlier.

We also have a very productive research collaboration with Professor Karin Verspoor and her doctoral team in Australia via our ChEMU (Cheminformatics Elsevier Melbourne University) collaboration, related to automating ways to extract information about chemical reactions in chemical patents. And as a result, we are making real headway into the many and varied challenges of training a machine to read tables and accurately extract information from them – which is very valuable for chemists and other researchers, as you can well imagine.

We are also doing a lot of work with our research partners led by Dr. Gordon Broderick at Rochester Institute of Technology looking at ‘in silico biology’ where you try to do as much early experimentation in the computer prior to live testing – which could dramatically cut the time it takes for clinical trials, as well as risks. So, we are really moving forward with all sorts of interesting directions at the moment.

So, moving forward means partnering?

Absolutely. With all
these different directions and partnerships taking place in this predictive
space, we are part of a large research community. And certainly, the Life Sciences
requires more collective engagement than in many other sectors – in terms of
involving academia, the business world, policymakers and regulatory bodies.

And we are very much a part of this community – whether it’s partnering with more academic institutions, supporting researchers at all stages of their career, expanding our interns program, or inspiring the younger generations with such initiatives as Amsterdam Data Science. Only together can we really build a healthier future.

Please enable JavaScript to view the comments powered by Disqus.

R&D Solutions for Pharma & Life Sciences

We’re happy to discuss your needs and show you how Elsevier’s Solution can help.

Contact Sales

!function(f,b,e,v,n,t,s){if(f.fbq)return;n=f.fbq=function()
{n.callMethod? n.callMethod.apply(n,arguments):n.queue.push(arguments)}
;if(!f._fbq)f._fbq=n;
n.push=n;n.loaded=!0;n.version=’2.0′;n.queue=[];t=b.createElement(e);t.async=!0;
t.src=v;s=b.getElementsByTagName(e)[0];s.parentNode.insertBefore(t,s)}(window,
document,’script’,’https://connect.facebook.net/en_US/fbevents.js’);
fbq(‘init’, ‘533182150132648’);
fbq(‘track’, “PageView”);
!function(f,b,e,v,n,t,s){if(f.fbq)return;n=f.fbq=function()
{n.callMethod? n.callMethod.apply(n,arguments):n.queue.push(arguments)}
;if(!f._fbq)f._fbq=n;
n.push=n;n.loaded=!0;n.version=’2.0′;n.queue=[];t=b.createElement(e);t.async=!0;
t.src=v;s=b.getElementsByTagName(e)[0];s.parentNode.insertBefore(t,s)}(window,
document,’script’,’https://connect.facebook.net/en_US/fbevents.js’);
fbq(‘init’, ‘1737613393127776’,
{ em: ‘insert_email_variable,’ }
);
fbq(‘track’, ‘PageView’);

Source link

LEAVE A REPLY

Please enter your comment!
Please enter your name here

eighteen + nine =