The Pragmatic Text Miner

Lars Juhl JensenHow quickly do you read?  According to results of an online speed reading test by Staples, the office supplies company, the average senior executive reads 575 words per minutes, while the average college professor clocks in at 675.  The rest of us manage only less than half that volume, about 300 words per minute.

Why does this matter?  According to the U.K.-based JISC, the global research community generates over 1.5 million new scholarly articles each year.  Even a champion speed-reader would never be able to keep up or ever do anything else but read.  That’s why life science companies increasingly are relying on text and data mining software to identify and extract from this information mountain the research equivalents of diamonds and rubies and sapphires.

To enable access and sharing of the tremendous volume of published research available today, Copyright Clearance Center provides licensing and content workflow solutions for life science companies around the world.  We understand what a daunting challenge it can be to manage and mine this information mountain range.  Within the millions of words and illustrations in libraries and databases, the potential for insights and innovation is enormous, for drug discovery and clinical trial development to drug safety monitoring and competitive intelligence.

In a February 11, 2015 webinar, Lars Juhl Jensen, professor at Novo Nordisk Foundation Center for Protein Research at the Panum Institute, Copenhagen, examined the challenges researchers face in building large collections of content to mine — and he offered suggestions for what bioinformatics professionals need to address these issues. As part of the free program (click here for a full webinar recording), Prof. Jensen spoke with CCC’s Chris Kenneally.

Lars Juhl Jensen started his research career in Søren Brunak’s group at the Technical University of Denmark (DTU), from where he in 2002 received the Ph.D. degree in bioinformatics for his work on non-homology based protein function prediction. During this time, he also developed methods for visualization of microbial genomes, pattern recognition in promoter regions, and microarray analysis. From 2003 to 2008, he was at the European Molecular Biology Laboratory (EMBL) where he worked on literature mining, integration of large-scale experimental datasets, and analysis.

No comments yet.

Leave a Reply