Molecular pathway databases represent cellular processes in a structured and standardized way. These databases support the community-wide utilization of pathway information in biological research and the computational analysis of high-throughput biochemical data. Although pathway databases are critical in genomics research, the fast progress of biomedical sciences prevents databases from staying up-to-date. Moreover, the compartmentalization of cellular reactions into defined pathways reflects arbitrary choices that might not always be aligned with the needs of the researcher. Today, no tool exists that allow the easy creation of user-defined pathway representations.
Here we present Padhoc, a pipeline for pathway ad hoc reconstruction. Based on a set of user-provided keywords, Padhoc combines natural language processing, database knowledge extraction, orthology search and powerful graph algorithms to create navigable pathways tailored to the user’s needs. We validate Padhoc with a set of well-established Escherichia coli pathways and demonstrate usability to create not-yet-available pathways in model (human) and non-model (sweet orange) organisms.
Salvador Casaní-Galdón, Cecile Pereira, Ana Conesa, Padhoc: a computational pipeline for pathway reconstruction on the fly, Bioinformatics, Volume 36 (2):i795–i803, December 2020.