Show simple item record

dc.contributor.authorOyetunde, T.
dc.contributor.authorLiu, D.
dc.contributor.authorGarcia-Martin, H.
dc.contributor.authorTang, Y.J.
dc.date.accessioned2019-06-06T13:59:15Z
dc.date.available2019-06-06T13:59:15Z
dc.date.issued2019-01-01
dc.identifier.issn1932-6203
dc.identifier.urihttp://hdl.handle.net/20.500.11824/983
dc.description.abstractMetabolic models can estimate intrinsic product yields for microbial factories, but such frameworks struggle to predict cell performance (including product titer or rate) under suboptimal metabolism and complex bioprocess conditions. On the other hand, machine learning, complementary to metabolic modeling necessitates large amounts of data. Building such a database for metabolic engineering designs requires significant manpower and is prone to human errors and bias. We propose an approach to integrate data-driven methods with genome scale metabolic model for assessment of microbial bio-production (yield, titer and rate). Using engineered E. coli as an example, we manually extracted and curated a data set comprising about 1200 experimentally realized cell factories from ~100 papers. We furthermore augmented the key design features (e.g., genetic modifications and bioprocess variables) extracted from literature with additional features derived from running the genome-scale metabolic model iML1515 simulations with constraints that match the experimental data. Then, data augmentation and ensemble learning (e.g., support vector machines, gradient boosted trees, and neural networks in a stacked regressor model) are employed to alleviate the challenges of sparse, non-standardized, and incomplete data sets, while multiple correspondence analysis/principal component analysis are used to rank influential factors on bio-production. The hybrid framework demonstrates a reasonably high cross-validation accuracy for prediction of E.coli factory performance metrics under presumed bioprocess and pathway conditions (Pearson correlation coefficients between 0.8 and 0.93 on new data not seen by the model).en_US
dc.formatapplication/pdfen_US
dc.language.isoengen_US
dc.rightsReconocimiento-NoComercial-CompartirIgual 3.0 Españaen_US
dc.rights.urihttp://creativecommons.org/licenses/by-nc-sa/3.0/es/en_US
dc.titleMachine learning framework for assessment of microbial factory performanceen_US
dc.typeinfo:eu-repo/semantics/articleen_US
dc.relation.publisherversionhttps://journals.plos.org/plosone/article?id=10.1371/journal.pone.0210558en_US
dc.relation.projectIDES/1PE/SEV-2017-0718en_US
dc.rights.accessRightsinfo:eu-repo/semantics/openAccessen_US
dc.type.hasVersioninfo:eu-repo/semantics/acceptedVersionen_US
dc.journal.titlePlos ONEen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

Reconocimiento-NoComercial-CompartirIgual 3.0 España
Except where otherwise noted, this item's license is described as Reconocimiento-NoComercial-CompartirIgual 3.0 España