Full EC Number Predictor.

Motivation: Assignment of full EC numbers to a given biochemical transformation is a challenging issue, however, with very helpful insights for metabolic engineering and synthetic biology.
Results: In the present work (ECAssigner4), reaction diversity fingerprints (RDF) are newly proposed to assign EC numbers to enzymatic reactions. Reaction structures are converted to RDF, Reaction similarity fingerprints (RSF) at different fingerprint lengths and reaction transformation fingerprint (RTF) at three levels. In order to validate the proposed method, 1627 balanced Rhea enzymatic reactions occurring more than once over the same EC number are selected as training data set. We’ve tried thirteen methods using RTF, RDF, RSF, or their combinations to predict the full EC number. Cross-validations demonstrate that RSF with fingerprint length at 10 and RDF at length 3 together obtains the best result: the precision of full EC number prediction is 71.21%. As far as we know, ECAssigner4 is the first web server, empowered with more than 100,000 RxnFinder biosynthesis reactions manually curated, for full EC number assignments only using reaction structures.