EcoCyc Project Overview

EcoCyc is a bioinformatics database describing the genome and the biochemical machinery of E. coli K-12 substr. MG1655. The long-term goal of the project is to describe the molecular catalog of the E. coli cell, as well as the functions of each of its molecular parts, to facilitate a system-level understanding of E. coli. EcoCyc is an electronic reference source for E. coli biologists, and for biologists who work with related microorganisms.

The value of EcoCyc comes from its rich, extensively curated data content and its extensive bioinformatics tools. EcoCyc provides the following user operations:

For more information:

"EcoCyc" is pronounced "eeko-sike". It sounds like "ecology" and like "encyclopedia".

EcoCyc Data Content

See also the list of data sources from which EcoCyc integrates data.

Genome. EcoCyc contains the complete genome sequence of E. coli, and describes the nucleotide position and function of every E. coli gene. A staff of three curators updates the annotation of the E. coli genome on an ongoing basis using a literature-based curation strategy. Mini-review summaries of E. coli gene products can be found in EcoCyc protein and RNA pages. Users can retrieve the nucleotide sequence of a gene, and the amino-acid sequence of a gene product.

Regulation. EcoCyc contains the most complete description of the regulatory network of any organism, including E. coli operons, promoters, transcription factors, transcription-factor binding sites, attenuators, and small-RNA regulators.

Membrane transporters. EcoCyc annotates E. coli transport proteins, and the associated transport reactions that they mediate.

Metabolism. EcoCyc describes all known metabolic pathways and signal-transduction pathways of E. coli. It describes each metabolic enzyme of E. coli, including its cofactors, activators, inhibitors, and subunit structure. See also the MetaCyc project.

Database links. EcoCyc is linked to other biological databases containing protein and nucleic acid sequence data, bibliographic data, protein structures, and descriptions of different E. coli strains.

Literature-Based Curation. Curation is the process of manually refining and updating a bioinformatics database. The EcoCyc project uses a literature-based curation approach in which database updates are based on evidence in the experimental literature. EcoCyc is largely up to date with respect to its curation activities. As of February 2024, EcoCyc has encoded information from more than 44,142 publications.

Curators collect gene, protein, pathway, and compound names and synonyms. They classify genes and gene products using Gene Ontology, and they classify pathways within the Pathway Tools pathway ontology. Protein complex components and the stoichiometry of these subunits are captured; cellular localization of polypeptides and protein complexes is entered, as are experimentally determined protein molecular weights; enzyme activities and any enzyme prosthetic groups, cofactors, activators, or inhibitors are captured. Operon structure and gene regulation information are encoded.

Curators author textual summaries with extensive citations for proteins, RNAs, pathways, and operons that capture phenotypes caused by mutation, depletion, or overproduction of each gene product; any genetic interactions known; protein domain architecture and structural studies; similarity to other proteins; or any functional complementation experiments that have been described. Summaries can also be used to note cases in which the published reports present contradictory results. In such cases, both viewpoints will be presented with proper attribution.

EcoCyc Bioinformatics Tools

The EcoCyc software tools support a number of search, visualization, and analysis operations.

Search and visualization. Scientists can use the EcoCyc web site to search for genes, pathways, metabolites, etc. The navigation capabilities of the software enable a user to move from a display of an enzyme to a display of a reaction that the enzyme catalyzes, or to the gene that encodes the enzyme. The EcoCyc genome browser visualizes the layout of genes within the E. coli chromosome, and the metabolic network browser provides a zoomable view of the full E. coli metabolic network.

Analysis of omics data. EcoCyc provides four tools for analysis of single-omics datasets. The second and third tools (Cellular Omics Viewer and Dashboard) can also be used to analyze multi-omics datasets.

Comparative genomics. We have computed orthologs between EcoCyc and the 20,000 other microbial genomes within BioCyc, including 502 other E. coli strains. The genome browser can be run in a comparative mode that aligns multiple chromosomes at orthologous genes. Metabolic network comparison tools are also available.

EcoCyc Certification

EcoCyc has been designated a Global Core Biodata Resource by the Global Biodata Coalition, meaning that EcoCyc is one of 37 resources whose long term funding and sustainability is critical to life-science and biomedical research worldwide.

We Encourage Your Feedback

Feedback from the scientific community has been invaluable to improving EcoCyc during its many years of development. We strongly encourage your comments and suggestions for improvements in areas including the following. Please email suggestions or questions to our .


The development of EcoCyc is funded by NIH grants 1R24GM150703 and GM71962 from the NIH National Institute of General Medical Sciences.

Contributors to EcoCyc are listed on the credits page.