Cyclebase.org

Search Type:
  • Gene Name
  • BLAST
Example searches:

 

Methods

Table of Contents

Data Normalization

All data in Cyclebase are normalized in the same manner.

Expression values: The expression values were initially log2 transformed. In order to center the profile at zero, the mean expression value was subtracted over all time points. To make the y-axes comparable between experiments, the expression values were all normalized so that the standard-deviation is one across the entire experiment.

Time-scale: The original time-scale in minutes from the experiment start was first normalized with the interdivision time (the time it takes to complete a cell cycle). This creates a time-scale in percent of a cell-cycle. Since different experimental methods release cells from different points, each experiment was shifted such that Zero always corresponds to the time of cytokinesis (M/G1 transition).

Rank

The rank orders each gene of an organism by a score we have assigned based on its pattern of expression and magnitude of regulation. Those genes with the highest periodicity and that are most regulated are given the best ranks (lowest number).

Calculating the rank combines both the P-value for regulation and P-value for periodicity. First, a P-total value is calculated by multiplying P(per) with P(reg). As this P-total value can unfairly favor a single gene because of only one of the values, the combined score penalizes genes that are not both regulated and periodic:

Rank Equation

This combined score is sorted and the genes are given their rank based on this order.

To calculate the total rank for a single gene across all available experiments, the total P-value for regulation and total P-value for periodicity are used (in the combined score equation) instead of just the single experiment P-values.

P-value for Periodicity - P(Per)

The P-value for periodicity is the chance of observing as great a periodicity by random shuffling of the individual time-point values of the expression profile. A small P(per) value therefore implies a highly periodic pattern of expression.

In order to calculate the P-value for periodicity, a Fourier score was obtained for each profile. This Fourier score is defined as:

Rank P(per)

Next, 1,000,000 artificial profiles were generated from random shuffling of the data within the original profile. The fraction of random profiles whose Fourier scores were greater than or equal to the gene's real Fourier score was then normalized to create the final P-value for periodicity.

The total P(per) value for a single gene across all available experiments is computed by multiplying all of the P(per)-values for each experiment.

P-value for Regulation - P(Reg)

The P-value for regulation estimates the chance that the magnitude of regulation will have occurred at random. A small P(reg) value therefore implies a strongly regulated gene.

In order to calculate the P-value for regulation, the standard deviation was obtained for the log-ratio profile. Next, 1,000,000 random profiles generated from the global distribution (entire experiment) were created. The fraction of random profiles whose standard deviations were greater than or equal to the gene's standard deviation were calculated. This fraction was then normalized to create the final P-value for regulation.

The total P(reg) value for a single gene across all available experiments is computed by multiplying all of the P(reg)-values for each experiment.

Peaktime

The peaktime describes when in the cell cycle a gene is maximally expressed. Peaktime is calculated as a percent, with both 0 and 100 representing the M/G1 transition in the cell cycle. These percents are displayed as discrete phases or transitions of the cell cycle.

A peaktime for a single expression profile first requires that a sine wave be fitted to the profile. The algorithm scans through all possible offsets and selects the sine wave has the best correlation with the observed expression profile. The peaktime is then computed as the peak of this sine wave.

To compute a peaktime for a single gene across all available experiments, the time scale was 'shifted' such that time was represented as a fraction of the cell cycle. In this scale, both 0 and 100 correspond to the M/G1 transition. As experiments with not very periodic profiles produce poor peaktimes, the combined peaktime was weighted to take this into account.

Peaktime Uncertainty

In certain cases the peaktime will be marked as uncertain. There are several reasons this uncertainty can occur:

  • The experiment(s) are not sufficiently periodic for a peaktime to be determined. Remember, the peaktime is only a meaningful measure for those genes that are periodic.
  • When more than one experiment is being analyzed, the experiments may in some cases disagree with respect to the time of peak expression. In such cases, inconsistency across the different experiments makes it impossible to calculate a reliable peaktime.

Gene Feature Annotations

The Gene Feature annotations that Cyclebase displays include degradation signals, overexpression phenotypes, CDK substrates, and siRNA knockdown phenotpes. Each annotation was manually curated from different sources. References describing the experimental and computational sources are included below:

Degradation Signals
Jensen, L. J., Jensen, T. S., de Lichtenberg, U., Brunak, S., and Bork, P. (2006) Coevolution of transcriptional and posttranslational cell-cycle regulation. Nature, 443, 594-597. [pubmed]
Overexpression Phenotype
The overexpression annotations were performed in two different studies:
Niu, W., Li, Z., Zhan, W., Iyer, V.R., and Marcotte, E.M. (2008) Mechanisms of cell cycle control revealed by a systematic and quantitative overexpression screen in S. cerevisiae. PLoS Genetics, 4, e1000120. [pubmed]
and
Sopko, R., Huang, D., Preston, N., Chua, G., Papp, B., Kafadar, K., Snyder, M., Oliver, S.G., Cyert, M., Hughes, T.R., Boone, C., and Andrews, B. (2006) Mapping pathways and phenotypes by systematic gene overexpression. Molecular Cell, 21, 319-330. [pubmed]
Phosphorylated By
Annotations of Saccharomyces cerevisiae CDK substrates were curated from two sources:
Loog, M. and Morgan, D.O. (2005) Cyclin specificity in the phosphorylation of cyclin-dependent kinase substrates. Nature, 434, 104-108. [pubmed]
and
Ubersax, J.A., Woodbury, E.L., Quang, P.N., Paraz, M., Blethrow, J.D., Shah, K., Shokat, K.M., and Morgan, D.O. (2003) Targets of the cyclin-dependent kinase Cdk1. Nature, 425, 859-864. [pubmed]
Annotations of Homo sapiens CDK substrates were taken from:
Diella, F., Gould, C.M., Chica, C., Via, A., and Gibson, T.J. (2008) Phospho.ELM: a database of phosphorylation sites--update 2008. Nucleic Acids Research, 36(Database issue), D240-244. [pubmed]
siRNA Knockdown Phenotype
Mukherji, M., Bell, R., Supekova, L., Wang, Y., Orth, A.P., Batalov, S., Miraglia, L., Huesken, D., Lange, J., Martin, C., Sahasrabudhe, S., Reinhardt, M., Natt, F., Hall, J., Mickanin, C., Labow, M., Chanda, S.K., Cho, C.Y., and Schultz, P.G. (2006) Genome-wide functional analysis of human cell-cycle regulators. Proc Natl Acad Sci USA, 103, 14819-14824. [pubmed]

Further Reading

For more detailed information about the analysis methodology and results, please see:

  • de Lichtenberg, U., Jensen, L. J., Fausbøll, A., Jensen, T. S., Bork, P., and Brunak, S. (2005) Comparison of computational methods for the identification of cell cycle regulated genes. Bioinformatics, 21, 1164-1171. [pubmed]
  • Marguerat, S., Jensen, T. S., de Lichtenberg, U., Wilhelm, B. T., Jensen, L. J., and Bähler, J. (2006) The more the merrier: comparative analysis of microarray studies on cell cycle-re1gulated genes in fission yeast. Yeast, 23, 261-277. [pubmed]
  • Jensen, L. J., Jensen, T. S., de Lichtenberg, U., Brunak, S., and Bork, P. (2006) Coevolution of transcriptional and posttranslational cell-cycle regulation. Nature, 443, 594-597. [pubmed]