Cyclebase
 
 
 
 
Methods
 
 
Contents:
 
 
Data Normalization
All data in Cyclebase are normalized in the same manner.

Expression values: The expression values were initially log2 transformed. In order to center the profile at zero, the mean expression value was subtracted over all time points. To make the y-axes comparable between experiments, the expression values were all normalized so that the standard-deviation is one across the entire experiment.

Time-scale: The original time-scale in minutes from the experiment start was first normalized with the interdivision time (the time it takes to complete a cell cycle). This creates a time-scale in percent of a cell-cycle. Since different experimental methods release cells from different points, each experiment was shifted such that Zero always corresponds to the time of cytokinesis (M/G1 transition).
 
 
 
Rank
The rank orders each gene of an organism by a score we have assigned based on its pattern of expression and magnitude of regulation. Those genes with the highest periodicity and that are most regulated are given the best ranks (lowest number).

Calculating the rank combines both the P-value for regulation and P-value for periodicity. First, a P-total value is calculated by multiplying P(per) with P(reg). As this P-total value can unfairly favor a single gene because of only one of the values, the combined score penalizes genes that are not both regulated and periodic:

This combined score is sorted and the genes are given their rank based on this order.

To calculate the total rank for a single gene across all available experiments, the total P-value for regulation and total P-value for periodicity are used (in the combined score equation) instead of just the single experiment P-values.

 
 
 
P(per)
The P-value for periodicity is the chance of observing as great a periodicity by random shuffling of the individual time-point values of the expression profile. A small P(per) value therefore implies a highly periodic pattern of expression.

In order to calculate the P-value for periodicity, a Fourier score was obtained for each profile. This Fourier score is defined as:

Next, 1,000,000 artificial profiles were generated from random shuffling of the data within the original profile. The fraction of random profiles whose Fourier scores were greater than or equal to the gene's real Fourier score was then normalized to create the final P-value for periodicity.

The total P(per) value for a single gene across all available experiments is computed by multiplying all of the P(per)-values for each experiment.

 
 
 
P(reg)
The P-value for regulation estimates the chance that the magnitude of regulation will have occurred by chance. A small P(reg) values therefore implies a strongly regulated gene.

In order to calculate the P-value for regulation, the standard deviation was obtained for the log-ratio profile. Next, 1,000,000 random profiles generated from the global distribution (entire experiment) were created. The fraction of random profiles whose standard deviations were greater than or equal to the gene's standard deviation were calculated. This fraction was then normalized to create the final P-value for regulation.

The total P(reg) value for a single gene across all available experiments is computed by multiplying all of the P(reg)-values for each experiment.

 
 
 
Peaktime
The peaktime describes when in the cell cycle a gene is maximally expressed. Peaktime is represented as a percent with both 0 and 100 representing the M/G1 transition in the cell cycle.

A peaktime for a single expression profile first requires that a sine wave be fitted to the profile. The algorithm scans through all possible offsets and selects the sine wave has the best correlation with the observed expression profile. The peaktime is then computed as the peak of this sine wave.

To compute a peaktime for a single gene across all available experiments, the time scale was 'shifted' such that time was represented as a fraction of the cell cycle. In this scale, both 0 and 100 correspond to the M/G1 transition. As experiments with not very periodic profiles produce poor peaktimes, the combined peaktime was weighted to take this into account.

 
Peaktime Uncertainty
In certain cases the peaktime will be marked as uncertain. There are several reasons this uncertainty can occur:
  • The experiment(s) are not sufficiently periodic for a peaktime to be determined. Remember, the peaktime is only a meaningful measure for those genes that are periodic.
  • When more than one experiment is being analyzed, the experiments may in some cases disagree with respect to the time of peak expression. In such cases, inconsistency across the different experiments makes it impossible to calculate a reliable peaktime.
 
 
 
Further Reading
For more detailed information about the analysis methodology and results, please see:
  • de Lichtenberg, U., Jensen, L. J., Fausbøll, A., Jensen, T. S., Bork, P., and Brunak, S. (2005) Comparison of computational methods for the identification of cell cycle regulated genes. Bioinformatics, 21, 1164-1171.

  • Marguerat, S., Jensen, T. S., de Lichtenberg, U., Wilhelm, B. T., Jensen, L. J., and Bähler, J. (2006) The more the merrier: comparative analysis of microarray studies on cell cycle-regulated genes in fission yeast. Yeast, 23, 261-277.

  • Jensen, L. J., Jensen, T. S., de Lichtenberg, U., Brunak, S., and Bork, P. (2006) Coevolution of transcriptional and posttranslational cell-cycle regulation. Nature, 443, 594-597.

 
 
FAQ       Methods       Experiments       Download Data
©2007 Cyclebase.org