example analysis


The Bisulfite Genomic Sequencing technique has gained wide acceptance for the generation of DNA-methylation maps in single-base resolution. The method is based on the selective deamination of cytosine to uracil (and subsequently via PCR to thymine), whereas 5-methylcytosine remains unchanged. Maps are created by the comparison of bisulfite converted sequences with the untreated genomic sequence. Provided the sequences exist in electronic form you may proceed with the...

Conversion of Bisulphite generated methylation data into "seq1"-format:

  1. ALIGNMENT: you align the sequences either by hand or with your favorite alignment program. We use ClustalW for this purpose. At the Download page a matrix is provided which facilitates the alignment (please refer to the ClustalW manual for the use of external matrices).
  2. EDITING: manual editing of the alignment will be necessary in most cases. Make sure to bring all sequences to equal length - e.g. by adding "-" to the ends. The sequences are expected to have no internal gaps. Unknown bases can be represented by "n". If you are not sure about the format refer to the example file.
  3. SAVING: Save the sequences in FASTA format into one file "filename". The unconverted mother sequence has to be the first sequences. An example of this sort of concatenated FASTA sequences can be obtained from the Download page (example A). Generate a new subdirectory (UNIX: "mkdir my_dir") and place the file there. Under UNIX make this directory your working directory ("cd my_dir").
  4. CONVERSION (UNIX): convert all the bisulfite generated sequences into individual sequence files with capital "C" in place of 5-methylcytosine. Type

    "convert_bisulfite.pl -f filename > filename.log".

    The ">" sign directs a conversion report into the file "filename.log". If you omit this part the report will be only displayed on the screen.

  5. CONVERSION (Mac): Start MacPerl and call the program "convert_bisulfite.pl". A file selector box will prompt you for the name of the input file. The output shows up in the standard output window of MacPerl. Save this output into a file using the menue item "Save as...". Add the extension .log.
  6. The sequences are now saved in your current working directory. Each sequence exist in 2 forms: as so called seq1-file (all bases in small letters except 5mC which is represented by capital C) and seq2-files which contain a schematic representation of the methylation patterns. Please, refer to our publication for the details.
  7. It is recommended to save all .seq1-files in appropriately named subdirectories (e.g. Seq1, Seq2).

Analysis of data in seq1-format:

All analysis programs follow the same scheme: the .seq1 files are expected to be in the subdirectory you specify with parameter -d. Results are written into separate files (optional named with paramter -n) in this subdirectory. If you obmit paramter -n the files are named after the subdirectory and an appropriate extension is given to them. A detailed description of all programs is about to be published and will be placed in a separate page after publication. As example, the generation of a graphical output of the methylation patterns is described here:

  1. Make sure you are in your working directory and the seq1 files are located in the subdirectory Seq1 and carry the extension .seq1. All other files will be ignored
  2. UNIX: Type "Plot_CpGs_to_png.pl -d Seq1 -n CpG_plot_1".
  3. Mac: Start MacPerl and call the program "Plot_CpGs_to_png.pl". A file selector box will be ask you for the folder with the seq1-files and the name of the output file. Choose "Seq1" as input folder and "CpG_plot_1" as output file.
  4. The graphics is written into a file "CpG_plot_1.png" in PNG-format. The extension .png is added automatically.
  5. View the graphics with your favorite picture viewer or simply open it in a WWW browser.

Automatization:

If you have a number of methylation analysis to perform you will appreciate that the whole analysis process can be automated. An example script that consecutively runs all the analysis programs and transfer the results in different subdirectories can be downloaded from our web page (autometh.sh). Under MacOS AppleScript can be used.