GeneMerge Documentation
PublicationsThe following documents describe the rationale and statistical methods behind GeneMerge:
Stand-alone GeneMerge ExecutionGeneMerge is called as follows:./GeneMerge.pl gene-association.file description.file population.file study.file output.filename
How to make your own Gene Association FilesStructured text files for use with GeneMerge are available for download and a description of these files can be found here.However, it's easy to make your own gene association files for use with GeneMerge. Just use any text editor to make two text files with the following formats:
Gene Association file
Description file Here's an example of a Gene Association file for Drosophila melanogaster
The FBgn numbers are Flybase gene names and the GO:XXXXXXX terms are Gene Ontology IDs for specific functions. The white-space is a single tab. Each ID is followed by a semi-colon and if more than one ID is associated with a gene then these are separated by a semi-colon.
The ID terms here are Gene Ontology IDs for specific functions. The human-readable functional descriptions follow after a single tab. Note these lines do not have to end in semi-colons. You can use a text editor and spreadsheet program to make these files. The following are typical steps you can follow to create gene-associaton and description files using Word and Excel on a Mac:
2. Open it in Excel 3. Organize the data so that there are two columns, one with genenames, the adjacent column with IDs 4. Copy and paste the two columns into Word using Paste Special --> "unformatted text" 5. Do a seach and replace for the line ending to add semi-colons. Replace ^p with ;^p. 6. Save file as "text"
Description files can be made along the same lines, just skip step 5.
If there are no IDs for your genomic data just make them up in Excel.
A list of numbers works just fine, just make sure that each function/categorygets a unique ID.
Understanding the outputOutput is a tab-delimited text file that can be opened in a spreadsheet program like Excel either by cutting and pasting from a text editor or importing "as tab delimited." The output file lists each gene-association term found in the study set along with it's English description, frequency in the population set, frequency in the study set, and statistical enrichment score-- uncorrected and corrected. Below is a breakdown of each column header.
GeneMerge - Post-genomic analysis, data mining, and hypothesis testing
Cristian I. Castillo-Davis
|