COMPTABLE

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
OUTPUT
INPUT FILES
RELATED PROGRAMS
CONSIDERATIONS
COMMAND-LINE SUMMARY
LOCAL DATA FILES
PARAMETER REFERENCE

FUNCTION

[ Top | Next ]

CompTable creates a scoring matrix using equivalences defined in a simplification scheme such as the one used for Simplify. (See the Chapter 4, Using Data Files in the User's Guide for more information.)

DESCRIPTION

[ Previous | Top | Next ]

Scientists comparing protein sequences sometimes want to consider similar amino acids as equivalent. Sequence simplification can be done either by changing the symbols in the sequences being compared (see Simplify) or, for programs that use scoring matrices, by creating a table that scores matches between the symbols you consider to be equivalent.

EXAMPLE

[ Previous | Top | Next ]

Here is a session using CompTable to make a scoring matrix with the standard simplification file used by Simplify (you can use Fetch to make a copy of simplify.txt and modify it to create the input file for CompTable):


% comptable

 COMPTABLE from what simplification file ?  simplify.txt

 What is the comparison match value (* 10 *) ?

 What is the comparison mismatch value (* -2 *) ?  0

 Are you creating a protein scoring matrix (* Yes *) ?

 What should I call the output file (* simplify.cmp *) ?

%

OUTPUT

[ Previous | Top | Next ]

Here is part of the output scoring matrix file:


!!AA_SCORING_MATRIX_RECT 1.0
 COMPTABLE of: simplify.txt  FileCheck: 327

A standard simplification used by SIMPLIFY and WORDSEARCH to simplify
peptide sequences.  The first line below means "for all of the P, A, G,
S, or T characters in the sequence, substitute A." The program COMPTABLE
can construct a symbol comparison table with the equivalences from this
file.

                     August 18, 1998 12:19   ..

{
GAP_CREATE 20
GAP_EXTEND 1
}

      A    B    C    D    E     F    G    H    I    J     K    L  ...  ..
A    10    0    0    0    0     0   10    0    0    0     0    0  ...
B     0   10    0   10   10     0    0    0    0    0     0    0  ...
C     0    0   10    0    0     0    0    0    0    0     0    0  ...
D     0   10    0   10   10     0    0    0    0    0     0    0  ...
E     0   10    0   10   10     0    0    0    0    0     0    0  ...

See Appendix VII for more information about scoring matrices.

INPUT FILES

[ Previous | Top | Next ]

CompTable accepts a simplification table file as input. Here is the input file for the example above:


!!SIMPLIFY 1.0
A standard simplification used by SIMPLIFY and WORDSEARCH to simplify
peptide sequences.  The first line below means "for all of the P, A, G,
S, or T characters in the sequence, substitute A." The program COMPTABLE
can construct a symbol comparison table with the equivalences from this
file.

10/7/84 ..

A PAGST
D QNEDBZ
H HKR
I LIVM
F FYW
C C

RELATED PROGRAMS

[ Previous | Top | Next ]

Simplify simplifies a sequence file with the simplifications from a simplification table.

CONSIDERATIONS

[ Previous | Top | Next ]

CompTable calculates default gap creation and extension penalties to write in the auxiliary data block in the output scoring matrix file that are appropriate for the type of scoring matrix you are creating (protein or nucleotide ) and for the comparison match and mismatch values that you specify. You can use -GAPweight and -LENgthweight to specify alternative gap penalties if you don't want to accept the default values.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


COMPTABLE does not support complete command-line control.

Required Parameters:

-PROtein or -NUCleotide  specifies the type of the scoring matrix

Local Data Files:        None

Optional Parameters:

-GAPweight=50            sets the gap creation penalty
-LENgthweight=3          sets the gap extension penalty

LOCAL DATA FILES

[ Previous | Top | Next ]

None.

PARAMETER REFERENCE

[ Previous | Top | Next ]

You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-PROtein or -NUCleotide

specifies the type of scoring matrix that will be created.

-GAPweight

specifies the default gap creation penalty associated with the scoring matrix. This penalty is written in the auxiliary data block in the output scoring matrix file. If you don't specify a default gap creation penalty with -GAPweight, the program calculates a reasonable default and writes it in the auxiliary data block. (See "Auxiliary Data Block: Setting Gap Creation and Extension Penalties" in the "Scoring Matrices" section of Appendix VII for information about the auxiliary data block in scoring matrix files.)

-LENgthweight

specifies the default gap extension penalty associated with the scoring matrix. This penalty is written in the auxiliary data block in the output scoring matrix file. If you don't specify a default gap extension penalty with -LENgthweight, the program calculates a reasonable default and writes it in the auxiliary data block. (See "Auxiliary Data Block: Setting Gap Creation and Extension Penalties" in the "Scoring Matrices" section of Appendix VII for information about the auxiliary data block in scoring matrix files.)

Printed: December 9, 1998 16:29 (1162)

[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997, 1998 Genetics Computer Group Inc., a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com