PEPPLOT(+)

[ Program Manual | User's Guide | Data Files | Databases ]

Table of Contents
FUNCTION
DESCRIPTION
EXAMPLE
OUTPUT
GARNIER OUTPUT FILE
CHOU AND FASMAN OUTPUT FILE
HYDROPHOBIC MOMENT OUTPUT FILE
INPUT FILES
RELATED PROGRAMS
RESTRICTIONS
CONSIDERATIONS
GRAPHICS
<CTRL>C
COLOR
COMMAND-LINE SUMMARY
ACKNOWLEDGEMENT
LOCAL DATA FILES
PARAMETER REFERENCE

FUNCTION

[ Top | Next ]

PepPlot plots measures of protein secondary structure and hydrophobicity in parallel panels of the same plot.

DESCRIPTION

[ Previous | Top | Next ]

PepPlot shows several common measures of protein secondary structure together on one coordinated plot. Most of the curves are the average, sum, or product of some residue-specific attribute within a window. In a few cases, the attribute is both specific to the residue and dependent on its position in the window. Throughout the plot, the blue curves are for beta-sheets and the red curves are for alpha-helices; black is used for turns and hydropathy. If your plotter does not have four colors, then dashed lines are for alpha-helix and solid lines are for beta-structures.

This document is only a description of what PepPlot does. You may want to read some of the articles cited below to help you interpret what the curves really mean.

There are ten different panels that can be plotted in any combination and in any order. In the descriptions below they are referred to from top to bottom as if you had plotted them all in the default order as in the example session and figure.

The Sequence

The first part of the plot shows the sequence itself. This panel is extremely crowded if you use a density of more than 100 residues per page.

The Residue Schematic

The second part of the plot shows a schematic representation of the sequence. Each residue is represented by a line at the position where it occurs in the sequence. The lengths and colors of the lines are used to indicate chemically similar groups of amino acids as follows.

Color        Category

Green        hydrophilic, charged
                 down = acidic
                 up   = basic

Red          hydrophilic, uncharged
                 short = amides
                 long  = alcohols

Blue         hydrophobic
                 short = aliphatic
                 long  = aromatic

Black        Proline

Unmarked     Alanine, Glycine, Cysteine

Chou and Fasman Beta-Sheet Forming and Breaking Residues

The third panel is a display of the residues that are beta-sheet forming and breaking as defined by Chou and Fasman (Adv. Enz. 47; 45-147 (1978)). To nucleate beta-structures, there should be at least three beta-forming residues and not more than one breaking residue within a window of five.

Chou and Fasman Alpha and Beta Propensities

The fourth panel of the plot shows the Chou and Fasman (1978 cited above) propensity measures for alpha-helix and beta-sheet. As each curve rises past the threshold for its color, it satisfies one criterion for propagation of an alpha-helix or beta-sheet structure. If the curves for alpha and beta propagation drop below the black threshold (at value of the 1.00 level) and if there is at least one breaking residue in four, then the structure may terminate. Both curves are the average of a residue-specific attribute over a window of four.

Chou and Fasman Alpha-Helix Forming and Breaking Residues

The fifth panel shows the residues that are alpha-helix forming and breaking, as defined by Chou and Fasman (1978 cited above). For alpha-helices to nucleate, there should be four or more alpha-forming residues and not more than one breaking residue within six residues.

Chou and Fasman Amino Ends

The sixth panel shows regions of the sequence that resemble sequences typically found at the amino end of alpha-helices and beta-structures (Chou and Fasman, 1978 cited above). The curves plot the probabilities for a window of six that the first three residues in the window precede the end of the structure and the last three residues are within the structure. There are two different residue-specific attributes used, one for each half of the product.

Chou and Fasman Carboxyl Ends

The seventh panel shows regions of the sequence typically found at the carboxyl end of alpha-helices and beta-structures (Chou and Fasman, 1978 cited above). The two curves show the probability for a window of six that the first three residues in the window are within the structure and the last three residues are outside the structure. Two different residue-specific attributes are used, one for each half of the product.

Chou and Fasman Turns

The eighth panel shows regions of the sequence typically found in turns (Chou and Fasman, 1978 cited above). The curve is the product of a residue-specific, position-dependent attribute (probability) multiplied across a window of four. The calculated values are multiplied by 10,000 for plotting.

Hydrophobic Moment

The ninth panel shows the helical hydrophobic moment at each position of the sequence. These curves rise when the molecule forms either an alpha-helix or a beta-sheet at the interface between the solvent and the interior of the molecule. Said another way, the moment statistic is the probability that the sequence at each position is amphiphilic, that is, it appears to have hydrophobic residues on one side and hydrophilic residues on the other. The hydrophobic moment is calculated as described by Eisenberg et al. (Proc. Natl. Acad. Sci. USA 81; 140-144 (1984)), except that we have normalized the hydrophobic moment for the local hydrophobicity of the amino acids in the window where the moment is being determined. This makes the method equivalent to that described by Finer-Moore and Stroud (Proc. Natl. Acad. Sci. USA, 81; 155-159 (1984)).

In a typical alpha-helix, each residue is oriented about 100 degrees from the preceding residue. The alpha moment that we plot in this panel is the maximum for all inter-residue angles between 95 and 105 degrees The alpha moment curve is calculated for a window of eight residues.

Typical beta-strands have 160 degrees of rotation between adjacent residues. The beta hydrophobic moment curve is the maximum for all inter-residue angles between 159 to 161 degrees calculated over a window of six residues.

Kyte and Doolittle Hydropathy

The tenth panel has two curves based on the average hydrophobicity. The black curve is the Kyte and Doolittle hydropathy measure (J. Mol. Biol. 157; 105-132 (1982)). This curve is the average of a residue-specific hydrophobicity index over a window of nine residues. When the line is in the upper half of the frame, it indicates a hydrophobic region, and when it is in the lower half, a hydrophilic region. You can set the Kyte-Doolittle window to a number other than nine using -HWINdow=n.

Goldman, Engelman, and Steitz Transbilayer Helices

The green curve in the tenth panel is the Goldman, Engelman, and Steitz (GES) curve for identifying nonpolar transbilayer helices (reviewed in Ann. Rev. Biophys. Biophys. Chem. 15; 321-353 (1986)). The curve is the average of a residue-specific hydrophobicity scale (the GES scale) over a window of 20 residues. When the line is in the upper half of the frame, it indicates a hydrophobic region and when it is in the lower half, a hydrophilic region. You can suppress the GES curve in this panel with -NOGES. You can set the GES window to a number other than 20 with -GESWindow=n.

Garnier Predictions Can Be Written Into a File

Using -GARnier, secondary structure prediction using the method of Garnier, et al. (J. Mol. Biol. 120; 97-120 (1978)) can also be calculated by PepPlot and written into a file.

EXAMPLE

[ Previous | Top | Next ]

Here is a session using PepPlot to plot the secondary structure measures for the first 100 residues of adenylate kinase (PIR:Kihua). This session with PepPlot also writes the Garnier predictions, Chou and Fasman values, and helical hydrophobic moment values to separate output files.


% pepplot -GARnier -CFFile -MOMentfile

  PEPPLOT of what protein sequence ?  PIR:Kihua

                      Begin (* 1 *) ?
                    End (*   194 *) ?  100

  The minimum density for a one-page plot is 87.0 residues/100 platen units.
  What density do you want (* 87.0 *) ?

 What Panels do you want to plot?

     a) Sequence
     b) Charged-polar-hydrophobic residue schematic
     c) Beta forming-breaking symbols
     d) Chou-Fasman Alpha-Beta prediction curves
     e) Alpha forming-breaking symbols
     f) Chou-Fasman NH2-end prediction curves
     g) Chou-Fasman CO2-end prediction curves
     h) Chou-Fasman Turn    prediction curve
     i) Helical Hydrophobic Moment for Alpha and Beta
     j) Hydropathy and Hydrophilicity

  Please choose one or more (* ABCDEFGHIJ *):

  When your LaserWriter attached to tty07 is ready, press <Return>.

%

OUTPUT

[ Previous | Top | Next ]

Here are parts of the text output files. If you are reading the Program Manual, you can see the plot from this session in the figure at the end of this program entry.

GARNIER OUTPUT FILE

[ Previous | Top | Next ]

Secondary structure prediction using the method of Garnier et al. (J. Mol. Biol. 120; 97-120 (1978)) is also performed by PepPlot when the program is run with -GARnier. The Garnier method calculates a statistic for alpha-helix, beta-sheet, turns, and random coil structures using position-dependent, residue-specific information within a window of 17. The structure predicted for the residue in the center of the window is the structure with the largest calculated statistic at that position.

Output File Structure

The results of the Garnier prediction are written into a file. The file shows different predictions for several different combinations of decision constants (see below). The predicted structure is represented with an A for alpha, B for beta, C for random coil, and T for turn. Question marks indicate that two or more structures are equally probable.

Decision Constants From Physical Measurements

If you have physical data on the proportion of the protein's secondary structure that is alpha-helix and beta-strand, decision constants ("fudge factors") can be used to bias the Garnier predictions. Read predictions from that column of data whose percent alpha-helix and percent beta-strand corresponds most closely to the physical measurements for the entire protein.

Decision Constants Without Physical Measurements

If you have no physical measurement of the percentage of alpha-helix and beta-strand in your protein, Garnier recommends using the percent alpha and beta with no decision constants (the No DC column of data in the file).

Here is part of the Garnier output file, kihua.gar:


PEPPLOT (Garnier prediction) of: Kihua check: 1665 from: 1 to: 100

                           October 12, 1998 10:53

P1;KIHUA - adenylate kinase (EC 2.7.4.3) 1 - human
N;Alternate names: myokinase
C;Species: Homo sapiens (man)
C;Date: 23-Oct-1981 #sequence_revision 23-Oct-1981 #text_change 05-Sep-1997
C;Accession: A33508; A00679
R;Matsuura, S.; Igarashi, M.; Tanizawa, Y.; Yamada, M.; Kishi, F.; Kajii, T.;
 Fujii, H.; Miwa, S.; Sakurai, M.; Nakazawa, A

 Structural composition for no decision constant:  alpha = 32.0%   beta = 21.0%

%Alpha  No DC    <20    <20    <20  20-50  20-50  20-50    >50
    >50
% Beta  No DC    <20  20-50    >50    <20  20-50    >50    <20
  20-50
   Pos ----------------------------------------------------------- ..
     9      A      B      B      B      A      A      A      A      A
    10      B      B      B      B      B      B      B      B      B
    11      B      B      B      B      B      B      B      B      B

/////////////////////////////////////////////////////////////////////

    90      B      B      B      B      B      B      B      B      B
    91      B      B      B      B      B      B      B      B      B
    92      B      B      B      B      B      B      B      B      B

CHOU AND FASMAN OUTPUT FILE

[ Previous | Top | Next ]

With -CFFile, PepPlot writes a file with the Chou and Fasman (1978, cited above) values for every position in the sequence written out as a table of numbers. Here is part of the Chou and Fasman output file, kihua.cho:


PEPPLOT (Chou/Fasman) of: Kihua check: 1665 from: 1 to: 100

                           October 12, 1998 10:53

P1;KIHUA - adenylate kinase (EC 2.7.4.3) 1 - human
N;Alternate names: myokinase
C;Species: Homo sapiens (man)
C;Date: 23-Oct-1981 #sequence_revision 23-Oct-1981 #text_change 05-Sep-1997
C;Accession: A33508; A00679
R;Matsuura, S.; Igarashi, M.; Tanizawa, Y.; Yamada, M.; Kishi, F.; Kajii, T.;
 Fujii, H.; Miwa, S.; Sakurai, M.; Nakazawa, A

           Alpha  Alpha   Beta   Beta      Alpha         Beta
  Pos Res   Stat   Ave    Stat    Ave   NH2    COOH   NH2    COOH   Turn  HPhob
    ------------------------------------------------------------------------ ..
    1   M   1.45   1.41   1.05   0.63   0.23   4.50   0.57   0.11   0.30  -1.96
    2   E   1.51   1.35   0.37   0.69   0.28   5.25   1.20   0.09   0.17  -1.67
    3   E   1.51   1.26   0.37   0.79   0.42   4.90   0.54   0.40   0.22  -0.78

///////////////////////////////////////////////////////////////////////////////

   98   E   1.51   1.20   0.37   1.07   1.32   1.28   0.15   0.70   0.10   0.00
   99   V   1.06   0.96   1.70   1.16   1.91   1.06   0.04   0.92   0.34   0.00
  100   Q   1.11   1.08   1.10   0.83   0.00   0.00   0.00   0.00   0.88   0.00

HYDROPHOBIC MOMENT OUTPUT FILE

[ Previous | Top | Next ]

With -MOMentfile, PepPlot writes a file with the helical hydrophobic moment values for every position in the sequence written out as a table of numbers. Here is part of the moment output file, kihua.mom:


PEPPLOT (Hydrophobic Moment) of: Kihua check: 1665 from: 1 to: 100

                           October 12, 1998 10:53

P1;KIHUA - adenylate kinase (EC 2.7.4.3) 1 - human
N;Alternate names: myokinase
C;Species: Homo sapiens (man)
C;Date: 23-Oct-1981 #sequence_revision 23-Oct-1981 #text_change 05-Sep-1997
C;Accession: A33508; A00679
R;Matsuura, S.; Igarashi, M.; Tanizawa, Y.; Yamada, M.; Kishi, F.; Kajii, T.;
 Fujii, H.; Miwa, S.; Sakurai, M.; Nakazawa, A

  Pos Res    Alpha    Alpha     Beta     Beta
             Angle    Value    Angle    Value
    ----------------------------------------- ..
    1   M     95.0     0.52    161.0     0.46
    2   E     95.0     0.32    161.0     0.29
    3   E     95.0     0.28    159.0     0.21

   //////////////////////////////////////////

   98   E    105.0     0.42    159.0     0.37
   99   V      0.0     0.00    161.0     0.24
  100   Q      0.0     0.00      0.0     0.00

INPUT FILES

[ Previous | Top | Next ]

PepPlot accepts a single protein sequence as input. If PepPlot rejects your protein sequence, see Appendix VI for information on how to change or set the type of a sequence.

RELATED PROGRAMS

[ Previous | Top | Next ]

PeptideStructure and PlotStructure were sent to us by Dr. Berthold Foertsch of the Max Planck Institute of Munich. Used together, these two programs let you see a graphics representation of the best choice Chou-Fasman or Garnier prediction with hydrophobicity or antigenic index superimposed.

Moment makes a contour plot of the helical hydrophobic moment for all rotation angles between 0 and 180 degrees per residue (Eisenberg, 1984, and Finer-Moore and Stroud, 1984, cited above). HelicalWheel plots a peptide sequence as a helical wheel to help you recognize amphiphilic regions.

RESTRICTIONS

[ Previous | Top | Next ]

The residue-specific attributes for all of the measurements in PepPlot are only defined for the standard alphabet of protein sequence characters, including B, X, Z, and * (see Appendix III). Sequences containing any other symbols, such as the gap symbols period (.) and tilde (~), are not suitable as input for PepPlot.

CONSIDERATIONS

[ Previous | Top | Next ]

You should realize that secondary structure predictions are not very reliable, especially for proteins that are not soluble or globular.

Plots with more than about 250 residues per 100 platen units may be too compressed to be useful for structure prediction, although they may be useful for comparing two protein sequences for structurally similar regions. When multiple-page plots are made, successive pages are overlapped by one residue so that the plots can be spliced together. The curves stop one-half window width from the ends of the sequence.

GRAPHICS

[ Previous | Top | Next ]

The Wisconsin Package must be configured for graphics before you run any program with graphics output! If the % setplot command is available in your installation, this is the easiest way to establish your graphics configuration, but you can also use commands like % postscript that correspond to the graphics languages the Wisconsin Package supports. See Chapter 5, Using Graphics in the User's Guide for more information about configuring your process for graphics.

<CTRL>C

[ Previous | Top | Next ]

If you need to stop this program, use <Ctrl>C to reset your terminal and session as gracefully as possible. Searches and comparisons write out the results from the part of the search that is complete when you use <Ctrl>C. The graphics device should stop plotting the current page and start plotting the next page. If the current page is the last page, plotters should put the pen away and graphic terminals should return to interactive mode.

COLOR

[ Previous | Top | Next ]

PepPlot uses dashed lines when four-color plotting is not available. Alpha curves are red when color is available, dashed in black and white. Beta curves are blue in color, solid in black in white. In the hydrophilicity panel, the GES curve is green if color is available and dashed otherwise. In the residue schematic, hydrophilic and charged residues are red and green in the color plot and dashed in black and white. Hydrophobic residues are blue in color and solid in black and white.

There are three threshold lines across the Chou-Fasman panel (panel D). From top to bottom, these lines are as follows: the blue line is the threshold for the beginning of a beta-sheet; the red line is the threshold for the beginning of an alpha-helix; and the black line is the breaking line below which either kind of structure is no longer predicted. In black and white these lines are solid, short dashed, and long dashed, respectively.

COMMAND-LINE SUMMARY

[ Previous | Top | Next ]

All parameters for this program may be added to the command line. Use -CHEck to view the summary below and to specify parameters before the program executes. In the summary below, the capitalized letters in the parameter names are the letters that you must type in order to use the parameter. Square brackets ([ and ]) enclose parameter values that are optional. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.


Minimal Syntax: % pepplot PIR:kihua -Default

Prompted Parameters:

-BEGin=1/END=100 sets the range of interest
-DENsity=87      sets the density in residues per 100 platen units
-MENu=a          sets the sequence display
      b            charged-polar-hydrophobic residue cartoon
      c            beta forming-breaking symbols
      d            Chou-Fasman alpha-beta prediction curves
      e            alpha forming-breaking symbols
      f            Chou-Fasman NH2-Ends prediction curves
      g            Chou-Fasman CO2-Ends prediction curves
      h            Chou-Fasman turn prediction curve
      i            helical hydrophobic moment for alpha and beta
      j            hydropathy and hydrophilicity

Local Data Files:

-DATa1=pepplot.dat    specifies amino acid attributes except for Garnier
-DATa2=ges.dat        specifies hydrophobicities for the GES curve
-DATa3=garnier.dat    specifies amino acid attributes for Garnier

Optional Parameters:

-CFFile[=kihua.cho]       writes the Chou and Fasman predictions
-GARnierfile[=kihua.gar]  writes the Garnier predictions
-MOMentfile[=kihua.mom]   writes the Hydrophobic moment values
-HWINdow=9                sets the window for hydropathy averaging
-GESWindow=20             sets the window for GES scale averaging
-NOGES                    suppresses the GES curve
-SHOwseq                  shows the sequence in panel 1
-BOXES                    draws a box around each quantitative panel
-NOTITle                  suppresses the plot's title
-NOPLOt                   suppresses the whole plot

All GCG graphics programs accept these and other switches. See the Using
Graphics chapter of the USERS GUIDE for descriptions.

-FIGure[=FileName]  stores plot in a file for later input to FIGURE
-FONT=3             draws all text on the plot using font 3
-COLor=1            draws entire plot with pen in stall 1
-SCAle=1.2          enlarges the plot by 20 percent (zoom in)
-XPAN=10.0          moves plot to the right 10 platen units (pan right)
-YPAN=10.0          moves plot up 10 platen units (pan up)
-PORtrait           rotates plot 90 degrees

ACKNOWLEDGEMENT

[ Previous | Top | Next ]

PepPlot was written by Drs. Michael Gribskov and John Devereux of the Genetics Computer Group. It was first described in Nucl. Acids Res. 14(1); 327-334 (1986). The original code was revised by John Devereux to support command-line control for Version 5 and to support plotting the panels independently for Version 6.

LOCAL DATA FILES

[ Previous | Top | Next ]

The files described below supply auxiliary data to this program. The program automatically reads them from a public data directory unless you either 1) have a data file with exactly the same name in your current working directory; or 2) name a file on the command line with an expression like -DATa1=myfile.dat. For more information see Chapter 4, Using Data Files in the User's Guide.

PepPlot reads three different data files to find the residue-specific attributes: pepplot.dat, which contains Chou-Fasman and hydropathy values; ges.dat, which contains the GES scale; and garnier.dat, which contains the Garnier measures.

PARAMETER REFERENCE

[ Previous | Top | Next ]

You can set the parameters listed below from the command line. For more information, see "Using Program Parameters" in Chapter 3, Using Programs in the User's Guide.

-DENsity=87

sets the number of bases or amino acids per 100 platen units (PU). This is usually equivalent to the number of bases or amino acids per page. Output from different GCG graphics programs that are run at the same density can be compared by lining up the plots on a light box.

-MENu=ABCDEFGHIJ

specifies one or more plots to include in the output: (A) sequence display, (B) charged-polar-hydrophobic residue cartoon, (C) beta forming-breaking symbols, (D) Chou-Fasman alpha-beta prediction curves, (E) alpha forming-breaking symbols, (F) Chou-Fasman NH2-Ends prediction curves, (G) Chou-Fasman CO2-Ends prediction curves, (H) Chou-Fasman turn prediction curve, (I) helical hydrophobic moment for alpha and beta structures, (J) hydropathy and hydrophilicity.

-CFFile=kihua.cho

causes PepPlot to write an output file with the Chou and Fasman predictions for the sequence. The filename is the sequence name plus the filename extension .cho, unless you set it to something else.

-GARnierfile=kihua.gar

causes PepPlot to write an output file with the Garnier predictions for the sequence. The filename is the sequence name plus the filename extension .gar, unless you set it to something else.

-MOMentfile=kihua.mom

causes PepPlot to write an output file with the helical hydrophobic moments for the sequence. The filename is the sequence name plus the filename extension .mom, unless you set it to something else.

-HWINdow=9

sets the window size for calculating the Kyte and Doolittle hydropathy curve. The hydropathy window size must be between 1 and 50.

-GESWindow=20

sets the window size for calculating the Goldman, Engelman, and Steitz hydropathy curve. The GES window size must be between 1 and 50.

-NOGES

suppresses the Goldman, Engelman, and Steitz curve in the hydropathy (eighth) panel.

-SHOwseq

The sequence display in the top panel is normally suppressed if it seems too crowded. Use this parameter to insist that it be plotted no matter how crowded it seems.

-BOXES

draws a box around each quantitative panel (the ones with the tick marks).

-NOTITle

suppresses the plot's title.

-NOPLOt

suppresses the plot.

The parameters below apply to all Wisconsin Package graphics programs. These and many others are described in detail in Chapter 5, Using Graphics of the User's Guide.

-FIGure=programname.figure

writes the plot as a text file of plotting instructions suitable for input to the Figure program instead of sending it to the device specified in your graphics configuration.

-FONT=3

draws all text characters on the plot using Font 3 (see Appendix I).

-COLor=1

draws the entire plot with the pen in stall 1.

The parameters below let you expand or reduce the plot (zoom), move it in either direction (pan), or rotate it 90 degrees (rotate).

-SCAle=1.2

expands the plot by 20 percent by resetting the scaling factor (normally 1.0) to 1.2 (zoom in). You can expand the axes independently with -XSCAle and -YSCAle. Numbers less than 1.0 contract the plot (zoom out).

-XPAN=30.0

moves the plot to the right by 30 platen units (pan right).

-YPAN=30.0

moves the plot up by 30 platen units (pan up).

-PORtrait

rotates the plot 90 degrees. Usually, plots are displayed with the horizontal axis longer than the vertical (landscape). Note that plots are reduced or enlarged, depending on the platen size, to fill the page.

-NOCLIpping

If the data points on a line fall outside of the window in which the data are supposed to be represented, most programs will clip the graph at the edge of the window. This switch disables that clipping.

Printed: December 9, 1998 16:28 (1162)


[ Program Manual | User's Guide | Data Files | Databases ]


Documentation Comments: doc-comments@gcg.com
Technical Support: help@gcg.com

Copyright (c) 1982, 1983, 1985, 1986, 1987, 1989, 1991, 1994, 1995, 1996, 1997, 1998 Genetics Computer Group Inc., a wholly owned subsidiary of Oxford Molecular Group, Inc. All rights reserved.

Licenses and Trademarks Wisconsin Package is a trademark of Genetics Computer Group, Inc. GCG and the GCG logo are registered trademarks of Genetics Computer Group, Inc.

All other product names mentioned in this documentation may be trademarks, and if so, are trademarks or registered trademarks of their respective holders and are used in this documentation for identification purposes only.

Genetics Computer Group

www.gcg.com