Many of the methods we show you here rely on the sliding windows tech-
nique. (For more on these guys, check out the “Sliding windows” sidebar.)
Predictions based on sliding windows aren’t very sensitive or very precise,
but they are very robust. If you see a strong signal when using a method
based on sliding windows, chances are you’re looking at something that’s a
genuine biological signal.
Ironically, the main advantage of the methods that use sliding windows is
also one of their main shortcomings: They don’t interpret the results for you;
they provide only a raw signal.
Because you have to do the interpretation yourself, two simple rules apply
when interpreting the results of an analysis based on sliding windows:
Be very strict and consider only strong signals.
Check the robustness of your signal. Good signals aren’t shy. They don’t
go away simply because you increase or decrease the window size by
one amino acid or replace a property table with another similar table.
167
Chapter 6: Working with a Single Protein Sequence
Sliding windows
The “sliding windows” technique is the most
ancient way of looking at sequences. The prin-
ciple is very simple. What you need is a chemi-
cal property and a list of values associated with
each of the 20 amino acids. This property can
be any measurable physico-chemical parame-
ter, such as size, polarity, hydrophobicity, or
even the propensity of amino acids to be in a
specific structural state. The values in this table
are the amino acids’
scale values.
Many such
tables exist that have been determined experi-
mentally for almost any characteristic you can
think of.
After you have this table, you choose a window
size and slide it along your sequence. When the
window is centered on an amino acid, the scale
values associated with the amino acids it con-
tains are summed up and averaged. The result-
ing value is associated with the central amino
acid, and then the window is shifted by one
amino acid. This process goes on to the end of
the sequence. The following example illustrates
how the window slides along the sequence. The
letter <--X--> indicates the amino acid on which
the window is centered.
Sequence
AGVCFGTRESALPTFREDCYGHZPLI
KJFDESAQZ
<---A---> Window 1
<---G---> Window 2
<---V---> Window 3
When the sliding operation is finished, the
values associated with every amino acid are
plotted against the sequence. Biologists name
this display a property profile (do not confuse it
with a domain profile, which is a formulation of
a multiple sequence alignment). If you’re lucky,
you may be able to identify transmembrane seg-
ments, loops, or coiled-coil regions by using
sliding windows. Hydrophobicity is the most
popular analysis because it’s a good indicator
of transmembrane segments or core regions
within a protein.