associated with particular biological properties. Many of these patterns are
associated with post-translational modifications. On the ExPASy server
(
www.expasy.org), you can compare your protein sequence with the collec-
tion of patterns in PROSITE — and find out which modifications your protein
is likely to undergo.
When you do sequence analysis, there is something you must ALWAYS remem-
ber: Similar short sequences (such as those with less than 20 amino acids)
don’t ALWAYS have the same function. Thus, if a small sequence has been
shown to function as an ATP binding in a protein — and if the protein you’re
analyzing contains a short segment with EXACTLY the same sequence — it
doesn’t NECESSARILY mean that your protein is also an ATP binding protein.
What you have is an indication that it
may be an ATP binding protein. Of
course, the longer the segment, the stronger the indication.
This warning on the meaning of short similarity regions also applies to
PROSITE patterns — patterns listed there are often quite short. That said, we
can show you how to check your sequence to see whether it contains any
known PROSITE pattern.
Looking for PROSITE patterns
ScanProsite is a server that allows you to compare your protein with the list
of patterns contained in the PROSITE database. Highly trained specialists
designed each pattern in this database. If you find that your protein contains
a PROSITE pattern, this one fact can (often) give you a pretty clear indication
of its function.
175
Chapter 6: Working with a Single Protein Sequence
The PROSITE patterns
When biologists first started having access to
protein sequences, one of their first discoveries
was that some small conserved sequences are
often associated with important properties —
such as cellular localization, ligand binding, and
so on.
Amos Bairoch, the creator of Swiss-Prot,
started exploiting this discovery while building
his protein database. To help organize and
annotate the proteins, he created a collection
of small, well-conserved segments that he
could use to classify and analyze new proteins.
These special segments are known as
patterns
—
and they are still widely used today as a means
of characterizing new proteins. PROSITE is the
name Amos gave to this particular pattern data-
base. These days, PROSITE no longer contains
only patterns; it also includes a new, more
sophisticated type of model: the
profile.
Profiles
describe every position of an entire protein
family — not just a few highly conserved posi-
tions, as patterns do. (These domain profiles are
very different from the property profiles we also
describe in this chapter.)