The content of the Comments section is often sufficient to decide whether a
match (with BLAST or otherwise) of your query with a particular Swiss-Prot
entry makes sense or not.
Let’s skip the Copyright notice — the stuff for the lawyers out there — and
jump into the big Cross-References section that comes next.
The Cross-References
The Cross-References section contains links to entries in other databases
that contain some information about our protein.
Most of the links in this section take you to new databases. With all this Web
jumping about from database to database, it can be easy to get lost and lose
your original Swiss-Prot entry. To avoid this confusion, when you browse the
Cross-References section, open the links in a new window: Click the link with
the
right button of your mouse (not the left, as you usually do) and choose
Open in a New Window from the context menu that appears.
There are bunches of fields in the Cross-References section. The following list
will help you keep at least the major ones straight:
EMBL: This field contains all necessary links with the nucleotide
sequences world. As we point out in Chapter 3, numerous GenBank,
EMBL, and DDBJ entries can be related to a single protein sequence.
Click the CoDingSequence
link to send a query to the EBI SRS server for
finding the CDS (
CoDing Segment, the part of the nucleotide sequence
that precisely encodes the protein from start to stop) of these entries.
PIR: This historical field contains the accession numbers of the corre-
sponding entries in the late Protein Information Resource (PIR), now dis-
continued, but incorporated in the UniProtKB consortium.
UniGene: A link to NCBI’s gene expression database (see also the
CleanEx field for linking to DNA chip experimental data).
PDB: This field contains a link to a sequence homologue of the current
query, for which 3-D structural information (X-ray crystal structures) is
available. Here, this is Swiss-Prot entry P11362, the basic fibroblast
growth factor 1, for which several crystal structures exist in the protein
Databank (PDB). Click the PDB
link to see the relevant page. (The PDB ID
here is 1FGK.)
ModBase: This field links you to ModBase, a database of theoretically
calculated models, not experimentally determined structures. The
models may contain significant errors, as pointed out by the authors
themselves.
116
Part II: A Survival Guide to Informatics