biologically active molecule. Such information would involve helical and
extended regions, loops, residues exposed at the surface, and so on.
If you have nothing else at hand, such information may be useful. For instance,
it may help you predict the potential effect of a mutation — or choose the
right protein fragment to make antibodies. Unfortunately, you’ll soon find out
that secondary structure predictions are far less useful than we would like
them to be. They fall short of being the real thing: the detailed spatial repre-
sentation of your molecule. In this section, we show you how to gather and
use 3-D information to better understand what goes on in your protein
sequence.
The good news for biologists is that plenty of experimental 3-D structure
information is available on the Internet. As was the case when the molecular
biologists of the world agreed to centralize their sequence data into the
GenBank/EMBL/DDBJ databank repository, all structural biologists have
agreed to deposit their 3-D structure coordinates into a single database: the
Protein Data Bank. Everybody refers to this database by its acronym: PDB.
Retrieving and displaying a 3-D
structure from a PDB site
Like other data repositories, the Protein Data Bank (PDB) offers a rather
daunting interface that wasn’t particularly designed with the nonspecialist in
mind. Yet, in those rare cases where you know precisely what you’re looking
for — and even know what you’re doing! — you may want to retrieve a protein
3-D structure dataset directly from one of the PDB sites. Before you query the
PDB, be sure to collect some precise information about the structure you’re
looking for — such as the exact protein name or (even better) its PDB identi-
fier. You can usually obtain this identifier from such user-friendly sources as
the ExPASy/Swiss-Prot server or by using the various NCBI query tools. (See
Chapter 4 for more on how to use these tools.)
Here’s how to obtain and display a PDB structure. For now, let’s assume that
we are looking for the structure of an
Escherichia coli (E. coli) protein named
TolB, with PDB ID code 1CRZ.
1. Point your browser to www.rcsb.org/pdb/.
This takes you to the PDB home page.
2. Enter the PDB ID code 1CRZ in the search box in the middle at the
very top of the page.
3. Click the adjacent SEARCH button.
Figure 11-3 shows the resulting output. This one-page Structure Summary
form presents the essential information on this protein structure —
including a small graphic, a bibliographical reference, its function, its
337
Chapter 11: Working with Protein 3-D Structures