common feature is often their secondary structure. This makes it difficult to
find them using any of the homology-based methods we routinely use for
finding proteins. (For more on such methods, see Chapter 7.)
The methods we show you in this section initiate database searches using
not only the sequence of RNAs but also information related to the known sec-
ondary structures. If you have just used mfold to predict the fold of the RNA
family you’re interested in, you can now use the fold you’ve discovered to
look for new members of the family. In this section, we first show you how to
use tRNAscan to look only for tRNAs, and we also introduce PatScan, a method
that allows you to look for RNAs with a secondary structure that you can
specify yourself.
Finding tRNAs in a genome
tRNAs are small non-coding RNAs that the cell uses to assemble proteins.
They constitute one of the best examples of important non-coding genes that
are difficult to localize in a genome. All the bacterial and eukaryotic genomes
contain tRNAs. Unfortunately, these ubiquitous genes are just as difficult to
identify as any small non-coding RNA. They require database search tech-
niques much more sophisticated than BLAST.
The good news here is that there are some very efficient programs out there
for predicting tRNA genes in a eukaryotic or a prokaryotic genome. The state-
of-the-art method is RNAscan-SE that you can access from the server at
Washington University in St. Louis:
selab.janelia.org/tRNAscan-SE/.
Using PatScan to look for RNA patterns
tRNAscan is perfect if you’re interested only in tRNAs. However, if your inter-
est lies in a specific RNA family, you want to use a tool that’s more general
and enables you to use your own knowledge of the secondary structure of
your RNA family. PatScan is ideal for this purpose.
PatScan lets you search databases with patterns that can accurately describe
RNA secondary structures. From a computational point of view, this type of
search is quite expensive, which is why PatScan requires your e-mail address
and returns a result after a few hours. (In our experience, however, PatScan
has never taken more than half an hour to return the result.)
363
Chapter 12: Working with RNA