What is the Revenant Database?
Why Revenant is manually curated?
How many structures do we have?
How are ancestral sequences predicted?
Which are the commonly used programs to estimate ancestral sequences?
How ancestral sequences are resurrected in the lab
How to download Revenant sequences?
How to download the database information?
Q: What is the Revenant Database?
Revenant is a hand-curated collection of resurrected proteins. Several of these proteins have known crystallographic structures. All the entries in Revenant have information about the ancestral node used in the reconstruction, methodologies used for the estimation, sequence alignments, ligand characterization, etc. Several resurrected proteins have additionally, biochemical and structural parameters.
Q: Why Revenant is manually curated?
Revenant has been built gathering information from research articles related with ASR and protein resurrection. Biochemical, biophysical parameters characterizing each protein as well as sequence alignments and other information were extracted from primary citations. Further citations were extracted from PDB database characterizing resurrected protein structures.
Q: How many structures do we have?
The database has a total of 211 proteins; 55 of them have known crystallographic structures.
Q: How are ancestral sequences predicted?
The sequences of ancestral proteins from extinct organisms can be estimated using computational methods called ancestral sequence reconstruction (ASR). ASR uses a multiple sequence alignment (MSA) with extant homologous proteins and a phylogenetic tree to predict all the sequences at all internal nodes including the root of the tree. After a tree has been obtained, the most plausibleancestral sequences can be deduced using a Bayesian approach which maximizes the probability for the occurrence of the different amino acids in ancient sequences given those present in actual proteins (Pupko et al. 2000).
Q: Which are the commonly used programs to estimate ancestral sequences?
Commonly used programs to obtain ancestral sequences are MrBayes (Huelsenbeck et al. 2008) , PAML (Yang 1997), and FastML (Pupko et al. 2000).
Q: How ancestral sequences are resurrected in the lab?
Once an ancestral sequence is estimated it should be synthesized to be further cloned, expressed and purified. If the ancestral reconstruction involves recent ancestors, site-directed mutagenesis using an exante gene can be used to obtain the ancestral sequence (Stackhouse et al. 1990). However, in those cases where remote proteins are resurrected, gene synthesis (Dillon and Rosen 1990) or the assembly of gene fragments are required.
Q: How to download Revenant sequences?
The sequences for each Revenant entry can be retrieved in Multi-Fasta format by following
this link. This Multi-Fasta file consists of all Revenant sequences each with a description in the form of ">RVxx|PDB_entry/entries|sequence_name''. In the Fasta description, each field divided by pipe character (“|”) have different meaning: 1) In first place the RVxx is the Revenant entry where the “xx'' correspond to the id Revenant number, 2) In second place, if the Revenant entry were crystallized, the PDB entries id are provided; when PDB entries are more than one (i.e posess conformational diversity), each one will be separated by a semicolon (“;”); when the revenant entry does not have structural information the field will appear empty. 3) Finally, in the third place, the name of the Revenant sequence is provided; in general this name corresponds with the abbreviation of the protein family which Revenant sequence belongs to.
Q: How to download the database information?