PS03
Solvent Accessibility
Due 9 October, 2025
1
Run solvent accessibility calculations for the small hydrophobic seed protein crambin from Crambe hispanica (1CRN) using naccess. Which four residues are least accessible to solvent (total side chain, by percent)? What does that suggest as the major stabilizing factor of this (very) small protein? Prepare a plot of the crambin structure illustrating these interactions.
2
The accessible surface area story is a bit different for a larger protein. Run solvent accessibility calculations for hexokinase from Sulfolobus tokodaii (thermophilic archaeon) (2E2O). What is the reaction catalyzed by hexokinase? This enzyme is 299 amino acids long, in comparison to the 46 amino acids of crambin. Using the summed absolute accessible surface area for the total side chain calculation, compare the accessible surface per residue for these two proteins. Is the result what you expected? What can you conclude about the conformational stability of this larger protein?
3
Bacteriophage T4 lysozyme may well be the best studied protein structure of all time (917 structures at the
RCSB PDB currently; it crystallizes exceptionally well making it a common choice for structural studies). Using one of those structures (
2LZM) run solvent accessibility calculations. Prepare a plot of the protein illustrating all side chains which are less than or equal to two percent accessible (ninety-eight percent buried, see Total Sidechain, Relative). What do you notice about these residues, and what does that suggest for larger proteins, proteins typically made of a number of small(er) domains?
4
Run solvent accessibility calculations for the
November 2017 PDBselect list. Now, that list is based on polypeptide chains, which may come in structures with other polypeptides. In order to simplify our analysis logic later, use an option of
gfp, the
-c option, to output only the specified chain. This option requires a five character input code, the four characters of the PDB ID and the one character chain ID, that is the full five character ID listed in the PDBselect file. The
naccess program is a combination of shell script and compiled executable which does not lend itself easily to use with
xargs. There is a script which will run
naccess for all PDB files (*.pdb) in the current working directory,
run-naccess. This script needs no arguments or options, but is likely to run for a while. In order to make sense out of these data, the program
accres will mine the *.rsa files extracting the relative accessibilities for the side chains of a given amino acid type. This program "bins" each residue by accessibility, essentially producing the histogram of number of residues as a function of accessibility in bins of width 10 percent. Command line syntax is illustrated below:
[user@451]$ accres res=ALA *.rsa
Here, in brief, the program is called with one named argument (res=ALA) where the amino acid code is the amino acid for which to prepare the report, and *.rsa relies on shell wild card expansion to supply each relative solvent accessibility file in the current directory on the command line. Run this analysis on the PDBselect list for all twenty amino acids. Use these data to establish a scale of hydrophobicity for the amino acids.
With these data, compare your results to a measured scale of hydrophobicity such as that of
Fauchere and Pliska, 1983 who generated a hydrophobicity scale by measuring the free energy of transfer of amino acids from water to n-octanol. In their scale, positive numbers are hydrophobic, negative numbers hydrophilic and glycine is the neutral point. Do your results agree? Why might this be (or not) the case?
The laboratory of George Rose performed a similar study in 1985, before bioinformatics was hip, publishing the results in a
Science paper. How do their results compare to your results? On which data set would you prefer to base a hydrophobicity scale, model compound data or protein survey data? Why?
And you're not alone in making a hydrophobicity scale, apparently
at least 98 others have done it before you (and no, they didn't all agree). Know you're not alone.
Last updated at 08:39:49 on 2025-12-04.
Page generated in 3 milliseconds.