The UMass protein crystallography home page is found at xtal.biochem.umass.edu
This year's journal club is organized by Scott Garman. We meet Fridays at 11:00am in 869 LGRT.
This year, we will process some x-ray diffraction data, solve a couple of molecular replacement, experimentally phase electron density maps, and build into electron density maps.
0. Collecting diffraction data from crystals:
The first thing we did this year was collect some diffraction data from Derrick's crystal on our new X-ray setup. Thanks Derrick!
1. Installing the needed software:
A. Download the ccp4 suite of crystallographic programs. Version 7.0 is the current version. This package will be the workhorse of our class this semester.
• Follow the instructions for installing the package. Different operating systems will have different installations.
• If you are installing on a laptop, make sure you have enough memory and storage space available.
• Beware of spaces in directories and file names. Unix does not allow spaces in file and directory names. The CCP4i graphic interface won't run correctly if you have spaces anywhere in your directory tree where you are running CCP4.
• Run the install tests in CCP4 to make sure your installation is working.
B. Download and install Coot.
• Coot is now packaged with the CCP4 suite.
• You can also install a standalone version.
• If you are a Mac user, there is a Mac OS X build on Bill Scott's website here. His site on scientific computing on OS X (here) is tremendously useful for getting started.
• If you are a Microsoft Windows user, there is a build for that OS here.
• If you are a Linux user, there are binaries available from the main Coot page here.
• Once you have installed Coot, open the program, from which you can download coordinates and structure factors from the Protein Data Bank (PDB). Or download your favorite coordinates from the PDB, save them as a text file, and open it from the file menu in Coot.
• Make sure you can open coordinates and maps in Coot. RefMac from the CCP4 suite (see section A above) will calculate maps from structure factor files.
C. Download and install the Phenix package.
• The homepage for the suite is here. You must register as a user to get the password to download the software.
• The Phenix suite does many of the same things as the CCP4 suite (experimental phasing, molecular replacement, refinement, etc.) using different algorithms.
2. Processing X-ray diffraction images
The first task we will take on is processing of diffraction images into intensites.
See if you can process the following frames into a scaled reflection file. The frames were collected in house on our rotating anode (wavelength λ=1.5418 Å) with the RaxisIV+ detector set to 170 mm detector distance and a beam center of 150.6 mm in x and 150.3 mm in y. Try iMosflm (included in the ccp4 package) to process the images. The goal is to extract the intensities of the whole h, k, l reflection set.
• Here are the images (in a 1.3 GB file, so make sure you have disk space and a fast connection!)
• They are compressed with zip for faster download, so the first thing you need to do after downloading is to decompress them, depending on your operating system. Try double clicking on the file to uncompress the file into an "images" folder.
• Put the images folder in a sensible place on your computer.
• Create another directory for putting the processing results and see if you can process the images.
Some issues to consider:
• What is the correct space group? Getting this right is critical if we are going to get the right molecular replacement solution.
• There are several space groups consistent with the unit cell parameters of the unknown crystal. See if you can sort out which of the possible space groups is correct based upon scaling and comparing symmetry mates.
• Pay particular attention to the translational component of the space group (the subscripted number in the space group name). This is important if we are going to solve the structure.
Work on the images, and then we will discuss some strategies for processing diffraction data.
If you got stuck or cannot get mosflm to run correctly, here is a reflections file from a successful mosflm run on the 143 frames. It can be used as input to the scaling and merging step in 3A below.
3. Molecular replacement (easy test case)
Now that we have our diffraction data processed and our potential space group identified, we will use molecular replacement to calculate phases for our model. The trick is to orient a model of a known structure correctly into our diffraction data. When we get the model correctly oriented in the box that is seen in the new data, we will see a (small) signature in the correlation between the observed reflections and those calculated from the model with molecule in the right place in our box. We will start with an easy molecular replacement problem to make sure our software is working correctly and that we know how to use the programs.
A. Scaling and merging reflections
After integrating our reflections on the diffraction images, we need to get the reflections file ready for molecular replacement. To do that, we scale (to minimize the differences between crystallographically equivalent reflections), merge (to fold the observed reflections into the asymmetric unit of the reciprocal sphere), and reduce the observed reflections from intensities to amplitudes. We will need structure factor amplitudes for the subsequent molecular replacement searches.
• You can use the “Quickscale” feature in iMosflm to convert the integrated intensities to scaled and merged amplitudes.
• Alternately, you can use ccp4i programs to scale and merge intensities. In the “Data reduction” tab in ccp4i, there are two parallel options called “Symmetry, Scale, Merge (Aimless)” and “Symmetry, Scale, Merge (Scala)” that will get your reflections into the correct format (scaled and merged amplitudes) for molecular replacement calculations.
• If you are stuck with the scaling and cannot get your reflections file scaled and merged, here is one from Cameron that you can use for the following molecular replacement problem.
B. Questions and issues to consider before getting started with molecular replacement :
• How big is our unit cell? How big is our crystallographic asymmetric unit?
• How confident are we of our space group determination? Do we know the translational components in the space group? Are there any handedness ambiguities in the space group we need to consider? Are there aribtrary choices of origin in the space group?
• How do we know how many objects we are looking for?
• What is our search model? How much sequence identity does it have with the unknown? What is the expected RMS deviation between our known and our unknown?
• Do we need to make any modifications to our model before starting? Should we remove solvent atoms, ligands, post-translational modifications, and/or extra domains before we begin?
• What program are we going to use for molecular replacement? AMoRe, MolRep, Phenix, and Phaser are all reasonable choices.
• Note: The ccp4 molecular replacement tutorials are excellent.
C. Molecular replacement models you might try:
• Here is a best-case search model, a structure with 99% sequence identity but in a different space group: 3S5Y.pdb (human α-GAL D170A mutant)
• Here is the the monomeric version of the above, with only the A chain included: 3S5Y_chA.pdb
• Here is a typical MR model, a monomeric chain of a related enzyme with 51% sequence identity: 1KTB.pdb (α-NAGAL from chicken)
• Some programs (e.g. MolRep) want the sequence of the unknown. Here is a fasta sequence file: GLA.fasta
D. Building into electron density maps:
• Now that we have successfully placed two polypeptide chains into our search box, we should be able to build and refine the structure. There are quite a few differences between the search object we started with and the model we need to build. Depending on the program you used for molecular replacement, your model may not have carbohydrates, solvent molecules, and/or ligand molecules. Also, there may be regions where the polypeptide backbone chain changes or the side chains are in different conformations.
• What is your current R-factor? You might do a round of rigid body refinement in refmac to get an R and R-free for all atoms and reflections.
• Go through your molecular replacement solution coordinates and look for differences in the map compared to the model. See if you rebuild the model in Coot or another tool to improve the fit of the model to the density. Look in the active site for ligand density. Look at the N-linked glycans for carbohydrates that are not in the model.
4. Heavy atom phasing
For this part of the class, we are going to take the following x-ray diffraction data and see if we can use them to solve a structure.
1. nat1.mtz: Native data collected to 1.9Å at 1.054Å. Rsym = 6.0% (56.8% last shell)
2. lig1.mtz: 3mM ligand soak data to 2.4Å, collected at 1.054Å. Rsym = 5.4% (37.9% last shell)
3. hg1.mtz: 5mM methyl mercury chloride soak for 7 days, collected at 1.007Å to 2.1Å. Rsym = 7.3% (42.4% last shell)
4. pt1.mtz: 1mM PtCl4 soak for 24 hours, collected at 1.072Å to 2.7Å. Rsym = 6.7% (23.9% last shell)
All data sets are at least 99% complete in space group P41212/P43212. The heavy atom sets contain anomalous information.
The steps we need to take are:
1. Scale all of the other data to the native and check that the scaling went well (In ccp4, you can use "cad" and then "scaleit").
Output mtz files from cad & scaleit. Log files from cad & scaleit.
2. Make isomorphous and anomalous difference Pattersons of the heavy atom data to check for peaks (use the program "patterson")
3. Find the heavy atom locations (lots of possibilities here: solve by hand, use crank, shelx, sharp, phaser, etc.)
4. Calculate phases and make experimental maps (again lots of choices: mlphare, dm, crank, shelx, sharp, etc.)
5. Use the electron density to solve the structure (coot, etc.)
Hints and suggestions on the heavy atom data:
What is the average change in amplitude when the crystal is soaked with heavy atom solution? What do the changes look like with resolution?
One of the heavy atom derivatives is more useful for phasing than the other. Can you figure out which one?
Try scaling the data together and making Patterson maps. Where are the Harker sections in this space group? Can you find peaks in the Patterson on the Harker sections? Can you solve the Patterson? I like to print out sections that show the full unit cell and not just the asymmetric unit, to see the additional symmetry and peaks in Patterson space.
Can you calculate electron density maps from experimental phases? Can you find secondary structure elements in the density?
In class on April 21st, we calculated the 7 difference vectors in space group P43212. We then looked at isomorphous Patterson maps for the Pt derivative, looking for peaks that were consistent with the difference vectors predicted by the space group. We ended up with a solution at 0.70, 0.41, 0.13 (in real space) for one of the Pt sites. Not all of the Harker sections showed strong peaks, but many of them had reasonable ones. To confirm the location of the Pt, see if you can check that heavy atom site in the Pt anomalous data to see what peaks appear in the Pt anomalous Patterson.
Once you trust the location of a heavy atom in real space, you can use phases from it to search for other heavy atoms. Use the Pt location to make a difference fourier of the Pt and of the Hg data. A "double difference" fourier can find additonal peaks in the Pt data. Once you get the locations of the Pt and Hg sites, you can calculate phases and make maps from them. Once you have maps from the experimental phases, you can use density modification routines to make the maps look more "protein like" and to improve the figure of merit of the map. Using the two heavy atom datasets and density modification, you should be able to get a nice map that is buildable.