The UMass protein crystallography home page is found at xtal.biochem.umass.edu
This year's journal club is organized by Scott Garman. We meet Fridays at noon in 745E LGRT.
This year, we will process some x-ray diffraction data, solve a couple of molecular replacement problems, experimentally phase electron density maps, and build into electron density maps.
1. Installing the needed software:
A. Download the ccp4 suite of crystallographic programs. Version 7.0 is the current version. This package will be the workhorse of our class this semester.
• Follow the instructions for installing the package. Different operating systems will have different installations.
• If you are installing on a laptop, make sure you have enough memory and storage space available.
• Beware of spaces in directories and file names. Unix does not allow spaces in file and directory names. The CCP4i graphic interface won't run correctly if you have spaces anywhere in your directory tree where you are running CCP4.
• Run the install tests in CCP4 to make sure your installation is working.
B. Download and install Coot.
• Coot is now packaged with the CCP4 suite.
• You can also install a standalone version.
• If you are a Mac user, there is a Mac OS X build on Bill Scott's website here. His site on scientific computing on OS X (here) is tremendously useful for getting started.
• If you are a Microsoft Windows user, there is a build for that OS here.
• If you are a Linux user, there are binaries available from the main Coot page here.
• Once you have installed Coot, open the program, from which you can download coordinates and structure factors from the Protein Data Bank (PDB). Or download your favorite coordinates from the PDB, save them as a text file, and open it from the file menu in Coot.
• Make sure you can open coordinates and maps in Coot. RefMac from the CCP4 suite (see section A above) will calculate maps from structure factor files.
C. Download and install the Phenix package.
• The homepage for the suite is here. You must register as a user to get the password to download the software.
• The Phenix suite does many of the same things as the CCP4 suite (experimental phasing, molecular replacement, refinement, etc.) using different algorithms.
2. Test the software by processing X-ray diffraction images
The first test we will try is to process diffraction images into intensites.
See if you can process the following frames into a scaled reflection file. The frames were collected in house on our old Rigaku rotating Cu anode (wavelength λ=1.5418 Å) with the RaxisIV+ detector set to 170 mm detector distance and a beam center of 150.6 mm in x and 150.3 mm in y. Try iMosflm (included in the ccp4 package) to process the images. The goal is to extract the intensities of the whole h, k, l reflection set.
• Here are the images (in a 1.3 GB file, so make sure you have disk space and a fast connection!)
• They are compressed with zip for faster download, so the first thing you need to do after downloading is to decompress them, depending on your operating system. Try double clicking on the file to uncompress the file into an "images" folder.
• Put the images folder in a sensible place on your computer. Do not have any spaces in your directory tree. Do not put them 20 folders deep.
• Create another directory for putting the processing results.
• See if you can open the iMosflm program in the CCP4 package.
• See if you can load the images into the image viewer in iMosflm.
• See if you can process the images.
Some issues to consider:
• What is the correct space group? Getting this right is critical if we are going to get the right molecular replacement solution.
• There are several space groups consistent with the unit cell parameters of the unknown crystal. See if you can sort out which of the possible space groups is correct based upon scaling and comparing symmetry mates.
• Pay particular attention to the translational component of the space group (the subscripted number in the space group name). This is important if we are going to solve the structure.
Work on the images, and then we will discuss some strategies for processing diffraction data.
During class on February 9th, we decided that the data scaled into space group P3121 or its mirror image P3221. Let's see if we can now find out where the molecule sits in the unit cell.
3. Get phases from molecular replacement
Here is a pdb file for molecular replacement. See if you can figure out where the molecule falls in the unit cell. The search object is 99% identical to the unknown object we are looking for in the box, so the search in principle should be easy.
Some questions to consider before we start:
• How many objects are we looking for?
• What is the best search object, the monomer or the dimer?
• Which of the two mirror image space groups is the right one? We need to check both.
• What program are we going to use for molecular replacement? AMoRe, MolRep, Phenix, and Phaser are all reasonable choices.
• Note: The ccp4 molecular replacement tutorials are excellent.
• Some programs (e.g. MolRep) want the sequence of the unknown. Here is a fasta sequence file: GLA.fasta
Starting with the scaled data set (containing h, k, l, Fobs, σF), see if you can get started with molecular replacement, which we will work on in class on Feb 16th.
During class on Feb 16th, we took the scaled reflections data and the pdb file above and did molecular replacement in Phaser, leading to a model with a dimer in the asymmetric unit in space group P3221. We checked the packing of the molecule relative to symmetry equivalents, which all seemed to pack nicely into a three-dimensional lattice. The R-factor was still 50% after molecular replacement, so we still have work to do to refine the structure down to a reasonable R-factor. See if you can do the molecular replacement on your own, and we will move into refinement in the next class on Feb 23rd.
4. Build into electron density and refine the structure
Now that we have amplitudes and phases, we can calculate electron density maps for the unknown. Take the output coordinates from molecular replacement, run 10 cycles of rigid body refinement in Refmac, and calculate maps from the output coordinates. Some questions at this stage:
• How do we know the molecular replacement was successful?
• Do the molecules pack into the unit cell without too much collisions?
• How do we avoid model bias? The maps are likely to look quite a bit like the search object at this point.
• What do we need to fix to get to a publishable structure?
On March 9th, we discussed how to proceed from the initial electron density map to the final model. Starting with the molecular replacement model, we removed the parts of the model that are unreliable (the glycans, waters, and ligands), and then we did two refinement steps. First, we did rigid body refinement of each monomer, which dropped the working R-factor from 38% to 33%. Second, we did restrained refinement using medium-strength non-crystallographic symmetry restraints, which dropped the R-factor to 25%. The next stage is to build into the 25% R-factor map, including fixing loops that are wrong (like the 210 loop), adding glycans, waters, and ligands.
We will have an optional meeting on March 16th for those people who are interested, where we will talk about how to check for problems in the model and to finalize the model.
5. Phasing by multiple ismorphous replacement with anomalous scatter (MIRAS)
On March 23rd, we will starting working on data to phase a structure by experimental phasing of electron density maps. We have three reflections files below, one from a native crystal and two derivatives (one soaked with gold and one with platinum). All three crystals are in space group C2. The data have been internally scaled and merged so that the reflections files already have Fobs and σFobs calculated.
See if you can answer the following questions about the diffraction data:
• Why are there so many degrees of data in the collection?
• Why were different wavelengths chosen?
• What counts as isomorphous for the unit cells?
• What are the good and bad features of the diffraction data?
• What are the Harker sections in space group C2?
• Do we have enough information to solve the structure by SIR, SIRAS, MIR, MIRAS, SAD, and/or MAD phasing? How should we proceed?
A. Native crystal smnat1 (smnat1_f.mtz)
collected at Stanford 7-1, mar300 detector
wavelength = 1.08 Ångstroms
206° collected using 1° frames
mosaicity = 1.4°
unit cell = 88.8Å, 70.0Å, 49.4Å, 90°, 116.7°, 90°.
resolution = 2.4 Ångstroms
Rsym = 5.7% (22.6% last shell)
B. Gold soak amau3a (amau3a_f_2016.mtz).
1 mM AuCl4, soaked for 18 days
collected at Argonne DND-CAT, marCCD detector
wavelength = 1.0230 Ångstroms
360° collected using 1° frames
mosaicity = 0.8°
unit cell = 89.8Å, 71.0Å, 50.1Å, 90°, 115.4°, 90°.
resolution = 3.0 Ångstroms
Rsym = 10.1% (39.8% last shell)
ΔF/F vs native = 20.8%
C. Platinum soak ampt8 (ampt8_f_2016.mtz).
5 mM PtBr4, soaked for 2 days
collected at Argonne DND-CAT, marCCD detector
wavelength = 1.0539 Ångstroms
360° collected using 1° frames
mosaicity = 1.7°
unit cell = 90.0Å, 71.1Å, 50.6Å, 90°, 116.8°, 90°.
resolution = 4.0 Ångstroms
Rsym = 5.1% (7.0% last shell)
ΔF/F vs native = 9.1%
The ccp4 documentation has nice tutorials that step through the entire heavy-atom phasing procedure. There are lots of different ways to get your heavy atom data converted into electron density maps. We will work through one path.
The first step is to collect all of the data into one file. Here is the merged mtz file containing the native, Au, and Pt datasets all collected together but not yet scaled. Run "Scaleit" on the file to check the statistics on the files and to find any outliers. Then make isomorphous difference Patterson and anomalous difference Patterson maps from the data. Do you see any peaks? What do they represent?
We spent the past few classes looking at Pattersons, finding peaks in Pattersons, and converting them to heavy atom locations in real space. How many heavy atom sites are there for each derivative? Starting with the locations of the heavy atoms, we can calculate phases in the MLPhare program. Once we have phases, we can calculate electron density maps. Which heavy atoms give strong phasing power? What do the maps look like after refining the heavy atom location, occupancy, and B-factor?
Some of the phasing programs want the sequence of the unknown structure. If we have a starting map, we can improve it using image-processing methods like solvent flattening and histogram matching. The density modification programs need an estimate of the solvent content. When we ran the "Matthews coefficient" program to calculate the solvent content of the crystal, the most likely result was 1 copy of the protein in the asymmetric unit, leading to 60% solvent content. Because the calculation does not include the carbohydrate part of the glycoprotein, try 50% solvent in the solvent-flattening programs. What do the maps look like after density modification? Can we build the structure from the maps, or do we need better phases?
Using the position of the Au atom from the Patterson maps, we refined the location and occupancy of the gold and calculated phases in MLPhare. This refinement produced an overall figure of merit of 0.49 and an overall phasing power of 1.97 to 3Å. The electron density maps showed connected density and beta strands. We took the phases from the MLPhare run and modified them in the DM program, which did solvent flattening and histogram matching. The DM protocol improved the figure of merit to 0.77. The maps were improved and clearly buildable. Here is an mtz file that includes the MLPhare and DM phases.