Two crucial steps in structure determination using NMR spectra are to assign each nucleus to a specific chemical shift (so-called sequence-specific assignment) and to assign each peak in relevant spectra to these assigned resonances. These two steps together constitute the assignment procedure. Three example lessons are presented in the following section, which highlight the basic steps.
This lesson presents the basic steps of an NMR spectrum assignment using homonuclear spectra, including TOCSY, DQF-COSY, and NOESY 2D NMR spectra of Zn-rubredoxin (Blake et al. 1991).
The topics covered in this lesson are:
If not done yet, set up the tutorial files as described in "Setting up tutorial files" in the preface, How to use this book. The files for this lesson are located in the Assignment\Lesson1 folder.
Start FELIX by double-clicking the FELIX icon on your desktop, or by clicking the Start button on the Windows taskbar, then selecting Programs/Accelrys FELIX 2004/FELIX 2004. If FELIX prompts you to restore from last session, click Cancel.
Change your Current Working Directory to C:\Felix_Practice\Assign\Lesson1\ using the Preference/Directory... command. Build a new database by opening the File/New... command and choosing Create a new matrix or DBA file. Make sure the File Type is set to DBA(*.dba). Enter zn.dba and click OK.
This procedure typically takes several seconds. Then the program asks for the library. The library is an ASCII file, as described in "Assign/ Define Library" in the separate FELIX User Guide. FELIX contains a standard library for proteins and DNA (pd.rdb) which you should read in.
This is the protein/DNA library. A few seconds later the project setup procedure finishes.
5. Viewing the project entity through a spreadsheet
The project entity is presented in the spreadsheet, and you can browse through its fields.
Many fields contain zeros or nulls, since the full definition is not finished yet. There are nine experiment columns, therefore you can define nine experiments in one project.
6. Adding an experiment to the projects
If you want, you can change the display parameters using the Experiment/Change Attribute menu item in the Experiments table.
The program plots a density or contour plot of the DQF-COSY using the parameters you defined. The coloring scheme is a predefined blue and green colormap with 16 blue colors for positive peaks and 16 green colors for negative peaks.
What you enter for Experiment Title should be descriptive, but not too long (for example, COSY or DQF is appropriate for this spectrum). Leave Use Default Names toggled on (which automatically fills in the peak table, volume table, and J table names). Set the remaining parameters to these values:
It is important to define the spectrum-specific tolerances, which are used in many automated and semi-automated procedures.
7. Repeating Step 6 for the TOCSY and NOESY spectra
This brings up a spreadsheet with the currently-defined experiments. You can use this spreadsheet to add, delete, or edit experiments.
Now go to the spreadsheet menubar and select the Experiment/Add item.
When the control panel appears, select zh.mat for TOCSY and zn.mat for NOESY. The required values for each run are different:
Leave the remaining parameters at their default values and click OK.
8. Checking the project entity
You may need to highlight the spectral window if the main menu is not displayed. Next select Assign/Project to display the Project table. You can click the maximize button in the upper right corner of the table to expand it for a better view. After reviewing the items in the table, select File/Close to close it.
Note that previously zero or null fields now have values.
9. Drawing the full DQF-COSY spectrum
The next step in the assignment procedure is to do a peak picking. This procedure is very important, since all other steps rely on proper peak picking.
Usually peak picking involves several steps. First the automatic peak picker should be run. You can run the regular peak picker or the Stella peak picker. The results are then filtered automatically (symmetrizing, deleting the diagonals, deleting artifacts (solvent ridges), and deleting peaks with invalid widths). You should also thoroughly inspect the results visually, to ensure there is enough confidence in the data. FELIX also provides a tool to fit the 2D peaks via the Peaks/Optimize menu item (see "Peaks/Optimize" in the separate FELIX User Guide), which also increases the accuracy of peak picking.
The footprints of the imported peaks are displayed on the DQF-COSY spectrum. You may need to click the Plot icon from the main tool bar to redraw the spectrum. A Peaks-xpk:dqf table is also displayed, showing the imported peaks.
From the main menu select the File/Import/Peaks menu item. Set the Peak File Type to FELIX Peak Type(*.*). Select tocsy.xpk as File name and click OK.
When the query box appears, asking about overwriting the entity, select Overwrite.
Select File/Close from the Peaks-xpk:tocsy table menu to close the table.
Next you repeat the procedure for the NOE spectrum.
From the main menu, select the File/Import/Peaks menu item. Set the Peak File Type to FELIX Peak Type(*.*). Select noe.xpk as File name. Leave FELIX Peak Table Name as xpk:noe and click OK.
When the query box appears, asking about overwriting the entity, enter Overwrite.
Select File/Close from the Peaks-xpk:noe table menu to close the table.
Now you have a full peak set defined for all three experiments.
11. Selecting the DQF-COSY spectrum
The displayed footprints belong to this spectrum, not to the NOESY, which was read in last. The database took care of reloading the spectrum-specific information.
The next step is the collection of prototype patterns, i.e., sets of frequencies, which later are promoted to patterns and assigned to specific amino acid residues. The menu items relating to prototype patterns are in the third subsection of the Assign pulldown. First we demonstrate a method which uses all three available (COSY, TOCSY, and NOESY) spectra to generate prototype patterns.
12. Performing a prototype pattern detection
In this tutorial the homonuclear 2D spectra are used for assignment.
A control panel with several options appears. The program tries to fill in reasonable values.
The Frequency Collapse Tolerance is the tolerance for aligning and finding connected expansion peaks with seed peaks.
At this point you could just start the collection and use the defaults for the seed peak and expansion peak area. You can also look at them by clicking the More... button instead of the OK button. In the next control panel leave all parameters at the default values:
The Seed Area D2 (High) is the amide proton region above the diagonal. Remove Intraproto Frequency on Number of Frequencies in Proto is the minimum and maximum number of frequencies in a prototype pattern. Number of Iterations is the maximum number of expansion loops, and Frequencies Per Iteration is the number of frequencies in each loop to keep.
Only those prototype patterns are kept which have at least (and at most) one frequency in the 6-12 ppm region (amide proton) and at least one (and at most three) frequencies in the 3-5.5 ppm region.
Be sure to leave the Min # cont (the minimum number of contacts) values at their defaults (1 1 2 2, 1 2 2 3, and 2 2 3 4).
In the output window, information is displayed about the current stage of prototype pattern collection. After one minute, the prototype pattern collection is finished for 106 seed peaks and 3240 expansion peaks, and the following information appears in the output window:
Nr of prototype patterns generated:(57)
The 2D protopattern detection took 7 seconds
Also, a spreadsheet containing the prototype patterns is displayed (Protopatterns).
13. Saving the results of prototype pattern detection as a file
In the output window you are informed about the success of the command:
The next step is to visually inspect the prototype patterns. The Protopatterns spreadsheet provides several ways for you to see prototype patterns: you can draw frequencies of prototype patterns as lines on top of a contour plot, spawn tiles, or draw a strip plot.
You see four lines at 9.7, 5.37, 1.78, and 0.89 ppm, which are frequencies in this prototype pattern.
The second way to visualize prototype patterns is to spawn tile plots from them. This allows you to concentrate only on frequencies and peaks belonging to them, which are present in this prototype pattern.
14. Making a tile plot of prototype pattern 1
Using the tile plot functionality, you can concentrate on peaks and their immediate surroundings which belong to a prototype pattern. Also, you can use strip plots to see strips surrounding the frequencies in vertical or in horizontal position.
You see four vertical strips with the frequencies of the first prototype pattern in the middle of each.
From the strip plot you can see that there are no outstanding peaks that have common chemical shifts with the frequencies in this prototype pattern. Therefore you can continue to promote this prototype pattern to pattern. The first step in this procedure is to copy these frequencies to the clipboard.
16. Copying a prototype pattern to the frequency clipboard
The first prototype pattern is now copied to the clipboard list. This list can be manipulated (you may add or delete frequencies to or from the list, swap the order of two frequencies, delete duplicate frequencies, sort the list, or zero the list). You can also display the list as lines on top of the matrix plot or spawn a tile and strip plot from it.
The results should look like this:
# Freq(ppm) Atom
--- --------- ----
If there is no appropriate frequency to add or delete, the clipboard list can be promoted to a pattern, and the pattern can be then subjected to database searches and naming.
17. Copying the frequency clipboard to the pattern
Now the four frequencies in the clipboard are promoted to a spin system. Also, a new spreadsheet is displayed - Spinsystems. You can list this pattern to a file or to the output window or you can examine it in the spreadsheet. Also, you can close this spreadsheet and reopen it using the Edit/Spin Systems menu item.
This prints the following information to the output window:
Next you copy the generic shifts for the frequencies to the spectrum-specific category. Since these chemical shifts were detected in the TOCSY spectrum, you can copy this to the experiment without any change.
19. Finding the residues that the pattern belongs to
In the output window, a report is generated of the probabilities that this pattern belongs to a certain type of amino acid residue:
Since the highest score and lowest average is for the Ile and Val residues, since you have seen from strip plots that there are no extra resonances, and since Ile theoretically has seven resonances, while Val has only five (with methyl degeneracy likely four), you can now assume that pattern pa1 is a valine type.
Generally, the probability is higher if the score higher and the average is lower. The best-matched atoms give a higher confidence in the probability. You can store the result with the same control panel, by selecting the Store Result option.
You can try to perform this action again, using the DQF-COSY peaks to help distinguish between equally likely residue types. If you do, the printout will be:
This can help in further distinguishing residue types. After you decide that this pattern is a valine type, you need to see which frequency belongs to which atom. For this you must query the database.
The result is a table in the output window showing the relative differences of each frequency from its expectation value. The smallest absolute value shows the highest matching:
Matching pattern pa1 versus val
9.703 5.369 1.786 0.895
H H H H
HN 2.579 -4.526 -10.400 -11.861
HA 10.187 2.307 -4.207 -5.827
HB 30.572 13.236 -1.096 -4.660
HG1* 40.332 20.632 4.345 0.295
HG2* 48.906 24.828 4.922 -0.028
You can see that the frequency with 9.703 ppm probably belongs to the HN resonance and that the Ha is the frequency with 5.369 ppm. Hb is the frequency with 1.786 ppm, The two gamma methyls are not resolved, but the 0.895 ppm frequency belongs to them. You need to set these findings in your database. To do this, you must assign these frequencies.
Select the Assign/Assign Spin System/Frequency menu item. Select pa1, and click Next. This brings up a control panel in which you can make the assignment. Choose the first frequency 9.703 in the Frequency list and click the Select button. Select the VAL item from the Residues list and click the Filter button. This fills in the #S list with 4, 37, and *.
Since you do not know which valine this pattern belongs to, choose the wild card (*), then select the HN item from the Nuclei list. Click the Build button, which fills in the Atom Spec with 1:VAL_*:HN. Now click Add. Since the database contains only atoms that belong to certain residues and the selected residue is not a real one, a dialog box appears asking you for confirmation to include this "new atom" in the database. Click Yes.
Since the frequencies were defined from the TOCSY experiment in the pattern, you need to edit the NOE and DQF frequencies. For this, use the Assign/Spin System/Tile+Show+Edit Frequencies menu item.
22. Adjusting the spectrum-specific shifts for NOE
First you need to define the NOE spectrum-specific shifts for frequencies.
You see the spectrum-specific shifts drawn on top of the tiled plot of the NOE spectrum, and an instruction message in the status bar (as well as in the title of the main window):
Click-drag-release to include the frequency you want to edit, or click <ESC> to quit.
Suppose the frequency 0.895 does not coincide with the NOE peaks well and you want to edit this frequency for the NOE spectrum. You may edit this frequency:
A new instruction message appears in the status bar (as well as in the title of the main window):
Pick on the new position or hit <ESC> to quit.
The new shift is displayed, and a message appears with the new chemical shifts. You may have something similar to:
Selected frequency(s) to change:
D2: 0.892
Changed to new frequency(s):
D2: 0.9150143
Applied the changes to frequencies of experiment 'noe'.
You should notice that, under Frequencies, the specific shifts for NOE have been changed.
You now need to inspect the other prototype patterns and promote them to patterns, as was done in Steps 16, 17, and 18.
23. Copying the 55th prototype pattern to the clipboard list
Select Assign/Frequency Clipboard/Zero Clipboard to clear the frequency clipboard. Next select Assign/Frequency Clipboard/Copy Proto to Clipboard and copy prototype pattern 55.
If TOCSY is not currently displayed, select it in the Experiment table and click the Draw icon. Next activate the spectral window and select the Assign/Frequency Clipboard/Strip Plot Clipboard menu item to display the strip plot.
You can see that, in the column of 9.194 ppm, there is an extra frequency around 2.5 ppm.
Use the Zoom icon in the main tool bar to zoom to the (9.19, 2.5 ppm) peak region.
Add this 2.5 ppm frequency to the frequency clipboard by selecting the Assign/Frequency Clipboard/Add One menu item, clicking the cursor on this peak, and then setting the Frequency parameter to W1 2.664909 and the Nucleus 1 to HX. Click OK.
Click <ESC> to terminate the frequency adding mode. Select Assign/Frequency Clipboard/Copy Clipboard to Pattern. Leave all the default settings in the dialog box and click OK. This promotes the frequencies in the clipboard to a new pattern, pa2.
Score the pattern as in Step 19. The result is:
If you check the DQF-COSY spectrum (by selecting to display the COSY from the Experiment table, and then selecting pa2 in the Spinsystems table and clicking the Tile Plot icon), you can see that the frequency with the chemical shift of 2.845 has a cross peak with an amide proton (9.194 ppm), therefore this must be the alpha proton. HB2 is probably the frequency with 3.060 ppm, and HB1 the frequency with 2.665 ppm. Assign this pattern as described in Step 21. The result should be similar to:
24. Copying the 52nd prototype pattern to the clipboard list
Clear the clipboard using the Assign/Frequency Clipboard/Zero Clipboard menu item, then copy the 52nd prototype pattern to the clipboard with the Assign/Frequency Clipboard/Copy Proto to Clipboard menu item. Spawn a tile for the TOCSY spectrum as in Step 14, but use the clipboard as the source (Assign/Frequency Clipboard/Tile Clipboard).
You can see that, in the left-most tile in the row with 1.983 ppm, there is an extra frequency around 1.89 ppm.
Add this frequency to the clipboard list with the Assign/Frequency Clipboard/Add One menu item. Copy this clipboard list to the pattern pa3 and score it with Min atoms set to 6.
You should get the following output:
Further inspecting the TOCSY spectrum (strip plots), you can see two extra frequencies, at around 2.99 and 2.88 ppm, which you can add to the pattern with the Assign/Spin System/Add Frequency via Cursor menu item.
The result should be similar to:
which clearly shows that the original assumption - that the pattern is a listen type - was a valid one (leucine is now ruled out, although it was possible from the score itself).
Unambiguous assignment is possible only for the amide and alpha proton, therefore the pattern listing will show the following results:
25. Copying the 49th prototype pattern to the frequency clipboard
Follow the procedure described in Step 25 to copy prototype pattern 49 to the clipboard. Then spawn a tile plot from the clipboard for the TOCSY spectrum using the Assign/Frequency Clipboard/Tile Clipboard menu item. You can see that a frequency at 0.853 ppm was missed during the automated routine. Add it to the clipboard and then copy the list to the pa4 pattern. Set the spectrum-specific shifts. Now score and store the result of the pattern using 5 as the Min Atoms:
Since there are clearly at least six frequencies in the pattern, the valine possibility can be dropped. Also, since there is an HN frequency, the proline can be excluded. The remaining possibilities are leucine, lysine, and isoleucine. Since the frequencies with 0.836 and 0.735 ppm are methyl groups, the lysine can also be excluded.
From this you can see that the frequency with 2.545 ppm belongs to a possible beta methine or methylene proton. This is connected with a strong COSY interaction with the methyl frequency at 0.846 ppm, which is only possible in an isoleucine spin system. Therefore, this spin system is an isoleucine type.
In the Spinsystems table, select the Spinsystem/List Residue Type menu item.
Residue type of pattern pa4 is set to ile
The probability for pa4 to be ILE is : 1.000
After matching the pattern against the ILE residue, you can assign the frequencies.
Since you saw from the DQF spectrum that the frequency with 2.545 ppm is the beta methine proton and that it has a cross peak with the methyl at 0.846 ppm, the assignments are as follows:
26. Copying the 4th prototype pattern to clipboard list
This time you will promote (copy) a prototype pattern directly to a pattern (spin system).
This adds a new spin system, pa5, in the Spinsystems table.
The residue type scoring appears in the output window as:
with the most likely candidates as phenylalanine and cysteine. Match against these two residues:
Since we do not know at this stage what the residue type is, we can leave that undetermined and let the automated routines come up with a possible answer later.
27. Finding the sequential connectivities
Once a set of patterns is determined, the next step is to connect these patterns. This is possible with the neighbor-finding algorithm. Generally, it is very important that your spectrum-specific shifts for the NOE spectrum be set for all patterns, as well as for the root frequencies, before you attempt to perform this action.. In this tutorial, however, proceed without doing that for each pattern. Also, make sure to select the NOE spectrum from the Experiments table.
If NOESY is not displayed, select the NOE spectrum from the Experiments table and click the Draw icon.
Select the Assign/Neighbor/Find Neighbor Via 2D NOE menu item. Set these values:
Leave the other parameters at their default values and click More... to see the default parameters.
In the next control panel, leave the defaults as they are and click OK.
After one or two seconds the results are printed and stored:
From this listing you can see which pattern is neighbor to which, i.e., what the sequential connection is (i - i+1). For example, pa2 is neighbor to pa1, pa3 is neighbor to pa2, pa4 is neighbor to pa3, and pa5 is neighbor to pa4.
The possible neighbors for pattern pa1 are:
pattern pa2 with probability: 1.0000
28. Visually verifying the results of neighbor detection
You now should see the results as a tile plot of the inter-pattern peaks.
Now you see the frequencies displayed on top of the tile plot.
Inspecting the plot reveals that there are really inter-residue (inter-pattern) cross peaks. There is a well-defined cross peak at frequencies 8.928 and 9.094 ppm, which is an amide-amide cross peak between the two neighboring residues (dHN(LYS)HN(ILE)). Also, there is a cross peak between 3.972 and 9.094 ppm, which is an alpha-amide cross peak (dHa(LYS)HN(ILE)). There is a cross peak at 1.893 and 9.094 ppm, which can be a beta-amide cross peak, since from residue matching you can see that this frequency likely belongs to a beta proton in the lysine residue. These two (three) interactions usually determine a sequential connectivity.
After neighbor detection, the next step is to match the found patterns against the known amino acid sequence.
29. Matching the found patterns against the known amino acid sequence
After a few seconds, the suggestion is ready. The output contains information about several steps in the automated routine:
Constructing assignment-probability matrix
Probs for ALA :( 0.000 0.000 0.000 0.000 0.000 )
Probs for LYS :( 0.990 0.000 0.800 0.000 0.000 )
Probs for TRP :( 0.000 0.000 0.000 0.000 0.000 )
Probs for VAL :( 0.990 0.000 0.000 0.000 0.000 )
Probs for CYS :( 0.000 0.990 0.000 0.000 0.990 )
Probs for LYS :( 0.990 0.000 0.800 0.000 0.000 )
Probs for ILE :( 0.990 0.000 0.000 0.990 0.000 )
Probs for CYS :( 0.000 0.990 0.000 0.000 0.990 )
Probs for GLY :( 0.000 0.000 0.000 0.000 0.000 )
...
Constructing neighbour-probability matrix
Nbrs for null :( 0.000 1.000 0.000 0.000 0.000 )
Nbrs for null :( 0.000 0.000 1.000 0.000 0.000 )
Nbrs for null :( 0.000 0.000 0.000 1.000 0.000 )
Nbrs for null :( 0.000 0.000 0.400 0.000 0.600 )
Nbrs for null :( 0.000 0.000 0.000 0.000 0.000 )
Generating the assignments ...
... found 0 stretches starting at residue 1
... found 0 stretches starting at residue 2
... found 0 stretches starting at residue 3
... found 1 stretches starting at residue 4
... found 1 stretches starting at residue 5
... found 0 stretches starting at residue 6
...
Number of assignments generated :( 2)
Buffer usage pointers (%) :( 0.040 )
Buffer usage assignments (%) :( 0.002 )
Sorting out the generated assignments
Assignments left :( 1 )
assignment # 1 -- length = 5 residues
...stretch of residues = 4 - 8 total scores:4.76 3.60
Residues:VAL_4 CYSH_5 LYS+_6 ILE_7 CYSH_8
Patterns: 1 2 3 4 5
Scores: 0.99 0.99 0.80 0.99 0.99
I>I+1: 1.00 1.00 1.00 0.60
The Pattern Suggest Assignment took 1 seconds!
The program thus suggests that pa1 belongs to residue 4 (VAL_4), pa2 belongs to residue 5 (CYSH_5), pa3 belongs to LYS+_6, pa4 belongs to ILE_7, and pa5 belongs to CYSH_8. The residue type of this latter spin system was in question - based on frequencies, the program could not distinguish between cysteine and phenylalanine. Now this ambiguity is resolved through the use of systematic search.
A new spreadsheet came up - one which contains this possible sequential assignment in tabular form: Stretches. Now you can make sequence-specific assignments for the known frequencies.
30. Making the sequence-specific assignment for pa1
a. Use the Stretches table to make a quick assignment, then recheck the results and possibly edit the assignments using the Spinsystems table.
b. Use the following sequence of commands:
Following the procedures in Step 21, you next assign the sequence-specific assignment for each frequency of pal.
Alternatively, you can select any item in the Assignments list and then edit it in the Atom Spec box. Next, click ADD to accept it.
You can do the assignments on the Spinsystems table, too. For this, you just edit the fields next to the resonances.
Once you assign the frequencies, you must transfer these assignments to peaks in order to use them together with volume measurements of those peaks in a refinement procedure.
Displaying a full-spectrum contour plot with several contour levels can be time consuming. Using the hot keys <Ctrl>-i is an alternative.
All the peaks should now be red, indicating that none of them are assigned yet, although some of the frequencies are assigned.
Now you need to transfer frequency assignments to peak assignments.
32. Automatically generating peak assignments from frequency assignments
In a few seconds you should see output similar to the following:
Assign peaks for spectrum :(noe)
Tolerances :( 0.010 0.010 )
Spins (h h)
Folding (0 0 )
Transfers ( N )
Nr of peaks unambiguously assigned :( 95 )
Nr of peaks with competing assmnts :( 0 )
Nr with no or too many assignments :( 1787 )
The peak auto assignment took 9 seconds
The cross peaks have different colors, depending on the assignment status: green for fully assigned peaks, red for non-assigned peaks, blue for multiply assigned peaks, and turquoise and purple for partially assigned peaks. You should see several green peaks, with the majority of peaks still being red.
Next you go back and check whether the peaks belonging to different patterns were assigned correctly.
33. Checking the peak assignment for pattern 1
The peak at 9.698 and 5.370 ppm is now green, showing that the peak was assigned along both frequencies. The peak at 5.37 and 9.64 ppm and the symmetric peak at 9.64 and 5.37 ppm are both red, showing that the peaks have not been assigned yet.
34. Checking the inter-residue peak assignment
Next you follow a similar procedure for inter-residue peaks.
The peak at 5.369 and 9.194 ppm is green, denoting that this is a fully assigned inter-residue peak between VAL_4 and CYSH_5.
You now should see output in the output window that looks like:
This indicates an daN(i,i+1) NOE connectivity. If you have the corresponding peak table open as a spreadsheet (Peaks-xpk:noe), this peak is highlighted in the table.
Note the red peak at around 9.7 and 9.1 ppm, which is in the lower-left box of the tile display - if you list it with the Assign/List Peak menu item, you see that the two frequencies defining this peak are assigned to 1:VAL_4:HN and 1:ILE_7:HN but that the peak itself was not assigned:
This is because the two atoms are farther apart than the NOE cutoff used in automated assignment (8 Å). You can check this with the following action.
You see the following output in the output window:
Peak # 154
Frequency Assignment:
W2 W1 Distance (A)
1:VAL_4:HN 1:ILE_7:HN 11.1772
This proves that the peaks were not assigned because the distance criterion was not met.
Turn off the tile mode using the View/Tile Plot/Tile Plot menu item.
To generate structures you need to assign all the peaks.Usually the peak assignment should be done on an NOE spectrum where buildup (i.e., multiple mixing time experiments) information is also available. There is a spectrum - a 450-ms mixing-time NOESY experiment which is defined in the following database.
35. Reading in the database containing fully-assigned patterns
Select the File/Open menu item. Select DBA for the File Type and select the zn_model.dba file from the list. Click OK.
If it prompts you to save changes to the previous DBA file you opened, click Save Changes to save them.
All tables are automatically closed. Select Assign/Experiment to open the Experiments table. Click OK to the next dialog box. Select Edit/Spin Systems to open the Spinsystems table.
The spectrum-specific shifts for this experiment are not exactly set yet, you need to adjust them in the next step:
The next step is to assign all the peaks with the help of these newly defined spectrum-specific chemical shifts.
36. Assigning the buildup automatically
After one or two minutes the auto-assignment is done. You should see something like this in the output window:
Generating automatic assignments
Please wait ...
Press <Esc> to quit.
Assign peaks for spectrum :(buildup)
W1 W2
Spins :(hh )
Folding :( 0 0 )
Transfers :(N )
Tolerances :( 0.010 0.010 )
Nr of peaks unambiguously assigned :( 858)
Nr of peaks with competing assmnts :( 0 )
Nr of peaks with no new assignment :( 1104 )
The peak auto assignment took 81 seconds
The following step is to define NOE distance restraints from this spectrum. In restraint definition, the first step is to define a scalar peak, for which the distance between the atoms it is assigned to is fixed. There are several ways of doing this, but for now we will demonstrate with a fixed HB1-HB2 peak.
List this peak with the Assign/List Peak menu item:
Now set the intensity plot (if you were in contour plot) and draw a full plot (View/Limits/Full Limits).
Yellow footprints appear on the plot, indicating the peaks from which restraints are generated. At the end of the procedure a message appears:
After you have finished peak assignment and restraint generation, you can move on to generate structures in NMR Refine. Therefore, the last step is to write out a database that you can import to Insight II.
This action writes out the znrdlec.pks, znrdlec.ppm, znrdlec.asn, and znrdlec.rstrnt files.
After running DGII or simulated annealing, the first structures are generated. A new set of assignments can be generated based on this new structure(s). First you can redefine the molecular structure and then rerun auto-assignment.
40. Redefining the coordinates
This replaces the linear-chain coordinates of Zn-rubredoxin with the first DG-II structure coordinates.
41. Rerunning autoassignment for the buildup
Generating automatic assignments
Please wait ...
Press <Esc> to quit.
Assign peaks for spectrum :(buildup)
W1 W2
Spins :(hh )
Folding :( 0 0 )
Transfers :(N )
Tolerances :( 0.010 0.010 )
Nr of peaks unambiguously assigned :( 1123 )
Nr of peaks with competing assmnts :( 0 )
Nr of peaks with no new assignment :( 839 )
The peak auto assignment took 87 seconds
More than 250 new peak assignments were made based on the preliminary DG-II structure.
42. Regenerating the restraints
Yellow footprints again appear on the plot, indicating the peaks from which restraints are generated. At the end of the procedure a message appears:
Typically, after a DGII or simulated annealing run you need to analyze your restraints. This can be done in Insight II, and the results can be printed as a file containing a list of restraints that are violated in multiple structures. In FELIX you can then use that file to help you to redefine or reassign erroneous assignments or restraints. This is what you do in the next step.
The program now brings up a new spreadsheet containing the distance restraint violations - Violations. In this table you can zoom in on the peak defining the first problematic restraint and can also see the values of the restraint and the violations.
Select the first row in the Violations table and click the Zoom icon. The restraint for 1:GLU-_47HN and 1:GLU-_47:HG2 which had a restraint between 1.8 and 4.5 Å was violated in 14 conformations out of a total of 20, and the violation average was 0.17 Å. The average distance measured in the 20 conformation is 4.64 Å and the calculated distance based on ISPA is 2.78 Å. You can see that this peak is heavily overlapped, and the symmetric peak has not been assigned at all (Click the Symmetric Peak icon to check). Since this restraint is very unreliable, you may want to delete it.
Item 320 deleted from biosym:noe_dist.
Now you can see in the NOE-Restraints table that this restraint was indeed deleted from the database.
A new peak appears in the spectrum, which is the next problematic restraint. This is a well-defined peak and the distance calculated on the symmetric peak is larger than the one from this peak (use Violation/ Calculate Distance in the Violations table) and also larger than the original restraint was:
You may want to simply increase the bounds for this as in the following step.
Item 243 updated in biosym:noe_dist.
Since there is no other violated restraint left in this file, this finishes the redefinition.
44. Calculating the chemical shift index
The chemical shifts of certain spins can be informative about regular secondary structural elements. This can be exploited as shown in the following step.
In few moments the calculation is done and a spreadsheet appears (HA-CSI) showing the residues, the assigned HA chemical shifts, and the CSI index and grouping, as well as the Richardson classification. By browsing through the table you can see regions that were found to be beta-sheets or alpha-helices. The program also wrote a file csi.txt with this classification to the disk. This file can be imported into Insight II and can (for example) be rendered on the ZNRDDG molecule.
This action writes out the znrddg.pks, znrddg.ppm, znrddg.asn, and znrddg.rstrnt files.
This tutorial shows typical steps involved in assignment of a singly- labelled protein. The data set is the 15N-HSQC, 15N-HMQC-TOCSY, and 15N-HMQC-NOE spectra of the 15N-enriched MCP-1 protein from P. J. Domaille (DuPont Merck, Wilmington) and T. Handel (University of California, Berkeley).
The topics covered in this lesson are:
If not done yet, set up the tutorial files as described in "Setting up tutorial files" in the preface, How to use this book.
The files for this lesson are all located in the Assignment\Lesson2 folder
Start FELIX by double-clicking the FELIX icon on your desktop, or by clicking the Start button on the Windows taskbar, then selecting Programs/Accelrys FELIX 2004/FELIX 2004. If FELIX prompts you to restore from last session, click Cancel.
Change your Current Working Directory to C:\Felix_Practice\Assign\Lesson2\ using the Preference/Directory... command. Build a new database by opening the File/New... command and choosing Create a new matrix or DBA file. Make sure the File Type is set to DBA(*.dba). Enter mcp.dba and click OK.
FELIX informs you that no project was found in the database.
FELIX then asks you for a new project name.
After this step is successfully completed, a library should be defined. The library is an ASCII file, as described in the Assign/Define Library section of the separate FELIX User Guide. FELIX contains a standard library (pd.rdb) which you should read in.
4. Adding experiments to the projects
If you want, you can change the display parameters by going to the Experiments table and using its Experiment/Change Attribute menu item later.
The program plots a density or contour plot of the 15N-HSQC spectrum using the parameters you defined.
Now another control panel appears.
The spectrum-specific tolerances are important to define and are used in many automated and semi-automated procedures.
A new window containing an Experiments table is open and displayed to the left of the spectral window. Note that, by default, whenever a new window (table or spectral) is open, FELIX automatically re-arranges the layout of the windows. You can turn off this feature by selecting Preference/Frame Layout from the main menu and set Action to None. You can also do the automatic re-arrangement at anytime by selecting Window/Auto Arrange.
5. Adding the 15N-HMQC-TOCSY experiment to the projects
Set these values for the plot using the 3D Display Parameters control panel:
Leave the other parameters at their default values and select Apply.
If you want, you can change the display parameters using the Experiment/Change Attribute menu item in the Experiments table later.
The program plots a density plot or contour plot of the first D1-D2 plane of the 15N-HMQC-TOCSY spectrum using the parameters you defined.
Now another control panel appears.
Set the parameters to these values:
6. Repeating Step 4 for the 15N-HMQC-NOE spectrum
In the next control panel select these values:
Activate the spectral window so that the main menu is displayed. Select the File/Import/Peaks menu item. Set the Selection parameter to hsqc.xpk. Leave the FELIX Peak Table Name parameter at its current value (xpk:hsqc) and the Peak File Type as FELIX Peak File. Click OK.
When the program asks you whether to overwrite the entity, click OK.
This command reads in the peaks and displays them in a spreadsheet. The peaks are also displayed as boxes.
Now you have a full peak set defined for all three experiments.
8. Selecting the HSQC spectrum
The next step is the collection of prototype patterns, that is, sets of frequencies, which are later promoted to patterns and assigned to specific amino acid residues. The commands connected to prototype patterns are in the third subsection of the Assign pulldown. Since we have the 15N-HSQC, 15N-HMQC-TOCSY, and the 15N-HMQC-NOESY spectra in our project, we demonstrate the use of the two currently available double-resonance prototype pattern-collection methods.
9. Performing prototype pattern detection
You see a control panel with several options. The program tries to automatically fill in reasonable values
Make sure these values are set in the third control panel:
Now click More.... and set these values:
Information about the current stage of prototype pattern collection appears in the output window. After one or two minutes, the prototype pattern collection finishes and a spreadsheet of prototype patterns is displayed. The following information appears in the output window:
Nr of protos generated : ( 52)
The 3D protopattern detection took 4 seconds
The protein has 77 residues, from which you can theoretically expect only 71 spin systems to be found, since there are 5 prolines and the N- terminal spin system is probably missing. If you have recorded a well resolved 2D 15N-HSQC spectrum, then that can greatly help in spin- system collection. Therefore, we present here the other prototype pattern-detection method implemented in FELIX.
10. Performing the second prototype pattern detection
You get a third control panel with several options. The program tries to automatically fill in reasonable values.
Set these values in the third control panel:
Now click More... and set these values in the resulting control panel:
Information about the current stage of prototype pattern collection appears in the output window. After one minute, the prototype pattern collection is finished, and the following information appears in the output window:
Nr of protos generated : ( 13)
The 3D protopattern detection took 1 seconds
Now you have 65 prototype patterns in all. While this method relies on well resolved 2D HSQC peaks, the previous one depends on well resolved pseudo-diagonal peaks of the HMQC-TOCSY spectrum. In certain cases, the higher digital resolution and better-defined peak shapes of 2D spectra help find more spin systems, while in other cases relying on the third dimension yields better results. Sometimes the combination of the two is the best choice, as you can see from this example (the 15N-HSQC + 15N-HMQC-TOCSY would generate only 58 spin systems).
Since clearly some spin systems were missed, it is always advisable to inspect the peaks in the spectrum to see which ones were not assigned to spin systems. This procedure is shown in Step 13.
11. Visually inspecting several prototype patterns
The region (strip) containing peaks of the 3rd spin system is displayed.
Now connect the HSQC and HMQC-TOCSY spectra.
12. Connecting the HSQC and TOCSY spectra
This connects the D1(1H) of HSQC with D1(1H) of HMQC-TOCSY, and D2(15N) of HSQC with D3(15N) of HMQC-TOCSY. If you switch to the frame containing the tocsy experiment, you can zoom in on the spectra (Zoom in Protopatterns table), but this time display the same region in both spectra.
You can use various methods to browse through the spectrum in Frame 2, and the same action occurs in Frame 1, too.
13. Coloring the peaks based on prototype patterns
A new control panel appears, where you can set the colors for peaks which have each frequency belonging to a prototype pattern (To The Same), and for those which do not (None).
From now on, when you use the View/Draw Peaks menu item, the peaks will be drawn according to this coloring scheme: green peak boxes will be drawn at peaks that belong to a prototype pattern, and red peak boxes will be drawn at peaks that were not assigned to any particular spin system. Therefore, the manual spin-system detection should proceed from those peaks which are, in this case, red.
When the full plot is drawn you can see that there are red and green peak boxes. You may notice a red peak box at the lower edge of the HSQC spectrum at around 124 ppm and 8.7 ppm.
This moves the display of the 3D TOCSY spectrum at that particular plane.
Now you learn to create a new spin system manually.
Now you are ready to add frequencies to this clipboard.
You can check what is in the clipboard by listing it (Assign/Frequency Clipboard/ View Clipboard), and the result is printed in the output window:
The Frequency Clipboard List contains the following frequencies:
# Freq(ppm) Atom
--- --------- ----
1 8.744 H
2 124.156 N
Now switch to the frame containing the 3D TOCSY spectrum.
No peak box drawn for this peak, because that peak was missed during peak picking since it is on the very edge of the spectrum.
Since there were no peaks picked for these latter frequencies, your actual results may be different from the those presented. Check the clipboard again (Assign/Frequency Clipboard/View Clipboard):
The Clipboard List contains the following frequencies:
# Freq(ppm) Atom
--- --------- ----
1 8.744 H
2 124.156 N
3 5.183 X
4 3.032 X
14. Promoting the prototype patterns to patterns
After couple of seconds, 65 new patterns are generated and displayed in a spreadsheet. You can inspect the patterns using the Assign/Report Spin System menu item:
15. Adding the manually detected spin system to the patterns
A new pattern with name pa66 is added to the database and is also displayed in the Spinsytsems table.
After a few minutes, all 66 patterns are scored and the residue type probabilities are stored. Using the Assign/Report Spin System menu item for the first pattern will give similar results:
After the spin-system probabilities are defined, the next step is to find neighboring spin systems. This can be achieved here by using the 15N- HMQC-TOCSY spectrum. In such a spectrum you can expect cross peaks between the spins of the ith and (i+1)th residue, as well as between further separated residues. The algorithm should search for NOE cross peaks, such as HN,i-HN,i+1(-Ni+1), Ha,i-HN,i+1(-Ni+1), and Hb,i-HN,i+1(-Ni+1), whose presence makes the connectivity between the two spin systems probable. Before you start the neighbor search, the spectrum-specific shifts of the patterns for the 15N-HMQC-NOESY spectrum should be updated.
17. Setting the spectrum-specific shifts for all patterns
Since the chemical shifts in the patterns were defined using the HSQC and the 15N-HMQC-TOCSY spectrum, a slight difference is expected between those shifts and the actual shifts in the 15N-HMQC-NOESY spectrum. To take into account this possible shift difference, you need to edit the spectrum-specific shifts of the patterns. This can be done either manually (where for each pattern the chemical shifts of frequencies are adjusted based on displayed intrapattern peaks) or automatically.
In few minutes the spectrum-specific chemical shifts are set for all the patterns. You can see the results by using the Assign/Report Spin System menu item for e.g. the first pattern:
You must update the root frequency attribute of the patterns:
18. Performing neighbor searches
Select the Assign/Neighbor/Find Neighbor Via 3D N-15 NOE menu item. Set these values:
Click More... and set these values:
In few seconds the neighbor search is done. There are several ways to check for the result of the run; using the previously described Assign/ Report Spin System menu item now will result in output such as:
You can also use the Assign/Neighbor/List Neighbors menu item:
The output will be similar to:
The possible neighbors (i - i+1) for pattern pa1 are:
pattern pa12 with probability: 0.2500
pattern pa28 with probability: 0.1875
pattern pa51 with probability: 0.1875
pattern pa48 with probability: 0.1250
pattern pa50 with probability: 0.1250
pattern pa62 with probability: 0.1250
Or you can visually inspect the neighboring patterns:
Seven strips appear on the screen, containing plots of the region containing the frequencies of pattern pa1, and the neighboring patterns: pa12, pa28, pa48, pa50, pa51, and pa62. Also, the output window shows:
Strip plot of pattern 1 with neighbors:
pa12 0.2500
pa28 0.1875
pa51 0.1875
pa48 0.1250
pa50 0.1250
pa62 0.1250
The next step is to generate possible sequence-specific assignments for the patterns, that is, to compare spin-system type and neighbor-probability information with the primary sequence and make suggestions about which pattern belongs to which particular amino acid in the sequence. This can be done in FELIX through the Assign/Sequential menu items. Here we show one approach, using the Assign/ Sequential/ Systematic Search menu item; other approaches can be found in the Tasks Chapter of the separate FELIX User Guide.
Since the 15N-HMQC-TOCSY spectrum does not contain spin systems from prolines therefore we need to find stretches of sequential assignments between the Pro residues. The first such one in the sequence is between residues 4 and 8. Certainly, if one has the Pro spin systems detected, then this limitation does not exist.
19. Generating sequence-specific assignments for the stretch of residues 4-8
Select the Assign/Sequential/Systematic Search menu item and set these values:
Leave the other parameters at the default values and click OK.
After few seconds you will see a listing of several possible assignments, the beginning of which would look like:
Also, a new spreadsheet is activated: Stretches where stretches of possible sequential assignment are stored.
20. Showing the first possible sequential assignment
Now you can inspect the first possible stretch.
In few seconds a strip plot appears which contains five vertical strips - along the HN-N frequencies of the patterns pa36, pa37, pa4, pa44, and pa58. Also, a message is printed in the output window:
. Strip plot of stretch # 1
The patterns in the stretch are:
pa36
pa38
pa4
pa44
pa58
21. Generating sequence-specific assignments for the stretch of residues 4-8 via simulated annealing
Next you try the other method available for sequential assignment: simulated annealing.
Select the Assign/Sequential/Simulated Annealing menu item. Set these values:
Leave the other parameters at the default value (1.0) and click OK.
The output in the output window should be similar to this:
Now you can inspect this assignment too.
In the output window a message appears:
Strip plot of pattern 10
Strip plot of pattern 33
Strip plot of pattern 4
Strip plot of pattern 44
Strip plot of pattern 58
and five vertical strips are displayed in the frame.This solution was not found in systematic search, since the neighbor probability measure between pa10 and pa33 is 0.125, which is lower than the cutoff we set (0.15).
Using the simulated annealing and systematic search methods for sequential assignment one can then assign the patterns to specific residues.
This lesson takes you through typical steps in the assignment of a doubly-labelled protein. The data set consists of the HNCACB and CBCA(CO)NH spectra of the 13C- and 15N-enriched RNA-binding domain of hnRNP C from Luciano Mueller (Bristol-Myers Squibb, Princeton: see Wittekind 1992).
The topics covered in this lesson are:
If not done yet, set up the tutorial files as described in "Setting up tutorial files" in the preface, How to use this book.
The files for this lesson are all located in the Assignment\Lesson3 folder
Start FELIX by double-clicking the FELIX icon on your desktop, or by clicking the Start button on the Windows taskbar, then selecting Programs/Accelrys FELIX 2004/FELIX 2004.
If FELIX prompts you to restore from last session, click Cancel.
Change your Current Working Directory to C:\Felix_Practice\Assign\Lesson3\ using the Preference/Directory... command. Build a new database by opening the File/New... command and choosing Create a new matrix or DBA file. Make sure the File Type is set to DBA(*.dba). Enter hnrp.dba and click OK.
This procedure typically takes several seconds; a meter on the status bar shows the progress.
After the setup is successfully completed, you should define a library. The library is an ASCII file, as described in the Assign/Define Library section of the separate FELIX User Guide. FELIX contains a standard library (pd.rdb) which you should read in.
5. Adding experiments to the projects
If you want, you can change the display parameters using the Experiment/Change Attribute menu item in the Experiment table.
The program plots a density plot or contour plot of the HNCACB using the parameters you defined. Note, since by default this is the first plane, that possibly no cross peaks may be seen, but you can select a different plane (or different view) using the New Plane button later.
Now another control panel appears.
Set the parameters to these values:
It is important to define the spectrum-specific tolerances, since they are used in many automated and semi-automated procedures.
A new window containing an Experiments table is open and displayed to the left of the spectral window. Note that, by default, whenever a new window (table or spectral) is open, FELIX automatically re-arranges the layout of the windows. You can turn off this feature by selecting Preference/Frame Layout from the main menu and set Action to None. You can also do the automatic re-arrangement at anytime by selecting Window/Auto Arrange.
Note: When one or more table windows are open, only the menu and tool bar of the currently activated window are visible. If you want to select a certain menu item or tool bar icon, be sure to click the corresponding window first to activate its menu and tool bar (if any).
6. Adding the CBCA(CO)NHN spectrum to the project
In the next control panel, set the parameters to these values:
In the Experiments table select the hncacb experiment (click the first row) and click the Draw icon. This draws the hncacb spectrum in the spectral frame.
Activate the spectral frame and select the File/Import/Peaks menu item. Select hncacb.xpk and leave the FELIX Peak Table Name parameter at its current status (xpk:hncacb). Click OK.
This action reads in the peaks and also displays them as a peak table.
Now you have a full peak set defined for both experiments. Close the two peak tables by selecting File/Close from their menus.
8. Selecting the HNCACB spectrum
The next step is the collection of prototype patterns, i.e., sets of frequencies, which later are promoted to patterns and assigned to specific amino acid residues. The commands related to prototype patterns are in the Assign pulldown in the third subsection. Since we have the HNCACB and the CBCA(CO)NH spectra in our project, we demonstrate the use of one of the triple resonance prototype pattern collection methods.
9. Performing a prototype pattern detection
After a couple of minutes, the prototype pattern collection is finished, and the following information appears in the output window:
A new table of prototype patterns also appears. Comparing these prototype patterns with the spin systems assigned previously (1) we confirm that 82% of the spin systems were picked correctly.
10. Visually inspecting several prototype patterns
The region containing the four HNCACB peaks of the first protopattern is displayed.
In order to see the HNCACB and CBCACONH peaks simultaneously when you browse the prototypes, you use two connected spectral windows as follows.
11. Connecting the HNCACB and CBCACONH spectra
For the following action you need to have room for two spectral frames.
Two spectral frames are displayed and the other table windows are re- organized automatically. Since the up/down layout of two spectral windows is not the default layout of FELIX, you have to toggle off the automatic rearrangement option in order not to mess up your layout when a new window
Now the two spectra are connected in all three dimensions. Now you can navigate in the first frame and the display is updated in the second frame, too. For example, if you zoom on a prototype pattern in the first frame, the same region is displayed in the second frame, too.
You can use different activities to browse through the spectrum in Frame 1, and the same action also takes place for Frame 2.
12. Correcting the prototype patterns
In this section you find out how to make corrections to the automatically detected prototype patterns. To do this, you need to zoom on a prototype pattern and then inspect the frequencies (which are drawn through the peaks).
Alternatively, you can double-click the first row, since the default action for double-clicking is to zoom. You can change the default action using the popup on the combo box of the table.
You can see that the first prototype pattern is correct. Now zoom on the second prototype pattern, and you can see that this one is also correct. When you zoom on the third prototype pattern, you can see that only three carbon frequencies were found and that there is a well-defined peak at 64.1, 121.8, 9.4 ppm. If you look at the prototype pattern in the table you can see that the CA(i) was missed, and most likely this peak contains that frequency (64.1 ppm). Here, you will add that frequency:
Now the prototype pattern is redrawn on the screen with four carbon frequencies and the table is updated: there is now a CA frequency for the third protopattern.
You would typically go through the full set of prototype patterns and make similar adjustments. Also, you can edit the frequencies in the table directly or by using the Assign/Edit Prototype menu item. You can also delete spurious protopatterns through the table or through the Assign/ Edit Prototype menu item.
After the prototype patterns are cleaned up you can promote them to patterns.
13. Promoting the prototype patterns to patterns
Select the Assign/Promote Prototype Patterns menu item. In the first control panel, select the Copy Prototype Patterns to Spin Systems (Patterns) option and click OK.
In the next control panel, set the Sequential Tolerances for C to 0.25, set the Find Prolines to Yes, and leave the Compare Similarities parameter as No. Click OK.
You can follow the progress in the output window: in first stage you can see that the regular spin systems are to be promoted - i.e., non-prolines. Then FELIX tries to find spin systems that can be interpreted as prolines - i.e., such protopatterns that contain a Ca,i-1 in the 60-66 ppm range and a Cb,i-1 in the 28-34 ppm range and no appropriate HN and N. In third stage FELIX tries to find the possible sequential connectivities and stores them with the newly generated patterns as probabilities. After a few minutes 86 new patterns are generated and the table containing them is displayed (Spinsystems). You can either inspect the patterns in the table or use the Assign/Report Spin System menu item:
After few seconds all 86 patterns are scored and the residue type probabilities are stored. Using the Assign/Report Spin System menu item for the first pattern gives a similar result:
15. Generating sequence-specific assignments for patterns
After few seconds you will see many possible assignments; the beginning of the output will look like:
The scoring stores the results in the database as stretches. At the end of the action the stretches are shown in tabular form. This table can be closed and reopened using the Edit/Stretch menu item. Next, we show how to use stretches in helping with the assignment procedure.
16. Reviewing the first stretch
You will see the strip plot of patterns 49, 64, 58, 41, 77, 20, 22, 31, and 7. I
17. Setting the sequence-specific assignments for patterns 49, 64, 58, 41, 77, 20, 22, 2, 31, and 76
During execution the following lines appear in the output window:
Pattern pa49 assigned to residue 1:GLY_69.
Pattern pa64 assigned to residue 1:GLU-_70.
Pattern pa58 assigned to residue 1:ASP-_71.
Pattern pa41 assigned to residue 1:GLY_72.
Pattern pa77 assigned to residue 1:ARG+_73.
Pattern pa20 assigned to residue 1:MET_74.
Pattern pa22 assigned to residue 1:ILE_75.
Pattern pa2 assigned to residue 1:ALA_76.
Pattern pa31 assigned to residue 1:GLY_77.
Pattern pa76 assigned to residue 1:GLN_78.
When you are done you can check the resulting pattern using the Assign/Report Spin System menu item. For example, pattern 58 would yield a similar result: