Tasks

All other spectrometer data must be converted to one of the first three formats before FELIX can access it. The new FELIX format is preferred. Unlike the old format the new format can accurately and consistently move data between systems with different byte ordering. The ASCII format is generally used only when no other method of data conversion is available.

Transferring data

First, ensure that FELIX can access the spectrometer data on disk. One option is to mount the spectrometer data disks via a network connection so that they are accessible to the workstation running FELIX. If this is not possible, transfer the data from the spectrometer computer to the FELIX workstation. This is usually accomplished by using ftp to transfer the data via Ethernet.

Note: All of the data filters require that the spectrometer data files be kept in their native form. For example, the X32 data filter for Bruker data expects to see the same file and directory structure that exists on the spectrometer. The Bruker "data file" is actually a directory containing subdirectories with experimental and processed data files. When the spectrometer data is transferred to the FELIX workstation, this entire directory structure must remain unchanged.

One way to assure that the directory structure remains unchanged is to first make a .tar file of the desired data directories on the spectrometer. This .tar file can then be transferred to the FELIX workstation using ftp in binary mode. Once the data have been transferred, you can untar the data to use it in FELIX.

Converting processed data

Use a conversion filter to convert processed data directly to the FELIX matrix format. Use the File/Convert/Matrix command to perform this task. Currently, you can convert NMRPipe, NMRCompass, Bruker 2rr and 3rrr files, and Varian data. The resulting FELIX matrix file can then be directly accessed by the FELIX program.

Converting Varian spectra

You can import processed Varian spectral data (that is, phase files) into FELIX. These phase files can be for 2D or 3D data sets. Using a macro called sv2dpf, you create a parameter (.par) file corresponding to 2D processed data sets. Similarly, using a macro called sv3dpf, you create a parameter (.par) file corresponding to 3D processed data sets. Each of these macros is provided below, and both must be run by VNMR (that is, Varian processing software). These two macros are located in the FELIX gifts\VARIAN\matrix\ directory and must be transferred to the VNMR macro library directory before they can be run by VNMR.

When you prompt FELIX to import a Varian spectrum and provide the filename for that spectrum, FELIX searches for a corresponding .par file and converts the data in that file into FELIX matrix format. If FELIX does not find a .par file with the same root name as the spectrum filename that you specified, FELIX prompts you to enter the spectral parameters manually.

If you are importing a 2D spectrum, you must rename the phase file to a name that corresponds to the root name of the .par file. For example, if you use the sv2dpf macro to create a file called test.par, you must rename the phase file to test.

The sv2dpf macro

"sv2dpf - save 2d phasefile"
"usage: sv2dpf (basename)"

" $# is the number of input arguments. It must be greater than zero. "

if ($# < 1) then
write ('error','usage:sv2d(filename)')
return
endif

" If the file already exists, delete it. "

exists ($1,'file') : $e
if $e then
rm($1)
endif

"Flush the phasefile completely from memory"

write ('line3', 'saving raw data to disk')
trace='f1' dcon flush
trace='f2' dcon flush

" Create the text file with suffix .par containing parameters of the data"

$parfile=$1+'.par'
write ('line3', 'saving parameters to disk')
write ('reset', $parfile)
write ('file', $parfile, '%d %d %d',ni,np,0)
write ('file', $parfile, '%10.1f %10.1f %10.1f',sw1,sw,0)
if (tn=dn) then
$frq1=sfrq
else
$frq1=dfrq
endif
write ('file', $parfile, '%10.1f %10.1f %10.1f', $frq1, sfrq, 0)

" We are finished writing the parameters"

write ('line3', 'Data written to %s and %s', $1, $parfile)

The sv3dpf macro

"sv3dpf - save 3d phasefile"
" usage: sv3dpf(basefilename,orientation)"
" where basefilename is the basic filename; to that will be appended"
" '.orientation' to indicate the orientation and"
" '.xxx' to indicate the index"
" orientation is 'f1f2','f1f3', or 'f2f3'"

if ($#<2) then write('error','improper arguments') return endif
if ($2='f1f2') or ($2='all') then "save f1f2 data"
$i=1
repeat
select('f1f2',$i)
trace='f1' dcon flush
trace='f2' dcon flush
format($i,0,0):n1 length(n1):$len
if ($len=2) then n1='0'+n1 endif
if ($len=1) then n1='00'+n1 endif
copy(curexp+'/datdir/phasefile',$1+'.f1f2.'+n1)
$i=$i+1
until $i>(fn/2)
endif

if ($2='f1f3') or ($2='all') then "save f1f3 data"
$i=1
repeat
select('f1f3',$i)
trace='f1' dcon flush
trace='f3' dcon flush
format($i,0,0):n1 length(n1):$len
if ($len=2) then n1='0'+n1 endif
if ($len=1) then n1='00'+n1 endif
copy(curexp+'/datdir/phasefile',$1+'.f1f3.'+n1)
$i=$i+1
until $i>(fn2/2)
endif
if ($2='f2f3') or ($2='all') then "save f2f3 data"
$i=1
repeat
select('f2f3',$i)
trace='f2' dcon flush
trace='f3' dcon flush
format($i,0,0):n1 length(n1):$len
if ($len=2) then n1='0'+n1 endif
if ($len=1) then n1='00'+n1 endif
copy(curexp+'/datdir/phasefile',$1+'.f2f3.'+n1)
$i=$i+1
until $i>(fn1/2)
endif

The .par file

As described above, each macro creates a parameter (.par) file. The name of this file must correspond to the name of the spectrum that you want to import. The table below lists the contents of the .par file.

Note:

Line number Description

1

Size of data rows, columns, and tiers.

2

Sweep width along rows, columns, and tiers.

3

Spectrometer frequency along rows, columns, and tiers.

4

Flag to define spectral zooming and the range of ppm along the tier dimension for a 3D data set.

**Note:**
	Line number		Description
1	Size of data rows, columns, and tiers.
2	Sweep width along rows, columns, and tiers.
3	Spectrometer frequency along rows, columns, and tiers.
4	Flag to define spectral zooming and the range of ppm along the tier dimension for a 3D data set.

You need not have a fourth line in the .par file to import a Varian spectrum. However, you can add this line if FELIX displays flags that prompt you to define spectral zooming or the range of ppm along the tier dimension for 3D data.

The format of the .par file is shown below. According to VNMR convention, the order of the values in the first three lines of the .par file varies, depending on which set of planes is being read. That is:

Table 3
f3 f1 f2

1st line:

fn/2

fn1/2

fn2/2

2nd line:

sw

sw1

sw2

3rd line:

sfrq

(dfrq)

(dfrq2)

Table 3
			f3		f1		f2
1st line:	fn/2	fn1/2	fn2/2
2nd line:	sw	sw1	sw2
3rd line:	sfrq	(dfrq)	(dfrq2)

Table 4
f3 f2 f1

1st line:

fn/2

fn2/2

fn1/2

2nd line:

sw

sw2

sw1

3rd line:

sfrq

(dfrq2)

(dfrq)

Table 4
			f3		f2		f1
1st line:	fn/2	fn2/2	fn1/2
2nd line:	sw	sw2	sw1
3rd line:	sfrq	(dfrq2)	(dfrq)

Table 5
f2 f1 f3

1st line:

fn2/2

fn1/2

fn/2

2nd line:

sw2

sw1

sw

3rd line:

(dfrq2)

(dfrq)

sfrq

Table 5
			f2		f1		f3
1st line:	fn2/2	fn1/2	fn/2
2nd line:	sw2	sw1	sw
3rd line:	(dfrq2)	(dfrq)	sfrq

Note: Because only a few pulse sequences use the second and third channel for the f1 and f2, respectively, the location of dfrq and dfrq2 will rarely be in the correct position. Therefore, you will need to add the correct spectrometer frequency. You can use the axis parameter in VNMR to determine the correct spectrometer frequency for all three axes

Task: Modifying the quick-access (context) menu

FELIX includes a quick-access (also called context) menu that is invoked by the secondary mouse button. This menu is intended to make it easy to customize the FELIX interface. (It is possible to rewrite the whole user interface; however, just change this menu and macro and make your favorite commands easily accessible.) The necessary changes are described here.

Locate mouser.mot in the macros\mot\ folder under your installation directory. (See Starting FELIX, How to Use This Guide in this book for more details about the installation directory). This macro defines the items to be displayed in the context menu, and the callback macros (and in some cases, parameters) when the items are clicked.

mouse.mot
item "Draw Frequency" D NULL NULL "ex drawfreqs 1" NULL NULL
item "Add Frequencies" A NULL NULL "ex drawfreqs 2" NULL NULL
item "Clear Frequencies" F NULL NULL "ex drawfreqs 3" NULL NULL
Separator Separator NULL NULL NULL NULL NULL NULL
item "Correlated Cursors" C NULL NULL "ex cursorval 2" NULL NULL
Separator Separator NULL NULL NULL NULL NULL NULL
item "Pick One Peak" O NULL NULL "ex pickone" NULL NULL
item "Edit One Peak "E NULL NULL "ex editpeak" NULL NULL
item "Merge One Multiplet "M NULL NULL "ex mergeone" NULL NULL
item "Remove One Peak" n NULL NULL "ex removeone" NULL NULL

The format of this file follows that for the general FELIX menu files. For example, the first line displays a "Draw Frequency" item in the context menu, and when it is clicked FELIX executes a macro drawfreqs with the first parameters as 1. For more details about the format, see the separate Command Reference Guide.

Task: Working with 1D data

FELIX provides a comprehensive set of tools for processing, displaying, and analyzing NMR data. This section outlines ways to use the menu interface for processing and analyzing one-dimensional (1D) NMR data. For information on converting data files into a format that can be read by FELIX, please see "Tasks" on page 33, and Appendix D., Data files.

Reading data files

FELIX reads two distinct file formats of its own. You must tell FELIX what format your data is in for it to be interpreted correctly.

To read a data file, use the File/Open command. You can specify the data as either one-dimensional or N-dimensional, and as either Bruker, Varian, FELIX For Windows, JEOL, or FELIX old or new format.

If you define the data type as one-dimensional, the data file is explicitly closed before it is read, to reset the record pointer to the first record. When you are first working with N-dimensional data, FELIX reads the first record from the serial data file. Each subsequent reading (during the same FELIX session) increments the record pointer by one and reads the next data record.

Sample task:

As an example of reading in a sample data file, copy the 1D data file sample.dat from the %ACCELRYS_FELIX%\tutorial directory to your working directory.

Once you have copied the sample data file to your working directory, try reading it into FELIX using the File/Open command.

Saving data

Sometimes, you might want to save the data to disk as a permanent record. For example, you can save a fully processed spectrum to a file so you can display it quickly without re-transformation. To do this, select the File/Save As command. Specify a file name and the data are saved in FELIX new format. If the file already exists, FELIX prompts you to quit or overwrite the file.

Displaying 1D spectra

The most common way of manipulating displays is to use the View pulldown. View contains a set of commands designed to manipulate the display of 1D data. For example, to redraw the current workspace, select the View/Plot command.

By default, when FELIX initially reads and displays a new file, it draws all the data points (the entire spectral width). To display or work with an expanded region of the spectrum, you must select the expansion limits.

Changing 1D limits

Use the View/Limits/Set Limits command to choose spectrum limits in real time. FELIX displays a crosshair cursor. Move this cursor to the region you want to expand, push and hold down the left button on your mouse, drag out the region to be expanded, and then release the mouse button.

Alternatively, you can use the icons - particularly the Zoom icon - to execute this task.

Adjusting plot parameters

Use control panels to tailor the appearance of your data plots. FELIX shows the current settings for the plot appearance; you can change these settings. Access the control panels from the Preference/Plot Parameters command; or click the Plot Parameters icon.

1D data buffers

FELIX allows and enhances processing and analysis of one-dimensional and N-dimensional spectra through the availability of additional addressable memory locations. These storage spaces are called buffers and may be used to temporarily save data. The data stored in the buffers may be as simple as a single data value or as complex as a protein spectrum.

The FELIX buffers are addressed by number: buffer 1, buffer 2, ... buffer n. The size and number of buffers available is determined by the amount of memory configured for this use. Use the Preference/Memory command to change the program's memory allocation. For example, if four 1D buffers are defined, each having a buffer size of 16384 complex points, enough space can be reserved to let you work with the current data plus four buffers for data containing up to 16384 points.

Buffers are accessed with the Tools/Buffers command. For example, to store the current information to a buffer, select the Tools/Buffers/Store Work to Buffer command and enter the buffer number in the control panel. To visualize this information, you must change the stack depth to include the buffers that you want to visualize. The Tools/Buffers command contains many additional controls that allow you to manipulate the contents of the buffers.

Adjusting stack and buffer display

The stacks and buffers are generally used for displaying multiple spectra simultaneously. The stack represents the range of buffers selected for display. Change the various stack parameters (Depth, Order, and Overlap) in the PLOT PARAMETERS - BASIC control panel. Access it by selecting the Preference/Plot Parameters command or the Plot Parameters icon.

Axes and referencing

FELIX can display a spectral axis with several types of labels. The default axis label is points, because reliable point positions do not change when the values for the spectrum width and spectrometer frequency change. Change axis with the Preference/Plot Parameters command in the Scale control panel. Choices for the Axis Units parameter are None, Points, Hertz, PPM, Seconds, and 1/cm.

To reference a one-dimensional spectrum, select the Preference/Reference command, which asks you for the referencing information using the REFERENCE 1D DATA control panel. The Spectral Frequency and Spectral Width parameters must be set to the values of the spectrometer frequency in MHz and the spectrum width in Hz.

Enter the Reference Point value either by typing it in the box on the control panel or by clicking the Cursor button in the control panel. This prompts FELIX to display a vertical cursor; here you move the cursor to the desired reference point on the plot and then click the left mouse button.

At this time FELIX displays the REFERENCE 1D DATA control panel. Now enter the reference value in the appropriate box; for axis units of Hertz enter a reference value in the Reference Hertz box, for ppm units enter a reference value in the Reference PPM box. When you are finished, click OK to close the control panel and redisplay the data with the selected axis units.

Finding data values in spectra

Use the Measure/Cursor Position command to obtain point numbers, ppm values, and corresponding data values. When you select the Measure/Cursor Position command, FELIX displays a vertical cursor, which updates the current axis position and data value when you move it with the mouse. The axis position is in axis-based units ; if your axis is in points it tracks in points, if your axis is set to ppm it tracks in ppm. To quit the cursor-tracking mode, press <Esc>. The data value shown is the actual data value stored at that location in the workspace.

Correlated cursors

When you display data in several frames at once, you can use correlated cursors multiple crosshair cursors) that track in more than one frame. Use the mouse to point to the same data-point values in all the frames simultaneously.

For example, if there are three frames, three crosshair cursors track the mouse motion. The center of each cursor falls on the same data point value in each frame. To activate correlated cursors, select the Measure/Correlated Cursors command. To find the cursor position and data-point value, click the mouse button. To quit the correlated-cursor mode, press <Esc>.

Spectrum separation

You can calculate the separation between any two spectrum features in a display with the Measure/Distance/Separation command. Use a crosshair cursor to select two locations on the display. FELIX reports the separation in points, ppm, and Hertz. To quit this mode, press <Esc>.

Zero-filling and removing DC offset

Zero-filling is a common manipulation performed on time-domain data. Select the Process1D/Zero Fill command to increase the number of points in the transformed spectrum, and thereby increase the spectrum's apparent digital resolution.

To remove any DC offset that may have occurred during data acquisition, raw FIDs are usually corrected prior to Fourier transformation, To do this, select the Process1D/DC Offset command. Select regular BC (baseline correction) or DBC (for Bruker oversampled data) and set a Baseline Correction Fraction. Use the Baseline Correction Fraction command to specify the fraction of the FID, starting from the right side, to be averaged to eliminate the DC offset.

Linear prediction

Linear prediction estimates the value of a point based on the values of adjacent points; they can be used to replace corrupted values in an FID. Often, several of the first data-point values can be corrupted by instrumentation-induced artifacts. Therefore the spectrum can be noticeably improved when these corrupted point values are replaced with values estimated by linear prediction, where the estimated values are based on subsequent points - points of greater integrity.

A second application of linear prediction to NMR data is to extend an FID. This is useful for experiments in which the data collection stopped before the signals completely decayed; that is, the FID is truncated. Here, the data values of the FID are used to estimate new data values that are appended to the end of the FID.

Process1D/Linear Prediction/First Point Predition

Use the Process1D/Linear Prediction/First Point Prediction command to use linear prediction to replace data values at the beginning of the FID. FELIX displays a control panel; specify the following parameters for the prediction:

The time taken to complete an LP calculation is determined in large part by the number of data points that are used to determine the LP coefficients. In general, the quality of the LP calculation also increases with the number of data points used in the calculation.

Therefore, you must pick a value for the Points to use that is large enough to produce accurate predicted data values but that does not unduly lengthen the time taken by the LP calculation. The value for the Number of Coefficients is generally set to be one-quarter to one-third of the Points to use.

Process1D/Linear Prediction/Last Point (or general) Prediction

Use the Process1D/Linear Prediction/Last Point (or general) Prediction command to use linear prediction to replace data values at the end of the FID. You can also use it to extend the FID.

Conventional linear prediction works well when the data are being extended by a small fraction of the number of points. However, when the number of data points is being extended by a large fraction (e.g., doubling the number of points), the predicted data can contain signals with increasing amplitude due to the effect of noise on the predicted LP coefficients. The net effect can be an FID with increasing (as opposed to decreasing) amplitude as a function of time.

To prevent this you may use Root Reflection when predicting data in FELIX. Root Reflection ensures that the calculated frequency components decay as a function of time and thus more accurately reflect the correct physical nature of an FID. Using Root Reflection increases the time needed to perform the LP calculation, but the predicted points are more representative of a true FID. Root Reflection is essentially required when predicting large numbers of data points, to avoid having noise components with increasing amplitude dominate the predicted FID at longer time values.

LP calculation methods

FELIX allows great flexibility in how the LP calculation is performed. By default, the Process1D/Linear Prediction/First Point Prediction command is used for extending the data, but you may also specify exactly which points you would like to use in the LP calculation and which points you would like to predict.

You specify the method used to perform the LP calculation. The options include Forward, Backward, Forward-Backward, and Mirror Image.

When you use the Mirror Image technique, you can increase the Number of Coefficients to between one-half and two-thirds the value of the Points to use.

The Mirror Image technique requires prior knowledge of the phase of the data and nondecaying signals. Because of these restrictions, the Mirror Image technique is used primarily for severely truncated indirect dimensions of N-dimensional data sets when there is a need to calculate more LP coefficients than would be possible with the Forward- Backward method. The Mirror Image method includes options for data collected with no sampling delay and data collected with a one-half dwell time sampling delay.

Solvent signal suppression

FELIX offers three methods for reducing the intensity of strong solvent signals : a linear-prediction-based algorithm, a convolution-based method, and a polynomial-based method. All three methods are accessible by using one of the options in the Process1D/Solvent Suppression command.

Linear-prediction-based solvent reduction

To access linear-prediction-based solvent reduction, select Linear Prediction from the Method popup in the control panel that FELIX displays when you select the Process1D/Solvent Suppression command. This technique uses the LP algorithm to estimate and remove contributions from the most intense components in the spectrum. This technique works well when the intensity of the solvent signal to be removed is much greater than the other signals that are present.

Convolution-based solvent reduction

To access convolution-based solvent reduction, select Time-Domain Convolution from the Method popup in the control panel that FELIX displays when you select the Process1D/Solvent Suppression command. In this technique, FELIX performs a convolution to first identity the lowest-frequency components that are present. Then, FELIX subtracts these components from the data. This technique is very useful if the solvent signals to remove are present at the carrier frequency.

Polynomial-based solvent reduction

To access polynomial-based solvent reduction, select Polynomial from the Method popup in the control panel that appears when you select the Process1D/Solvent Suppression command. In this technique, FELIX fits a polynomial to the data. Then, FELIX subtracts the resulting function from the time domain data. This technique works best when the solvent resonance is close to zero frequency.

Viewing and applying window functions

Time domain NMR data can be multiplied by window functions that perform digital filtering for the purpose of reducing noise or increasing spectral resolution. For example, the noise level in 1D NMR data can be attenuated by multiplying the FID by an exponential window function.

FELIX offers two methods for selecting window function parameters: you can enter the parameter values directly or you can adjust them interactively. The Process1D/Window Function command allows you to select a window function and adjust its parameters interactively while FELIX displays plots of both the window function and the product of the FID and window function. FELIX can also display the spectrum rather than the FID and window function product, while you adjust the window function parameters. This function is extremely useful in determining which window function is appropriate for your data.

You can also explicitly specify a window function and its parameters with this command. When you know exactly what window function you want to use, as well as its parameters, this action lets you apply it quickly and precisely.

Window function descriptions

Sinebell, Sinebell squared, Skewed sinebell, and Skewed sinebell squared windows are a useful (but potentially dangerous) family of apodization functions. These sinebell windows can be shaped in many different ways, depending on their size, phase, and skew.

Exponential linebroadening

Exponential linebroadening is the most commonly used window function. This function starts out equal to one at the first point and decays at the rate of the specified FID. Because the value of the first point in the FID is not changed, exponential windows do not alter integral intensities and are a good window to use if integrals are to be measured. Exponential windows allow you to trade line width for signal to noise but preserve the Lorentzian line shape.

Note: If you use an exponential window to reduce noise, be aware that the lines in the spectrum still have Lorentzian shapes but no longer have natural line widths. For more detailed information, please refer to the em command in the FELIX Command Language Reference Guide.

lbroad	(slider) Adjusts the line-broadening parameter for the exponential.

Gaussian linebroadening

Gaussian linebroadening is another popular window that changes not only the line width, but also the line shape. Gauss/Lorentz multiplication modifies the value of the first point of the FID and hence the value of the integral. Gauss/Lorentz is commonly used for resolution enhancement and changes the line shape to be partly Gaussian. Gaussian lines have narrow tails and yield a nicer-looking spectrum. This window is appropriate for cosmetic resolution enhancement, but the line widths and line shapes are no longer natural. Gaussian multiplication also alters the integral of spectral lines and differentially reduces the integral of broad lines with respect to narrow lines.

Although a spectrum with Gaussian line shapes looks great, use caution if you attempt to integrate it, since the integral is affected by the line shape. For more detailed information, please refer to the gm command in the FELIX Command Language Reference Guide.


lbroad	(slider) Adjusts the line broadening parameter for the exponential.
gbroad	(slider) Adjusts the Gaussian parameter for the exponential.

Kaiser

The Kaiser window is a function by Hamming (1989). This window is useful for apodizing data that are truncated. For more detailed information, please refer to the kw command in the FELIX Command Language Reference Guide.


wsize	(slider) Adjusts the number of data points for the window function.
alpha	(slider) Adjusts the alpha parameter of the Kaiser window.

Trapezoidal

The Trapezoidal function multiplies the data in the workspace by a window that rises from zero at the first point to one at <p1>, is equal to one from <p1> to <p2>, and falls to zero from <p2> to <p3>. For more detailed information, please refer to the tm command in the FELIX Command Language Reference Guide.


p1	(slider) Adjusts the first point of the trapezoid.
p2	(slider) Adjusts the second point of the trapezoid.
p3	(slider) Adjusts the third point of the trapezoid.

Fourier transforms

The advantages of applying a pulse to the nuclear spins and collecting the resulting transient - advantages that range from improving the signal-to-noise ratio to making possible forbidden detection of multiple quantum coherence - have made this the predominant NMR technique. The Fourier integral transform is central to modern NMR data processing because it transforms the transient from the time domain to the frequency domain, thereby yielding a spectrum.

Accordingly, FELIX provides a battery of Fourier integral transforms that are accessed by selecting the Process1D/Transform command, which opens a control panel containing a list of the integral transforms that FELIX can perform on the data in the work space. Five Fourier transforms and a Hilbert transform are available. The most commonly used transform method are listed below.

Phasing

After Fourier transformation, a spectrum frequently appears to be out of phase ; that is, the resonance lines appear to be a mixture of absorptive and dispersive shapes. This is due to several factors, including finite pulse lengths, acquisition delays, and analog filter response. NMR spectra can be phase-corrected after transformation by multiplying each datapoint value pair by a phase factor.

FELIX has three ways to apply phase correction to a spectrum. To access these, select the Process1D/Phase Correction menu item. The automatic phase-correction algorithms include the PAMPAS and APSL methods for spectra with non-split peaks, such as decoupled ^¹³C and DEPT spectra, an algorithm based on peak integration for general in-phase 1D spectra, and a basic algorithm intended for common proton spectra. In addition to the automatic phase-correction routine, FELIX provides an interactive real-time phase routine that is easy to use. FELIX also allows you to enter the zero- and first-order phase parameters explicitly.

Correcting baseline distortions

An NMR spectrum can exhibit substantial baseline distortions caused by non-ideal experimental conditions. Such distortions can interfere with analysis of the spectrum (for example peak picking and integral calculation), so they must be minimized. Fortunately, most baseline distortions can be minimized easily, by first identifying a set of points on the spectrum that are free of peaks. These points are called baseline points. Next, a smoothly varying function is fitted to the set of baseline points. This function is expected to closely approximate the baseline distortion. Finally, at each point, the value of the smoothly varying function is subtracted from the data value in the work space, thereby removing the baseline distortion from the spectrum.

With the Process1D/Baseline Correction command, you can either add and delete baseline points or you can use different options to correct the baseline.

Baseline point entities and files

FELIX uses an integrated database for storing spectrum information, including the identities of selected baseline points. By default, the name of the baseline point entity is bas:baseline; however you can change this name. In fact, if you want to retain the baseline points stored in the current entity while you pick a new set of baseline points, you must change the entity name.

Spectrum display for baseline correction

When you are correcting the baselines of spectra, you may need to test several sets of baseline correction points and functions before a spectrum can be satisfactorily corrected. This is especially true if it is difficult to define baseline points due to spectral crowding.

Therefore, before applying any baseline-correction function, select the File/Save As command and save your spectrum. Later, if you are not satisfied with the results of any given baseline-correction function and its application to your data, select the File/Open command to read your saved spectral data in again.

Adding and deleting baseline points

Select the Process1D/Baseline Correction command to open a control panel containing the available choices for adding and deleting baseline points. First you have to toggle on the Baseline Points radio button.

To define baseline points

To define the baseline points, select the Auto Pick Points or Auto Pick Points w/FLATT option from the popup. FELIX generates a list of baseline points. FELIX displays markers for each baseline point picked in the spectrum at the bottom of the current spectrum.

To add baseline points singly

To add baseline points singly, use the Pick Points Via Cursor option. Move the crosshair cursor to baseline points, and click them with the left mouse button. To quit this mode, click outside of the spectrum. If you want, you may add baseline points explicitly using point numbers with the Manual Pick Points option.

To modify the list of baseline points

If you make a mistake while selecting individual baseline points or if you want to modify the current list of baseline points, you can delete a region of points. Select the Delete Points in Region option in the Process1D/Baseline Correction command's control panel to create a small crosshair cursor. Then drag out a region of baseline points to delete.

To delete all baseline points

To delete all the baseline points, select the Delete All Points option. This deletes the current baseline points entity from the database: it requires confirmation via a dialog box.

Applying baseline-correction functions

Once the baseline points are defined, you can choose a baseline-correction algorithm.

Cubic spline algorithm

The cubic spline algorithm generates a baseline that passes exactly through each baseline point. To apply it, set the Baseline Correction option to Cubic Spline in the control panel for the Process1D/Baseline Correction command, A cubic spline may yield a kinked baseline if the defined baseline data points are close together and noisy.

Polynomial baseline correction

The polynomial baseline correction algorithm generates smoother baseline correction functions from baseline points. To apply this baseline correction, set the Baseline Correction option to Polynomial in the control panel for the Process1D/Baseline Correction command. The polynomial correction differs from the cubic spline correction algorithm in that the baseline does not necessarily pass exactly through each baseline point, but a best fit is calculated.

Real-time baseline correction

The FELIX real-time baseline correction feature lets you adjust the coefficients of a polynomial baseline function while displaying the resulting baseline function superimposed on the baseline-corrected spectrum. To access the real-time baseline correction feature, set the Baseline Correction option to Real-Time Polynomial in the control panel for the Process1D/Baseline Correction command.

Automatic baseline flattening

The baseline-correction methods described above require that a set of baseline points already exist. In contrast, the three baseline correction methods discussed below do not need pre-defined baseline points. Instead, they alone determine what constitutes the baseline.

Automatic baseline correction

One of the baseline-correction functions supported by FELIX that does not require explicit baseline points is applied by setting the Baseline Correction option to Automatic w/abl in the control panel for the Process1D/Baseline Correction command. FELIX selects noise points and performs a baseline correction for each point. The method involves the DC convolution of baseline points with a moving average, while applying a straight-line correction to non-baseline intervals (Dietrich et al. 1991).

FLATT baseline-correction

FELIX also performs the FLATT baseline-correction algorithm (G�ntert and W�thrich 1992), which finds baseline segments in the spectrum and uses a linear least-squares solution to fit a truncated Fourier series to these points. To use FLATT, set the Baseline Correction option to Automatic w/FLATT in the control panel for the Process1D/Baseline Correction command.

FaceLift baseline correction

The third baseline-correction function that does not require explicit baseline points is based on FaceLift, a technique introduced by Chylla and Markley (1993). The FaceLift algorithm automatically identifies base points from peak signals. To use FaceLift, set the Baseline Correction option to Automatic w/FaceLift in the control panel for the Process1D/Baseline Correction command.

Miscellaneous work tools

FELIX contains several menu items that affect frequency domain spectra in the workspace. Although most of these controls are directly related to the transformation of multidimensional spectra, several of them affect the processing of one-dimensional data. To access these tools, select the Tools pulldown.

Task: 1D peak picking and integration

Picking 1D peaks or resonances within FELIX is performed using the menu items within the Peaks pulldown.

1D peak entities and files

FELIX uses an integrated database for storing spectrum information, including the identities of picked peaks. By default, the name of the 1D peaks entity is pic:1d_picks; however, you can change this name. In fact, if you want to retain the 1D picked peaks stored in the current entity while you pick a different set of peaks, you must change the entity name.

To change or view the name of the current 1D peaks entity, select the Preference/Pick Parameters command. FELIX displays a control panel with the name of the current 1D peaks entity displayed in a the Peak Pick Table box. To change the name, enter the name of a new or existing 1D peaks entity and click OK.

To view the contents of the 1D peaks entity, select the Edit/Peaks command. FELIX creates a spreadsheet display of the 1D peaks entity contents. You may add or delete 1D peaks on the spreadsheet, which also changes the information in the database.

To save a 1D peaks entity in an ASCII file, select the File/Export/Peaks command. FELIX prompts you to specify an output file name, as well as the name of the entity to save. FELIX then writes the ASCII file to the current directory or the text subdirectory, depending on how the FELIX directory structure is configured.

To read a1D peaks ASCII file into a 1D peaks entity, select the File/Import/Peaks command. FELIX prompts you to specify an input file name as well as an entity name.

Working with picked peaks

You must define the threshold value before picking peaks. To do this, select the Preference/Pick Parameters command, set the Threshold Value option to Cursor, and click OK. Move the cursor so that the horizontal half-crosshair is located at the level of the smallest peak you want to pick, and then click the left mouse button. FELIX displays a dialog box with the newly set threshold. To accept it, click OK.

Selecting peaks

To automatically select peaks in the current display, select the Peaks/Pick All command.

To select peaks in a sub region of your display, select the Peaks/Pick Region command. This creates a small crosshair cursor that you may drag to select a picking region.

Use the Preference/Pick Parameters command to use a control panel listing those parameters that affect 1D peak picking. You can inspect and set threshold values here. To control the displayed names for picked peaks, use the Peak Pick Units parameter. All peaks are displayed in Points, PPM, Hertz, or None. To display peak names on the peaks, set the Peak Pick Units parameter to Assignment.

You can also vary the style of the peak markers, specifying arrowheads only, lines only, or lines with arrowheads.

Deleting picked peaks

1D line fitting

FELIX offers a powerful line-fitting interface for deconvolution of complex spectra into individual peaks that are described by an analytic function of intensity, linewidth, and frequency. These functions allow precise integration of peaks individually.

To access the interface and display utilities, select the Peaks/Optimize command. This gives you access to the 1D line fitting function and other functions that display the actual data, the synthetic data, or residual data - either separately or all together in an overlay.

To fit a spectrum, select the Optimize option in the control panel associated with the Peaks/Optimize command.

Note: The spectrum's baseline must be flat in order for you to obtain meaningful optimization results. So, before you calculate integrals, be sure to perform baseline correction on your spectrum.

Integration

Peak integration of a spectrum gives information about the relative number of spin species. Accurate integration is an important part of 1D data analysis. The integration options are accessed with the Measure/Integral/Volume command.

FELIX allows you to integrate the entire spectrum either as a single integral or in shorter segments. To integrate the entire spectrum, select the View/Draw Integrals command. However, if current segments are defined, the integrals for each segments are displayed separately by default. You can select the Measure/Integral/Volume command to add, remove segments, change the display parameters, or to normalize the integral values. Options are displayed in the control panel upon selecting Measure/Integral/Volume.

Segment entities and integral files

FELIX uses an integrated database for storing spectrum information, including the identities of selected integral segments. By default, the name of the integral segment entity is seg:segments; however, you can change this name. In fact, if you want to retain the integral segments stored in the current entity while you pick a new set of integral segments, you must change the entity name.

To change the name of the current integral segment entity, type a name under Use Database Entity option in the control panel belonging to Measure/Integral/Volume. Or, select the Preference/Table command. If you want to change the name, set the Table Type parameter to Integral Segments, enter the name of a new or existing integral segment entity, and click OK.

To view the contents of the integral segment entity, select the Edit/Table command and choose the segment entity. This creates a spreadsheet display of the integral segment entity contents. You may add or delete integral segments on the spreadsheet.

At times you may want to save an integral segment entity in an ASCII file. Select the File/Export/Table command. FELIX prompts you to specify the name of the entity (table) to save, as well as for an output file name. The ASCII file will be written to the directory selected in the file selection control panel.

An integral segment ASCII file may be read into an integral segment entity by selecting the File/Import/Table command. FELIX prompts you to specify an input file name and an entity name.

Spectrum display for integrals

When integrating spectra, it is often necessary to try several permutations of integral segments and normalization before a spectrum can be integrated satisfactorily. This is especially true if it is difficult to define segments or baseline points due to crowding. Thus, FELIX provides keypad navigation tools to quickly change displayed regions, as well as the Adjust option in the control panel belonging to the Measure/Integral/Volume command. Use it to change display type or manipulate integrals in real time.

Defining and deleting integral segments

To define integral segments, select the Add Segment option in the control panel belonging to the Measure/Integral/Volume command. Integral segments are added by dragging out a segment region with the cursor. When you select a valid region within your spectrum, you can add additional segments without reselecting the Measure/Integral/Volume command. To exit this mode, press the <Esc> key.

If you make a mistake while selecting individual segments or if you want to modify the current list of segments, you may delete a small subregion of segments graphically. First, select the Remove option in the control panel belonging to the Measure/Integral/Volume command. Use the small crosshair cursor to drag out a region of segments to delete.

To delete all segments, select the Remove All option in the same control panel. This action deletes the current integral segments entity from the database and requires confirmation via a dialog box.

Adjusting integral slope and bias

For some spectra, it is impossible to accurately define baseline points. Use the Adjust option in the control panel belonging to the Measure/Integral/Volume command to adjust the slope and bias parameters in real time using sliders. If the baseline is significantly distorted, even adjusting the slope and bias may not be able to generate correct integral shapes.

Integral normalization

FELIX enables you to normalize the integral of any segment of the spectrum to an arbitrary value. Four different normalization options are available via the Measure/Scalar/Normalize command. After normalization, the volume element in the integral segment entity is updated to the normalized value.

Task: Processing 2D data

This section describes the general procedures for processing 2D data. For information on converting your data files into a format that FELIX can read, please see "Tasks", and Appendix D., Data files.

General processing steps

Multidimensional data are processed and stored by FELIX using matrix files. Matrix files are designed to allow easy access to and manipulation of the individual vectors that compose the data. In fact, virtually all N-dimensional processing is a repeated process of loading a vector from the matrix into the workspace, processing that vector, and then storing that vector back in the matrix. Therefore, you must build a matrix file that will be used to hold the N-dimensional data.

Once a matrix file has been created, the individual FIDs that make up the FELIX data file are read in one by one, processed, and saved to the matrix file. Subsequent dimensions are processed by loading each vector in the given dimension one by one from the matrix into the workspace, processing each vector in turn, then storing the processed vectors back in the matrix file. This process is repeated for each dimension of the matrix.

In practice, these steps are generally carried out using macros. Macros are files that contain a series of instructions used to process the data. FELIX includes a very flexible macro language (FCL), which allows you to control all aspects of how the data are processed. You can use your own macros to process the data or to use the EZ macros. The EZ macros are a series of predefined processing macros designed for the most common types of multidimensional data. Generally, using your own custom macros is preferred, but the EZ macros will often suffice.

Processing the D1 dimension using macros

The macro below shows an example of how to process the D1 dimension of a 2D data set. This sample macro is appropriate for either States or TPPI data. (The line numbers are for reference only and are not included in the actual file.)

These lines define a series of symbols whose values determine how the data processing is to be performed. You do not need to define these symbols separately in the beginning of the macro. You can choose to enter processing-specific parameters on the individual command lines. However, grouping important symbol definitions at the beginning of a macro tends to remind you which parameter values may need to changed for different processing sessions.

The cmx command is used to close any open matrix files. Then the bld command is used to create an empty matrix file of the appropriate size. The zero in the bld command line indicates that the matrix to be created will be real. Most multidimensional data processing is done with real matrix files. The mat command is then given to open the matrix that was just created with write access.

These two lines define a pair of symbols (temph0 and temph1) which are used to store the zero and first-order phasing parameters. You need to store the desired phasing parameters in temporary symbols because when the input data files are read in later, the phase0 and phase1 symbols will be overwritten by the phasing values that were stored in the input data file.

This section of the macro illustrates how to set up an apodization function in buffer 1. Later in the processing-loop section of the macro, each FID is multiplied by the contents of buffer 1. This is a more efficient method of doing the apodization than recalculating the appropriate apodization function for each FID. Lines 16 and 17 define the datsiz symbol as the appropriate number of complex data points and the datype symbol as one that indicates complex data. The set command then sets all the real values in the workspace to 1. The appropriate apodization function is then performed and the result is stored in buffer 1.

These lines print out a message to the text window that the D1 processing loop is about to begin.

The cl command is used to close any open data file. This ensures that a subsequent read command (re or rn) will read the first record of the data file.

This section of the macro forms the main processing loop. This loop is executed once for each data file in the input data set.

These two lines of the macro show how the esc command can be used to allow you to interrupt a macro during execution. The esc command monitors keyboard input for the escape character. When the <Esc> key is pressed the user-defined symbol (here, the out symbol) is set to one. An if statement transfers control out of the loop when the out symbol is no longer equal to zero.

These two lines read the next data set from the input data file and check the status of the read command. The re command is used to read old-format data and the rn command is used to read new-format data. If the read command is successful, the status symbol is set to zero. If the read command is not successful because, for example, you tried to read more data sets than were present in the input data file, then the status symbol is set to a non-zero value. If the read is not successful, then control is transferred out of the loop.

These two lines make sure that the phase0 and phase1 symbols, which are used by the subsequent phasing command (ph), are set correctly based on the saved phasing parameters (temph0 and temph1). This is necessary (as mentioned above) because, when a read command is given (re or rn), the phase0 and phase1 symbol values are overwritten with the phasing values that are stored in the input data file.

This line sets the datype symbol to one which represents complex data. This is required so that subsequent processing operations operate properly.

These are the main processing steps. The data are baseline corrected, the first point is corrected, and solvent is removed, apodized, ft'd, and phased. You would customize this section of the macro to correctly process your data. Apodization is performed by multiplying by buffer 1. Buffer 1 contains the apodization function that was set up earlier in the macro.

The data are then reduced so that only the real part of the FID is retained. These real data are then stored in the matrix. A type command (ty) is used to print out the current row so that you can monitor the progress of the macro.

If a read command should fail (such as by trying to read more input data sets than are present in the input data file) then control transfers to this section of the macro. A message is printed telling you which record triggered the failed read.

If the macro finishes normally or you exit the macro early, then control transfers to this section. A cmx command closes the open matrix. The status symbol is set to zero, indicating a normal exit, and control is returned to the menu interface.

Processing the D2 dimension using macros

The macro below shows an example of how to process the D2 dimension of a 2D data set. This macro is appropriate for States data. This example assumes that 512 FIDs (256 complex points in D2) were collected in the D2(t1) dimension and that a real matrix was used with 1024 � 1024 points. (The line numbers are for reference only and are not included in the actual file.)

The cmx command is used to close any open matrix files. Then the mat command is used to open the appropriate matrix file for writing.

This section forms the main processing loop. Since this macro operates on the D2 dimension, a column must be processed for each D1 point value. The for loop increments the symbol col from 1 to the value of the symbol d1size. When a matrix is opened, the value of the symbols d1size, d2size, etc. are set to the number of points in the corresponding dimension. Thus, the symbol d1size is automatically set to the number of points in the D1 dimension and is therefore a logical choice to use as the maximum value of the for loop.

These two lines show how the esc command can be used to allow you to interrupt a macro during execution. (Such checking slows down the processing, so you may prefer not to use it.) The esc command monitors keyboard input for the escape character. When the <Esc> key is pressed the user-defined symbol (in this case the out symbol) is set to one. An if statement transfers control out of the loop when the out symbol is no longer equal to zero.

The loa command is used to load the next column into the workspace for processing.

Since this is States data, the datype symbol is set to 1, indicating complex data. Now that the data are complex, the data size (datsiz symbol) must be reduced by half, because there are only half as many complex points compared to the number of points in the vector when it was first read in as real. If the data size is not reduced by half, then the second half of the workspace will contain invalid data. The data size is then doubled by zero-filling, so that the second half of the FID will contain zeros. The data size in complex points (datsiz) is now equal to the number of points in the D2 dimension of the matrix.

The data are then apodized, ft'd, and phased. The apodization is over 256 complex points. This is due to the fact that the 512 FIDs collected in the D2 dimension correspond to 256 complex points in D2. Because this is States data, a complex fit is performed.

The data are then reduced so that only the real part of the FID is retained. The data size (datsiz parameter) remains unchanged and is equal to the number of points in the D2 dimension of the matrix. These real data are then stored in the matrix. A type command (ty) is used to print out the current column so that you can monitor the progress of the macro.

If the macro finishes normally or you exit the macro early, then control transfers to this section. A cmx command is given to close the open matrix. Control is returned to the menu interface.

Processing 2D data with the supplied macros

The supplied precoded processing macros (in the Process pulldown) provide an alternative to writing your own processing macros. These macros are designed to process the most common kinds of 2D data. These macros do not provide the same flexibility inherent in writing your own custom macros, but they do offer the capability to quickly process 2D data collected with most of the typical acquisition schemes.

Access the precoded 2D processing macros with the ProcessND/Open and Process 2D command. FELIX displays a file-selection control panel. Here, you can select the data type: FELIX old and new data (.dat) file, FELIX matrix file (.mat), Bruker fid or ser file, Varian fid file or JEOL Alpha or Lambda file, and the filename.

If FELIX can access the data, it opens a second control panel with the header parameters. The acquisition parameters can be valid only if the appropriate files are present (for Varian the procpar file and for Bruker the acqus and the acqu2s file. For Bruker the program also attempts to read the pdata/1/procs and proc2s files.) In this control panel, you can update the header parameters if they show incorrect values.

When you select D1 dimension for processing FELIX displays a menu of choices which define how the D1 dimension is to be processed. For more information on each of the various processing options see Chapter 5., Processing, visualization, and analysis interface (1D/2D/ND). When you click OK, the macro first builds a real matrix of the size that you specified with the Dimension 1 Size and Dimension 2 Size parameters. The macro then reads in each of the FIDs from the specified input file, processes the data according to the options you selected, and saves the processed vector to the matrix. The macro completes when each of the FIDs, as specified by the D2 Parameters Data Size header parameter, has been processed.

When the D1 macro finishes processing the data, you will have a FELIX matrix file (normally with a .mat extension) with each of the D1(t2) vectors processed. At this point you can process the D2(t1) vectors by again selecting the Macro/2D Data Processing command and choosing the matrix from the previous processing.

Then, FELIX displays a menu of choices, which in this case defines how the D2 dimension will be processed. When you click OK to begin processing the D2 dimension, the macro opens the matrix you have selected, reads each of the D2 vectors into the workspace, processes each vector according to the options you selected, saves the processed vector back to the matrix, and closes the matrix. When the D2 processing macro is finished, you will have a FELIX matrix file where all of the D1 and D2 vectors have been processed.

Checking/examining the data as they are being processed

As part of the transformation process, before proceeding to the next stage in the transformation process, remember to check the data and verify that everything is correct.

Whether you choose to process your data using the precoded processing macros or your own custom macros, the process of transforming your data in the D1 dimension is basically the same. The macro generally first builds a new real matrix to hold the data, reads in each FID from the input data file into the workspace, processes each FID in turn, and stores the result to the matrix.

1. The first check is to make sure that the raw data are correct or that the spectrometer-format data was correctly converted to a FELIX old- or new-format data file. To do this you can use the File/Open command to examine the individual FIDs that make up the input data file. By changing the Dimension parameter between 1D and ND, you can control whether this action always reads in the first FID in the series or that subsequent reads load in successive FIDs from the input data file.

2. Once you have verified that the input data file is correct, you should experiment with processing the first FID in the series to determine the optimum processing parameters for the D1(t2) dimension. Only after you have determined a good set of starting parameters to use for the D1 dimension should you proceed to processing the entire D1 dimension using a matrix.

3. Examine the matrix at this point to make sure that the D1 processing is acceptable. Open the matrix file, extract 1D slices (using the View/Draw 1D Slices command), and then examine the individual 1D vectors using commands such as View/Limits/Manual Limits. The View/Draw 1D Slices command is especially helpful because it allows you, through the use of a slider, to scan through the individual D1 vectors in order, to determine if the data needs to be reprocessed.

4. If you decide that the basic processing operations such as apodization and solvent suppression were performed correctly, then you generally do not have to fully reprocess the D1 dimension. It is possible to rephase or baseline flatten the existing data.

5. You can write your own macro to rephase or baseline correct the data in the D1 dimension. Once you have determined that the D1 dimension has been processed correctly and have reprocessed the data if necessary, you should look at some of the D2 vectors. Select some peaks in the first transformed row that you expect to have strong cross peaks and note the point numbers. Alternatively, you could click the Vertical 1D Slice icon to select an initial column vector to view, then use the left and right arrows on the keyboard to scan through a series of column vectors noting the D1 point numbers for a few columns which show a strong interferogram.

6. It is often desirable to load these selected columns into the workspace using the View/Limits/Manual Limits command and then save them to FELIX data files. Use these saved column vectors to determine the optimum processing parameters for the D2 dimension. Once you have decided on the optimum processing parameters for the D2 dimension, you can move on to processing all the D2 vectors using your own macros or the precoded processing macros.

Note: Whether you process the D2 dimension with your own macros or the precoded macros, the basic process is the same. The macro opens the appropriate matrix, reads each column vector into the workspace one by one, processes the vector, saves the vector back to the matrix, and closes the matrix file.

7. After you have transformed the D2 dimension, you need to examine the individual D2 vectors by either selecting the View/Limits/Manual Limits command or clicking the Vertical 1D Slice icon. If the D2 vectors need to be rephased or baseline corrected, you can do this with your own macros or the precoded macros in the Macro pulldown as discussed above for the D1 dimension.

Task: Analysis of relaxation data

Throughout this documentation, we use the terms R1 and R2 for the relaxation rates, which are simply the inverse of the relaxation times T1 and T2.

We assume that R1 and R2 relaxation data and heteronuclear NOE were measured as a series of 2D HSQC (or equivalent) spectra (Skelton et al. 1993). You must pick peaks in one of the relaxation spectra or in an equivalent HSQC. Peak assignments are helpful, but not necessary.

Evaluate peak heights or peak volumes

The first step in relaxation analysis is to evaluate peak heights or peak volumes in a series of 2D spectra using the Measure/Relaxation/Measure Heights/Volumes command. In practice, peak heights have proved more reliable and yield better signal to noise ratios (Skelton et al. 1993). FELIX automatically centers the peak boxes (if they were picked in another spectrum and are slightly off) and then determines the peak heights or volumes in the relaxation spectra.

For peak-volume determination, the raw volumes are optimized by fitting a 2D Gaussian function to the spectra and deriving volumes from the fit. In the first spectrum of the series, peak widths are optimized, then peak amplitudes, and finally both are optimized simultaneously. In subsequent spectra, centers and widths are left as they are, and only amplitudes are adjusted. For peak-height determination when peak boxes overlap, the peak box is iteratively shrunk by one point in each dimension until no other peak box overlaps it, to prevent cross talk from any overlapping peak maxima.

Thus volumes and heights are accurate within the limits of the methodology even for overlapped peaks, and the procedure requires no interaction on your part. Peak heights or volumes are stored into a regular FELIX volume table, but the last row holds relaxation delays instead of mixing times.

Estimate signal-to-noise ratio

The next step is to estimate the signal-to-noise ratio from one or more duplicate spectra, using the Measure/Relaxation/Signal/Noise Ratio command. The average standard deviations obtained for these spectra are interpolated to the other points in the series as appropriate. The time courses of peak heights/volumes that you obtain in this way can be visualized and plotted by:

Selecting the Measure/Relaxation/View Timecourse via Cursor command and then selecting a peak with the mouse

Selecting the Measure/Relaxation/View Timecourse via Item command and entering a peak number

If the time series is already fitted to an appropriate exponential function, FELIX displays this function, and reports the relaxation rate in the text window.

FELIX stores duplicate peak volumes and heights in their own volume table. This is essentially a copy of the original volume table, with the appropriate column overwritten with values from the duplicate spectrum. FELIX records uncertainties in the last row of this table.

Fit data to exponential function

After obtaining peak heights and volumes and their errors, you can fit these time series to an appropriate exponential function. Use the Measure/Relaxation/Fit R1/R2/NOE command. For R1 data, this is a general exponential function:

where R2 = -a1 may be appropriate. In practice, it is often found that the more general Eq. 21 yields statistically better fits (Skelton et al. 1993). FELIX tries both fits and retains the coefficients for the fit with the lower c2 value.

Analysis of heteronuclear NOE data

Preparing input data files for Modelfree

In addition to generating R1, R2, and NOE values, the relaxation analysis tools provided in FELIX let you prepare input data files for the Modelfree program of A. G. Palmer (accessed via the Measure/Relaxation/Modelfree Input command). Modelfree can be downloaded from Columbia University's website at
http://cpmcnet.columbia.edu/dept/gsas/biochem/labs/palmer

For further details on analyzing relaxation data with the Modelfree program, see Mandel et al. (1993).

Task: Working with assignment databases

The assignment work starts with defining a project. It is highly advisable to define only one project within one database file. The project then contains all the information needed in the course of the work, such as spectrum definitions, last display parameters, and entities necessary to store the assignment information.

The definition of the project starts with reading in an Insight II or X-PLOR coordinate file of the molecule (or complex) being studied. This coordinate file does not necessarily need to be the true structure, especially since at the beginning of the assignment the final structure is not usually known. As the assignment progresses and refined structures become available, the new coordinates can be read into the project.

The crucial part of residue identification is the library. The library is an ASCII file, where the mean chemical shifts and standard deviations are listed for the usual residues. You can change these values and can also add non-standard residues to the end of the file. This procedure is illustrated later. This file is then read into the project and used in spin-system identification.

You can define the matrices that are used in the assignment. These matrices should be referenced in ppm and are in the standard FELIX format. In adding the experiments, it is important to define a reasonable tolerance for each axis, since the automated spin-system detection routines rely heavily on them. This tolerance should mirror the uncertainty of the peak positions in each dimension (in ppm). Later on you can delete unused experiments from the database and add new ones. For example, you can start with J spectra (TOCSY, DQF-COSY, HSQC-TOCSY, or CBCA(CO)NH) to do the sequence-specific resonance assignment. Once that is done, you can delete the J spectra and define NOE-type experiments (e.g., NOESY or HSQC-NOESY) to do peak assignment and restraint generation.

Task: Adding modified residues to the Assign database

If you work with non-standard amino acids or nucleic acids, you can add these new residue types to the database, by simply editing two files in the data/felix/asglib/ folder under the installation folder (See "Starting FELIX" on page xix for more details about the installation directories). One file contains the residue definition (atoms, median chemical shifts, standard deviation, and topology); the other contains the alias names.

Editing the residue-definition file

The residue-definition file is called pd.rdb or rna.rdb. For example, to add an aminobutyric acid residue, you should go to the end of the pd.rdb file, which looks like:

1. Each line that starts with an exclamation mark is a comment. So first you add the formula as a comment:

First enter the keyword RESATM, then the atom type, atom name (in Insight notation), a number (this field is not active yet), a mean chemical shift, a standard deviation, and finally, if there is more than one atom (e.g., a pseudoatom) in that group, enter the number of atoms.

4. Next is another comment line that contains the atoms used to help construct the neighbor matrix:

Editing the alias file

7. Append a line to the alias file which contains the names of that residue for all the charged/terminal forms:

Task: Peak picking

Usually the first step in assignment is peak picking. FELIX has two types of peak pickers: the regular one and one that uses example peaks to distinguish genuine peaks. Depending on the data, it is probably advisable to try several rounds of peak picking using both methods, and then use whichever filtering functions (such as symmetrizing, deleting the diagonal, and merging multiplets) you prefer. Good peak sets are crucial for the success of the analysis procedure.

After picking peaks, you may want to use peak optimization to better define the peak centers, widths, and/or volumes. To do this you need to first measure the peak volumes. Sometimes the peak optimizer can merge close peaks; therefore, it is advisable to measure the peak volumes interactively and save the peak entity in a backup file within the database. By defining small regions where the optimizer should work and examining the results, you can see if the peaks get misplaced too much or merged unnecessarily. Then you can retrieve the saved copy and omit optimization for those peaks.

Task: Using connected frames to navigate among multiple spectra within Assign

In assignment work you often need to analyze multiple spectra simultaneously. To help with this, FELIX allows you to connect multiple frames so that the definition of plot limits in one frame automatically triggers the same changes in the other connected frames.

To connect frames, use the Preference/Frame Connection menu item. To illustrate the process, assume you have a CBCANH and a CBCA(CO)NH experiment. The CBCANH was transformed so that D1 is ^¹³C, D2 is ^¹⁵N, and D3 is ^¹H. Assume that CBCA(CO)NH was transformed so that D1 is ^¹H, D2 is ^¹³C, and D3 is ^¹⁵N.

Also assume that you want to see ^¹H-^¹³C slices, and that CBCANH is the experiment in Frame 1 and CBCA(CO)NH in Frame 2. In Frame 1 you would select the CBCANH spectrum from the Experiment table and in Frame 2 you would select the CBCA(CO)NH spectrum. After going back to Frame 1 and selecting the Preference/Frame Connection menu item you would do the following:

1. In the control panel, leave the First Frame set to 1 and the Second Frame as 2.

2. For the connection method, click the D1-D2-D3<=>D1-D2-D3 radio button and then click OK.

Now, if you zoom in on a region in the CBCANH spectrum (Frame 1), then the same plot limits are defined in the CBCA(CO)NH spectrum (Frame 2). Similarly, if you step onto another plane (for example, in the CBCA(CO)NH spectrum (Frame 2), then a plane at the same ^¹⁵N frequency is displayed in the CBCANH spectrum (Frame 1).

Certainly, if you have 2D, 3D, and 4D spectra to connect, you can use this interface to define quite sophisticated schemes.

The following is a theoretical example of how to connect a 2D spectrum with 3D spectra:

Assume you have a 2D ^¹⁵N-^¹H HSQC spectrum with ^¹⁵N as D2 and ^¹H as D1. Assume also that you have a ^¹⁵N TOCSY and ^¹⁵N NOE spectra, where D1 is the amide ^¹H, D2 the full ^¹H, and D3 the ^¹⁵N dimension. You want to look through the amide peaks in the HSQC spectrum and then see the corresponding slices in the TOCSY and NOESY spectra.

1. First, select the HSQC spectrum from the Experiment table for Frame 1 and the TOCSY and the NOE spectra, respectively, for Frame 2 and Frame 3.

2. In the first control panel leave First Frame set to 1 and Second Frame set to 2. This connects the HSQC spectrum with the TOCSY spectrum. Select the Custom option and click OK.

	First Frame 1		Second Frame 1
D1	D1
D2	Null

	First Frame 1		Second Frame 1
D2	D3
Null	Null

6. Next, connect the HSQC spectrum with the NOE spectrum with the first control panel (which should appear again). Leave the First Frame set to 1 and the Second Frame to 2. Select the Custom option and click OK.

	First Frame 1		Second Frame 1
D1	D1
D2	Null

	First Frame 1		Second Frame 1
D2	D3
Null	Null

Linking all these spectra assures that plot navigation in the HSQC spectrum (e.g., zooming in on a peak) translates to the same movement in the HSQC-TOCSY and HSQC-NOE spectra. If you press the period <.> on the keyboard while in the HSQC frame you will be able to use the cursor to select a new ^¹⁵N position (e.g., clicking on a HSQC peak), which then translates to new planes also in the TOCSY and NOE spectra.

If in addition you want to navigate the TOCSY and NOE spectra together, you need to set that up by using the Preference/Frame Connection menu item again.

10. Make Frame 2 current. Select the Preference/Frame Connection menu item. In the first control panel, leave First Frame and Second Frame set to 2 and 3, respectively. Select D1-D2-D3<=>D1- D2-D3 as the method and click OK. After the control panel appears again, click Cancel. This way you have connected all three axes of the TOCSY and NOE spectra.

Spin systems

The key step in sequential assignment is spin-system detection. In FELIX spin systems are detected in three ways:

Automated spin-system detection

The first type of spin system results from automated spin-system detection. These are called prototype patterns or protos. This system can be intraresidue or can contain spins from neighboring residues, depending on the spectrum and method by which the data were collected.

Frequency clipboard

The second type of spin system is the frequency clipboard, which is a unique list of frequencies that originates from a proto or a concatenation of several protos, or which are hand picked from a spectrum. The clipboard is usually used where you would delete spurious resonances from the spin system or add missed resonances manually.

Pattern

The third type of spin-system detection is the so-called pattern. Patterns can be scored against a library to find a type probability score or they can have sequential neighbors. The frequencies in the pattern may also then be assigned to atoms (or atom groups) in the molecule. Patterns can be edited, but with much more limited functionality.

At each stage, the spin system can be visualized by spawning tile plots or strip plots or by drawing lines along the frequencies. Visual interaction is an important element in assignment, since automated routines are not 100% reliable.

Task: Manual spin-system collection

You can collect spin systems manually; that is, by picking peaks in a spectrum and promoting the chemical shifts to the clipboard. You would typically do this if there were unresolved or missed spin systems left after automated spin-system detection. Then you would go through the spectrum (or spectra) and use the controls in Assign/Frequency Clipboard to collect the peak positions as frequencies in the clipboard. After each spin system is collected this way, you then need to copy it to patterns, then clear the current spin system and start over with the next spin system.

Task: Automated spin-system detection

The automated prototype pattern-detection routines work on one or several peak-picked spectra, depending on the method selected. You should optimize the values of several parameters by trial-and-error. To find as complete a set of spin systems as possible, several iterations are necessary. Guidelines for selecting prototype pattern parameters are given below.

Tolerances

One of the most important variables in automated spin-system detection and neighbor-finding methods is the tolerance. A spectrum-specific tolerance for each experiment needs to be entered at the beginning. This number usually represents how well the peaks are lined up in each dimension up for any given frequency, that is, how much the peak centers differ along a particular dimension arising from the same spin system. As a rule of thumb, this number is roughly equivalent to the digital resolution in ppm (e.g., first try the ppm equivalent of one point, which then will use this number as a boundary--each peak whose center is within the target plus/minus the tolerance). This tolerance is used to automatically assign peaks to frequencies.

Another tolerance is also required during automated spin-system detection, which is sometimes needed to decide whether a peak belongs to a spin system. If you use the homonuclear method with one spectrum, this tolerance is defined by the resolution of the dimension having the worse digital resolution in ppm, usually the F1. If you use multiple spectra, then this tolerance should be defined based on a comparison of the chemical shifts of the peaks belonging to the same connectivity in different spectra.For example, if you measure the position of a couple of the "same" peaks in the NOESY and TOCSY spectra, the difference defines the tolerance. Certainly, if the automated methods do not give sufficient results, the first thing you should try changing is the tolerance.

It may of benefit to reference one base spectrum and also newly introduced chemical shift ranges in other spectra, according to scientific theory, and reference the other spectra so that known peaks match optimally. This can significantly increase the success of automated spin-system detection and peak-assignment routines.

2D systematic search method

The 2D systematic search method can be used on different combinations of COSY, TOCSY, and NOESY type spectra. In the first iteration for a protein, the spin systems are usually collected starting from a H_a-H_b region, and you must exclude spin systems having frequencies in the aromatic/amide region. Similarly, if you want to search for aromatic side chains, you must use the aromatic region as a seed and exclude the aliphatic region by filtering.

Between iterations, you can selectively delete prototype patterns that are clearly not right, thus releasing peaks, or you can delete all of them. Also, it is useful to note that after you assign or define several reliable patterns, then those patterns' frequencies can be copied back to prototype patterns, thus preventing you from re-detecting them in the following iterations.

2D simulated annealing method

This search should begin with the longest spin systems. Since the algorithm tries to fit peaks into a defined motif, it does not take care of possible additional correlated frequencies, which means that an AMX portion of a long spin system could be assigned to a four-spin system. Initially, the program should be run on the whole residue set of the primary sequence (which automatically takes the above priorities into account) and on the patterns examined with the usual interactive tools. Then it should be rerun on specific missing amino-acid types. To compensate for the limited number of iterations in simulated annealing, the process should be run for several loops (typically 6), from among which the program will retain the best results. One loop of the program for the whole sequence of a 53-residue protein requires about 10 min of computation time on an R4000 Silicon Graphics Indigo workstation. For aromatic residues, this method assigns only the AMX subsystems; therefore, the aromatic resonances should be found with the systematic search method and added through the clipboard.

Double-resonance methods

Double-resonance spin-system detection is implemented in two different ways: one method which finds spin systems starting from the backbone, and one which finds spins in existing spin systems and extends them.

The first method works on ^¹⁵N-^¹H HSQC-TOCSY and ^¹⁵N-^¹H HMQC-TOCSY spectra. For this method, the spectrum should contain pseudodiagonal peaks-that is, HN-HN-N peaks-from which the spin systems will be collected along the sidechain. If the pseudodiagonal is not well resolved or if peaks are missing, you can use the ^¹⁵N-^¹H HSQC spectrum to help-the program tries to find new frequencies in the 3D from each peak in the 2D spectrum and stores them as spin systems. However, if the ^¹⁵N-^¹H HSQC spectrum is not well resolved, this method is less effective.

In the second spin-system detection method, you can use double resonance experiments to extend already existing spin systems--usually these are results of a triple-resonance spin-system detection run. Here, the purpose of the detection is to expand spin systems containing only backbone frequencies, to include sidechain information. Currently one method is implemented in FELIX which uses a 3D HCCH-TOCSY spectrum to achieve this. You can expand spin systems in the prototype pattern stage or in the pattern stage.

Triple-resonance methods

A variety of triple resonance measurements can be used for making sequence-specific assignments. Several approaches are implemented, which have some common elements: usually you must first peak-pick several triple-resonance spectra and find out what the uncertainty is between the peak positions within each spectrum for a couple of resonances. This will be the spectrum-specific tolerance for each axis. Also, you need to find out, simultaneously inspecting the spectra, how well the peaks belonging to the same resonances can be overlaid. For example, you need to know, at a given X residue, what are the differences between the peak frequencies along the H_{_N} and ^¹⁵N axis, and C_a in the HNCO and HNCA and HN(CO)CA spectra (H_{_N,x}-N_{_x}-C_{_x-1}, H_{_N,x}-N_{_x}-C_a,x-1, and H_{_N,x}-N_{_x}-C_a _{_,x-1}, respectively). Those will be the interspectrum tolerances for the respective axis.

After that you can run one of the spin-system detection routines starting with, for example, a tolerance that is a little less than these inter-spectrum tolerances, and then telling the program to iterate and increase the tolerance at each iteration step. Depending on the spectrum quality, you can automatically get 50-90% of the theoretical spin systems. These spin systems may not be perfect; therefore, you need to inspect them before promotion and discard or alter any inadequate spin systems. Also, you can manually add missed spin systems to the list of prototype patterns. There is an advantage, since usually triple resonance experiments are measured as (at least) pairs, so if you add spin systems at this stage, you can manually add frequencies belonging to this residue as well as to the neighboring residue. As long as you let the program know what those frequencies are, then Assign will establish these connectivities in the promotion stage.

User-definable automated spin-system detection

In addition to the predefined combinations of experiments given here, you can design your own spin-system detection methods. Typically you would design your own protocol if you have a good-sensitivity 2D or 3D double- or triple-resonance spectrum (e.g., ^¹⁵N HSQC or HNCO), known as a "seed" spectrum, together with a couple of double- or triple-resonance 3D "secondary" spectra (e.g., HNCA and HN(CO)CA). The typical procedure is to start from the seed spectrum with each peak as a trial peak and try to collect connected resonances in the secondary spectra. For example, if you want to detect spin systems using a combination of HNCA and HN(CO)CA experiments, you can use the following procedure (assuming that the order of dimensions is H_{_N}, N, C_a):

1. First, select the Assign/Collect Prototype Patterns/User Settable menu item and set the Number of steps to 1, since the seed peak selection is considered as the zeroth step. Select the HN(CO)CA spectrum as the seed spectrum (Step # 0) and the HNCA spectrum as the first step.

2. Since you need to match the HN and N frequencies in the two spectra for the corresponding peaks in the second control panel (Experiment hncoca in Step 0), you select the following for the Match to parameters:


	Match to		hnca
hncoca D1	D1	No	...	No
hncoca D2	D2	No	---	No
hncoca D3		No	---	No

3. For the D1 Type select HN, for D2 Type select N, and for D3 Type select null. The typical tolerances would be comparable to the differences between the chemical shifts of the same H_{_N}-N-C_a peak in the two spectra, for example, 0.02, 0.15, and 0.15 ppm for D1 Tol, D2 Tol, and D3 Tol, respectively. You can perform selective detection using chemical shift ranges to reflect that, or you can use the full spectrum using -1 for all Range variables. Typically, several trials are needed from each peak, to find a prototype pattern.

The next set of variables reflects this, that is, if with the given tolerances not enough frequencies (Minimum Freqs in Proto) are collected from a particular peak, then new iterations are carried out (up to Number of Iterations times) increasing the tolerances by the Tolerance Factor. You can also direct the program to delete peaks that may belong to already detected spin systems (Remove First is then True). For this, you would select 3 for the Number of Iterations and 1.2 for the factor by which the tolerances are multiplied after each unsuccessful trial (you need 4 frequencies).

4. In the next control panel you must decide how many peaks to store and which frequencies to store. Here, we need to use the D3 frequency (the ^¹³C dimension) (Use D3) and you need to store the frequencies for two peaks (Store 2). The Type variable is active if you select the All option for Store (it is useful in, for example, a 15N-HSQC-TOCSY experiment, because the program would not assign the resonances along the TOCSY line but would call them HXs). You set the first peak's frequency to store as C_a (this peak in the HNCA spectrum is usually more intense: #1 Ca and Attr Larger). The second peak's D3 frequency then results in the C_a,i-1 frequency (#2 C_a(-1) and Attr Smaller). Besides the magnitude of the intensity, you can use the sign of the intensity as a distinction (as for the HNCACB spectrum, where the H_{_N}-N-C_a and H_{_N}-N-C_b peaks have opposite signs) or use a distinctive chemical shift range.

This procedure makes it possible to collect spin systems automatically in the user-defined combination of spectra.

Task: Semiautomated spin-system detection

You can collect spin systems using a semiautomated method. Here, you use the cursor to select a peak in one spectrum, and the program then tries to extend this trial spin system in the spectra that you connected to it.

To illustrate: assume you have a 2D ^¹⁵N-^¹H HSQC spectrum with ^¹⁵N as D2 and ^¹H as D1. Moreover assume that you have a ^¹⁵N TOCSY, where D1 is the amide ^¹H, D2 the full ^¹H, and D3 the ^¹⁵N dimension. You want to select an amide peak in the HSQC spectrum and then let the program collect frequencies in the corresponding slice of the 15N TOCSY spectrum.

1. First you select the Assign/Collect Prototype Patterns/Semiautomated Setup menu item and set the Number of frames/steps to 2, putting the HSQC in Frame 1 and the HSQC-TOCSY in Frame 2.

2. In the second control panel, select the following for the Connect to parameters:


	Connect to		#2		#3		#4
Frame 1 D1	D1	No	No
Frame 2 D2	No	No	No

3. Now set the Slice Position in Frame 1 parameter to D2 and the Sliceplane in #2 to Along D3. That means that you want to use the D2 coordinate of the HSQC spectrum to select a plane in HSQC-TOCSY along D3.

4. Next you must specify how to match the different frequencies. Here, you need to match the D1 of the HSQC to the D1 of the TOCSY, and the D2 of the HSQC to the D3 of the TOCSY spectrum (H_{_N} to H_{_N} and ^¹⁵N to ^¹⁵N). You also need to specify the spin types you expect in this first spectrum, together with search tolerances: H_{_N} with 0.02 and N with 0.1.

5. The number of iterations can be set with a tolerance factor, which specifies that the spin-system collection is be tried that many times, increasing the tolerances that many times (you can set it to 3 and 1.4, for example).

6. In the third control panel you would set the Orientation parameter D1-D2 to D3: 118 ppm. You would set the Connect to parameters as follows:


	Connect to		#2
Frame 2 D1	D1	No	No
Frame 2 D2	No	No	No
Frame 2 D3	No	No	No

7. Now set the Use parameter to D2 since you want to collect new frequencies along D2 of the HSQC-TOCSY spectrum (with D1 and D3 defined by the HSQC) and Store All with type HX.

This setup then would make it possible to zoom in on a peak in HSQC and on a strip in HSQC-TOCSY. Moreover, if you want the ^¹⁵N position from HSQC transferred to 3D spectra, you need to press the < . > key to activate the hidden Jump function. This function gives you a cursor: clicking the required peak (or ^¹⁵N position) with this cursor triggers a jump to a new slice. Finally, if you find a peak in the HSQC spectrum where you want a new spin system to be collected, you need to select the Assign/Collect Prototype Patterns/Semiautomated Collect menu item (for which the hotkey is =) and click that peak with the crosshair cursor. If the program can find connected peaks in the HSQC-TOCSY spectrum, it collects the spin system and shows it in the prototype pattern table.

Task: Spin-system extension

Most of the automated methods in the Assign module collect spin systems containing backbone atoms. To automatically extend the spin systems to include atoms from the sidechain, you may want to run the routines accessed from the Assign/Collect Prototype Patterns/Extend Prototype Patterns menu item.

To illustrate this method, assume that the spin systems were collected so that the prototype patterns contain the H_{_N}, N, C_a, and C_b shifts. Then you can use a 3D HCCH-TOCSY spectrum to automatically find the corresponding H_a, H_b frequencies. In the Extend Prototype Pattern Using HCCH-TOCSY control panel, you would select appropriate tolerances for the search along the ^¹³C and ^¹H dimensions (taking into account how well the C_a and C_b resonances align between the spectra they were detected from (e.g., CBCANH) and this spectrum). You have to set how many times the extension will be attempted from each prototype pattern (Number of iterations) and how much to increase the tolerances each time (Tolerance factor). You also have to define the primary and secondary search dimensions for protons, as well as the carbon dimension. The primary proton dimension is where the heteronuclear transfer was done, and the secondary dimension is where the TOCSY was done.

You can use the extend option to add H_a,i-1 and H_b,i-1 frequencies to spin systems containing H_{_N} and N frequencies using the HBHA(CO)NH spectrum (Extend Prototype Pattern Along One Axis).

Task: Spin-system promotion

As soon as reliable spin systems are detected, you can promote them to patterns. This can be done in two ways.

One way is to copy the protos one by one, using the clipboard. This is the preferred method for spin systems detected in homonuclear or heteronuclear double-resonance spectra, since it is wise to visually inspect and correct the results of automated methods. After inspecting and correcting the frequency clipboard you can verify that the new "pattern" is unique by using the fuzzy algebra comparison control panel (Assign/Frequency Clipboard/Compare Frequencies menu item).

The other way, which might be more dangerous if you did not verify the prototype patterns, is to copy all prototype patterns to patterns directly. If you choose this method, you must delete false prototype patterns before copying. This method is preferred for prototype patterns resulting from triple-resonance heteronuclear spectra, since the neighbor information contained in protos detected in triple resonance spectra can be preserved this way.

Task: Spin-system identification

The next step in peak assignment is to identify the spin systems-the patterns resulting from an automated or manual search. This can be done in Assign based on all-atom chemical shifts contained in the database, by using either a simple scoring algorithm or by using a probability distribution for C_a/C_b chemical shifts (Grzesiek and Bax 1993). The latter method gives better scores, since the proton chemical shifts are not so well dispersed for different residue types, but this method can only be used if labeled proteins are accessible.

The all-atom method gives scores that are not very distinguishable from each other, but, combined with the sequential probability scores, it still gives a good starting point for the sequential assignment-generation step. To help with the assignment, it is advisable to manually unset scores based on a DQF spectrum of the very unlikely residue type probabilities.

Task: Establishing connectivities

This is the next step in the sequential assignment strategy. Usually this step uses the NOE effect as its basis, but in triple-resonance spectra, this connectivity shows up in J peaks, therefore the J-peak-based method is more reliable. Nevertheless, because of the difficulties encountered during labelling, and because a significant portion of the connectivities do not show up due to spectroscopic reasons, the NOE-based methods are still important.

The Assign module includes several menu items that deal with the sequential connection of patterns. It is advisable to have a possibly full set of patterns when you start to find sequential connectivities. Also, it is important that the spectrum-specific shifts be set correctly for the NOE spectrum that is to be used to find the sequentials. The root frequency for each pattern also should reflect the real (average) H_{_N} frequencies of the NOE spectrum for the homonuclear neighbor-finding methods.

You can use the Assign/Neighbor/Find Neighbor Via 2D NOE or 3D NOE menu item to connect the aromatic sidechains with the aliphatic sidechains. You usually end up with prototype patterns having the aliphatic and aromatic part of the same residue as two different entries. Therefore, you promote them to separate patterns. Then you must find which aliphatic sidechain has several contacts to the H_a's (of Tyrs or Phes) which should be the root frequency for that particular pattern. If you find the correct connection (possibly visually inspecting by tiling) then you must merge the two patterns (you can use the Assign/Frequency Clipboard/Copy Pattern To Clipboard and then the Assign/Frequency Clipboard/Copy Clipboard To Pattern menu items). Lastly, you must delete the purely aromatic sidechain pattern(s).

Task: Sequential assignment

The Assign/Sequential/Systematic Search menu item can be used to match the sequence on a set of patterns that have at least residue-type probabilities and sequential probabilities assigned. The algorithm is flexible, therefore several strategies can be followed. One approach is to generate all possible assignments for the entire molecule (Min length of assigned stretches = full sequence). Here, the assumption is that the best scores will result from the correct assignment, but the Min neighbor prob score variable should be set to 0, since there can be missing sequentials. This could mean that the number of possible assignments will be very large (Kleywegt et al 1993),but restricting the Max # of assignments to generate variable to a small number (100-1000) can give a usable result.

The other approach would be to assign shorter stretches (20-30 residues), still keeping the neighbor probability score comparatively small (e.g., 0.1). This method has the disadvantage that you cannot assume that the first solution is the correct one; therefore, you must check all high-scoring possibilities.

The third approach was named "iterative assignment by consensus" (Kleywegt et al 1993), which means that assignment generation starts with restricting the neighbor probability score to high values and letting the Min length of assigned stretches be relatively small.

Thus, well-connected stretches are found first, then the neighbor score is gradually relaxed and the minimum length is increased, and longer, well defined stretches are assigned. This procedure continues while no new assignments can be obtained. Then you might try to specifically assign the remaining stretches. At all stages the consensus means that those residues are assigned which are conserved in a majority of generated assignments.

The Assign/Sequential/Simulated Annealing menu item can be used on any set of homonuclear or heteronuclear patterns. It uses only the type and neighbors scores, obtained by any method, to find the sequence-specific assignment. Optionally, previous assignments are loaded and respected. The amino acid type and/or residue number are considered assigned for a pattern, if they are consistent over all frequencies of the pattern (unique or specified assignments).

After careful inspection of the patterns and scoring of types and neighbors, the process might be run on the full sequence. Then you might inspect the result, modify it using the Pattern Assign functions, perhaps try another run, and identify some satisfactory parts from the scores listed. You should then discard the ambiguous assignments and rerun the program with the correct residues used as anchor points. If several such iterative processes fail to unambiguously determine the complete assignment, then some additional information should be input, such as more accurate scoring or some new patterns.

The results are stored as assignment pointers for all frequencies of the patterns (and set as the current specified frequencies). There should not be any residue named "null" in the molecule, or its assignment will be discarded.

Optionally, some parameters of the simulated annealing might be adjusted (scaled by a factor of 0.1 to 10) according to the complexity of the problem:

Initial temperature, number of iterations: if most parts of the sequence are well defined, these parameters can be decreased to speed up the program.

Sequential/Individual factor: weight is accorded to the neighbor information, relative to the spin-system fit scores.

Task: Resonance assignment

When you assign particular resonances or frequencies in the patterns, you need to use the Insight II atom names if you plan to use NMR_Refine to generate or refine structures. If you make the assignments through the control panels, the atom names are automatically correct. The usual form of the so-called nmrspec is: 1:RESIDUENAME_RESIDUENUMBER:ATOMNAME(NUMBER) (e.g., 1:VAL_4:HN). If you need to use pseudoatoms, then the specification is: 1:RESIDUENAME_RESIDUENUMBER:ATOMNAME(NUMBER)*.

For example, one of the methyl groups in valine would be named 1:VAL_4:HG1*, which encompasses atoms 1:VAL_4:HG11, 1:VAL_4:HG12, and 1:VAL_4:HG13. The methyl group in alanine would be named 1_ALA_23:HB*, which encompasses methyl protons 1:ALA_23:HB1, 1:ALA_23:HB2, and 1:ALA_23:HB3.

Task: Peak assignment

After the resonance assignment is finished, you can try to assign your peaks (usually in an NOE spectrum). It is very important to have spectrum-specific shifts for all patterns that are as good as possible, since otherwise peak assignment can be very ambiguous. You can adjust the spectrum-specific shifts manually or automatically. Preferably you would do it both ways: first automatically, then manually for what was missed. After this is done you can make auto-assignments.

You can approach the automated assignment in different ways: you can use a linear chain as a model and only assign intraresidue and sequence peaks, as well as peaks having unique frequency assignments (that is, peaks that have only one possible assignment in each dimension), or you can use a homology or a low-resolution starting model (for example, from X-ray crystallography) and assign based on those distances. Or you can make ambiguous assignments and then later use those assignments as overlap restraints in SA or rMD protocols.

After a set of assignments is generated and (possibly manually) verified, structures are generated based on them. You can refine your assignments based on the new model or based on despots (where restraints are highly violated or the structure is bad).

Task: Restraint generation

A crucial step in generating NOE distance restraints is to define suitable scalar peaks, that is, those peaks for which the assigned atoms are in a rigid part of the molecule and which are associated with well defined distances. These peaks are often referred to as "reference peaks" and are used to calibrate the conversion of peak intensities (volumes) to distances. As a rule of thumb, it is always safe to use good-intensity, clean, non-overlapped peaks as scalars. One category would be H_b1 to H_b2 methylene peaks in the same residue. You can also use an intraresidue H_{_N}-H_a peak as a scalar, since the variability of this distance is small across different secondary structures. If you use either single mixing time (Single tm) volumes or volume buildups through fitting (Fit First N tm) for calculation of the restraints (Calculation Method), you need to have scalar peaks for which the distance is roughly equal (e.g., use only methylenes or only intraresidue H_{_N}-H_a peaks).

On the other hand, if you use an empirical fit of volumes versus distances (Calculation Method is Empirical Fit), then you typically need several types of peaks-representing some short, some medium, and some longer distances. FELIX then fits an empirical function through the volume/distance pairs, and the volume is converted using this empirical fit.

If you use a 2D NOESY spectrum, you may want to control which peaks' volumes (Symmetry Selection) are converted to distance restraints. Use All converts all peaks (and then only restraints are generated for the first occurrence of the symmetric pair), Select Regions uses different sides of the diagonal in certain regions of the spectrum, and Use Weaker uses only the weaker peak of a symmetry-related pair.

Converting the calculated distance to a restraint can happen in many different ways (Method): Exact Distance, S-M-W Bins, VdW-Exact, Percentage.

There is always a danger in doing a totally automated conversion--there can be overlapped peaks where the measured volumes do not represent the real volumes. To avoid over-restraining when using such peaks, FELIX can handle this type of peak differently, if you use the Partial Overlap option to check how much the area of integration is overlapped for each peak. You can then define a threshold (Area Threshold) by which the algorithm can skip to generate restraints from those overlapped peaks (Discard) or you can use a different method to generate bounds from calculated distances, (e.g., Use as Qual, which generates only qualitative upper-bound restraints).

Task: Checking and redefining restraints

After assigning NOE peaks and generating restraints, you would typically run structure calculations either by using distance geometry (the DGII command within Insight II's NMR_Refine module) or simulated annealing (the MD_Schedule command within Insight II's NMR_Refine module).

Usually the assignment/restraint generation and structure calculation are done in an iterative way. That means that, after generating a set of structures, you need to analyze the structures and find the so-called hotspots (e.g., where many restraints are violated).

Those hotspots can result from misassignment or overlapped peaks. For misassignment, you use a list to reassign (or unassign) those peaks. You can use a simple ASCII file which just has the two (or three or four, depending on dimensionality) names in a row separated by a blank. You can also create a list within Insight II using this procedure:

1. First load the restraints on all the refined molecules in Insight II using the NMR_Refine modules' Restraints/Read molname* command.

2. Execute the NMR_Refine modules' Distance/List command. On the resulting output file run, the provided numvioltofelix script redirects the output to another file. This is the file you can use in the Filename parameter.

3. Proceed similar to the previously described Manual Assign Singly action, but instead of selecting the peaks with the cursor, the peaks automatically will be centered in your current frame and the corresponding control panel will appear.

Hotspots can also be due to erroneous restraints. If so, the NOE Distance Redefine option in the Measure/DISCOVER Restraints menu can help to loosen, tighten, or delete the restraints showing the highest violations (or the most violations within a family). To do this you must:

1. Load the restraints on all the refined molecules in Insight II using the NMR_Refine modules' Restraints/Read <molname*> command.

2. Execute the NMR_Refine modules' Distance/List command. On the resulting output file run, the provided numvioltofelix script redirects the output to another file.

3. Use the Measure/DISCOVER Restraints/NOE Distance Redefine command on this file. You should specify the Restraint entity that you want to work with (usually the msi:noe_dist) and the Buildup Rate Calculation Method. The program brings up a violation table, through which you can zoom in on each peak for which the defined restraint was violated and can report the calculated distance. The table also contains the restrained values and the violation statistics.

4. From the violation table you can redefine the restraints, delete restraints, or delete assignments.