What's Happening ? (part 1; Baseline Correction)
iNMR is, by design, extremely interactive. The user must learn it by doing, not by reading the manual or a textbook on NMR processing. You can understand everything by playing with the parameters and observing, critically, the visual feedback provided by iNMR. Maybe you are curious to know more, in mathematical terms, or you are only at ease with math formulas. This article is for you. Even if this is not your case, a minimal knowledge of the underlying maths can improve your observation skills.
First Thing First: it is not imperative to play with the controls. If the initial solution, automatically proposed by iNMR, satisfies you, just press the OK button. Presently, the interactive dialogs include:
- Reference Deconvolution
- Automatic Integration
- Automatic Baseline Correction
- Manual Baseline Correction
With the single exception of the latter, they all respond to the keyboard actions (arrow key, +, -, Home, etc...). You can navigate through the spectrum and closely examine the effect of processing on each peak. Every dialog contains many adjustable controls. You can consult the manual to discover which is the important one and which the less important. You will always touch the controls in this order: from the most important to the less important one. The articles in this series will also provide this information.
Moving Average Filter
Both automatic baseline correction
and automatic integration
have in common an algorithm
that creates a binary map. The map has a Boolean variable for each point in the spectrum.
Only two values are possible: YES (means that that point of the spectrum contains a peak)
or NO (means: no peak here). This is a very crude approximation and can never yield accurate results.
Just because we know the map can't be 100% correct, we minimize the damage.
What we do is to force the error into the benign direction: we don't care if a point
is mistakenly assigned to a peak (it makes no harm) but at least we know that,
when the map says a certain region is transparent, it is true. We accomplish this through
a convolution of the spectrum with a function like: 00000000000000000111110000000000000000000000 (to give an idea).
In other words, we generate the map not directly from the spectrum, but from a line-broadened version of it.
In practice, we substitute each point of the spectrum with the integral of the region around it.
This integral is commonly called “moving average filter”.
Summarizing: iNMR duplicates the spectrum, applies the moving average filter on the copy, creates the map from the copy. The map is used by iNMR to find regions of pure baseline (for baseline correction) or to find peaks (for automatic integration). The regions marked as "peak" are larger if you have used a large value for the moving average filter, are narrower in the opposite case. For example, if the value of the filter is 2, the point at position 100 is substituted by the sum of points no. 98, 99, 100, 101 and 102 (2 for each side). A very approximative description is that, if the value of the filter is n, you are adding n points at the left and n points at the right of each region labeled as “peak”, points that are stolen from the baseline. It is very important to do this, because the algorithm that creates the map does not recognize the tails of the peaks, and tend to treat them like baseline. The filter is used to compensate the errors of the mapping algorithm. If you really want to know how the map is created, iNMR calculates, in the order:
- the derivative (for each point);
- the square of it (for each point);
- the average of the squares;
- points whose squared derivative is too high (compared to the average) are peak and therefore removed;
- to calculate the average of the remaining points, iNMR returns to point 3 (looping).
Dietrich, W.; Rudel, C.H.; Neumann, M.
Fast and Precise Automatic Baseline Correction of One- and Two-Dimensional NMR Spectra
J. Magn. Reson., 1991, vol. 91, pages 1-10
Monitoring the Filter
All the above processing can be visualized, using a 1-D spectrum or an extract, with the dialog for Automatic Baseline Correction. You must:
- uncheck the option “Fit to a Polynomial”;
- select the 7th degree from the leftmost menu;
- amplify the spectrum (press “+” on the keyboard) until the satellites are visible.
You'll see a (red) line over the spectrum. It corresponds to the spectrum (I mean: the two lines are ALMOST superimposed) where the map says there's pure baseline, otherwise the map says there's a peak (and the two lines are distant). With a common 1-H, high-resolution, example, you will see that things are really in this way when the value of the filter is 64 or 128 (points is the implied unit). Experiment with a value of 1: the map will be wrong and the (red) line will climb up on the tails and the slopes of all peaks. There's no better demonstration of the necessity of the moving average filter.
How to perform the baseline correction
Remember that, in the general case, the line represents not the map, but the baseline as it has been fitted by iNMR. When you click the OK button, the line will be subtracted from the spectrum. It is not necessary, nor advisable, to select the 7th degree in practice. That menu is the last control you will touch. First decide if to use a polynomial correction or not (the alternative is the smoothed baseline). Rules of thumb:
- In most 1-D cases you get better results with the smoother.
- In the 2-D case there is not so much difference, and it's not clear which is better. I prefer the polynomial; if a small degree polynomial is not enough, you should try the smoother.
- In the case of overcrowded spectra (e.g.: spectra of urine) you can find regions of pure baseline only at the margins; a small degree polynomial is the first choice, in such cases.
You can, however, try both solutions. Just remember that, generally speaking, the smoother prefers a larger filter.
For example, if you get a good polynomial correction with filter=16, the smoother will probably be not
as good with an identical filter, but may work better with filter=64.
The fundamental parameter is certainly the filter. Once you have found the best value for it, you can experiment with changing the degree. There is no need to iterate the adjustment. One pass is enough. What you can do, if you are curious, is to toggle the state of the top control and compare the polynomial correction versus the smoother.
What is the smoother?
To learn all about it, read:
Cobas J.C., Bernstein M.A., Martin-Pastor M. and Garcia Tahoces P.
A new general purpose fully automatic baseline-correction procedure for 1D and 2D data
J. Magn. Res. 2006. vol. 183, pages 145-151
The smoother does what the word says. It starts from the experimental baseline and smoothes it until it becomes/resembles a continuous function. Every point of the smoothed baseline is correlated both to the corresponding experimental point and to the nearest neighbors on the smoothed version. When the map says that there is a peak, the experimental point is ignored and the point in the smoothed baseline is correlated to zero instead. In the original paper a weight (called lambda) is used to increase the correlation with the neighbors and decrease the correlation with the experimentally found baseline. You can read, after the fact, the value of lambda used by iNMR with the command “Edit/Copy/Processing” and subsequent pasting of the clipboard. The hidden formula is:
lambda = exp[ (9 - degree) * log( 4 * n ) / 9 ]
where n = number of complex points of the spectrum
Was this over-complication necessary? Not at all, but it hides the less critical parameter, recycles the "degree" from the polynomial, simplifies the interface and lets you easily toggle between the smoother and the polynomial. Certainly lambda can in theory take all values from 0 to 100,000 and beyond, while degree only leaves you 8 possibilities. From a different perspective, you have gained freedom of choice, because the original method was proposed and marketed as “fully automatic”.
Back to the Filter
Why the smoother requires a larger filter? It is easier to answer to a different question: Why the polynomial correction allows smaller filters? If the map has erroneously marked a few points like "baseline" while they are on the tail of some peak, it can have little consequences, because the fitting algorithm will recognize them as outliers and, if they represent only 1% of the sampled points, they will have a negligible effect. The smoother, instead, performs no sampling and no fitting. It “thinks” at a local level only. If a point has been attributed to the baseline, it will be subtracted. Period. It is therefore imperative to mark as “peak” as many points as possible, in other words to overdo the filtering.
How in 2D?
Apparently it's more difficult, because you have no visual clue, but quite often you can rely on the full-automatic correction. In the difficult cases, you can work on extracts or you can reload and reprocess everything, but it's time consuming.