High-Throughput Robust Analysis for Capillary Electrophoresis
For or 1D reactivity data, since there are replicates for each modifier profile, we would recommend checking consistency of those first. The first 8 lanes after 2 nomod are DMS, and we can plot those 4 (-) ligand lanes together with:
plot(normalized_reactivity(:, 3:6))
HiTRACE offers the function average_data_filter_outliers()
, which takes traces and initial error estimates; figures out outlier points and even outlier traces; and then returns reasonable final values and error estimates:
[d_DMS_minus, da_DMS_minus, flags] = average_data_filter_outliers( ...
normalized_reactivity(:, 3:6), normalized_error(:, 3:6), [], seqpos_out, sequence, offset);
This step only concerns error estimates across replicates. If your experiment only involves one lane for each condition, you do not need to run this command. And use
normalized_error
directly instead.
We can see the first 2 (blue & green) are in good agreement; so are the last 2 (red & magenta). But they do not agree with each other. In our experiment, 2 different modifier concentration were tried. And in this case, we think the first condition (e.g. DMS 1.0%) is better.
[d_DMS_minus, da_DMS_minus, flags] = average_data_filter_outliers( ...
normalized_reactivity(:, 3:4), normalized_error(:, 3:4), [], seqpos_out, sequence, offset);
[d_DMS_plus, da_DMS_plus, flags] = average_data_filter_outliers( ...
normalized_reactivity(:, 7:8), normalized_error(:, 7:8), [], seqpos_out, sequence, offset);
[d_CMCT_minus, da_CMCT_minus, flags] = average_data_filter_outliers( ...
normalized_reactivity(:, 11:12), normalized_error(:, 11:12), [], seqpos_out, sequence, offset);
[d_CMCT_plus, da_CMCT_plus, flags] = average_data_filter_outliers( ...
normalized_reactivity(:, 15:16), normalized_error(:, 15:16), [], seqpos_out, sequence, offset);
[d_SHAPE_minus, da_SHAPE_minus, flags] = average_data_filter_outliers( ...
normalized_reactivity(:, 19:20), normalized_error(:, 19:20), [], seqpos_out, sequence, offset);
[d_SHAPE_plus, da_SHAPE_plus, flags] = average_data_filter_outliers( ...
normalized_reactivity(:, 23:24), normalized_error(:, 23:24), [], seqpos_out, sequence, offset);
For simplicity, we created variables (e.g. d_DMS_minus
) to hold individual reactivity values. We can now make a figure and evaluate the data (see Step #7).
Here are some checkpoints to go over:
The 1D example has good GAGUA pentaloops, see the SHAPE profile for 5 reactive nucleotides and protected stems flanking the GAGUA. If the profile looks different, it may be your flanking sequence interferes with proper folding by interacting with the region of interest.
The 1D example here has attenuation issues. The CMCT profile is generally bad, while the DMS is still attenuation biased, and the SHAPE profile is better but not perfect.
Trace back where the negative values are. If there is a strong nomod background, it might be excusable. It could be due to degradation, or RNA activity (e.g. self-cleavage). A consecutive region of negative numbers ‘flying-off’ is a red flag. Slightly negative values (< 0.1) are no need to worry.
Built with Jekyll using a RiboKit Theme . Hosted on GitHub Pages.