fullsibQTL
Tutorial
fullsibQTL
is an R
package to perform QTL mapping in outbred/outcrossing species using the model develop by Gazaffi et al. 2014. We consider as a mapping population a full-sib progeny (or F1 population) derived by a bi-parental cross between two non-homozygous parents, with a genetic map obtained from markers possibly showing different segregation patterns. We assumed that the phenotypic traits are continuous and have a normal distribution. Also, the genetic map must be previously made using the R package onemap
.
If you are not familiar with onemap
and its approach to build a map, we strongly encourage you to read onemap
vignettes (Margarido et al., 2007).
In this tutorial, to illustrate the usage of fullsibQTL
we analyze the dataset present by Gazaffi et al. (2014) when proposing the method. The purpose is to help users dealing with this package and understanding the outputs. It is not our intention to teach the basis doing a QTL analysis, so for a better understand about the method see Gazaffi et al. (2014).
Users of fullsibQTL
are supposed to have some experience with R, since the analysis is done using the command line. Previous knowledge of onemap
is also desirably, once the genetic map is obtained with this software. But by no means one needs to be on expert on any of them to use the package.
fullsibQTL
is available under the GNU General Public License, it is open-source and the code can be changed freely. It comes with no warranty. It is implemented as a package to be used under the freely distributed R
software, which is a language and environment for statistical computing
If you are not familiar with R
, we recommend reading the vignette Introduction to R. If you are not familiar with onemap
, we recommend the reading of vignette How to build a linkage map for outcrossing populations.
Moreover, there are three additional tutorials:
Citation
To cite the model for QTL mapping in a full-sib progeny:
Gazaffi, R., Margarido, G. R., Pastina, M. M., Mollinari, M., Garcia, A. A. F. (2014). A model for quantitative trait loci mapping, linkage phase, and segregation pattern estimation for a full-sib progeny. Tree Genetics & Genomes, 10(4), 791-801. (https://link.springer.com/article/10.1007/s11295-013-0664-2)
To cite this R package:
Article in submission
Functions
The following table presents the fullsibQTL
functions which can be directly accessed by users:
Type | Name | Description |
---|---|---|
input | read_outcross_pheno |
Reads the data file containing markers and phenotypic values |
create_fullsib |
Creates the object to perform QTL | |
interval mapping | im_scan |
Scan the genome using IM approach |
im_char |
Provides the genetic effects, LOD Score and the p-values of complementary testes | |
cofactors | cof_selection |
Selects cofactors using multiple linear regression |
cof_definition |
Defines locations on the genome to be used as cofactors | |
composite interval mapping | cim_scan |
Scan the genome using CIM approach |
cim_char |
Provides the genetic effects, LOD Score and the p-values of complementary testes | |
summaries (S3 Methods) |
print |
class fullsib : Prints a summary for the object of class fullsib |
print |
class fullsib_scan : Prints the result of im_scan or cim_scan |
|
summary |
class fullsib_scan : Summarizes the QTL search for im_scan or cim_scan |
|
plot |
class fullsib_scan : Plot the QTL profile for the mapped groups |
|
summary |
class fullsib_perm : Provides the threshold values for permutations test |
|
plot |
class fullsib_perm : Plot the empiral distribution for permutations test |
|
draw_phase |
Returns the linkage phase between QTL and markers | |
get_segr |
Returns the QTL segregation | |
r2_ls |
Provides the phenotypic proportion of each QTL mapped and altogether (least square estimation) |
Required data
The input file of fullsibQTL
is the same as the one required by onemap
package. If you have VCF or MAPMAKER format files, onemap
provides functions to read (onemap_read_vcfR
and read_mapmaker
) and export (write_onemap_raw
) them as the onemap
raw file. There are details about it at onemap
vignette.
Below there is an example of such a file. The header line indicates the data type, the following line with 10 5 0 0 2
indicates that the data has, respectively, 10 individuals, 5 markers, no chromosome information for these markers (indicated by 0), no physical position information for the markers (0), and two phenotypic traits (0). This file is somehow similar to a MAPMAKER/EXP
file (used a long time ago), but has additional information about the crosstype. It uses the symbol -
for missing data.
data type outcross
10 5 0 0 2
I1 I2 I3 I4 I5 I6 I7 I8 I9 I10
*M1 B3.7 ab ab - ab b ab ab - ab b
*M2 D2.18 o - a a - o a - o o
*M3 D1.13 o a a o o - a o a o
*M4 A.4 ab b - ab a b ab b - a
*M5 D2.18 a a o - o o a o o o
*pheno1 4.8 2.1 10.6 3.7 -3.7 -7.5 - 3.9 3.6 -5.8
*pheno2 5 -5.8 - 14.8 -2.4 -1.9 -1.1 8.1 -11.1 4.8
If your onemap raw data do not have the phenotypic information, you can also provide it separately in a matrix with individuals in rows and traits in columns. See example:
pheno <- read.table(system.file("extdata/pheno_example.txt", package = "fullsibQTL"),
header = T)
head(pheno)
#> ind Trait1 Trait2
#> 1 IND1 4.985 5.985
#> 2 IND2 4.616 5.616
#> 3 IND3 -0.133 0.867
#> 4 IND4 -5.002 -4.002
#> 5 IND5 6.815 7.815
#> 6 IND6 2.678 3.678
If one wants to see an example of the phenotypic data file format, type the following, that will show the location of a file that is distributed with the package:
The individuals in the rows must be the same as the ones used to build the linkage map, they are specified in the third row of onemap raw file. If you do not have phenotypic information for all of them, complete their rows with NA
.
Creating the working object
Once the file with raw data was created, it is necessary to import it and provide more details required for QTL mapping.
First you need to read the raw data with onemap::read_onemap
function. After, you need to combine it with another object with information of linkages groups, created with function create_fullsib
.
This is illustrated in the following example, where we import the raw data and a list with four linkages groups, with distances measure with
kosambi mapping function (Kosambi, 1944) (argument map.function
). Note that we also specified argument step
with value 1. This option is used for the computation of the conditional multipoint probability of QTL genotypes, which is used for QTL mapping (see Gazaffi et al. 2014 for details). If step=0
is used, the probabilities will be computed only in the position of each marker (allowing the analysis even with missing data).
library(fullsibQTL)
# Raw data
fs_data <- read_onemap( dir = "C:/workingdirectory", inputfile = "filename.raw")
## Information of linkage groups, step and mapping function
fsib <- create_fullsib( fs_data,
map.list = list( LG1_final, LG2_final, LG3_final, LG4_final),
step = 1, map.function = "kosambi")
# Or
## Provide the phenotypic information separately
fsib <- create_fullsib( fs_data,
map.list = list( LG1_final, LG2_final, LG3_final, LG4_final),
step = 1, map.function = "kosambi", pheno=pheno)
After loading the data, the user can look at the onemap
object’s summary:
For fullsibQTL
object’s:
If one wants to see an example of the data file format (raw data), type the following, that will show the location of a file that is distributed with the package:
#> Loading required package: onemap
#> Warning in rgl.init(initValue, onlyNULL): RGL: unable to open X11 display
#> Warning: 'rgl.init' failed, running with 'rgl.useNULL = TRUE'.
In this tutorial we will use as an example an R object distributed with the package. It has: a) raw data in the onemap
format (example_QTLfullsib
); b) four ordered linkage groups (LG1_final
, LG2_final
, LG3_final
, and LG4_final
); two objects with permutations, that will be explained later (named im_perm
and cim_perm
). They are the same ones described at Gazaffi et al. (2014).
To load the example data:
## Loading the example data
data("example_QTLfullsib")
## Checking if it worked
ls()
#> [1] "cim_perm" "example_QTLfullsib" "im_perm"
#> [4] "LG1_final" "LG2_final" "LG3_final"
#> [7] "LG4_final"
## Printing some of the loaded objects
example_QTLfullsib
LG1_final # If you found an error here, don't worry. It is because of onemap version incompatibility, we automatically fix it in the next steps
Again, the map was built with onemap
package. Now, we have to combine the raw data with the linkage groups (hereafter named LG) into a single object. This is done with function create_fullsib
:
fsib <- create_fullsib( example_QTLfullsib,
map.list = list( LG1_final, LG2_final, LG3_final, LG4_final ),
step = 1, map.function = "kosambi" )
Checking it:
fsib
#> This is an object of class 'fullsib'
#>
#> The linkage map has 4 groups, with 560 cM and 60 markers
#> No. individuals genotyped: 300
#> Group 1 : 140 cM, 15 markers (A, B1, B3, C, D1, D2)
#> Group 2 : 140 cM, 15 markers (A, B1, B2, B3, C, D1, D2)
#> Group 3 : 140 cM, 15 markers (A, B2, B3, C, D1, D2)
#> Group 4 : 140 cM, 15 markers (A, B1, B2, C, D1, D2)
#> And 5 unlinked markers
#>
#>
#> 2 phenotypes are avaliable for QTL mapping
#> Multipoint probability for QTL genotype was obtained for each 1 cM
This object contains a genetic map composed by four LGs, with their length and segregation patterns indicated in the output. There are also five unlinked makers. This might be helpful for further QTL analysis for species where the map does not cover the whole genome, using the single marker analysis. The object also has information for two phenotypes, with conditional probabilities calculated every 1 cM (using Kosambi function).
Interval mapping
To perform interval mapping, one should use two functions, im_scan
and im_char
. im_scan
scans all the genome testing for statistical associations between genotype and phenotype, i.e., trying to detect QTL. im_char
shows the genetic effects and their significance tests, as well as some additional tests to infer QTL segregation pattern.
IM Genome scan
To perform a genome scan:
Argument fullsib
refers to the object created using function create_fullsib
; lg
indicates which linkage groups must be scanned (another way of representing it is lg=1:4
). The function default is lg=all
, to scan every LG. The argument pheno.col
indicates which phenotype will be considered (it defaults to 1, the first trait), using an integer. The argument LOD=TRUE
indicates that the mapping result must be shown using LOD scores, if FALSE
, the scale will be shown as \(-log_{10}(p_{value})\).
The mapping result can be viewed by typing the object name:
im
#> lg pos.cM LOD model
#> M1 1 0 4.91459840 0
#> loc1 1 1 4.97917274 0
#> loc2 1 2 5.02580977 0
#> loc3 1 3 5.05401343 0
#> loc4 1 4 5.06380070 0
#> loc5 1 5 5.05567153 0
#> loc6 1 6 5.03060917 0
#> loc7 1 7 4.99010241 0
#> loc8 1 8 4.93589188 0
#> M2 1 10 4.79637977 0
#> loc10 1 10 4.79637977 0
#> loc11 1 11 4.78808670 0
#> [ reached getOption("max.print") -- omitted 548 rows ]
Another way to see the results is using print
and specifying the LG:
print( im, lg = 1 )
#> lg pos.cM LOD model
#> M1 1 0 4.9145984 0
#> loc1 1 1 4.9791727 0
#> loc2 1 2 5.0258098 0
#> loc3 1 3 5.0540134 0
#> loc4 1 4 5.0638007 0
#> loc5 1 5 5.0556715 0
#> loc6 1 6 5.0306092 0
#> loc7 1 7 4.9901024 0
#> loc8 1 8 4.9358919 0
#> M2 1 10 4.7963798 0
#> loc10 1 10 4.7963798 0
#> loc11 1 11 4.7880867 0
#> [ reached getOption("max.print") -- omitted 128 rows ]
The resulting object of im_scan
is a matrix with five columns and k lines, where k is the number of genome positions in which the conditional probabilities have been computed based on the step
argument of the earlier create_fullsib
function. In this object, the row names are the identification of each position (it can be the marker name itself or a pseudo-marker, with an automatically created name in the format loc
+ number (corresponding to a step between markers). The second column is the linkage group, and the third is the position (in centiMorgan). The fourth column has the LOD Score value (or the \(-log_{10}(p_{value})\)). The fifth column reefers to the model utilized for the computations (that will be detailed later).
For helping visualization of the results of QTL mapping, function summary(im)
can be used to print just the highest LOD score value (or the \(-log_{10}(p_{value})\)) for every LG. The user can set a threshold (thr
argument) to see only the scores above this. When there is more than one QTL in the same LG, the user has to manually look for it using some graphical visualization.
QTL mapping plot options
For graphical visualizations, there are two functions: plot
and plot_fullsibQTL
. Function plot
is simpler and it is a good option for quick observations. Function plot_fullsibQTL
is recommended for more detailed analysis; it has also an option for interactive analysis, allowing tools such as mouse zoom and hover.
To use function plot
:
Users can customize the output. For example, it is possible to plot in the same panel up to five mapping profiles. Moreover, the user can indicate the molecular marker’s position, change the labels, among others features. You can find more information about available options typing ?plot.fullsib_scan
or on its vignette, typing
Function plot_fullsibQTL
is more sophisticated and interactive. Try it with
To plot interactive graphics:
It will generate this interactive html graphic.
You can find more information about plot_fullsibQTL
in the vignette:
Covariates
To improve the mapping statistical power, it is possible to include covariates (as fixed effects) in the QTL model, in order to control the phenotypic variability caused by known sources other than genetic (e.g. some known plants did not receive the correctly irrigation). To set a full-rank matrix which controls the phenotypic variance, first you need to define an incidence vector, and, then, set it as an argument of the function im_scan
. An example follows:
IM Permutation test
The threshold for QTL detection can be set using permutation tests (Churchill and Doerge, 1994) adding the argument n.perm
in the im_scan
function. As the permutation test is a sampling process, it is recommended that users set a seed before the permutations, if they want to repeat the procedure. The argument write.perm
allows the user to save a text file with information about the permutations.
In this tutorial, the permutation test object im_perm
with the seed 1234
is already created and is part of the example_QTLfullsib
object (previously loaded), so there is no need to run the following lines. Anyway, an example of how to perform the permutation is the following:
## Perfoming the permutation
set.seed(1234)
im_perm <- im_scan( fsib, pheno.col = 1, n.perm = 1000,
write.perm = "im_permutations.txt" )
If one needs to safe keep the object with the results of permutations, for further analysis, the object can be saved as an Rdata
object. To save and load it, use:
The permutation test demands some time. At the end of the process, im_perm
receives a matrix with p lines, where p is the number of the realized permutations, and two columns with the highest and second highest peak values. The definition of the second peak is the same as the one defined by Chen and Storey (2006), i.e, the highest LOD score (or \(-log_{10}(p_{value})\)) of the genome disregarding the LG where the highest peak was achieved. The threshold values can be visualized with function summary
applied to the object with the permutation tests. In this case, the user can set an \(\alpha\) significance threshold. The result will be the quantile \((1-\alpha) \times 100\) of the samples. The function default is to show \(\alpha\) values equal to \(0.10\) and \(0.05\).
## Summary of the permutation test
summary( im_perm )
#> Threshold considering 1000 permutations
#> First column indicates 'classical' threshold showed by Churchill and Doerge, 1994
#> Second column means threshold values suggested by Chen and Storey, 2006
#> peak.1 peak.2
#> 0.05 3.419819 2.497431
#> 0.1 3.103702 2.287990
summary( im_perm, alpha = 0.05 )
#> Threshold considering 1000 permutations
#> First column indicates 'classical' threshold showed by Churchill and Doerge, 1994
#> Second column means threshold values suggested by Chen and Storey, 2006
#> peak.1 peak.2
#> 0.05 3.419819 2.497431
## Getting the peaks for alpha 0.05.
im.peak1 <- summary(im_perm, alpha = 0.05)[1, 1]
#> Threshold considering 1000 permutations
#> First column indicates 'classical' threshold showed by Churchill and Doerge, 1994
#> Second column means threshold values suggested by Chen and Storey, 2006
im.peak1
#> [1] 3.419819
im.peak2 <- summary(im_perm,alpha = 0.05)[1, 2]
#> Threshold considering 1000 permutations
#> First column indicates 'classical' threshold showed by Churchill and Doerge, 1994
#> Second column means threshold values suggested by Chen and Storey, 2006
im.peak2
#> [1] 2.497431
## Plotting the second peak in the graphic
plot(im)
abline(h = im.peak2, col = "red")
## Plotting both thresholds
plot( im, lty = 1, lwd = 2, incl.mkr = NULL, cex.incl = 0.7,
cex.axis = 0.8, col = "blue", ylab = "LOD Score", xlab = "Linkage Group",
main = "IM - Trait 1" )
abline( h = im.peak1, lty = 2, lwd = 1, col = "black" )
abline( h = im.peak2, lty = 2, lwd = 2, col = "black" )
If one wants to see the distribution of maximum values (selected for each permutation), function plot
can be applied to perm
object. If peak
argument is set as 2, the distribution of the second highest peaks will be showed. Vertical lines represent the cut-off threshold for \(\alpha=0.05\).
IM QTL characterization
After the determination of the number and position of the QTLs, the user can characterize the mapped QTLs. This relates to identify the significant effects, the segregation pattern and linkage phases between QTLs and flanking markers. The already created object im
shows the presence of at least 4 QTLs:
summary(im)
#> lg pos.cM LOD model
#> loc4 1 4 5.063801 0
#> M27 2 110 8.381168 0
#> loc64 3 64 6.343023 0
#> loc56 4 56 6.733203 0
The characterization of these 4 regions is achieved with function im_char
:
qtl1 <- im_char( fsib, pheno.col = 1, lg = 1, pos = "loc4" )
qtl2 <- im_char( fsib, pheno.col = 1, lg = 2, pos = "M27" )
qtl3 <- im_char( fsib, pheno.col = 1, lg = 3, pos = "loc64" )
qtl4 <- im_char( fsib, pheno.col = 1, lg = 4, pos = "loc56" )
Notice that positions “loc4”, “M27”, “loc64”, and “loc56” are the ones presented in the im
object and visualized when function summary
was applied. For example, the object with the characterization of the QTL on LG 1 is:
qtl1
#> M1-M2
#> LG 1.00000000
#> pos 4.00000000
#> -log10(pval) 4.46566439
#> LOD_Ha 5.06392065
#> mu -0.12075333
#> alpha_p 1.85844505
#> LOD_H1 4.34832177
#> alpha_q 0.43196490
#> LOD_H2 0.23481276
#> delta_pq -0.51729787
#> LOD_H3 0.29268801
#> H4_pvalue 0.01402575
#> H5_pvalue 0.03295641
#> H6_pvalue 0.89215754
#> model 0.00000000
#> attr(,"class")
#> [1] "fullsib_char" "matrix"
Object qtl1
(a matrix) has several information: * Line 1: LG information; * Line 2: position in centiMorgan where the QTL is localized; * Line 3: \(-log_{10}(p_{value})\)); * Line 4: LOD Score; * Line 5: intercept (mu
); * Lines 6, 8 and 10: alpha.p
, alpha.q
, and delta.pq
represent the QTL genetic effects; * Lines 7, 9 and 11: LOD.alpha.p
, LOD.alpha.q
, and LOD.delta.pq
are the statistics associated with the marginal tests of significance of these genetic effects, with 1 degree of freedom (testing just if the effect differs from zero). In this case, a LOD score above or equal \(0.83\) (equivalent to a likelihood ratio test of LRT \(= 3.81\), which has an approximate \(\chi^2\) distribution with 1 degree of freedom under the null hypothesis). Since this is done only on specific positions, it is not necessary to correct for multiple tests. * Lines 12, 13 and 14: H4.pvalue
, H5.pvalue
, and H6.pvalue
show the p-values for additional tests necessary to infer QTL segregation. For more information, please see Gazaffi et al. (2014). * Line 15 (last line): model
shows which model was considered; please see details on the vignette QTL mapping with partially informative markers.
In some cases, a given test cannot be performed, showing NA
as their values. In short, outcrossing species can have a combination of markers with different segregation patterns, and in some genomic regions the amount of information can be different. For example, a genomic with markers segregating only in one of the parents (marker types D1 or D2) will not allow inference about QTL in the other parent, so the model needs to be adapted accordingly. For details see Gazaffi et al. (2014).
We can print information for all QTL simultaneously:
cbind( qtl1, qtl2, qtl3, qtl4 )
#> M1-M2 M27 M37-M38 M51-M52
#> LG 1.00000000 2.000000000 3.0000000000 4.00000000
#> pos 4.00000000 109.999999871 64.0000000000 56.00000000
#> -log10(pval) 4.46566439 7.686445158 5.7006656820 6.07922441
#> LOD_Ha 5.06392065 8.383998018 6.3431104016 6.73344163
#> mu -0.12075333 -0.120753333 -0.1207533333 -0.12075333
#> alpha_p 1.85844505 2.001547484 2.1843732488 1.60658603
#> LOD_H1 4.34832177 5.185292053 6.3174806045 3.55807185
#> alpha_q 0.43196490 -0.462907691 0.0954155095 1.55777619
#> LOD_H2 0.23481276 0.341625090 0.0121428301 3.18133346
#> delta_pq -0.51729787 -1.715739744 0.1644144863 0.36960164
#> LOD_H3 0.29268801 3.887091384 0.0332241748 0.17050296
#> H4_pvalue 0.01402575 0.005461553 0.0001530791 0.92901795
#> H5_pvalue 0.03295641 0.597567873 0.0005160089 0.03000588
#> H6_pvalue 0.89215754 0.019293472 0.9112272264 0.05257837
#> model 0.00000000 0.000000000 0.0000000000 0.00000000
The analysis of the linkage phases allows to detect the localization of the QTL allele which increases or reduces the trait. To do so, look for LOD.alpha.p
and LOD.alpha.q
. In order to facilitate interpretation, the function draw_phase
and can be used as follows:
draw_phase(fsib, qtl1, 0.05)
#>
#> Printing QTL and its linkage phase between markers across the LG:
#>
#> Markers Position Parent 1 Parent 2
#>
#> M1 0.00 a | | o a | | o
#> QTL 4.00 P1 | | P2 Q0 | | Q0
#> M2 10.00 a | | b c | | d
#> M3 20.00 c | | c a | | b
#> M4 30.00 a | | o a | | o
#> M5 40.00 a | | b a | | b
#> M6 50.00 a | | b a | | b
#> M7 60.00 a | | o a | | o
#> M8 70.00 a | | o a | | o
#> M9 80.00 a | | b c | | d
#> M10 90.00 a | | b c | | c
#> M11 100.00 a | | b c | | d
#> M12 110.00 a | | b a | | o
#> M13 120.00 a | | b c | | c
#> M14 130.00 a | | b c | | d
#> M15 140.00 a | | b c | | d
#>
#> P1 and Q1 have positive effect (increase phenotypic value)
#> P2 and Q2 have negative effect (reduce phenotypic value)
#> P0 and Q0 have neutral effect (non signif.)
The value \(0.05\) relates to \(\alpha\) (significance level) and it is used to show only QTL statistically significant at this level. Alleles \(P1\) and \(Q1\) have positive effects, while \(P2\) and \(Q2\) have negative ones; \(P0\) and \(Q0\) have no significant effects.
To infer the QTL segregation, the user should use function get_segr
function. It uses the default of \(\alpha=0.05\) for the significance level of marginal tests (LOD.alpha.p
, LOD.alpha.q
, LOD.delta.pq
) and complementary ones (H4.pvalue
, H5.pvalue
, H6.pvalue
). For example:
Selection of Cofactors
Before proceeding to composite interval mapping (CIM), it is necessary to define which markers will be used as cofactors. They will be used to control variation outside the mapping interval. To do so, there are two functions: cof_selection
and cof_definition
.
Cofactor selection using multiple linear regression
Function cof_selection
allows to select cofactors using multiple linear regression with the stepwise procedure. The criteria information is utilized to determine if a given cofactor is kept or not in the model. Basically, this function first prepares the data and then uses step
from stats
package. For more information, use ?cof_selection
.
Object fsib
is the same one created using function create_fullsib
; pheno.col
indicates which phenotypic trait will be used in the analysis; k
is the penalty for the information criteria computation. If k=2
, it used Akaike Information Criteria; if k=log(n)
(where n
is the mapping population size), it uses the Bayesian Information Criteria. If the user wants to use another penalty values, we recommend to look at the QTL Cartographer v1.17 manual, page 76, and also the help of the R function step
. In some situations, the process would recommend to keep several cofactors which can lead to overparametrization of the model. To control it, the argument n.cofactor
allows to the user to restrict the number of selected cofactors up to a maximum, i.e, the proceeding to select cofactors stops when the number of cofactors reaches what is defined in n.cofactor
(the default is n.cofactor = 10
).
The following example shows how to proceed to select cofactors in a situation where the fixed covariates (see Covariates section) are set to control the character variability. To do it, we used the argument addcovar
. For example:
covar <- matrix( rep( c( 1, -1 ), each = 150 ), ncol = 1 )
cofs_fs2 <- cof_selection( fsib, pheno.col = 1, addcovar = covar,
k = 2, thres.effect = 0.05 )
#> Number of Cofactors selected: 1 ... 2 ... 3 ... 4 ... 5 ... 6 ... 7 ... 8 ... 9 ... 10 ... done
After the cofactor selection, thres.effect=0.05
is used to verify if there are remaining cofactors with no significant effects, and if this is the case, they are removed. It is worth noting that each cofactor can have up to three genetical effects (additive in both parents and dominance), but not all them might be statistically significant. The default of this argument is 1, i.e, there is no removal of non-significant effects. This could lead to overparametrization, so use it carefully. For more information, use (?cof_selection
).
Cofactor ad-hoc definition
Users can also select the cofactors using and ad-hoc procedure. This is the case for example when someone knows that certain markers are supposed to be linked with QTL. To do so, it is necessary to build a matrix and then incorporate it in the cof_definition
function:
Composite Interval Mapping (CIM)
In order to proceed with composite interval mapping, there are two functions: cim_scan
and cim_char
. They have analogous options of functions im_scan
and im_char
. The first one scan all the genome computing associations between genotype and phenotype trying to detect QTL. The second shows the genetic effects and their statistical significance. It also shows additional tests to infer the QTL segregation pattern. The mapping procedure is performed in a similar way as already presented for im_scan
. The inclusion of cofactors allows a better mapping of QTL, since this model has more statistical power than interval mapping. For more information, use ?cim_scan
and ?cim_char
.
CIM Genome scan
The genome scan can be performed with function cim_scan
, indicating the cofactors object (class: fullsib_cofactors
) previously created with cof_selection
or cof_definition
.
Option ws
(window size) is to indicate the size of a window outside the mapping interval to block the inclusion of cofactors. This is important to not markers linked with the QTL being mapped to be considered as other QTL.
Permutation test
The threshold for QTL detection can be set using permutation tests (Churchill and Doerge, 1994) adding the argument n.perm
in the cim_scan
function. As the permutation test is a re-sampling process, it is recommended that users set a seed before the permutations, in order to be able to replicate the results. Argument write.perm
allows to save a text file with information about all the permutations. In this tutorial, the permutation test object cim_perm
with the seed 4321
is already created and is part of the example_QTLfullsib
object previously loaded, so it is not necessary to run the following lines. But just as an example:
set.seed( 4321 )
cim_perm <- cim_scan( fullsib = cofs_fs, lg = "all", pheno.col = 1,
LOD = TRUE, n.perm = 1000,
write.perm = "cim_permutations.txt" )
To keep the results of permutations for later analysis, one just needs to save (and eventually load) an R object:
The permutation test demands some time. At the end of the process, cim_perm
receives a matrix with p lines, where p is the number of the realized permutations, and 4 columns representing the first, second, third, and fourth highest peak value, respectively, for the statistic LOD (or \(-log_{10}(p_{value})\)). The definition of the second peak is the same of the method two of Chen and Storey (2006), i.e, the highest LOD score (or \(-log_{10}(p_{value})\)) of the genome disregarding the LG where the highest peak was achieved. The third and fourth peak definitions are analogous. This threshold values can be visualized through summary
for the object with the permutation tests. In this case, the user can set an \(\alpha\) for the significance threshold. The result will be the quantile \((1-\alpha) \times 100\) of permutation samples for the statistic (defaults \(0.10\) an \(0.05\)).
## Summary of the permutation test
summary(cim_perm)
#> Threshold considering 1000 permutations
#> peak.1 peak.2 peak.3 peak.4
#> 0.05 4.023724 2.900906 2.300657 1.860476
#> 0.1 3.707559 2.650663 2.140063 1.692707
## just with alpha=0.05
summary(cim_perm, alpha = 0.05)
#> Threshold considering 1000 permutations
#> peak.1 peak.2 peak.3 peak.4
#> 0.05 4.023724 2.900906 2.300657 1.860476
## Getting the peaks for alpha 0.05.
cim.peak1 <- summary(cim_perm, alpha=0.05)[1,1]
#> Threshold considering 1000 permutations
cim.peak1
#> [1] 4.023724
cim.peak2 <- summary(cim_perm, alpha=0.05)[1,2]
#> Threshold considering 1000 permutations
cim.peak2
#> [1] 2.900906
## Ploting the thrshold with the second peak
plot(cim)
abline(h = cim.peak2, col = "red")
## Plotting both thresholds
plot(cim, lty = 1, lwd = 2, incl.mkr = NULL, cex.incl = 0.7,
cex.axis = 0.8, col = "red", ylab = "LOD Score", xlab = "Linkage Group",
main = "CIM - Trait 1")
abline(h = cim.peak1, lty = 2, lwd = 1, col = "black")
abline(h = cim.peak2, lty = 2, lwd = 2, col = "black")
If one wants to see the distribution of the maximum of the LODs, function plot
can be used with perm
object. If the peak
argument is 2
, the distribution of the second highest peaks will be showed. The vertical line represents the cut-off threshold for \(\alpha=0.05\).
Characterization of mapped QTL with CIM
After the determination of the number and position of the QTLs, it is possible to characterize the QTL, i.e., identify their significant effects, segregation patterns and linkage phases with flanking markers.
The already created object cim
shows the presence of at least 4 QTLs:
summary(cim)
#> lg pos.cM LOD model
#> loc13 1 13 6.420684 0
#> loc45 2 45 14.985451 0
#> loc65 3 65 9.367385 0
#> M52 4 60 11.397918 0
However, the following graphic shows that there is evidence for more QTL. This is because the automatic selection of peaks is not perfect, demanding some user interaction. We do not recommend using automatic procedures.
## Plotting the CIM results
plot( cim, lty = 1, lwd = 2, incl.mkr = NULL, cex.incl = 0.7,
cex.axis = 0.8, col = "red", ylab = "LOD Score", xlab = "Linkage Group",
main = "CIM - Trait 1" )
abline( h = cim.peak1, lty = 2, lwd = 1, col = "black" )
abline( h = cim.peak2, lty = 2, lwd = 2, col = "black" )
The location of other peaks (meaning evidence for more QTL) can be verified examining object cim
. For example, for some LGs:
## Visual Analysis of every location of LG1
## QTLs peak at loc13 and M12
print(cim,lg = 1)
#> lg pos.cM LOD model
#> M1 1 0 5.79367870 0
#> loc1 1 1 5.91181131 0
#> loc2 1 2 6.01211556 0
#> loc3 1 3 6.09560281 0
#> loc4 1 4 6.16354985 0
#> loc5 1 5 6.21737749 0
#> loc6 1 6 6.25856640 0
#> loc7 1 7 6.28857607 0
#> loc8 1 8 6.30875904 0
#> M2 1 10 6.32409143 0
#> loc10 1 10 6.32409143 0
#> loc11 1 11 6.37420419 0
#> loc12 1 12 6.40668720 0
#> loc13 1 13 6.42068404 0
#> loc14 1 14 6.41583249 0
#> loc15 1 15 6.39210295 0
#> loc16 1 16 6.34966143 0
#> loc17 1 17 6.29271090 0
#> loc18 1 18 6.21525449 0
#> M3 1 20 0.08055465 0
#> loc20 1 20 0.08055465 0
#> loc21 1 21 0.18998040 0
#> loc22 1 22 0.29355109 0
#> loc23 1 23 0.40492461 0
#> loc24 1 24 0.48753060 0
#> loc25 1 25 0.56386105 0
#> loc26 1 26 0.63065734 0
#> loc27 1 27 0.68585027 0
#> loc28 1 28 0.72837864 0
#> M4 1 30 0.75643287 0
#> loc30 1 30 0.75643287 0
#> loc31 1 31 0.81806962 0
#> loc32 1 32 0.87103269 0
#> loc33 1 33 0.91419103 0
#> loc34 1 34 0.92929996 0
#> loc35 1 35 0.95314871 0
#> loc36 1 36 0.96357427 0
#> loc37 1 37 0.96005953 0
#> loc38 1 38 0.94264939 0
#> M5 1 40 0.86006181 0
#> loc40 1 40 0.86006181 0
#> loc41 1 41 0.81052668 0
#> loc42 1 42 0.75138109 0
#> loc43 1 43 0.68301187 0
#> loc44 1 44 0.60697044 0
#> loc45 1 45 0.52607856 0
#> loc46 1 46 0.44414778 0
#> loc47 1 47 0.36536289 0
#> loc48 1 48 0.28445785 0
#> M6 1 50 0.17439630 0
#> loc50 1 50 0.17439630 0
#> loc51 1 51 0.19465351 0
#> loc52 1 52 0.21951677 0
#> loc53 1 53 0.24871485 0
#> loc54 1 54 0.28182034 0
#> loc55 1 55 0.31827237 0
#> loc56 1 56 0.36712610 0
#> loc57 1 57 0.40889707 0
#> loc58 1 58 0.45165496 0
#> M7 1 60 0.53820711 0
#> loc60 1 60 0.53820711 0
#> loc61 1 61 0.51824229 0
#> loc62 1 62 0.49986308 0
#> loc63 1 63 0.48264112 0
#> loc64 1 64 0.46615181 0
#> loc65 1 65 0.45007353 0
#> loc66 1 66 0.43003498 0
#> loc67 1 67 0.41567820 0
#> loc68 1 68 0.40183289 0
#> M8 1 70 0.37670533 0
#> loc70 1 70 0.37670532 0
#> loc71 1 71 0.30090156 0
#> loc72 1 72 0.22973170 0
#> loc73 1 73 0.16654602 0
#> loc74 1 74 0.11464860 0
#> loc75 1 75 0.07679051 0
#> loc76 1 76 0.05468877 0
#> loc77 1 77 0.04871900 0
#> loc78 1 78 0.05790017 0
#> M9 1 80 0.11370984 0
#> loc80 1 80 0.11370985 0
#> loc81 1 81 0.13441827 0
#> loc82 1 82 0.16022553 0
#> loc83 1 83 0.19122816 0
#> loc84 1 84 0.22733276 0
#> loc85 1 85 0.26820134 0
#> loc86 1 86 0.32089346 0
#> loc87 1 87 0.37124990 0
#> loc88 1 88 0.42286918 0
#> M10 1 90 0.52258518 0
#> loc90 1 90 0.52258519 0
#> loc91 1 91 0.55666866 0
#> loc92 1 92 0.58802902 0
#> loc93 1 93 0.62657991 0
#> loc94 1 94 0.64883535 0
#> loc95 1 95 0.65984363 0
#> loc96 1 96 0.65719359 0
#> loc97 1 97 0.63961502 0
#> loc98 1 98 0.60733055 0
#> M11 1 100 0.50067092 0
#> loc100 1 100 0.50067092 0
#> loc101 1 101 0.52015751 0
#> loc102 1 102 0.54480557 0
#> loc103 1 103 0.54290181 0
#> loc104 1 104 0.52817260 0
#> loc105 1 105 0.52551697 0
#> loc106 1 106 0.50633660 0
#> loc107 1 107 0.49849389 0
#> loc108 1 108 0.51489059 0
#> M12 1 110 5.97066700 0
#> loc110 1 110 5.97066700 0
#> loc111 1 111 5.95459428 0
#> loc112 1 112 5.92220227 0
#> loc113 1 113 5.87235208 0
#> loc114 1 114 5.80413814 0
#> loc115 1 115 5.71724937 0
#> loc116 1 116 5.61223033 0
#> loc117 1 117 5.49053478 0
#> loc118 1 118 5.35431731 0
#> M13 1 120 5.04451339 0
#> loc120 1 120 5.04451338 0
#> loc121 1 121 4.94913994 0
#> loc122 1 122 4.85024247 0
#> loc123 1 123 4.74692594 0
#> loc124 1 124 4.63807846 0
#> loc125 1 125 4.52247566 0
#> loc126 1 126 4.39893440 0
#> loc127 1 127 4.26652408 0
#> loc128 1 128 4.12483679 0
#> M14 1 130 1.48723693 0
#> loc130 1 130 1.48723691 0
#> loc131 1 131 1.30635788 0
#> loc132 1 132 1.07946868 0
#> loc133 1 133 0.84706137 0
#> loc134 1 134 0.64011396 0
#> loc135 1 135 0.47685836 0
#> loc136 1 136 0.36142950 0
#> loc137 1 137 0.30583532 0
#> loc138 1 138 0.26798851 0
#> M15 1 140 0.29660051 0
## Visual Analysis LG2
## QTLs peaks at loc45, loc77, and loc 108
print(cim, lg = 2)
## Visual Analysis LG3
## QTLs peaks at M33 and loc65
print(cim,lg = 3)
## Visual Analysis LG4
## QTL peak at M52
print(cim,lg = 4)
Another way is by pointing the mouse for an interactive session of function plot_fullsibQTL
:
Through the visual analysis we found the following eight QTLs peaks corresponding to positions next to the cofactors.
print( cim, lg = 1, pos = c( "loc13", "M12", "loc45" ) )
#> lg pos.cM LOD model
#> loc13 1 13 6.4206840 0
#> loc45 1 45 0.5260786 0
#> M12 1 110 5.9706670 0
In a similar way:
print( cim, lg=2, pos = c( "loc45", "loc77", "loc108" ) )
print( cim, lg=3, pos = c( "M33", "loc65" ) )
print( cim, lg=4, pos = "M52" )
In order to estimate the total of proportion explained by the QTL (including the new ones manually selected) and its characterization, one can use function cim_char
in the following way:
cim_QTL1_lg1 <- cim_char( fullsib = cofs_fs, lg = 1, pheno.col = 1, pos = "loc13" )
cim_QTL2_lg1 <- cim_char( fullsib = cofs_fs, lg = 1, pheno.col = 1, pos = "M12" )
cim_QTL1_lg2 <- cim_char( fullsib = cofs_fs, lg = 2, pheno.col = 1, pos = "loc45" )
cim_QTL2_lg2 <- cim_char( fullsib = cofs_fs, lg = 2, pheno.col = 1, pos = "loc77" )
cim_QTL3_lg2 <- cim_char( fullsib = cofs_fs, lg = 2, pheno.col = 1, pos = "loc108" )
cim_QTL1_lg3 <- cim_char( fullsib = cofs_fs, lg = 3, pheno.col = 1, pos = "M33" )
cim_QTL2_lg3 <- cim_char( fullsib = cofs_fs, lg = 3, pheno.col = 1, pos = "loc65")
cim_QTL1_lg4 <- cim_char( fullsib = cofs_fs, lg = 4, pheno.col = 1 , pos = "M52" )
## Printing the object
cim_QTL1_lg1
#> M2-M3
#> LG 1.000000000
#> pos 13.000000000
#> -log10(pval) 5.776307817
#> LOD_Ha 6.421161966
#> mu -0.190090732
#> alpha_p 1.515466989
#> LOD_H1 5.739391706
#> alpha_q -0.287266743
#> LOD_H2 0.217143175
#> delta_pq -0.437730736
#> LOD_H3 0.459059356
#> H4_pvalue 0.003131291
#> H5_pvalue 0.010658793
#> H6_pvalue 0.709783594
#> model 0.000000000
#> attr(,"class")
#> [1] "fullsib_char" "matrix"
cim_QTL1_lg1
is a matrix with 1 column and 15 lines. The flanking markers of this specific QTL are in the header (M2-M3 in this case). Then, the first line has the LG information. The second line has the position in centiMorgan where the region is localized. The third and fourth lines shows, respectively, \(-log_{10}(p_{value})\)) and LOD Score. The fifth line (mu
) contains the intercept effect of the model. The lines alpha.p
, alpha.q
, and delta.pq
represent the QTL genetic effects. The lines LOD.H1
, LOD.H2
, and LOD.H3
are the statistics for the marginal tests of the respective genetic effects. These three tests have each 1 degree of freedom (testing if the effect are different from zero). In this case, a LOD score above \(0.83\) indicates a significant effect (since LOD \(= 0.83\) is equivalent as a LRT \(= 3.81\), which under H0 follows a \(\chi^2\) distribution with 1 degree of freedom; \(3.81\) is for a significance level of \(5%\)). The lines H4.pvalue
, H5.pvalue
, and H6.pvalue
show the p-values for the complementary test in order to detect the QTL segregation. For more information on interpretations, please see Gazaffi et al. (2014). Finally, the line model
shows which model was evaluated; see details on the vignette QTL mapping with partially informative markers . In some cases, a given test cannot be evaluated, and so the cells will have NA
values.
We can also print all the QTLs in a single table:
## Printing in a single table
cbind( cim_QTL1_lg1, cim_QTL2_lg1, cim_QTL1_lg2, cim_QTL2_lg2,
cim_QTL3_lg2, cim_QTL1_lg3, cim_QTL2_lg3, cim_QTL1_lg4)
#> M2-M3 M12 M20-M21 M23-M24 M26-M27
#> LG 1.000000000 1.000000e+00 2.000000000 2.0000000 2.000000e+00
#> pos 13.000000000 1.100000e+02 45.000000000 77.0000000 1.080000e+02
#> -log10(pval) 5.776307817 5.342661e+00 14.175532464 8.4245522 9.761468e+00
#> LOD_Ha 6.421161966 5.973284e+00 14.987625223 9.1391751 1.050362e+01
#> mu -0.190090732 -2.866931e-01 -0.249944242 -0.2020260 -2.141821e-01
#> alpha_p 1.515466989 1.105868e-01 2.018470697 -1.2669553 2.198974e+00
#> LOD_H1 5.739391706 3.563618e-02 6.928530566 2.2693891 6.722619e+00
#> alpha_q -0.287266743 -1.471497e+00 0.688679248 -1.6428034 2.516817e-01
#> LOD_H2 0.217143175 5.281494e+00 0.884675396 3.0130132 1.119172e-01
#> delta_pq -0.437730736 2.649273e-01 -1.846287237 1.2301934 -1.277437e+00
#> LOD_H3 0.459059356 1.834512e-01 6.990072004 2.3963207 3.521562e+00
#> H4_pvalue 0.003131291 6.668927e-04 0.008110192 0.5277092 1.447511e-04
#> H5_pvalue 0.010658793 7.090789e-01 0.724646833 0.9461134 8.033794e-02
#> H6_pvalue 0.709783594 7.237503e-03 0.014040379 0.5212641 4.101040e-02
#> model 0.000000000 0.000000e+00 0.000000000 0.0000000 0.000000e+00
#> M33 M37-M38 M52
#> LG 3.00000000 3.000000e+00 4.000000000
#> pos 19.99999998 6.500000e+01 59.999999930
#> -log10(pval) 5.33337949 8.649308e+00 10.639847207
#> LOD_Ha 5.96368600 9.368843e+00 11.398141754
#> mu -0.13202914 -3.817750e-02 -0.029297883
#> alpha_p 0.99415759 2.046954e+00 1.411320036
#> LOD_H1 2.15964145 8.880128e+00 5.200490627
#> alpha_q -1.30846024 2.340881e-01 1.536338315
#> LOD_H2 3.78326129 1.109314e-01 6.700681952
#> delta_pq -0.49565965 4.211738e-01 0.298028729
#> LOD_H3 0.70778406 4.071333e-01 0.246488920
#> H4_pvalue 0.44260139 3.682693e-05 0.744413839
#> H5_pvalue 0.22998428 3.115411e-04 0.004033456
#> H6_pvalue 0.05329961 6.987909e-01 0.002403344
#> model 0.00000000 0.000000e+00 0.000000000
To evaluate QTL phases, we can use the get_segr
function:
get_segr( cim_QTL1_lg1, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 1:1
#> [1] "1:1"
get_segr( cim_QTL2_lg1, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 1:1
#> [1] "1:1"
get_segr( cim_QTL1_lg2, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 1:2:1
get_segr( cim_QTL2_lg2, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 3:1
get_segr( cim_QTL3_lg2, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 1:2:1
get_segr( cim_QTL1_lg3, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 1:2:1
get_segr( cim_QTL2_lg3, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 1:1
#> [1] "1:1"
get_segr( cim_QTL1_lg4, probs1 = 0.05, probs2 = 0.05 )
#> QTL segregation is 1:2:1
And finally, to draw a visualization of the genetic map with QTLs and infered phases:
cim_QTL1_lg1_draw <- draw_phase(fullsib = cofs_fs, fullsib.char = cim_QTL1_lg1)
# Two ways of visualize
cim_QTL1_lg1_draw
#>
#> Printing QTL and its linkage phase between markers across the LG:
#>
#> Markers Position Parent 1 Parent 2
#>
#> M1 0.00 a | | o a | | o
#> M2 10.00 a | | b c | | d
#> QTL 13.00 P1 | | P2 Q0 | | Q0
#> M3 20.00 c | | c a | | b
#> M4 30.00 a | | o a | | o
#> M5 40.00 a | | b a | | b
#> M6 50.00 a | | b a | | b
#> M7 60.00 a | | o a | | o
#> M8 70.00 a | | o a | | o
#> M9 80.00 a | | b c | | d
#> M10 90.00 a | | b c | | c
#> M11 100.00 a | | b c | | d
#> M12 110.00 a | | b a | | o
#> M13 120.00 a | | b c | | c
#> M14 130.00 a | | b c | | d
#> M15 140.00 a | | b c | | d
#>
#> P1 and Q1 have positive effect (increase phenotypic value)
#> P2 and Q2 have negative effect (reduce phenotypic value)
#> P0 and Q0 have neutral effect (non signif.)
plot_fullsib_phases(cim_QTL1_lg1_draw)
# If include fullsib.scan object you can also visualize the confidence interval considering 1 or 2 LOD
cim_QTL1_lg1_draw <- draw_phase(fullsib = cofs_fs, fullsib.char = cim_QTL1_lg1, fullsib.scan = cim)
plot_fullsib_phases(cim_QTL1_lg1_draw, CI = "LOD1")
# Too access the interval limits in cM
cim_QTL1_lg1_draw[2:3]
#> $LOD1
#> [1] 0 18
#>
#> $LOD2
#> [1] 0 18
# You can also plot multiple QTLs of same group
cim_QTL2_lg1_draw <- draw_phase( cofs_fs, cim_QTL2_lg1 )
cim_QTL1_lg2_draw <- draw_phase( cofs_fs, cim_QTL1_lg2 )
cim_QTL2_lg2_draw <- draw_phase( cofs_fs, cim_QTL2_lg2 )
cim_QTL3_lg2_draw <- draw_phase( cofs_fs, cim_QTL3_lg2 )
cim_QTL1_lg3_draw <- draw_phase( cofs_fs, cim_QTL1_lg3 )
cim_QTL2_lg3_draw <- draw_phase( cofs_fs, cim_QTL2_lg3 )
cim_QTL1_lg4_draw <- draw_phase( cofs_fs, cim_QTL1_lg4 )
p1 <- plot_fullsib_phases(x = list(cim_QTL1_lg1_draw,
cim_QTL2_lg1_draw))
p2 <- plot_fullsib_phases(list(cim_QTL1_lg2_draw,
cim_QTL2_lg2_draw,
cim_QTL3_lg2_draw))
p3 <- plot_fullsib_phases(list(cim_QTL1_lg3_draw,
cim_QTL2_lg3_draw))
p4 <- plot_fullsib_phases(cim_QTL1_lg4_draw)
library(ggpubr)
#> Loading required package: ggplot2
ggarrange(p1,p2,p3,p4, common.legend = T)
Estimating QTL effects using least squares
Function r2_l2
calculates the proportion of phenotypic variation explained by the QTL in the model (altogether, denoted by R2.trait
, and individually, indicated as R2.lgX.Y
, where lgX.Y
means the LG number \(X\) in the \(Y\)-th position). To do this, a least squares approximation is used (instead of the mixture model with the EM algorithm). This is good enough for interpretations, but much faster. For more information, use ?r2_l2
.
In this example, for the model with all QTL mapped with CIM and without the cofactors (since they are only a strategy to control background variation during the mapping procedure):
## Estimating QTL effects without covariate
qtls.cim <- r2_ls( fsib, pheno.col = 1, lg = c( 1, 1, 2, 2, 2, 3, 3, 4 ),
pos = c( "loc13", "M12", "loc45", "loc77", "loc108", "M33", "loc65", "M52" ) )
If for some reason one wants also to consider the cofactors:
## Estimating QTL effects with covariates
covar <- matrix( rep( c( 1, -1 ), each = 150 ), ncol = 1 )
r2_ls( fsib, pheno.col = 2, addcovar = covar, lg = c( 1, 1, 2, 2, 2, 3, 3, 4 ),
pos = c( "loc13", "M12", "loc45", "loc77", "loc108", "M33", "loc65", "M52" ) )
#> lg loc r2
#> R2.trait NA All 57.4916
#> R2.lg1.loc13 1 loc13 4.4783
#> R2.lg1.M12 1 M12 3.2839
#> R2.lg2.loc45 2 loc45 11.9193
#> R2.lg2.loc77 2 loc77 7.5766
#> R2.lg2.loc108 2 loc108 9.0162
#> R2.lg3.M33 3 M33 4.5839
#> R2.lg3.loc65 3 loc65 6.2476
#> R2.lg4.M52 4 M52 8.4961
Graphical visualization of QTLs
It is possible to plot a useful visualization of the CIM indicating the r2ls.out
object:
plot_fullsibQTL( fullsib = fsib,
fullsib.scan = cim,
r2ls.out = qtls.cim,
qtlmapping = "CIM",
thr = c( 4.023724, 2.900906 ) )
Reporting of Results
The following articles can be used as examples of how to report the QTL mapping results obtained with package fullsibQTL
:
Conson ARO, Taniguti CH, Amadeu RR, Andreotti IAA., de Souza LM, dos Santos LHB, Rosa JRBF, Mantello CC, da Silva CC, José Scaloppi Junior E, Ribeiro RV, Le Guen V, Garcia AAF, Gonçalves P de S, & de Souza AP (2018). High-Resolution Genetic Map and QTL Analysis of Growth-Related Traits of Hevea brasiliensis Cultivated Under Suboptimal Temperature and Humidity Conditions. Frontiers in Plant Science, 9(August), 1–16. ISSN 1664-462X. https://doi.org/10.3389/fpls.2018.01255
Balsalobre TWA, da Silva Pereira G, Margarido GRA, Gazaffi R, Barreto FZ, Anoni CO, Cardoso-Silva CB, Costa EA, Mancini MC, Hoffmann HP, de Souza AP, Garcia AAF, Carneiro MS (2017). GBS-based single dosage markers for linkage and QTL mapping allow gene mining for yield-related traits in sugarcane. BMC Genomics, 18(1), 72. ISSN 1471-2164. https://doi.org/10.1186/s12864-016-3383-x.
Costa EA, Anoni CO, Mancini MC, Santos FRC, Marconi TG, Gazaffi R, Pastina MM, Perecin D, Mollinari M, Xavier MA, Pinto LR, Souza AP, Garcia AAF (2016). QTL mapping including codominant SNP markers with ploidy level information in a sugarcane progeny. Euphytica, 211(1), 1–16. ISSN 1573-5060. https://doi.org/10.1007/s10681-016-1746-7.
da Silva Pereira G, Di Cassia Laperuta L, Nunes ES, Chavarrı́a L, Pastina MM, Gazaffi R, Geraldi IO, Garcia AAF, Vieira MLC (2017). The Sweet Passion Fruit (Passiflora alata) Crop: Genetic and Phenotypic Parameter Estimates and QTL Mapping for Fruit Traits. Tropical Plant Biology, 10(1), 18–29. ISSN 1935-9764. https://doi.org/10.1007/s12042-016-9181-4.
Margarido GRA, Pastina MM, Souza AP, Garcia AAF (2015). Multi-trait multi-environment quantitative trait loci mapping for a sugarcane commercial cross provides insights on the inheritance of important traits. Molecular Breeding, 35(8), 175. ISSN 1572-9788. https://doi.org/10.1007/s11032-015-0366-6.
References
Belsley, D. A., Edwin K., and Roy E. W. (2005). Regression diagnostics: Identifying influential data and sources of collinearity. Vol. 571. John Wiley & Sons.
Cockerham, C. C. (1954). An extension of the concept of partitioning hereditary variance for analysis of covariances among relatives when epistasis is present. Genetics 39(6), 859.
Chen, L., and Storey, J. D. (2006). Relaxed significance criteria for linkage analysis. Genetics, 173(4), 2371-2381.
Churchill, G. A., and Doerge, R. W. (1994). Empirical threshold values for quantitative trait mapping. Genetics, 138(3), 963-971.
Gazaffi, R., Margarido, G. R., Pastina, M. M., Mollinari, M., Garcia, A. A. F. (2014). A model for quantitative trait loci mapping, linkage phase, and segregation pattern estimation for a full-sib progeny. Tree Genetics & Genomes, 10(4), 791-801.
Grattapaglia, D., and Sederoff, R. (1994). Genetic linkage maps of Eucalyptus grandis and Eucalyptus urophylla using a pseudo-testcross: mapping strategy and RAPD markers. Genetics, 137(4), 1121-1137.
Margarido, G. R. A., Souza, A.P. and Garcia, A. A. F. OneMap: software for genetic mapping in outcrossing species. Hereditas 144, 78-79, 2007.
Wu, R., Ma, C.X., Painter, I. and Zeng, Z.-B. Simultaneous maximum likelihood estimation of linkage and linkage phases in outcrossing species. Theoretical Population Biology 61, 349-363, 2002a.
Wu, R., Ma, C.-X., Wu, S. S. and Zeng, Z.-B. Linkage mapping of sex-specific differences. Genetical Research 79, 85-96, 2002b.
Session Info
sessionInfo()
#> R version 3.6.2 (2019-12-12)
#> Platform: x86_64-pc-linux-gnu (64-bit)
#> Running under: Debian GNU/Linux 10 (buster)
#>
#> Matrix products: default
#> BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so
#>
#> locale:
#> [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
#> [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
#> [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=C
#> [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
#> [9] LC_ADDRESS=C LC_TELEPHONE=C
#> [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] ggpubr_0.4.0 ggplot2_3.3.2 fullsibQTL_0.0.9010
#> [4] onemap_2.2.0 rmdformats_0.3.7 knitr_1.29
#>
#> loaded via a namespace (and not attached):
#> [1] colorspace_1.4-1 ggsignif_0.6.0
#> [3] ellipsis_0.3.1 class_7.3-15
#> [5] rio_0.5.16 htmlTable_2.0.1
#> [7] RcppArmadillo_0.9.900.1.0 base64enc_0.1-3
#> [9] rstudioapi_0.11 mice_3.10.0
#> [11] farver_2.0.3 codetools_0.2-16
#> [13] splines_3.6.2 doParallel_1.0.15
#> [15] Formula_1.2-3 polynom_1.4-0
#> [17] jsonlite_1.7.0 MDSMap_1.1
#> [19] broom_0.7.0 cluster_2.1.0
#> [21] png_0.1-7 shiny_1.5.0
#> [23] httr_1.4.1 compiler_3.6.2
#> [25] backports_1.1.8 assertthat_0.2.1
#> [27] lazyeval_0.2.2 Matrix_1.2-18
#> [29] fastmap_1.0.1 later_1.1.0.1
#> [31] acepack_1.4.1 htmltools_0.5.0
#> [33] tools_3.6.2 gtable_0.3.0
#> [35] glue_1.4.1 reshape2_1.4.4
#> [37] dplyr_1.0.0 ggthemes_4.2.0
#> [39] Rcpp_1.0.5 carData_3.0-4
#> [41] cellranger_1.1.0 vctrs_0.3.1
#> [43] gdata_2.18.0 iterators_1.0.12
#> [45] crosstalk_1.1.0.1 xfun_0.15
#> [47] stringr_1.4.0 openxlsx_4.1.5
#> [49] mime_0.9 miniUI_0.1.1.1
#> [51] lifecycle_0.2.0 weights_1.0.1
#> [53] gtools_3.8.2 rstatix_0.6.0
#> [55] princurve_2.1.4 candisc_0.8-3
#> [57] MASS_7.3-51.4 scales_1.1.1
#> [59] heplots_1.3-5 hms_0.5.3
#> [61] promises_1.1.1 parallel_3.6.2
#> [63] smacof_2.1-1 RColorBrewer_1.1-2
#> [65] yaml_2.2.1 curl_4.3
#> [67] gridExtra_2.3 updog_1.2.0
#> [69] rpart_4.1-15 reshape_0.8.8
#> [71] latticeExtra_0.6-29 stringi_1.4.6
#> [73] foreach_1.5.0 plotrix_3.7-8
#> [75] e1071_1.7-3 checkmate_2.0.0
#> [77] zip_2.0.4 manipulateWidget_0.10.1
#> [79] rlang_0.4.7 pkgconfig_2.0.3
#> [81] rgl_0.100.54 evaluate_0.14
#> [83] lattice_0.20-38 purrr_0.3.4
#> [85] labeling_0.3 htmlwidgets_1.5.1
#> [87] cowplot_1.1.0 tidyselect_1.1.0
#> [89] plyr_1.8.6 magrittr_1.5
#> [91] bookdown_0.21 R6_2.4.1
#> [93] generics_0.0.2 nnls_1.4
#> [95] Hmisc_4.4-0 withr_2.2.0
#> [97] pillar_1.4.6 haven_2.3.1
#> [99] foreign_0.8-72 survival_3.1-8
#> [101] abind_1.4-5 nnet_7.3-12
#> [103] tibble_3.0.3 crayon_1.3.4
#> [105] car_3.0-8 wordcloud_2.6
#> [107] plotly_4.9.2.1 ellipse_0.4.2
#> [109] rmarkdown_2.5 jpeg_0.1-8.1
#> [111] grid_3.6.2 readxl_1.3.1
#> [113] data.table_1.12.8 forcats_0.5.0
#> [115] digest_0.6.25 webshot_0.5.2
#> [117] xtable_1.8-4 tidyr_1.1.0
#> [119] httpuv_1.5.4 munsell_0.5.0
#> [121] viridisLite_0.3.0