Identifies matches between bloodmeal STR profiles and a database
of human STR profiles. The euroformix::contLikSearch()
function is used
to calculate log10 likelihood ratios (log10_lrs) that are then used to
identify human contributors to each bloodmeal. For more details than are
present here, see vignette('bistro')
.
Usage
bistro(
bloodmeal_profiles,
human_profiles,
kit,
peak_thresh,
pop_allele_freqs = NULL,
calc_allele_freqs = FALSE,
bloodmeal_ids = NULL,
human_ids = NULL,
rm_twins = TRUE,
rm_markers = c("AMEL"),
model_degrad = TRUE,
model_bw_stutt = FALSE,
model_fw_stutt = FALSE,
difftol = 1,
threads = 4,
seed = NULL,
time_limit = 3,
return_lrs = FALSE
)
Arguments
- bloodmeal_profiles
Tibble or data frame with alleles for all bloodmeals in reference database including 4 columns: SampleName, Marker, Allele, Height. Height must be numeric or coercible to numeric.
- human_profiles
Tibble or data frame with alleles for all humans in reference database including three columns: SampleName, Marker, Allele.
- kit
STR kit name from euroformix. To see a list of all kits embedded in euroformix use
euroformix::getKit()
. If your kit is not included, see vignette("bistro") for details on how to include your own kit.- peak_thresh
Allele peak height threshold in RFUs. All peaks under this threshold will be filtered out. If prior filtering was performed, this number should be equal to or greater than that number. Also used for
threshT
argument ineuroformix::contLikSearch()
.- pop_allele_freqs
Tibble or data frame where the first column is the STR allele and the following columns are the frequency of that allele for different markers. Alleles that do not exist for a given marker are coded as NA. If NULL and
calc_allele_freqs = TRUE
, then population allele frequencies will be calculated fromhuman_profiles
.- calc_allele_freqs
A boolean indicating whether or not to calculate allele frequencies from
human_profiles
. If FALSE, apop_allele_freqs
input is required. Default: FALSE- bloodmeal_ids
Vector of bloodmeal ids from the SampleName column in
bloodmeal_profiles
for which to compute log10_lrs. If NULL, all ids in the input dataframe will be used. Default: NULL- human_ids
Vector of human ids from the SampleName column in
human_profiles
for which to compute log10_lrs. If NULL, all ids in the input dataframe will be used. Default: NULL- rm_twins
A boolean indicating whether or not to remove likely twins (identical STR profiles) from the human database prior to identifying matches. Default: TRUE
- rm_markers
A vector indicating what markers should be removed prior to calculating log10LRs. NULL to include all markers. By default, for the bistro function AMEL is removed as it is not standard to include it in LR calculations.
- model_degrad
A boolean indicating whether or not to model peak degradation. Used for
modelDegrad
argument ineuroformix::contLikSearch()
. Default: TRUE- model_bw_stutt
A boolean indicating whether or not to model peak backward stutter. Used for
modelBWstutt
argument ineuroformix::contLikSearch()
. Default: FALSE- model_fw_stutt
A boolean indicating whether or not to model peak forward stutter. Used for
modelFWstutt
argument ineuroformix::contLikSearch()
. Default: FALSE- difftol
Tolerance for difference in log likelihoods across 2 iterations.
euroformix::contLikSearch()
argument. Default: 1- threads
Number of threads to use when calculating log10_lrs.
euroformix::contLikSearch()
argument. Default: 4- seed
Seed when calculating log10_lrs.
euroformix::contLikSearch()
argument. Default: NULL (no seed)- time_limit
Time limit in minutes to run the
euroformix::contLikSearch()
function on 1 bloodmeal-human pair. Default: 3- return_lrs
A boolean indicating whether or not to return log10LRs for all bloodmeal-human pairs. Default: FALSE
Value
Tibble with matches for bloodmeal-human pairs including the columns listed below. Note that if multiple matches are found for a bloodmeal, these are included as separate rows.
bloodmeal_id
: bloodmeal idlocus_count
: number of loci successfully typed in the bloodmealest_noc
: estimated number of contributors to the bloodmealmatch
: whether a match was identified for a given bloodmeal (yes or no)human_id
: If match, human id (NA otherwise)log10_lr
: If match, log10 likelihood ratio (NA otherwise)notes
: Why the bloodmeal does or doesn't have a matchIf
return_lrs = TRUE
, then a named list of length 2 is returned:matches - the tibble described above
lrs - log10LRs for each bloodmeal-human pair including some of the columns described above and an additional column:
efm_noc
, which is the number of contributors used as input into euroformix, which ismin(est_noc, 3)
.
Examples
bistro(bloodmeal_profiles, human_profiles,
pop_allele_freqs = pop_allele_freqs,
kit = "ESX17", peak_thresh = 200
)
#> 1/17 markers in kit but not in pop_allele_freqs: AMEL
#> Formatting bloodmeal profiles
#> Removing 6 peaks under the threshold of 200 RFU.
#> For 1/4 bloodmeal ids, all peaks are below the threshold
#> Formatting human profiles
#> Markers being used: D10S1248, D12S391, D16S539, D18S51, D19S433, D1S1656, D21S11, D22S1045, D2S1338, D2S441, D3S1358, D8S1179, FGA, SE33, TH01, VWA
#> Calculating log10LRs
#> # bloodmeal ids: 3
#> # human ids: 3
#> Bloodmeal id 1/3
#> Human id 1/3
#> Human id 2/3
#> Human id 3/3
#> Bloodmeal id 2/3
#> Human id 1/3
#> Human id 2/3
#> Human id 3/3
#> Bloodmeal id 3/3
#> Human id 1/3
#> Human id 2/3
#> Human id 3/3
#> Identifying matches
#> # A tibble: 4 × 8
#> bloodmeal_id locus_count est_noc match human_id log10_lr notes thresh_low
#> <chr> <int> <dbl> <chr> <chr> <dbl> <chr> <dbl>
#> 1 evid1 16 2 yes P1 21.8 passed al… 9.5
#> 2 evid1 16 2 yes P2 10.3 passed al… 9.5
#> 3 evid2 1 2 no NA NA all log10… NA
#> 4 evid3 8 1 no NA NA all log10… NA