Skip to contents

Identifies matches between bloodmeal STR profiles and a database of human STR profiles. The euroformix::contLikSearch() function is used to calculate log10 likelihood ratios (log10_lrs) that are then used to identify human contributors to each bloodmeal. For more details than are present here, see vignette('bistro').

Usage

bistro(
  bloodmeal_profiles,
  human_profiles,
  kit,
  peak_thresh,
  pop_allele_freqs = NULL,
  calc_allele_freqs = FALSE,
  bloodmeal_ids = NULL,
  human_ids = NULL,
  rm_twins = TRUE,
  rm_markers = c("AMEL"),
  model_degrad = TRUE,
  model_bw_stutt = FALSE,
  model_fw_stutt = FALSE,
  difftol = 1,
  threads = 4,
  seed = NULL,
  time_limit = 3,
  return_lrs = FALSE
)

Arguments

bloodmeal_profiles

Tibble or data frame with alleles for all bloodmeals in reference database including 4 columns: SampleName, Marker, Allele, Height. Height must be numeric or coercible to numeric.

human_profiles

Tibble or data frame with alleles for all humans in reference database including three columns: SampleName, Marker, Allele.

kit

STR kit name from euroformix. To see a list of all kits embedded in euroformix use euroformix::getKit(). If your kit is not included, see vignette("bistro") for details on how to include your own kit.

peak_thresh

Allele peak height threshold in RFUs. All peaks under this threshold will be filtered out. If prior filtering was performed, this number should be equal to or greater than that number. Also used for threshT argument in euroformix::contLikSearch().

pop_allele_freqs

Tibble or data frame where the first column is the STR allele and the following columns are the frequency of that allele for different markers. Alleles that do not exist for a given marker are coded as NA. If NULL and calc_allele_freqs = TRUE, then population allele frequencies will be calculated from human_profiles.

calc_allele_freqs

A boolean indicating whether or not to calculate allele frequencies from human_profiles. If FALSE, a pop_allele_freqs input is required. Default: FALSE

bloodmeal_ids

Vector of bloodmeal ids from the SampleName column in bloodmeal_profiles for which to compute log10_lrs. If NULL, all ids in the input dataframe will be used. Default: NULL

human_ids

Vector of human ids from the SampleName column in human_profiles for which to compute log10_lrs. If NULL, all ids in the input dataframe will be used. Default: NULL

rm_twins

A boolean indicating whether or not to remove likely twins (identical STR profiles) from the human database prior to identifying matches. Default: TRUE

rm_markers

A vector indicating what markers should be removed prior to calculating log10LRs. NULL to include all markers. By default, for the bistro function AMEL is removed as it is not standard to include it in LR calculations.

model_degrad

A boolean indicating whether or not to model peak degradation. Used for modelDegrad argument in euroformix::contLikSearch(). Default: TRUE

model_bw_stutt

A boolean indicating whether or not to model peak backward stutter. Used for modelBWstutt argument in euroformix::contLikSearch(). Default: FALSE

model_fw_stutt

A boolean indicating whether or not to model peak forward stutter. Used for modelFWstutt argument in euroformix::contLikSearch(). Default: FALSE

difftol

Tolerance for difference in log likelihoods across 2 iterations. euroformix::contLikSearch() argument. Default: 1

threads

Number of threads to use when calculating log10_lrs. euroformix::contLikSearch() argument. Default: 4

seed

Seed when calculating log10_lrs. euroformix::contLikSearch() argument. Default: NULL (no seed)

time_limit

Time limit in minutes to run the euroformix::contLikSearch() function on 1 bloodmeal-human pair. Default: 3

return_lrs

A boolean indicating whether or not to return log10LRs for all bloodmeal-human pairs. Default: FALSE

Value

Tibble with matches for bloodmeal-human pairs including the columns listed below. Note that if multiple matches are found for a bloodmeal, these are included as separate rows.

  • bloodmeal_id: bloodmeal id

  • locus_count: number of loci successfully typed in the bloodmeal

  • est_noc: estimated number of contributors to the bloodmeal

  • match: whether a match was identified for a given bloodmeal (yes or no)

  • human_id: If match, human id (NA otherwise)

  • log10_lr: If match, log10 likelihood ratio (NA otherwise)

  • notes: Why the bloodmeal does or doesn't have a match

    If return_lrs = TRUE, then a named list of length 2 is returned:

  • matches - the tibble described above

  • lrs - log10LRs for each bloodmeal-human pair including some of the columns described above and an additional column: efm_noc, which is the number of contributors used as input into euroformix, which is min(est_noc, 3).

Examples

bistro(bloodmeal_profiles, human_profiles,
  pop_allele_freqs = pop_allele_freqs,
  kit = "ESX17", peak_thresh = 200
)
#> 1/17 markers in kit but not in pop_allele_freqs: AMEL
#> Formatting bloodmeal profiles
#> Removing 6 peaks under the threshold of 200 RFU.
#> For 1/4 bloodmeal ids, all peaks are below the threshold
#> Formatting human profiles
#> Markers being used: D10S1248, D12S391, D16S539, D18S51, D19S433, D1S1656, D21S11, D22S1045, D2S1338, D2S441, D3S1358, D8S1179, FGA, SE33, TH01, VWA
#> Calculating log10LRs
#> # bloodmeal ids: 3
#> # human ids: 3
#> Bloodmeal id 1/3
#> Human id 1/3
#> Human id 2/3
#> Human id 3/3
#> Bloodmeal id 2/3
#> Human id 1/3
#> Human id 2/3
#> Human id 3/3
#> Bloodmeal id 3/3
#> Human id 1/3
#> Human id 2/3
#> Human id 3/3
#> Identifying matches
#> # A tibble: 4 × 8
#>   bloodmeal_id locus_count est_noc match human_id log10_lr notes      thresh_low
#>   <chr>              <int>   <dbl> <chr> <chr>       <dbl> <chr>           <dbl>
#> 1 evid1                 16       2 yes   P1           21.8 passed al…        9.5
#> 2 evid1                 16       2 yes   P2           10.3 passed al…        9.5
#> 3 evid2                  1       2 no    NA           NA   all log10…       NA  
#> 4 evid3                  8       1 no    NA           NA   all log10…       NA