Multiple platform build/check report for BioC 3.9 - CHECK report for atSNP on tokay2

Back to Multiple platform build/check report for BioC 3.9

[A]

CHECK report for atSNP on tokay2

This page was generated on 2019-10-16 12:43:27 -0400 (Wed, 16 Oct 2019).

atSNP 1.0.0
Sunyoung Shin

Snapshot Date: 2019-10-15 17:01:26 -0400 (Tue, 15 Oct 2019)

URL: https://git.bioconductor.org/packages/atSNP

Branch: RELEASE_3_9

Last Commit: a54ff2e

Last Changed Date: 2019-05-02 11:54:14 -0400 (Thu, 02 May 2019)

malbec2

Linux (Ubuntu 18.04.2 LTS) / x86_64

tokay2

Windows Server 2012 R2 Standard / x64

[ OK ]

celaya2

OS X 10.11.6 El Capitan / x86_64

Summary

Package: atSNP

Version: 1.0.0

Command: C:\Users\biocbuild\bbs-3.9-bioc\R\bin\R.exe CMD check --force-multiarch --install=check:atSNP.install-out.txt --library=C:\Users\biocbuild\bbs-3.9-bioc\R\library --no-vignettes --timings atSNP_1.0.0.tar.gz

StartedAt: 2019-10-16 02:01:18 -0400 (Wed, 16 Oct 2019)

EndedAt: 2019-10-16 02:10:21 -0400 (Wed, 16 Oct 2019)

EllapsedTime: 543.3 seconds

RetCode: 0

Status: OK

CheckDir: atSNP.Rcheck

Warnings: 0

Command output

##############################################################################
##############################################################################
###
### Running command:
###
###   C:\Users\biocbuild\bbs-3.9-bioc\R\bin\R.exe CMD check --force-multiarch --install=check:atSNP.install-out.txt --library=C:\Users\biocbuild\bbs-3.9-bioc\R\library --no-vignettes --timings atSNP_1.0.0.tar.gz
###
##############################################################################
##############################################################################


* using log directory 'C:/Users/biocbuild/bbs-3.9-bioc/meat/atSNP.Rcheck'
* using R version 3.6.1 (2019-07-05)
* using platform: x86_64-w64-mingw32 (64-bit)
* using session charset: ISO8859-1
* using option '--no-vignettes'
* checking for file 'atSNP/DESCRIPTION' ... OK
* checking extension type ... Package
* this is package 'atSNP' version '1.0.0'
* checking package namespace information ... OK
* checking package dependencies ... OK
* checking if this is a source package ... OK
* checking if there is a namespace ... OK
* checking for hidden files and directories ... OK
* checking for portable file names ... OK
* checking whether package 'atSNP' can be installed ... OK
* checking installed package size ... OK
* checking package directory ... OK
* checking 'build' directory ... OK
* checking DESCRIPTION meta-information ... OK
* checking top-level files ... OK
* checking for left-over files ... OK
* checking index information ... OK
* checking package subdirectories ... OK
* checking R files for non-ASCII characters ... OK
* checking R files for syntax errors ... OK
* loading checks for arch 'i386'
** checking whether the package can be loaded ... OK
** checking whether the package can be loaded with stated dependencies ... OK
** checking whether the package can be unloaded cleanly ... OK
** checking whether the namespace can be loaded with stated dependencies ... OK
** checking whether the namespace can be unloaded cleanly ... OK
* loading checks for arch 'x64'
** checking whether the package can be loaded ... OK
** checking whether the package can be loaded with stated dependencies ... OK
** checking whether the package can be unloaded cleanly ... OK
** checking whether the namespace can be loaded with stated dependencies ... OK
** checking whether the namespace can be unloaded cleanly ... OK
* checking dependencies in R code ... NOTE
Namespace in Imports field not imported from: 'graphics'
  All declared Imports should be used.
* checking S3 generic/method consistency ... OK
* checking replacement functions ... OK
* checking foreign function calls ... OK
* checking R code for possible problems ... NOTE
ComputePValues: no visible binding for global variable 'motif'
ComputePValues: no visible binding for global variable 'snpid'
ComputePValues: no visible binding for global variable 'snpbase'
ComputePValues: no visible binding for global variable 'pval_ref'
ComputePValues: no visible binding for global variable 'pval_snp'
ComputePValues: no visible binding for global variable 'pval_cond_ref'
ComputePValues: no visible binding for global variable 'pval_cond_snp'
ComputePValues: no visible binding for global variable 'pval_diff'
ComputePValues: no visible binding for global variable 'pval_rank'
LoadSNPData: no visible binding for global variable 'IUPAC_CODE_MAP'
MatchSubsequence: no visible binding for global variable 'motif'
MatchSubsequence: no visible binding for global variable 'snpid'
MatchSubsequence: no visible binding for global variable 'snpbase'
MatchSubsequence: no visible binding for global variable 'len_seq'
MatchSubsequence: no visible binding for global variable 'ref_seq'
MatchSubsequence : : no visible binding for global variable
  'motif'
dtMotifMatch: no visible binding for global variable 'ref_seq'
dtMotifMatch: no visible binding for global variable 'len_seq'
dtMotifMatch: no visible binding for global variable 'snp_ref_start'
dtMotifMatch: no visible binding for global variable 'ref_start'
dtMotifMatch: no visible binding for global variable 'snp_start'
dtMotifMatch: no visible binding for global variable 'snp_ref_end'
dtMotifMatch: no visible binding for global variable 'ref_end'
dtMotifMatch: no visible binding for global variable 'snp_end'
dtMotifMatch: no visible binding for global variable 'snp_ref_length'
dtMotifMatch: no visible binding for global variable
  'ref_aug_match_seq_forward'
dtMotifMatch: no visible binding for global variable
  'ref_aug_match_seq_reverse'
dtMotifMatch: no visible binding for global variable
  'snp_aug_match_seq_forward'
dtMotifMatch: no visible binding for global variable 'snp_seq'
dtMotifMatch: no visible binding for global variable
  'snp_aug_match_seq_reverse'
dtMotifMatch: no visible binding for global variable 'ref_strand'
dtMotifMatch: no visible binding for global variable 'ref_location'
dtMotifMatch: no visible binding for global variable 'snp_strand'
dtMotifMatch: no visible binding for global variable 'snp_location'
dtMotifMatch: no visible binding for global variable
  'ref_extra_pwm_left'
dtMotifMatch: no visible binding for global variable
  'ref_extra_pwm_right'
dtMotifMatch: no visible binding for global variable
  'snp_extra_pwm_left'
dtMotifMatch: no visible binding for global variable
  'snp_extra_pwm_right'
dtMotifMatch: no visible binding for global variable 'snpid'
match_subseq_par: no visible binding for global variable 'snpid'
match_subseq_par: no visible binding for global variable 'motif'
match_subseq_par: no visible binding for global variable 'snpbase'
match_subseq_par: no visible binding for global variable 'ref_strand'
match_subseq_par: no visible binding for global variable
  'ref_match_seq'
match_subseq_par: no visible binding for global variable 'ref_seq'
match_subseq_par: no visible binding for global variable 'ref_start'
match_subseq_par: no visible binding for global variable 'ref_end'
match_subseq_par: no visible binding for global variable 'ref_seq_rev'
match_subseq_par: no visible binding for global variable 'len_seq'
match_subseq_par: no visible binding for global variable 'snp_strand'
match_subseq_par: no visible binding for global variable
  'snp_match_seq'
match_subseq_par: no visible binding for global variable 'snp_seq'
match_subseq_par: no visible binding for global variable 'snp_start'
match_subseq_par: no visible binding for global variable 'snp_end'
match_subseq_par: no visible binding for global variable 'snp_seq_rev'
match_subseq_par: no visible binding for global variable
  'snp_seq_ref_match'
match_subseq_par: no visible binding for global variable
  'ref_seq_snp_match'
match_subseq_par: no visible binding for global variable 'motif_len'
match_subseq_par: no visible binding for global variable 'log_lik_ref'
match_subseq_par: no visible binding for global variable 'log_lik_snp'
match_subseq_par: no visible binding for global variable
  'log_lik_ratio'
match_subseq_par: no visible binding for global variable
  'log_enhance_odds'
match_subseq_par: no visible binding for global variable
  'log_reduce_odds'
match_subseq_par: no visible binding for global variable 'IUPAC'
motif_score_par: no visible binding for global variable 'motif'
motif_score_par: no visible binding for global variable 'snpbase'
results_motif_par: no visible global function definition for 'ggplot'
results_motif_par: no visible global function definition for 'aes'
results_motif_par: no visible binding for global variable 'p.value'
results_motif_par: no visible global function definition for
  'geom_point'
results_motif_par: no visible global function definition for
  'scale_y_log10'
results_motif_par: no visible global function definition for
  'geom_errorbar'
results_motif_par: no visible global function definition for 'ggtitle'
Undefined global functions or variables:
  IUPAC IUPAC_CODE_MAP aes geom_errorbar geom_point ggplot ggtitle
  len_seq log_enhance_odds log_lik_ratio log_lik_ref log_lik_snp
  log_reduce_odds motif motif_len p.value pval_cond_ref pval_cond_snp
  pval_diff pval_rank pval_ref pval_snp ref_aug_match_seq_forward
  ref_aug_match_seq_reverse ref_end ref_extra_pwm_left
  ref_extra_pwm_right ref_location ref_match_seq ref_seq ref_seq_rev
  ref_seq_snp_match ref_start ref_strand scale_y_log10
  snp_aug_match_seq_forward snp_aug_match_seq_reverse snp_end
  snp_extra_pwm_left snp_extra_pwm_right snp_location snp_match_seq
  snp_ref_end snp_ref_length snp_ref_start snp_seq snp_seq_ref_match
  snp_seq_rev snp_start snp_strand snpbase snpid
* checking Rd files ... OK
* checking Rd metadata ... OK
* checking Rd cross-references ... OK
* checking for missing documentation entries ... OK
* checking for code/documentation mismatches ... OK
* checking Rd \usage sections ... OK
* checking Rd contents ... OK
* checking for unstated dependencies in examples ... OK
* checking contents of 'data' directory ... OK
* checking data for non-ASCII characters ... OK
* checking data for ASCII and uncompressed saves ... OK
* checking line endings in C/C++/Fortran sources/headers ... OK
* checking compiled code ... NOTE
Note: information on .o files for i386 is not available
Note: information on .o files for x64 is not available
File 'C:/Users/biocbuild/bbs-3.9-bioc/R/library/atSNP/libs/i386/atSNP.dll':
  Found 'abort', possibly from 'abort' (C), 'runtime' (Fortran)
  Found 'exit', possibly from 'exit' (C), 'stop' (Fortran)
  Found 'printf', possibly from 'printf' (C)
File 'C:/Users/biocbuild/bbs-3.9-bioc/R/library/atSNP/libs/x64/atSNP.dll':
  Found 'abort', possibly from 'abort' (C), 'runtime' (Fortran)
  Found 'exit', possibly from 'exit' (C), 'stop' (Fortran)
  Found 'printf', possibly from 'printf' (C)

Compiled code should not call entry points which might terminate R nor
write to stdout/stderr instead of to the console, nor use Fortran I/O
nor system RNGs. The detected symbols are linked into the code but
might come from libraries and not actually be called.

See 'Writing portable packages' in the 'Writing R Extensions' manual.
* checking files in 'vignettes' ... OK
* checking examples ...
** running examples for arch 'i386' ... OK
Examples with CPU or elapsed time > 5s
                  user system elapsed
MatchSubsequence  0.95   0.02   16.56
ComputePValues    0.87   0.03   13.03
dtMotifMatch      0.72   0.08   14.15
ComputeMotifScore 0.72   0.01   16.71
** running examples for arch 'x64' ... OK
Examples with CPU or elapsed time > 5s
                  user system elapsed
MatchSubsequence  1.17   0.00   13.57
dtMotifMatch      0.96   0.04   13.35
ComputePValues    0.96   0.01   14.19
ComputeMotifScore 0.95   0.00   16.20
* checking for unstated dependencies in 'tests' ... OK
* checking tests ...
** running tests for arch 'i386' ...
  Running 'test.R'
  Running 'test_change.R'
  Running 'test_diff.R'
  Running 'test_is.R'
 OK
** running tests for arch 'x64' ...
  Running 'test.R'
  Running 'test_change.R'
  Running 'test_diff.R'
  Running 'test_is.R'
 OK
* checking for unstated dependencies in vignettes ... OK
* checking package vignettes in 'inst/doc' ... OK
* checking running R code from vignettes ... SKIPPED
* checking re-building of vignette outputs ... SKIPPED
* checking PDF version of manual ... OK
* DONE

Status: 3 NOTEs
See
  'C:/Users/biocbuild/bbs-3.9-bioc/meat/atSNP.Rcheck/00check.log'
for details.

Installation output

atSNP.Rcheck/00install.out

##############################################################################
##############################################################################
###
### Running command:
###
###   C:\cygwin\bin\curl.exe -O https://malbec2.bioconductor.org/BBS/3.9/bioc/src/contrib/atSNP_1.0.0.tar.gz && rm -rf atSNP.buildbin-libdir && mkdir atSNP.buildbin-libdir && C:\Users\biocbuild\bbs-3.9-bioc\R\bin\R.exe CMD INSTALL --merge-multiarch --build --library=atSNP.buildbin-libdir atSNP_1.0.0.tar.gz && C:\Users\biocbuild\bbs-3.9-bioc\R\bin\R.exe CMD INSTALL atSNP_1.0.0.zip && rm atSNP_1.0.0.tar.gz atSNP_1.0.0.zip
###
##############################################################################
##############################################################################


  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  512k  100  512k    0     0  7665k      0 --:--:-- --:--:-- --:--:-- 8544k

install for i386

* installing *source* package 'atSNP' ...
** using staged installation
** libs
C:/Rtools/mingw_32/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c ImportanceSample.cpp -o ImportanceSample.o
C:/Rtools/mingw_32/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c ImportanceSampleChange.cpp -o ImportanceSampleChange.o
C:/Rtools/mingw_32/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c ImportanceSampleDiff.cpp -o ImportanceSampleDiff.o
C:/Rtools/mingw_32/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c MotifScore.cpp -o MotifScore.o
C:/Rtools/mingw_32/bin/g++ -shared -s -static-libgcc -o atSNP.dll tmp.def ImportanceSample.o ImportanceSampleChange.o ImportanceSampleDiff.o MotifScore.o -LC:/extsoft/lib/i386 -LC:/extsoft/lib -LC:/Users/BIOCBU~1/BBS-3~1.9-B/R/bin/i386 -lR
installing to C:/Users/biocbuild/bbs-3.9-bioc/meat/atSNP.buildbin-libdir/00LOCK-atSNP/00new/atSNP/libs/i386
** R
** data
** byte-compile and prepare package for lazy loading
** help
*** installing help indices
  converting help for package 'atSNP'
    finding HTML links ... done
    ComputeMotifScore                       html  
    ComputePValues                          html  
    GetIUPACSequence                        html  
    LoadFastaData                           html  
    LoadMotifLibrary                        html  
    LoadSNPData                             html  
    MatchSubsequence                        html  
    atSNP-package                           html  
    dtMotifMatch                            html  
    encode_motif                            html  
    encode_motifinfo                        html  
    jaspar_motif                            html  
    jaspar_motifinfo                        html  
    motif_library                           html  
    motif_match                             html  
    motif_scores                            html  
    plotMotifMatch                          html  
    prior                                   html  
    snpInfo                                 html  
    snp_tbl                                 html  
    transition                              html  
** building package indices
** installing vignettes
** testing if installed package can be loaded from temporary location
** testing if installed package can be loaded from final location
** testing if installed package keeps a record of temporary installation path

install for x64

* installing *source* package 'atSNP' ...
** libs
C:/Rtools/mingw_64/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c ImportanceSample.cpp -o ImportanceSample.o
C:/Rtools/mingw_64/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c ImportanceSampleChange.cpp -o ImportanceSampleChange.o
C:/Rtools/mingw_64/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c ImportanceSampleDiff.cpp -o ImportanceSampleDiff.o
C:/Rtools/mingw_64/bin/g++  -I"C:/Users/BIOCBU~1/BBS-3~1.9-B/R/include" -DNDEBUG  -I"C:/Users/biocbuild/bbs-3.9-bioc/R/library/Rcpp/include"   -I"C:/extsoft/include"     -O2 -Wall  -mtune=generic -c MotifScore.cpp -o MotifScore.o
C:/Rtools/mingw_64/bin/g++ -shared -s -static-libgcc -o atSNP.dll tmp.def ImportanceSample.o ImportanceSampleChange.o ImportanceSampleDiff.o MotifScore.o -LC:/extsoft/lib/x64 -LC:/extsoft/lib -LC:/Users/BIOCBU~1/BBS-3~1.9-B/R/bin/x64 -lR
installing to C:/Users/biocbuild/bbs-3.9-bioc/meat/atSNP.buildbin-libdir/atSNP/libs/x64
** testing if installed package can be loaded
* MD5 sums
packaged installation of 'atSNP' as atSNP_1.0.0.zip
* DONE (atSNP)
* installing to library 'C:/Users/biocbuild/bbs-3.9-bioc/R/library'
package 'atSNP' successfully unpacked and MD5 sums checked

Tests output

atSNP.Rcheck/tests_i386/test.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: i386-w64-mingw32/i386 (32-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > > ## process the data > data(example) > > motif_scores <- ComputeMotifScore(motif_library, snpInfo, ncores = 1) > > motif_scores <- MatchSubsequence(motif_scores$snp.tbl, motif_scores$motif.scores, ncores = 1, motif.lib = motif_library) > > motif_scores[which(motif_scores$snpid == "rs7412" & motif_scores$motif == "SIX5_disc1"), ] snpid motif 4 rs7412 SIX5_disc1 ref_seq 4 CTCCTCCGCGATGCCGATGACCTGCAGAAGCGCCTGGCAGTGTACCAGGCCGGGGCCCGCG snp_seq motif_len 4 CTCCTCCGCGATGCCGATGACCTGCAGAAGTGCCTGGCAGTGTACCAGGCCGGGGCCCGCG 10 ref_start ref_end ref_strand snp_start snp_end snp_strand log_lik_ref 4 29 38 - 22 31 + -42.60672 log_lik_snp log_lik_ratio log_enhance_odds log_reduce_odds IUPAC 4 -38.4083 -4.198418 23.013 -2.917768 GARWTGTAGT ref_match_seq snp_match_seq ref_seq_snp_match snp_seq_ref_match snpbase 4 GCCAGGCGCT CTGCAGAAGT CTGCAGAAGC GCCAGGCACT T > > len_seq <- sapply(motif_scores$ref_seq, nchar) > snp_pos <- as.integer(len_seq / 2) + 1 > > i <- which(motif_scores$snpid == "rs7412" & motif_scores$motif == "SIX5_disc1") > > test_that("Error: reference bases are not the same as the sequence matrix.", { + expect_equal(sum(snpInfo$sequence_matrix[31, ] != snpInfo$ref_base), 0) + expect_equal(sum(snpInfo$sequence_matrix[31, ] == snpInfo$snp_base), 0) + }) > > test_that("Error: log_lik_ratio is not correct.", { + expect_equal(motif_scores$log_lik_ref - motif_scores$log_lik_snp, motif_scores$log_lik_ratio) + }) > > test_that("Error: log likelihoods are not correct.", { + + log_lik <- sapply(seq(nrow(motif_scores)), + function(i) { + motif_mat <- motif_library[[motif_scores$motif[i]]] + colind<-which(snpInfo$snpids==motif_scores$snpid[i]) + bases <- snpInfo$sequence_matrix[motif_scores$ref_start[i]:motif_scores$ref_end[i], colind] + if(motif_scores$ref_strand[i] == "-") + bases <- 5 - rev(bases) + log(prod( + motif_mat[cbind(seq(nrow(motif_mat)), + bases)])) + }) + + expect_equal(log_lik, motif_scores$log_lik_ref) + + snp_mat <- snpInfo$sequence_matrix + snp_mat[cbind(snp_pos, seq(ncol(snp_mat)))] <- snpInfo$snp_base + log_lik <- sapply(seq(nrow(motif_scores)), + function(i) { + motif_mat <- motif_library[[motif_scores$motif[i]]] + colind<-which(snpInfo$snpids==motif_scores$snpid[i]) + bases <- snp_mat[motif_scores$snp_start[i]:motif_scores$snp_end[i], colind] + if(motif_scores$snp_strand[i] == "-") + bases <- 5 - rev(bases) + log(prod( + motif_mat[cbind(seq(nrow(motif_mat)), + bases)])) + }) + + expect_equal(log_lik, motif_scores$log_lik_snp) + }) > > test_that("Error: log_enhance_odds not correct.", { + + len_seq <- sapply(motif_scores$ref_seq, nchar) + snp_pos <- as.integer(len_seq / 2) + 1 + + ## log odds for reduction in binding affinity + + pos_in_pwm <- snp_pos - motif_scores$ref_start + 1 + neg_ids <- which(motif_scores$ref_strand == "-") + pos_in_pwm[neg_ids] <- motif_scores$ref_end[neg_ids]- snp_pos[neg_ids] + 1 + snp_base <- sapply(substr(motif_scores$snp_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + ref_base <- sapply(substr(motif_scores$ref_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + snp_base[neg_ids] <- 5 - snp_base[neg_ids] + ref_base[neg_ids] <- 5 - ref_base[neg_ids] + my_log_reduce_odds <- sapply(seq(nrow(motif_scores)), + function(i) + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], ref_base[i]]) - + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], snp_base[i]]) + ) + + expect_equal(my_log_reduce_odds, motif_scores$log_reduce_odds) + + ## log odds in enhancing binding affinity + + pos_in_pwm <- snp_pos - motif_scores$snp_start + 1 + neg_ids <- which(motif_scores$snp_strand == "-") + pos_in_pwm[neg_ids] <- motif_scores$snp_end[neg_ids]- snp_pos[neg_ids] + 1 + snp_base <- sapply(substr(motif_scores$snp_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + ref_base <- sapply(substr(motif_scores$ref_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + snp_base[neg_ids] <- 5 - snp_base[neg_ids] + ref_base[neg_ids] <- 5 - ref_base[neg_ids] + my_log_enhance_odds <- sapply(seq(nrow(motif_scores)), + function(i) + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], snp_base[i]]) - + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], ref_base[i]]) + ) + + expect_equal(my_log_enhance_odds, motif_scores$log_enhance_odds) + + + }) > > test_that("Error: the maximum log likelihood computation is not correct.", { + + snp_mat <- snpInfo$sequence_matrix + snp_mat[cbind(snp_pos, seq(ncol(snp_mat)))] <- snpInfo$snp_base + + .findMaxLog <- function(seq_vec, pwm) { + snp_pos <- as.integer(length(seq_vec) / 2) + 1 + start_pos <- snp_pos - nrow(pwm) + 1 + end_pos <- snp_pos + rev_seq <- 5 - rev(seq_vec) + + maxLogProb <- -Inf + for(i in start_pos : end_pos) { + LogProb <- log(prod(pwm[cbind(seq(nrow(pwm)), + seq_vec[i - 1 + seq(nrow(pwm))])])) + if(LogProb > maxLogProb) + maxLogProb <- LogProb + } + for(i in start_pos : end_pos) { + LogProb <- log(prod(pwm[cbind(seq(nrow(pwm)), + rev_seq[i - 1 + seq(nrow(pwm))])])) + if(LogProb > maxLogProb) + maxLogProb <- LogProb + } + return(maxLogProb) + } + + ## find the maximum log likelihood on the reference sequence + my_log_lik_ref <- sapply(seq(nrow(motif_scores)), + function(x) { + colind<-which(snpInfo$snpids==motif_scores$snpid[x]) + seq_vec<- snpInfo$sequence_matrix[, colind] + pwm <- motif_library[[motif_scores$motif[x]]] + return(.findMaxLog(seq_vec, pwm)) + }) + + ## find the maximum log likelihood on the SNP sequence + + my_log_lik_snp <- sapply(seq(nrow(motif_scores)), + function(x) { + colind<-which(snpInfo$snpids==motif_scores$snpid[x]) #ADDED + seq_vec<- snp_mat[, colind] + pwm <- motif_library[[motif_scores$motif[x]]] + return(.findMaxLog(seq_vec, pwm)) + }) + + expect_equal(my_log_lik_ref, motif_scores$log_lik_ref) + expect_equal(my_log_lik_snp, motif_scores$log_lik_snp) + + }) > > proc.time() user system elapsed 12.96 0.98 13.90	atSNP.Rcheck/tests_x64/test.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > > ## process the data > data(example) > > motif_scores <- ComputeMotifScore(motif_library, snpInfo, ncores = 1) > > motif_scores <- MatchSubsequence(motif_scores$snp.tbl, motif_scores$motif.scores, ncores = 1, motif.lib = motif_library) > > motif_scores[which(motif_scores$snpid == "rs7412" & motif_scores$motif == "SIX5_disc1"), ] snpid motif 4 rs7412 SIX5_disc1 ref_seq 4 CTCCTCCGCGATGCCGATGACCTGCAGAAGCGCCTGGCAGTGTACCAGGCCGGGGCCCGCG snp_seq motif_len 4 CTCCTCCGCGATGCCGATGACCTGCAGAAGTGCCTGGCAGTGTACCAGGCCGGGGCCCGCG 10 ref_start ref_end ref_strand snp_start snp_end snp_strand log_lik_ref 4 29 38 - 22 31 + -42.60672 log_lik_snp log_lik_ratio log_enhance_odds log_reduce_odds IUPAC 4 -38.4083 -4.198418 23.013 -2.917768 GARWTGTAGT ref_match_seq snp_match_seq ref_seq_snp_match snp_seq_ref_match snpbase 4 GCCAGGCGCT CTGCAGAAGT CTGCAGAAGC GCCAGGCACT T > > len_seq <- sapply(motif_scores$ref_seq, nchar) > snp_pos <- as.integer(len_seq / 2) + 1 > > i <- which(motif_scores$snpid == "rs7412" & motif_scores$motif == "SIX5_disc1") > > test_that("Error: reference bases are not the same as the sequence matrix.", { + expect_equal(sum(snpInfo$sequence_matrix[31, ] != snpInfo$ref_base), 0) + expect_equal(sum(snpInfo$sequence_matrix[31, ] == snpInfo$snp_base), 0) + }) > > test_that("Error: log_lik_ratio is not correct.", { + expect_equal(motif_scores$log_lik_ref - motif_scores$log_lik_snp, motif_scores$log_lik_ratio) + }) > > test_that("Error: log likelihoods are not correct.", { + + log_lik <- sapply(seq(nrow(motif_scores)), + function(i) { + motif_mat <- motif_library[[motif_scores$motif[i]]] + colind<-which(snpInfo$snpids==motif_scores$snpid[i]) + bases <- snpInfo$sequence_matrix[motif_scores$ref_start[i]:motif_scores$ref_end[i], colind] + if(motif_scores$ref_strand[i] == "-") + bases <- 5 - rev(bases) + log(prod( + motif_mat[cbind(seq(nrow(motif_mat)), + bases)])) + }) + + expect_equal(log_lik, motif_scores$log_lik_ref) + + snp_mat <- snpInfo$sequence_matrix + snp_mat[cbind(snp_pos, seq(ncol(snp_mat)))] <- snpInfo$snp_base + log_lik <- sapply(seq(nrow(motif_scores)), + function(i) { + motif_mat <- motif_library[[motif_scores$motif[i]]] + colind<-which(snpInfo$snpids==motif_scores$snpid[i]) + bases <- snp_mat[motif_scores$snp_start[i]:motif_scores$snp_end[i], colind] + if(motif_scores$snp_strand[i] == "-") + bases <- 5 - rev(bases) + log(prod( + motif_mat[cbind(seq(nrow(motif_mat)), + bases)])) + }) + + expect_equal(log_lik, motif_scores$log_lik_snp) + }) > > test_that("Error: log_enhance_odds not correct.", { + + len_seq <- sapply(motif_scores$ref_seq, nchar) + snp_pos <- as.integer(len_seq / 2) + 1 + + ## log odds for reduction in binding affinity + + pos_in_pwm <- snp_pos - motif_scores$ref_start + 1 + neg_ids <- which(motif_scores$ref_strand == "-") + pos_in_pwm[neg_ids] <- motif_scores$ref_end[neg_ids]- snp_pos[neg_ids] + 1 + snp_base <- sapply(substr(motif_scores$snp_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + ref_base <- sapply(substr(motif_scores$ref_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + snp_base[neg_ids] <- 5 - snp_base[neg_ids] + ref_base[neg_ids] <- 5 - ref_base[neg_ids] + my_log_reduce_odds <- sapply(seq(nrow(motif_scores)), + function(i) + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], ref_base[i]]) - + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], snp_base[i]]) + ) + + expect_equal(my_log_reduce_odds, motif_scores$log_reduce_odds) + + ## log odds in enhancing binding affinity + + pos_in_pwm <- snp_pos - motif_scores$snp_start + 1 + neg_ids <- which(motif_scores$snp_strand == "-") + pos_in_pwm[neg_ids] <- motif_scores$snp_end[neg_ids]- snp_pos[neg_ids] + 1 + snp_base <- sapply(substr(motif_scores$snp_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + ref_base <- sapply(substr(motif_scores$ref_seq, snp_pos, snp_pos), function(x) which(c("A", "C", "G", "T") == x)) + snp_base[neg_ids] <- 5 - snp_base[neg_ids] + ref_base[neg_ids] <- 5 - ref_base[neg_ids] + my_log_enhance_odds <- sapply(seq(nrow(motif_scores)), + function(i) + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], snp_base[i]]) - + log(motif_library[[motif_scores$motif[i]]][pos_in_pwm[i], ref_base[i]]) + ) + + expect_equal(my_log_enhance_odds, motif_scores$log_enhance_odds) + + + }) > > test_that("Error: the maximum log likelihood computation is not correct.", { + + snp_mat <- snpInfo$sequence_matrix + snp_mat[cbind(snp_pos, seq(ncol(snp_mat)))] <- snpInfo$snp_base + + .findMaxLog <- function(seq_vec, pwm) { + snp_pos <- as.integer(length(seq_vec) / 2) + 1 + start_pos <- snp_pos - nrow(pwm) + 1 + end_pos <- snp_pos + rev_seq <- 5 - rev(seq_vec) + + maxLogProb <- -Inf + for(i in start_pos : end_pos) { + LogProb <- log(prod(pwm[cbind(seq(nrow(pwm)), + seq_vec[i - 1 + seq(nrow(pwm))])])) + if(LogProb > maxLogProb) + maxLogProb <- LogProb + } + for(i in start_pos : end_pos) { + LogProb <- log(prod(pwm[cbind(seq(nrow(pwm)), + rev_seq[i - 1 + seq(nrow(pwm))])])) + if(LogProb > maxLogProb) + maxLogProb <- LogProb + } + return(maxLogProb) + } + + ## find the maximum log likelihood on the reference sequence + my_log_lik_ref <- sapply(seq(nrow(motif_scores)), + function(x) { + colind<-which(snpInfo$snpids==motif_scores$snpid[x]) + seq_vec<- snpInfo$sequence_matrix[, colind] + pwm <- motif_library[[motif_scores$motif[x]]] + return(.findMaxLog(seq_vec, pwm)) + }) + + ## find the maximum log likelihood on the SNP sequence + + my_log_lik_snp <- sapply(seq(nrow(motif_scores)), + function(x) { + colind<-which(snpInfo$snpids==motif_scores$snpid[x]) #ADDED + seq_vec<- snp_mat[, colind] + pwm <- motif_library[[motif_scores$motif[x]]] + return(.findMaxLog(seq_vec, pwm)) + }) + + expect_equal(my_log_lik_ref, motif_scores$log_lik_ref) + expect_equal(my_log_lik_snp, motif_scores$log_lik_snp) + + }) > > proc.time() user system elapsed 11.48 0.60 12.07
atSNP.Rcheck/tests_i386/test_change.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: i386-w64-mingw32/i386 (32-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > data(example) > > trans_mat <- matrix(rep(snpInfo$prior, each = 4), nrow = 4) > test_pwm <- motif_library$SIX5_disc1 > scores <- as.matrix(motif_scores$motif.scores[3:4, 4:5]) > score_diff <- abs(scores[,2]-scores[,1]) > > pval_a <- .Call("test_p_value", test_pwm, snpInfo$prior, snpInfo$transition, scores, 0.15, 100) > pval_ratio <- abs(log(pval_a[seq(nrow(scores)),1]) - log(pval_a[seq(nrow(scores)) + nrow(scores), 1])) > > test_score <- test_pwm > for(i in seq(nrow(test_score))) { + for(j in seq(ncol(test_score))) { + test_score[i, j] <- exp(mean(log(test_pwm[i, j] / test_pwm[i, -j]))) + } + } > > adj_mat <- test_pwm + 0.25 > motif_len <- nrow(test_pwm) > > ## these are functions for this test only > drawonesample <- function(theta) { + prob_start <- rev(rowSums(test_score ^ theta) / rowSums(adj_mat)) + id <- sample(seq(motif_len), 1, prob = prob_start) + sample <- sample(1:4, 2 * motif_len - 1, replace = TRUE, prob = snpInfo$prior) + delta <- adj_mat + delta[motif_len - id + 1, ] <- test_score[motif_len - id + 1, ] ^ theta + sample[id - 1 + seq(motif_len)] <- apply(delta, 1, function(x) sample(seq(4), 1, prob = x)) + ## compute weight + sc <- 0 + for(s in seq(motif_len)) { + delta <- adj_mat + delta[motif_len + 1 - s, ] <- test_score[motif_len + 1 - s, ] ^ theta + sc <- sc + prod(delta[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)])]) / + prod(snpInfo$prior[sample[s - 1 + seq(motif_len)]]) + } + sample <- c(sample, id, sc) + return(sample) + } > jointprob <- function(x) prod(test_pwm[cbind(seq(motif_len), x)]) > maxjointprob <- function(x) { + maxp <- -Inf + p <- -Inf + for(i in 1:motif_len) { + p <- jointprob(x[i:(i+motif_len - 1)]) + if(p > maxp) + maxp <- p + } + for(i in 1:motif_len) { + p <- jointprob(5 - x[(i+motif_len - 1):i]) + if(p > maxp) + maxp <- p + } + return(maxp) + } > get_freq <- function(sample) { + emp_freq <- matrix(0, nrow = 2 * motif_len - 1, ncol = 4) + for(i in seq(2 * motif_len - 1)) { + for(j in seq(4)) { + emp_freq[i, j] <- sum(sample[i, ] == j - 1) + } + } + emp_freq <- emp_freq / rowSums(emp_freq) + return(emp_freq) + } > > test_that("Error: quantile function computing are not equivalent.", { + for(p in c(0.01, 0.1, 0.5, 0.9, 0.99) ) { + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + delta.r <- as.double(sort(abs(scores[,2]-scores[,1]))[ceiling((1 - p) * (nrow(scores)))]) + expect_equal(delta, delta.r) + } + }) > > test_that("Error: the scores for samples are not equivalent.", { + p <- 0.1 + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_change", test_score, adj_mat, delta, package = "atSNP") + ## Use R code to generate a random sample + for(i in seq(10)) { + sample <- drawonesample(theta) + sample_score <- .Call("test_compute_sample_score_change", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)] - 1, snpInfo$prior, trans_mat, sample[2 * motif_len] - 1, theta, package = "atSNP") + expect_equal(sample[2 * motif_len + 1], sample_score[1]) + sample1 <- sample2 <- sample3 <- sample + sample1[motif_len] <- seq(4)[-sample[motif_len]][1] + sample2[motif_len] <- seq(4)[-sample[motif_len]][2] + sample3[motif_len] <- seq(4)[-sample[motif_len]][3] + sample_score_r <- log(maxjointprob(sample[seq(2 * motif_len - 1)])) - + log(c(maxjointprob(sample1[seq(2 * motif_len - 1)]), + maxjointprob(sample2[seq(2 * motif_len - 1)]), + maxjointprob(sample3[seq(2 * motif_len - 1)]))) + expect_equal(sample_score_r, sample_score[2:4]) + } + + ## Use C code to generate a random sample + for(i in seq(10)) { + sample <- .Call("test_importance_sample_change", test_score, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP") + start_pos <- sample[2 * motif_len] + 1 + adj_score <- 0 + for(s in seq_len(motif_len)) { + adj_s <- sum(log(adj_mat[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)] + 1)]) - + log(snpInfo$prior[sample[s - 1 + seq(motif_len)] + 1])) + adj_s <- adj_s + theta * log(test_score[motif_len + 1 - s, sample[motif_len] + 1]) - + log(adj_mat[motif_len + 1 - s, sample[motif_len] + 1]) + adj_score <- adj_score + exp(adj_s) + } + sample_score <- .Call("test_compute_sample_score_change", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)], snpInfo$prior, trans_mat, sample[2 * motif_len], theta, package = "atSNP") + expect_equal(adj_score, sample_score[1]) + } + }) > > test_that("Error: compute the normalizing constant.", { + ## parameters + for(p in seq(9) / 10) { + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_change", test_score, adj_mat, delta, package = "atSNP") + const <- .Call("test_func_delta_change", test_score, adj_mat, theta, package = "atSNP") + ## in R + adj_sum <- rowSums(adj_mat) + wei_sum <- rowSums(test_score ^ theta) + const.r <- prod(adj_sum) * sum(wei_sum / adj_sum) + expect_equal(const, const.r) + } + }) > > test_that("Error: sample distributions are not expected.", { + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_change", test_score, adj_mat, delta, package = "atSNP") + prob_start <- rev(rowSums(test_score ^ theta) / rowSums(adj_mat)) + ## construct the delta matrix + delta <- matrix(1, nrow = 4 * motif_len, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + delta[seq(4) + 4 * (pos - 1), ] <- snpInfo$prior + delta[seq(4) + 4 * (pos - 1), pos - 1 + seq(motif_len)] <- t(test_pwm) + delta[seq(4) + 4 * (pos - 1), motif_len] <- test_score[motif_len + 1 - pos, ] ^ theta + delta[seq(4) + 4 * (pos - 1), ] <- delta[seq(4) + 4 * (pos - 1),] / rep(colSums(delta[seq(4) + 4 * (pos - 1), ]), each = 4) + } + target_freq <- matrix(0, nrow = 4, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + target_freq <- target_freq + delta[seq(4) + 4 * (pos - 1), ] * prob_start[pos] + } + target_freq <- t(target_freq) + target_freq <- target_freq / rowSums(target_freq) + + results_i <- function(i) { + ## generate 100 samples + sample1 <- sapply(seq(100), function(x) + .Call("test_importance_sample_change", + adj_mat, snpInfo$prior, trans_mat, test_score, theta, package = "atSNP")) + emp_freq1 <- get_freq(sample1) + sample2 <- sapply(rep(theta, 100), drawonesample) + emp_freq2 <- get_freq(sample2 - 1) + ## print(rbind(emp_freq1[10, ], emp_freq2[10, ], target_freq[10, ])) + max(abs(emp_freq1 - target_freq)) > max(abs(emp_freq2 - target_freq)) + } + + if(Sys.info()[["sysname"]] == "Windows"){ + snow <- SnowParam(workers = 1, type = "SOCK") + results<-bpmapply(results_i, seq(20), BPPARAM = snow,SIMPLIFY = FALSE) + }else{ + results<-bpmapply(results_i, seq(20), BPPARAM = MulticoreParam(workers = 1), + SIMPLIFY = FALSE) + } + + print(sum(unlist(results))) + print(pbinom(sum(unlist(results)), size = 20, prob = 0.5)) + }) [1] 10 [1] 0.5880985 > > test_that("Error: the chosen pvalues should have the smaller variance.", { + .structure_diff <- function(pval_mat) { + id <- apply(pval_mat[, c(2, 4)], 1, which.min) + return(cbind(pval_mat[, c(1, 3)][cbind(seq_along(id), id)], + pval_mat[, c(2, 4)][cbind(seq_along(id), id)])) + } + for(p in c(0.05, 0.1, 0.2, 0.5)) { + p_values <- .Call("test_p_value_change", test_pwm, test_score, adj_mat, snpInfo$prior, snpInfo$transition, score_diff, pval_ratio, quantile(score_diff, 1 - p), 100, package = "atSNP")$score + p_values_s <- .structure_diff(p_values) + expect_equal(p_values_s[, 2], apply(p_values[, c(2, 4)], 1, min)) + } + }) > > proc.time() user system elapsed 10.15 1.03 11.15	atSNP.Rcheck/tests_x64/test_change.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > data(example) > > trans_mat <- matrix(rep(snpInfo$prior, each = 4), nrow = 4) > test_pwm <- motif_library$SIX5_disc1 > scores <- as.matrix(motif_scores$motif.scores[3:4, 4:5]) > score_diff <- abs(scores[,2]-scores[,1]) > > pval_a <- .Call("test_p_value", test_pwm, snpInfo$prior, snpInfo$transition, scores, 0.15, 100) > pval_ratio <- abs(log(pval_a[seq(nrow(scores)),1]) - log(pval_a[seq(nrow(scores)) + nrow(scores), 1])) > > test_score <- test_pwm > for(i in seq(nrow(test_score))) { + for(j in seq(ncol(test_score))) { + test_score[i, j] <- exp(mean(log(test_pwm[i, j] / test_pwm[i, -j]))) + } + } > > adj_mat <- test_pwm + 0.25 > motif_len <- nrow(test_pwm) > > ## these are functions for this test only > drawonesample <- function(theta) { + prob_start <- rev(rowSums(test_score ^ theta) / rowSums(adj_mat)) + id <- sample(seq(motif_len), 1, prob = prob_start) + sample <- sample(1:4, 2 * motif_len - 1, replace = TRUE, prob = snpInfo$prior) + delta <- adj_mat + delta[motif_len - id + 1, ] <- test_score[motif_len - id + 1, ] ^ theta + sample[id - 1 + seq(motif_len)] <- apply(delta, 1, function(x) sample(seq(4), 1, prob = x)) + ## compute weight + sc <- 0 + for(s in seq(motif_len)) { + delta <- adj_mat + delta[motif_len + 1 - s, ] <- test_score[motif_len + 1 - s, ] ^ theta + sc <- sc + prod(delta[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)])]) / + prod(snpInfo$prior[sample[s - 1 + seq(motif_len)]]) + } + sample <- c(sample, id, sc) + return(sample) + } > jointprob <- function(x) prod(test_pwm[cbind(seq(motif_len), x)]) > maxjointprob <- function(x) { + maxp <- -Inf + p <- -Inf + for(i in 1:motif_len) { + p <- jointprob(x[i:(i+motif_len - 1)]) + if(p > maxp) + maxp <- p + } + for(i in 1:motif_len) { + p <- jointprob(5 - x[(i+motif_len - 1):i]) + if(p > maxp) + maxp <- p + } + return(maxp) + } > get_freq <- function(sample) { + emp_freq <- matrix(0, nrow = 2 * motif_len - 1, ncol = 4) + for(i in seq(2 * motif_len - 1)) { + for(j in seq(4)) { + emp_freq[i, j] <- sum(sample[i, ] == j - 1) + } + } + emp_freq <- emp_freq / rowSums(emp_freq) + return(emp_freq) + } > > test_that("Error: quantile function computing are not equivalent.", { + for(p in c(0.01, 0.1, 0.5, 0.9, 0.99) ) { + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + delta.r <- as.double(sort(abs(scores[,2]-scores[,1]))[ceiling((1 - p) * (nrow(scores)))]) + expect_equal(delta, delta.r) + } + }) > > test_that("Error: the scores for samples are not equivalent.", { + p <- 0.1 + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_change", test_score, adj_mat, delta, package = "atSNP") + ## Use R code to generate a random sample + for(i in seq(10)) { + sample <- drawonesample(theta) + sample_score <- .Call("test_compute_sample_score_change", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)] - 1, snpInfo$prior, trans_mat, sample[2 * motif_len] - 1, theta, package = "atSNP") + expect_equal(sample[2 * motif_len + 1], sample_score[1]) + sample1 <- sample2 <- sample3 <- sample + sample1[motif_len] <- seq(4)[-sample[motif_len]][1] + sample2[motif_len] <- seq(4)[-sample[motif_len]][2] + sample3[motif_len] <- seq(4)[-sample[motif_len]][3] + sample_score_r <- log(maxjointprob(sample[seq(2 * motif_len - 1)])) - + log(c(maxjointprob(sample1[seq(2 * motif_len - 1)]), + maxjointprob(sample2[seq(2 * motif_len - 1)]), + maxjointprob(sample3[seq(2 * motif_len - 1)]))) + expect_equal(sample_score_r, sample_score[2:4]) + } + + ## Use C code to generate a random sample + for(i in seq(10)) { + sample <- .Call("test_importance_sample_change", test_score, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP") + start_pos <- sample[2 * motif_len] + 1 + adj_score <- 0 + for(s in seq_len(motif_len)) { + adj_s <- sum(log(adj_mat[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)] + 1)]) - + log(snpInfo$prior[sample[s - 1 + seq(motif_len)] + 1])) + adj_s <- adj_s + theta * log(test_score[motif_len + 1 - s, sample[motif_len] + 1]) - + log(adj_mat[motif_len + 1 - s, sample[motif_len] + 1]) + adj_score <- adj_score + exp(adj_s) + } + sample_score <- .Call("test_compute_sample_score_change", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)], snpInfo$prior, trans_mat, sample[2 * motif_len], theta, package = "atSNP") + expect_equal(adj_score, sample_score[1]) + } + }) > > test_that("Error: compute the normalizing constant.", { + ## parameters + for(p in seq(9) / 10) { + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_change", test_score, adj_mat, delta, package = "atSNP") + const <- .Call("test_func_delta_change", test_score, adj_mat, theta, package = "atSNP") + ## in R + adj_sum <- rowSums(adj_mat) + wei_sum <- rowSums(test_score ^ theta) + const.r <- prod(adj_sum) * sum(wei_sum / adj_sum) + expect_equal(const, const.r) + } + }) > > test_that("Error: sample distributions are not expected.", { + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile_change", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_change", test_score, adj_mat, delta, package = "atSNP") + prob_start <- rev(rowSums(test_score ^ theta) / rowSums(adj_mat)) + ## construct the delta matrix + delta <- matrix(1, nrow = 4 * motif_len, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + delta[seq(4) + 4 * (pos - 1), ] <- snpInfo$prior + delta[seq(4) + 4 * (pos - 1), pos - 1 + seq(motif_len)] <- t(test_pwm) + delta[seq(4) + 4 * (pos - 1), motif_len] <- test_score[motif_len + 1 - pos, ] ^ theta + delta[seq(4) + 4 * (pos - 1), ] <- delta[seq(4) + 4 * (pos - 1),] / rep(colSums(delta[seq(4) + 4 * (pos - 1), ]), each = 4) + } + target_freq <- matrix(0, nrow = 4, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + target_freq <- target_freq + delta[seq(4) + 4 * (pos - 1), ] * prob_start[pos] + } + target_freq <- t(target_freq) + target_freq <- target_freq / rowSums(target_freq) + + results_i <- function(i) { + ## generate 100 samples + sample1 <- sapply(seq(100), function(x) + .Call("test_importance_sample_change", + adj_mat, snpInfo$prior, trans_mat, test_score, theta, package = "atSNP")) + emp_freq1 <- get_freq(sample1) + sample2 <- sapply(rep(theta, 100), drawonesample) + emp_freq2 <- get_freq(sample2 - 1) + ## print(rbind(emp_freq1[10, ], emp_freq2[10, ], target_freq[10, ])) + max(abs(emp_freq1 - target_freq)) > max(abs(emp_freq2 - target_freq)) + } + + if(Sys.info()[["sysname"]] == "Windows"){ + snow <- SnowParam(workers = 1, type = "SOCK") + results<-bpmapply(results_i, seq(20), BPPARAM = snow,SIMPLIFY = FALSE) + }else{ + results<-bpmapply(results_i, seq(20), BPPARAM = MulticoreParam(workers = 1), + SIMPLIFY = FALSE) + } + + print(sum(unlist(results))) + print(pbinom(sum(unlist(results)), size = 20, prob = 0.5)) + }) [1] 8 [1] 0.2517223 > > test_that("Error: the chosen pvalues should have the smaller variance.", { + .structure_diff <- function(pval_mat) { + id <- apply(pval_mat[, c(2, 4)], 1, which.min) + return(cbind(pval_mat[, c(1, 3)][cbind(seq_along(id), id)], + pval_mat[, c(2, 4)][cbind(seq_along(id), id)])) + } + for(p in c(0.05, 0.1, 0.2, 0.5)) { + p_values <- .Call("test_p_value_change", test_pwm, test_score, adj_mat, snpInfo$prior, snpInfo$transition, score_diff, pval_ratio, quantile(score_diff, 1 - p), 100, package = "atSNP")$score + p_values_s <- .structure_diff(p_values) + expect_equal(p_values_s[, 2], apply(p_values[, c(2, 4)], 1, min)) + } + }) > > proc.time() user system elapsed 12.65 0.60 13.25
atSNP.Rcheck/tests_i386/test_diff.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: i386-w64-mingw32/i386 (32-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > data(example) > > trans_mat <- matrix(rep(snpInfo$prior, each = 4), nrow = 4) > test_pwm <- motif_library$SIX5_disc1 > scores <- as.matrix(motif_scores$motif.scores[3:4, 4:5]) > score_diff <- abs(scores[,2]-scores[,1]) > > test_score <- test_pwm > for(i in seq(nrow(test_score))) { + for(j in seq(ncol(test_score))) { + test_score[i, j] <- exp(mean(log(test_pwm[i, j] / test_pwm[i, -j]))) + } + } > > adj_mat <- test_pwm + rowMeans(test_pwm) > motif_len <- nrow(test_pwm) > > ## these are functions for this test only > drawonesample <- function(theta) { + prob_start <- sapply(seq(motif_len), + function(j) + sum(snpInfo$prior * test_score[motif_len + 1 - j, ] ^ theta * + adj_mat[motif_len + 1 - j, ]) / + sum(snpInfo$prior * adj_mat[motif_len + 1 - j, ]) + ) + id <- sample(seq(motif_len), 1, prob = prob_start) + sample <- sample(1:4, 2 * motif_len - 1, replace = TRUE, prob = snpInfo$prior) + delta <- adj_mat + delta[motif_len + 1 - id, ] <- delta[motif_len + 1 - id, ] * test_score[motif_len + 1 - id, ] ^ theta + sample[id - 1 + seq(motif_len)] <- apply(delta, 1, function(x) + sample(seq(4), 1, prob = x * snpInfo$prior)) + sc <- 0 + for(s in seq(motif_len)) { + delta <- adj_mat + delta[motif_len + 1 - s, ] <- delta[motif_len + 1 - s, ] * test_score[motif_len + 1 - s, ] ^ theta + sc <- sc + prod(delta[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)])]) + } + sample <- c(sample, id, sc) + return(sample) + } > jointprob <- function(x) prod(test_pwm[cbind(seq(motif_len), x)]) > maxjointprob <- function(x) { + maxp <- -Inf + p <- -Inf + for(i in 1:motif_len) { + p <- jointprob(x[i:(i+motif_len - 1)]) + if(p > maxp) + maxp <- p + } + for(i in 1:motif_len) { + p <- jointprob(5 - x[(i+motif_len - 1):i]) + if(p > maxp) + maxp <- p + } + return(maxp) + } > get_freq <- function(sample) { + emp_freq <- matrix(0, nrow = 2 * motif_len - 1, ncol = 4) + for(i in seq(2 * motif_len - 1)) { + for(j in seq(4)) { + emp_freq[i, j] <- sum(sample[i, ] == j - 1) + } + } + emp_freq <- emp_freq / rowSums(emp_freq) + return(emp_freq) + } > > test_that("Error: quantile function computing are not equivalent.", { + for(p in c(0.01, 0.1, 0.5, 0.9, 0.99)) { + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + delta.r <- as.double(sort(abs(scores[,2]-scores[,1]))[ceiling((1 - p) * (nrow(scores)))]) + expect_equal(delta, delta.r) + } + }) > > test_that("Error: the scores for samples are not equivalent.", { + p <- 0.1 + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_diff", test_score, adj_mat, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + ## Use R code to generate a random sample + for(i in seq(10)) { + sample <- drawonesample(theta) + sample_score <- .Call("test_compute_sample_score_diff", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)] - 1, sample[2 * motif_len] - 1, theta, package = "atSNP") + expect_equal(sample[2 * motif_len + 1], sample_score[1]) + sample1 <- sample2 <- sample3 <- sample + sample1[motif_len] <- seq(4)[-sample[motif_len]][1] + sample2[motif_len] <- seq(4)[-sample[motif_len]][2] + sample3[motif_len] <- seq(4)[-sample[motif_len]][3] + sample_score_r <- log(maxjointprob(sample[seq(2 * motif_len - 1)])) - + log(c(maxjointprob(sample1[seq(2 * motif_len - 1)]), + maxjointprob(sample2[seq(2 * motif_len - 1)]), + maxjointprob(sample3[seq(2 * motif_len - 1)]))) + expect_equal(sample_score_r, sample_score[-1]) + } + + ## Use C code to generate a random sample + delta <- matrix(1, nrow = 4 * motif_len, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + for(j in (pos + motif_len - 1) : 1) { + if(j < pos + motif_len - 1) { + delta[4 * (pos - 1) + seq(4), j] <- sum(snpInfo$prior * delta[4 * (pos - 1) + seq(4), j + 1]) + } + if(j >= pos) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * adj_mat[j - pos + 1, ] + } + if(j == motif_len) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * test_score[j - pos + 1, ] ^ theta + } + } + } + for(i in seq(10)) { + sample <- .Call("test_importance_sample_diff", delta, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP") + start_pos <- sample[2 * motif_len] + 1 + adj_score <- 0 + for(s in seq_len(motif_len)) { + adj_s <- sum(log(adj_mat[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)] + 1)])) + adj_s <- adj_s + theta * log(test_score[motif_len + 1 - s, sample[motif_len] + 1]) + adj_score <- adj_score + exp(adj_s) + } + sample_score <- .Call("test_compute_sample_score_diff", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)], sample[2 * motif_len], theta, package = "atSNP") + expect_equal(adj_score, sample_score[1]) + } + }) > > test_that("Error: compute the normalizing constant.", { + + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_diff", test_score, adj_mat, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + + ## + const <- .Call("test_func_delta_diff", test_score, adj_mat, snpInfo$prior, trans_mat, theta, package = "atSNP") + + prob_start <- sapply(seq(motif_len), + function(j) + sum(snpInfo$prior * test_score[motif_len + 1 - j, ] ^ theta * + adj_mat[motif_len + 1 - j, ]) / + sum(snpInfo$prior * adj_mat[motif_len + 1 - j, ]) + ) + + const.r <- prod(colSums(snpInfo$prior * t(adj_mat))) * sum(prob_start) + expect_equal(const, const.r) + }) > > test_that("Error: sample distributions are not expected.", { + + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_diff", test_score, adj_mat, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + + ## construct the delta matrix + delta <- matrix(1, nrow = 4 * motif_len, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + for(j in (pos + motif_len - 1) : 1) { + if(j < pos + motif_len - 1) { + delta[4 * (pos - 1) + seq(4), j] <- sum(snpInfo$prior * delta[4 * (pos - 1) + seq(4), j + 1]) + } + if(j >= pos) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * adj_mat[j - pos + 1, ] + } + if(j == motif_len) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * test_score[j - pos + 1, ] ^ theta + } + } + } + + target_freq <- matrix(0, nrow = 4, ncol = 2 * motif_len - 1) + + mat <- snpInfo$prior * matrix(delta[, 1], nrow = 4) + wei <- colSums(mat) + for(j in seq(2 * motif_len - 1)) { + for(pos in seq(motif_len)) { + tmp <- delta[seq(4) + 4 * (pos - 1), j] * snpInfo$prior + target_freq[, j] <- target_freq[, j] + tmp / sum(tmp) * wei[pos] + } + } + target_freq <- t(target_freq) + target_freq <- target_freq / rowSums(target_freq) + + results_i <- function(i) { + ## generate 100 samples + sample1 <- sapply(seq(100), function(x) + .Call("test_importance_sample_diff", + delta, snpInfo$prior, trans_mat, test_score, theta, package = "atSNP")) + emp_freq1 <- get_freq(sample1) + + sample2 <- sapply(rep(theta, 100), drawonesample) + emp_freq2 <- get_freq(sample2 - 1) + + ## print(rbind(emp_freq1[10, ], emp_freq2[10, ], target_freq[10, ])) + max(abs(emp_freq1 - target_freq)) > max(abs(emp_freq2 - target_freq)) + } + + if(Sys.info()[["sysname"]] == "Windows"){ + snow <- SnowParam(workers = 1, type = "SOCK") + results<-bpmapply(results_i, seq(20), BPPARAM = snow,SIMPLIFY = FALSE) + }else{ + results<-bpmapply(results_i, seq(20), BPPARAM = MulticoreParam(workers = 1), + SIMPLIFY = FALSE) + } + + print(sum(unlist(results))) + + print(pbinom(sum(unlist(results)), size = 20, prob = 0.5)) + + }) [1] 10 [1] 0.5880985 > > test_that("Error: the chosen pvalues should have the smaller variance.", { + + .structure_diff <- function(pval_mat) { + id <- apply(pval_mat[, c(2, 4)], 1, which.min) + return(cbind(pval_mat[, c(1, 3)][cbind(seq_along(id), id)], + pval_mat[, c(2, 4)][cbind(seq_along(id), id)])) + } + + for(p in c(0.05, 0.1, 0.2, 0.5)) { + p_values <- .Call("test_p_value_diff", test_pwm, test_score, adj_mat, snpInfo$prior, snpInfo$transition, score_diff, quantile(score_diff, 1 - p), 100, package = "atSNP") + p_values_s <- .structure_diff(p_values) + expect_equal(p_values_s[, 2], apply(p_values[, c(2, 4)], 1, min)) + } + }) > > proc.time() user system elapsed 12.20 0.92 13.10	atSNP.Rcheck/tests_x64/test_diff.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > data(example) > > trans_mat <- matrix(rep(snpInfo$prior, each = 4), nrow = 4) > test_pwm <- motif_library$SIX5_disc1 > scores <- as.matrix(motif_scores$motif.scores[3:4, 4:5]) > score_diff <- abs(scores[,2]-scores[,1]) > > test_score <- test_pwm > for(i in seq(nrow(test_score))) { + for(j in seq(ncol(test_score))) { + test_score[i, j] <- exp(mean(log(test_pwm[i, j] / test_pwm[i, -j]))) + } + } > > adj_mat <- test_pwm + rowMeans(test_pwm) > motif_len <- nrow(test_pwm) > > ## these are functions for this test only > drawonesample <- function(theta) { + prob_start <- sapply(seq(motif_len), + function(j) + sum(snpInfo$prior * test_score[motif_len + 1 - j, ] ^ theta * + adj_mat[motif_len + 1 - j, ]) / + sum(snpInfo$prior * adj_mat[motif_len + 1 - j, ]) + ) + id <- sample(seq(motif_len), 1, prob = prob_start) + sample <- sample(1:4, 2 * motif_len - 1, replace = TRUE, prob = snpInfo$prior) + delta <- adj_mat + delta[motif_len + 1 - id, ] <- delta[motif_len + 1 - id, ] * test_score[motif_len + 1 - id, ] ^ theta + sample[id - 1 + seq(motif_len)] <- apply(delta, 1, function(x) + sample(seq(4), 1, prob = x * snpInfo$prior)) + sc <- 0 + for(s in seq(motif_len)) { + delta <- adj_mat + delta[motif_len + 1 - s, ] <- delta[motif_len + 1 - s, ] * test_score[motif_len + 1 - s, ] ^ theta + sc <- sc + prod(delta[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)])]) + } + sample <- c(sample, id, sc) + return(sample) + } > jointprob <- function(x) prod(test_pwm[cbind(seq(motif_len), x)]) > maxjointprob <- function(x) { + maxp <- -Inf + p <- -Inf + for(i in 1:motif_len) { + p <- jointprob(x[i:(i+motif_len - 1)]) + if(p > maxp) + maxp <- p + } + for(i in 1:motif_len) { + p <- jointprob(5 - x[(i+motif_len - 1):i]) + if(p > maxp) + maxp <- p + } + return(maxp) + } > get_freq <- function(sample) { + emp_freq <- matrix(0, nrow = 2 * motif_len - 1, ncol = 4) + for(i in seq(2 * motif_len - 1)) { + for(j in seq(4)) { + emp_freq[i, j] <- sum(sample[i, ] == j - 1) + } + } + emp_freq <- emp_freq / rowSums(emp_freq) + return(emp_freq) + } > > test_that("Error: quantile function computing are not equivalent.", { + for(p in c(0.01, 0.1, 0.5, 0.9, 0.99)) { + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + delta.r <- as.double(sort(abs(scores[,2]-scores[,1]))[ceiling((1 - p) * (nrow(scores)))]) + expect_equal(delta, delta.r) + } + }) > > test_that("Error: the scores for samples are not equivalent.", { + p <- 0.1 + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_diff", test_score, adj_mat, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + ## Use R code to generate a random sample + for(i in seq(10)) { + sample <- drawonesample(theta) + sample_score <- .Call("test_compute_sample_score_diff", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)] - 1, sample[2 * motif_len] - 1, theta, package = "atSNP") + expect_equal(sample[2 * motif_len + 1], sample_score[1]) + sample1 <- sample2 <- sample3 <- sample + sample1[motif_len] <- seq(4)[-sample[motif_len]][1] + sample2[motif_len] <- seq(4)[-sample[motif_len]][2] + sample3[motif_len] <- seq(4)[-sample[motif_len]][3] + sample_score_r <- log(maxjointprob(sample[seq(2 * motif_len - 1)])) - + log(c(maxjointprob(sample1[seq(2 * motif_len - 1)]), + maxjointprob(sample2[seq(2 * motif_len - 1)]), + maxjointprob(sample3[seq(2 * motif_len - 1)]))) + expect_equal(sample_score_r, sample_score[-1]) + } + + ## Use C code to generate a random sample + delta <- matrix(1, nrow = 4 * motif_len, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + for(j in (pos + motif_len - 1) : 1) { + if(j < pos + motif_len - 1) { + delta[4 * (pos - 1) + seq(4), j] <- sum(snpInfo$prior * delta[4 * (pos - 1) + seq(4), j + 1]) + } + if(j >= pos) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * adj_mat[j - pos + 1, ] + } + if(j == motif_len) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * test_score[j - pos + 1, ] ^ theta + } + } + } + for(i in seq(10)) { + sample <- .Call("test_importance_sample_diff", delta, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP") + start_pos <- sample[2 * motif_len] + 1 + adj_score <- 0 + for(s in seq_len(motif_len)) { + adj_s <- sum(log(adj_mat[cbind(seq(motif_len), sample[s - 1 + seq(motif_len)] + 1)])) + adj_s <- adj_s + theta * log(test_score[motif_len + 1 - s, sample[motif_len] + 1]) + adj_score <- adj_score + exp(adj_s) + } + sample_score <- .Call("test_compute_sample_score_diff", test_pwm, test_score, adj_mat, sample[seq(2 * motif_len - 1)], sample[2 * motif_len], theta, package = "atSNP") + expect_equal(adj_score, sample_score[1]) + } + }) > > test_that("Error: compute the normalizing constant.", { + + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_diff", test_score, adj_mat, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + + ## + const <- .Call("test_func_delta_diff", test_score, adj_mat, snpInfo$prior, trans_mat, theta, package = "atSNP") + + prob_start <- sapply(seq(motif_len), + function(j) + sum(snpInfo$prior * test_score[motif_len + 1 - j, ] ^ theta * + adj_mat[motif_len + 1 - j, ]) / + sum(snpInfo$prior * adj_mat[motif_len + 1 - j, ]) + ) + + const.r <- prod(colSums(snpInfo$prior * t(adj_mat))) * sum(prob_start) + expect_equal(const, const.r) + }) > > test_that("Error: sample distributions are not expected.", { + + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile_diff", score_diff, p, package = "atSNP") + theta <- .Call("test_find_theta_diff", test_score, adj_mat, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + + ## construct the delta matrix + delta <- matrix(1, nrow = 4 * motif_len, ncol = 2 * motif_len - 1) + for(pos in seq(motif_len)) { + for(j in (pos + motif_len - 1) : 1) { + if(j < pos + motif_len - 1) { + delta[4 * (pos - 1) + seq(4), j] <- sum(snpInfo$prior * delta[4 * (pos - 1) + seq(4), j + 1]) + } + if(j >= pos) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * adj_mat[j - pos + 1, ] + } + if(j == motif_len) { + delta[4 * (pos - 1) + seq(4), j] <- delta[4 * (pos - 1) + seq(4), j] * test_score[j - pos + 1, ] ^ theta + } + } + } + + target_freq <- matrix(0, nrow = 4, ncol = 2 * motif_len - 1) + + mat <- snpInfo$prior * matrix(delta[, 1], nrow = 4) + wei <- colSums(mat) + for(j in seq(2 * motif_len - 1)) { + for(pos in seq(motif_len)) { + tmp <- delta[seq(4) + 4 * (pos - 1), j] * snpInfo$prior + target_freq[, j] <- target_freq[, j] + tmp / sum(tmp) * wei[pos] + } + } + target_freq <- t(target_freq) + target_freq <- target_freq / rowSums(target_freq) + + results_i <- function(i) { + ## generate 100 samples + sample1 <- sapply(seq(100), function(x) + .Call("test_importance_sample_diff", + delta, snpInfo$prior, trans_mat, test_score, theta, package = "atSNP")) + emp_freq1 <- get_freq(sample1) + + sample2 <- sapply(rep(theta, 100), drawonesample) + emp_freq2 <- get_freq(sample2 - 1) + + ## print(rbind(emp_freq1[10, ], emp_freq2[10, ], target_freq[10, ])) + max(abs(emp_freq1 - target_freq)) > max(abs(emp_freq2 - target_freq)) + } + + if(Sys.info()[["sysname"]] == "Windows"){ + snow <- SnowParam(workers = 1, type = "SOCK") + results<-bpmapply(results_i, seq(20), BPPARAM = snow,SIMPLIFY = FALSE) + }else{ + results<-bpmapply(results_i, seq(20), BPPARAM = MulticoreParam(workers = 1), + SIMPLIFY = FALSE) + } + + print(sum(unlist(results))) + + print(pbinom(sum(unlist(results)), size = 20, prob = 0.5)) + + }) [1] 8 [1] 0.2517223 > > test_that("Error: the chosen pvalues should have the smaller variance.", { + + .structure_diff <- function(pval_mat) { + id <- apply(pval_mat[, c(2, 4)], 1, which.min) + return(cbind(pval_mat[, c(1, 3)][cbind(seq_along(id), id)], + pval_mat[, c(2, 4)][cbind(seq_along(id), id)])) + } + + for(p in c(0.05, 0.1, 0.2, 0.5)) { + p_values <- .Call("test_p_value_diff", test_pwm, test_score, adj_mat, snpInfo$prior, snpInfo$transition, score_diff, quantile(score_diff, 1 - p), 100, package = "atSNP") + p_values_s <- .structure_diff(p_values) + expect_equal(p_values_s[, 2], apply(p_values[, c(2, 4)], 1, min)) + } + }) > > proc.time() user system elapsed 16.90 0.46 17.35
atSNP.Rcheck/tests_i386/test_is.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: i386-w64-mingw32/i386 (32-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > data(example) > > trans_mat <- matrix(rep(snpInfo$prior, each = 4), nrow = 4) > test_pwm <- motif_library$SIX5_disc1 > scores <- as.matrix(motif_scores$motif.scores[3:4, 4:5]) > > motif_len <- nrow(test_pwm) > > ## these are functions for this test only > drawonesample <- function(theta) { + delta <- snpInfo$prior * t(test_pwm ^ theta) + delta <- delta / rep(colSums(delta), each = 4) + sample <- sample(1:4, 2 * motif_len - 1, replace = TRUE, prob = snpInfo$prior) + id <- sample(seq(motif_len), 1) + sample[id : (id + motif_len - 1)] <- apply(delta, 2, function(x) sample(1:4, 1, prob = x)) + sc <- s_cond <- 0 + for(s in seq(motif_len)) { + sc <- sc + prod(test_pwm[cbind(seq(motif_len), + sample[s : (s + motif_len - 1)])]) ^ theta + } + s_cond <- prod(test_pwm[cbind(seq(motif_len), + sample[id : (id + motif_len - 1)])]) ^ theta + sample <- c(sample, id, sc, s_cond) + return(sample) + } > jointprob <- function(x) prod(test_pwm[cbind(seq(motif_len), x)]) > maxjointprob <- function(x) { + maxp <- -Inf + p <- -Inf + for(i in 1:motif_len) { + p <- jointprob(x[i:(i+motif_len - 1)]) + if(p > maxp) + maxp <- p + } + for(i in 1:motif_len) { + p <- jointprob(5 - x[(i+motif_len - 1):i]) + if(p > maxp) + maxp <- p + } + return(maxp) + } > get_freq <- function(sample) { + ids <- cbind( + rep(sample[motif_len * 2, ], each = motif_len) + seq(motif_len), + rep(seq(100), each = motif_len)) + sample_motif <- matrix(sample[ids], nrow = motif_len) + 1 + emp_freq <- matrix(0, nrow = motif_len, ncol = 4) + for(i in seq(motif_len)) { + for(j in seq(4)) { + emp_freq[i, j] <- sum(sample_motif[i, ] == j) + } + } + emp_freq <- emp_freq / rowSums(emp_freq) + return(emp_freq) + } > > test_that("Error: quantile function computing are not equivalent.", { + for(p in c(0.01, 0.1, 0.5, 0.9, 0.99)) { + delta <- .Call("test_find_percentile", c(scores), p, package = "atSNP") + delta.r <- -sort(-c(scores))[as.integer(p * length(scores)) + 1] + delta==delta.r + expect_equal(delta, delta.r) + } + }) > > test_that("Error: the scores for samples are not equivalent.", { + p <- 0.01 + delta <- .Call("test_find_percentile", scores, p, package = "atSNP") + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + ## Use R code to generate a random sample + for(i in seq(10)) { + sample <- drawonesample(theta) + sample_score <- .Call("test_compute_sample_score", test_pwm, sample[seq(2 * motif_len - 1)] - 1, sample[motif_len * 2] - 1, theta, package = "atSNP") + expect_equal(sample[2 * motif_len + 1], sample_score[2]) + expect_equal(sample[2 * motif_len + 2], sample_score[3]) + } + ## Use C code to generate a random sample + for(i in seq(10)) { + delta <- t(test_pwm ^ theta) + delta <- cbind(matrix( + sum(snpInfo$prior * delta[, 1]), + nrow = 4, ncol = motif_len - 1), delta) + sample <- .Call("test_importance_sample", delta, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP") + start_pos <- sample[motif_len * 2] + adj_score <- 0 + for(s in seq(motif_len) - 1) { + adj_score <- adj_score + prod(test_pwm[cbind(seq(motif_len), + sample[s + seq(motif_len)] + 1)]) ^ theta + } + adj_score_cond <- prod(test_pwm[cbind(seq(motif_len), sample[start_pos + seq(motif_len)] + 1)]) ^ theta + sample_score <- .Call("test_compute_sample_score", test_pwm, sample[seq(2 * motif_len - 1)], sample[motif_len * 2], theta, package = "atSNP") + expect_equal(adj_score, sample_score[2]) + expect_equal(adj_score_cond, sample_score[3]) + } + }) > > test_that("Error: compute the normalizing constant.", { + ## parameters + p <- 0.01 + delta <- .Call("test_find_percentile", scores, p, package = "atSNP") + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + ## + const <- .Call("test_func_delta", test_pwm, snpInfo$prior, trans_mat, theta, package = "atSNP") + const.r <- prod(colSums(snpInfo$prior * t(test_pwm) ^ theta)) * motif_len + expect_equal(abs(const - const.r) / const < 1e-5, TRUE) + }) > > test_that("Error: sample distributions are not expected.", { + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile", scores, p, package = "atSNP") + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, trans_mat, delta, package = "atSNP") + delta <- t(test_pwm ^ theta) + delta <- cbind(matrix( + sum(snpInfo$prior * delta[, 1]), + nrow = 4, ncol = motif_len - 1), delta) + + results_i <- function(i) { + ## generate 100 samples + sample <- sapply(seq(100), function(x) + .Call("test_importance_sample", + delta, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP")) + emp_freq1 <- get_freq(sample) + target_freq <- test_pwm ^ theta * snpInfo$prior + target_freq <- target_freq / rowSums(target_freq) + ## generate samples in R + sample <- sapply(rep(theta, 100), drawonesample) + emp_freq2 <- get_freq(sample[seq(2 * motif_len), ] - 1) + max(abs(emp_freq1 - target_freq)) > max(abs(emp_freq2 - target_freq)) + } + + if(Sys.info()[["sysname"]] == "Windows"){ + snow <- SnowParam(workers = 1, type = "SOCK") + results<-bpmapply(results_i, seq(20), BPPARAM = snow,SIMPLIFY = FALSE) + }else{ + results<-bpmapply(results_i, seq(20), BPPARAM = MulticoreParam(workers = 1), + SIMPLIFY = FALSE) + } + + print(sum(unlist(results))) + print(pbinom(sum(unlist(results)), size = 20, prob = 0.5)) + }) [1] 13 [1] 0.9423409 > > test_that("Error: the chosen pvalues should have the smaller variance.", { + .structure <- function(pval_mat) { + id1 <- apply(pval_mat[, c(2, 4)], 1, which.min) + return(cbind( + pval_mat[, c(1, 3)][cbind(seq_along(id1), id1)], + pval_mat[, c(2, 4)][cbind(seq_along(id1), id1)]) + ) + } + for(p in c(0.01, 0.05, 0.1)) { + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, trans_mat, quantile(c(scores), 1 - p), package = "atSNP") + p_values <- .Call("test_p_value", test_pwm, snpInfo$prior, snpInfo$transition, c(scores), theta, 100, package = "atSNP") + p_values_s <- .structure(p_values) + expect_equal(p_values_s[, 2], apply(p_values[, c(2, 4)], 1, min)) + } + }) > > proc.time() user system elapsed 10.68 0.84 11.51	atSNP.Rcheck/tests_x64/test_is.Rout R version 3.6.1 (2019-07-05) -- "Action of the Toes" Copyright (C) 2019 The R Foundation for Statistical Computing Platform: x86_64-w64-mingw32/x64 (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under certain conditions. Type 'license()' or 'licence()' for distribution details. R is a collaborative project with many contributors. Type 'contributors()' for more information and 'citation()' on how to cite R or R packages in publications. Type 'demo()' for some demos, 'help()' for on-line help, or 'help.start()' for an HTML browser interface to help. Type 'q()' to quit R. > library(atSNP) > library(BiocParallel) > library(testthat) > data(example) > > trans_mat <- matrix(rep(snpInfo$prior, each = 4), nrow = 4) > test_pwm <- motif_library$SIX5_disc1 > scores <- as.matrix(motif_scores$motif.scores[3:4, 4:5]) > > motif_len <- nrow(test_pwm) > > ## these are functions for this test only > drawonesample <- function(theta) { + delta <- snpInfo$prior * t(test_pwm ^ theta) + delta <- delta / rep(colSums(delta), each = 4) + sample <- sample(1:4, 2 * motif_len - 1, replace = TRUE, prob = snpInfo$prior) + id <- sample(seq(motif_len), 1) + sample[id : (id + motif_len - 1)] <- apply(delta, 2, function(x) sample(1:4, 1, prob = x)) + sc <- s_cond <- 0 + for(s in seq(motif_len)) { + sc <- sc + prod(test_pwm[cbind(seq(motif_len), + sample[s : (s + motif_len - 1)])]) ^ theta + } + s_cond <- prod(test_pwm[cbind(seq(motif_len), + sample[id : (id + motif_len - 1)])]) ^ theta + sample <- c(sample, id, sc, s_cond) + return(sample) + } > jointprob <- function(x) prod(test_pwm[cbind(seq(motif_len), x)]) > maxjointprob <- function(x) { + maxp <- -Inf + p <- -Inf + for(i in 1:motif_len) { + p <- jointprob(x[i:(i+motif_len - 1)]) + if(p > maxp) + maxp <- p + } + for(i in 1:motif_len) { + p <- jointprob(5 - x[(i+motif_len - 1):i]) + if(p > maxp) + maxp <- p + } + return(maxp) + } > get_freq <- function(sample) { + ids <- cbind( + rep(sample[motif_len * 2, ], each = motif_len) + seq(motif_len), + rep(seq(100), each = motif_len)) + sample_motif <- matrix(sample[ids], nrow = motif_len) + 1 + emp_freq <- matrix(0, nrow = motif_len, ncol = 4) + for(i in seq(motif_len)) { + for(j in seq(4)) { + emp_freq[i, j] <- sum(sample_motif[i, ] == j) + } + } + emp_freq <- emp_freq / rowSums(emp_freq) + return(emp_freq) + } > > test_that("Error: quantile function computing are not equivalent.", { + for(p in c(0.01, 0.1, 0.5, 0.9, 0.99)) { + delta <- .Call("test_find_percentile", c(scores), p, package = "atSNP") + delta.r <- -sort(-c(scores))[as.integer(p * length(scores)) + 1] + delta==delta.r + expect_equal(delta, delta.r) + } + }) > > test_that("Error: the scores for samples are not equivalent.", { + p <- 0.01 + delta <- .Call("test_find_percentile", scores, p, package = "atSNP") + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + ## Use R code to generate a random sample + for(i in seq(10)) { + sample <- drawonesample(theta) + sample_score <- .Call("test_compute_sample_score", test_pwm, sample[seq(2 * motif_len - 1)] - 1, sample[motif_len * 2] - 1, theta, package = "atSNP") + expect_equal(sample[2 * motif_len + 1], sample_score[2]) + expect_equal(sample[2 * motif_len + 2], sample_score[3]) + } + ## Use C code to generate a random sample + for(i in seq(10)) { + delta <- t(test_pwm ^ theta) + delta <- cbind(matrix( + sum(snpInfo$prior * delta[, 1]), + nrow = 4, ncol = motif_len - 1), delta) + sample <- .Call("test_importance_sample", delta, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP") + start_pos <- sample[motif_len * 2] + adj_score <- 0 + for(s in seq(motif_len) - 1) { + adj_score <- adj_score + prod(test_pwm[cbind(seq(motif_len), + sample[s + seq(motif_len)] + 1)]) ^ theta + } + adj_score_cond <- prod(test_pwm[cbind(seq(motif_len), sample[start_pos + seq(motif_len)] + 1)]) ^ theta + sample_score <- .Call("test_compute_sample_score", test_pwm, sample[seq(2 * motif_len - 1)], sample[motif_len * 2], theta, package = "atSNP") + expect_equal(adj_score, sample_score[2]) + expect_equal(adj_score_cond, sample_score[3]) + } + }) > > test_that("Error: compute the normalizing constant.", { + ## parameters + p <- 0.01 + delta <- .Call("test_find_percentile", scores, p, package = "atSNP") + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, snpInfo$transition, delta, package = "atSNP") + ## + const <- .Call("test_func_delta", test_pwm, snpInfo$prior, trans_mat, theta, package = "atSNP") + const.r <- prod(colSums(snpInfo$prior * t(test_pwm) ^ theta)) * motif_len + expect_equal(abs(const - const.r) / const < 1e-5, TRUE) + }) > > test_that("Error: sample distributions are not expected.", { + ## parameters + p <- 0.1 + delta <- .Call("test_find_percentile", scores, p, package = "atSNP") + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, trans_mat, delta, package = "atSNP") + delta <- t(test_pwm ^ theta) + delta <- cbind(matrix( + sum(snpInfo$prior * delta[, 1]), + nrow = 4, ncol = motif_len - 1), delta) + + results_i <- function(i) { + ## generate 100 samples + sample <- sapply(seq(100), function(x) + .Call("test_importance_sample", + delta, snpInfo$prior, trans_mat, test_pwm, theta, package = "atSNP")) + emp_freq1 <- get_freq(sample) + target_freq <- test_pwm ^ theta * snpInfo$prior + target_freq <- target_freq / rowSums(target_freq) + ## generate samples in R + sample <- sapply(rep(theta, 100), drawonesample) + emp_freq2 <- get_freq(sample[seq(2 * motif_len), ] - 1) + max(abs(emp_freq1 - target_freq)) > max(abs(emp_freq2 - target_freq)) + } + + if(Sys.info()[["sysname"]] == "Windows"){ + snow <- SnowParam(workers = 1, type = "SOCK") + results<-bpmapply(results_i, seq(20), BPPARAM = snow,SIMPLIFY = FALSE) + }else{ + results<-bpmapply(results_i, seq(20), BPPARAM = MulticoreParam(workers = 1), + SIMPLIFY = FALSE) + } + + print(sum(unlist(results))) + print(pbinom(sum(unlist(results)), size = 20, prob = 0.5)) + }) [1] 8 [1] 0.2517223 > > test_that("Error: the chosen pvalues should have the smaller variance.", { + .structure <- function(pval_mat) { + id1 <- apply(pval_mat[, c(2, 4)], 1, which.min) + return(cbind( + pval_mat[, c(1, 3)][cbind(seq_along(id1), id1)], + pval_mat[, c(2, 4)][cbind(seq_along(id1), id1)]) + ) + } + for(p in c(0.01, 0.05, 0.1)) { + theta <- .Call("test_find_theta", test_pwm, snpInfo$prior, trans_mat, quantile(c(scores), 1 - p), package = "atSNP") + p_values <- .Call("test_p_value", test_pwm, snpInfo$prior, snpInfo$transition, c(scores), theta, 100, package = "atSNP") + p_values_s <- .structure(p_values) + expect_equal(p_values_s[, 2], apply(p_values[, c(2, 4)], 1, min)) + } + }) > > proc.time() user system elapsed 14.37 0.34 14.70

Example timings

atSNP.Rcheck/examples_i386/atSNP-Ex.timings

name	user	system	elapsed
ComputeMotifScore	0.72	0.01	16.71
ComputePValues	0.87	0.03	13.03
GetIUPACSequence	0	0	0
LoadFastaData	0.41	0.02	0.43
LoadMotifLibrary	0.31	0.15	0.51
LoadSNPData	0	0	0
MatchSubsequence	0.95	0.02	16.56
dtMotifMatch	0.72	0.08	14.15
plotMotifMatch	1.16	0.25	1.40

atSNP.Rcheck/examples_x64/atSNP-Ex.timings

name	user	system	elapsed
ComputeMotifScore	0.95	0.00	16.20
ComputePValues	0.96	0.01	14.19
GetIUPACSequence	0	0	0
LoadFastaData	0.65	0.02	0.67
LoadMotifLibrary	0.47	0.02	0.49
LoadSNPData	0	0	0
MatchSubsequence	1.17	0.00	13.57
dtMotifMatch	0.96	0.04	13.35
plotMotifMatch	1.08	0.03	1.11