Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support for integer64 type in runXXX functions #134

Open
ethanbsmith opened this issue May 18, 2024 · 6 comments
Open

support for integer64 type in runXXX functions #134

ethanbsmith opened this issue May 18, 2024 · 6 comments
Labels
enhancement Enhancement to existing feature

Comments

@ethanbsmith
Copy link
Contributor

Description

runXXX functions produce incorrect results on integer64 input

Expected behavior

would be nice if runXXX functions either supported int64 or generated an "unsupported" error

Minimal, reproducible example

runSum(bit64::as.integer64(1:10), 3)

Session Info

> sessionInfo()
R version 4.4.0 (2024-04-24 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 22631)

Matrix products: default


locale:
[1] LC_COLLATE=English_United States.utf8  LC_CTYPE=English_United States.utf8    LC_MONETARY=English_United States.utf8 LC_NUMERIC=C                           LC_TIME=English_United States.utf8    

time zone: America/Denver
tzcode source: internal

attached base packages:
[1] parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] kableExtra_1.4.0  jsonlite_1.8.8    readxl_1.4.3      xml2_1.3.6        curl_5.2.1        Rcpp_1.0.12       matrixStats_1.3.0 data.table_1.15.4 quantmod_0.4.26   TTR_0.24.4        doFuture_1.0.1    future_1.33.2     doParallel_1.0.17 iterators_1.0.14  foreach_1.5.2     xts_0.13.2.1     
[17] zoo_1.8-12  
@joshuaulrich joshuaulrich added the enhancement Enhancement to existing feature label Jun 17, 2024
@joshuaulrich
Copy link
Owner

joshuaulrich commented Jun 17, 2024

I think the easiest way to do this would be to add Suggests: bit64 to the DESCRIPTION and special case that class inside the functions.

Here's a helper function to test for integer64 objects and that bit64 is installed.

is.integer64 <- function(x)  
{
    if (inherits(x, "integer64")) {
        if (requireNamespace("bit64", quietly = TRUE)) {
            return(TRUE)
        } else {
            stop("install the 'bit64' package to use this function on integer64 objects")
        }
    } else {
        return(FALSE)
    }
}  

Then something like this for runSum():

if (is.integer64(x)) {
    top <- bit64::as.integer64(rep(0, n))
    csum <- cumsum(x)
    result <- csum - c(top, head(csum, -n))
    is.na(result) <- seq_len(n - 1 + NAs)
}

@ethanbsmith
Copy link
Contributor Author

ethanbsmith commented Jun 17, 2024

i vaguely remember reading somewhere that supporting the bit64 types just involved adding headers and linking , but no code changes, so wasnt sure. what was actually involved. thats the only reason i left that open.

for me, i ran into this because another package has loaded some data as int64 and it took me a while to track that down as the source of my issue. i fixed it by forcing the data to be loaded as a float. im not sure its worth adding the support unless someone else actually needs it. even an "int64 not currently supported." message would be pretty low priority, but i'd take that on if you want.

@joshuaulrich
Copy link
Owner

I don't see any packages that currently link to bit64, so there isn't a package with an example of how to handle integer64 objects in C using the bit64 C API.

data.table supports integer64, and they just check inherits(x, "integer64") and then cast to int64_t and then work with the cast object.

@ethanbsmith
Copy link
Contributor Author

ethanbsmith commented Jun 17, 2024

just looked at some internals. its a weird beast. im not sure there is value in supporting this in xts. im now more in the camp of an error message, at best

> mean(bit64::as.integer64(1:10))
#integer64
#[1] 5

> mean(1L:10L)
#[1] 5.5

thinking further, most TTR and runXX functions, other than runSum, would probably return a double anyway, so. im im even more suspect in adding any complexity here.

@joshuaulrich
Copy link
Owner

integer64 is weird because it's actually a REALSXP (R's internal double vector). So anything that doesn't know about integer64 treats it as a double, but the result will be a weird number because the raw bits of a 64-bit integer and the raw bits of a 64-bit double don't share any equivalence as base-10 numbers.

The result of mean(bit64::as.integer64(1:10)) is correct because the sum of integers 1:10 is 55 (also an integer), but integer division by 10 discards any remainder, so you get 5.

@ethanbsmith
Copy link
Contributor Author

ethanbsmith commented Jul 6, 2024

Findings:

  • integer64 doesnt seem to support matrix. since xts is based on matrix, xts cant really be used with integer64
    • bit64::is.integer64(matrix(bit64::as.integer64(1:9)))
  • The following function will solve a lot of scenarios quite simply, since it would get dispatched to by try.xts
    • as.xts.integer64 <- function(x, order.by, dateFormat = "POSIXct", frequency = NULL, ...) {stop("integer64 not supported")}
    • this would not handle explicit xts creation with integer64 input.

This leaves me here:

  • It probably makes sense to do something in xts instead of TTR
  • How far down the rabbit hole we should go with this?
    • what about a data.frame as input that has integer64 columns (which was my actual real world issue)
    • What about functions like diff.xts that dont use try.xts? xts::diff.xts(bit64::as.integer64(1:9))
  • might be simplest to just check if bit64 has been loaded and just issue a blanket warning when xts gets loaded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Enhancement to existing feature
Projects
None yet
Development

No branches or pull requests

2 participants