`logdensityof(::For, x)` slower than it should be #243

cscherrer · 2022-10-20T17:08:45Z

Say we have

using MeasureTheory, BenchmarkTools

m = For(1:1000) do j
    Normal(μ=j)
end

x = rand(m)

Then compare

julia> @btime logdensityof($m, $x)
  3.114 μs (0 allocations: 0 bytes)
-1414.4176763297564

julia> @btime sum(j -> logdensityof(Normal(μ=j), $x[j]), 1:1000)
  684.154 ns (0 allocations: 0 bytes)
-1414.4176763297567

The For should be faster, because it knows about the common base measure. What's going on here?

Note that "slowness" is only relative to another approach in MeasureTheory. We're still well ahead of Distributions:

julia> @btime sum(j -> logdensityof(Dists.Normal(j, 1), $x[j]), 1:1000)
  7.166 μs (0 allocations: 0 bytes)
-1414.4176763297567

The text was updated successfully, but these errors were encountered:

jeremiahpslewis · 2022-10-20T18:56:51Z

Random observation…seems weird that the second (slower) version is floating point exactly equal to Dist.jl and the other one is off in the last digit. Also…can’t wait to have a project where I get to use this package again!

cscherrer · 2022-10-20T19:28:49Z

Yeah, that is weird, but it doesn't seem too suspicious to me. Probably just a matter of floating-point not commuting.

I think there are a few things going on...

First, here's the logdensity_def it's calling:

@inline function logdensity_def(
    d::For{T,F,I},
    x::AbstractVector{X};
) where {X,T,F,I<:Tuple{<:AbstractVector}}
    ind = only(d.inds)
    sum(eachindex(x)) do j
        i = getindex(ind, j)
        @inbounds logdensity_def(d.f(i), x[j])
    end
end

That's meant to be pretty generic, but it doesn't get to take advantage of regularities in ind.

With the current setup we get

julia> @btime logdensity_def($m, $x)
  1.073 μs (0 allocations: 0 bytes)
-1.6689849749288806e8

But we can specialize for UnitRanges. If we add this method

@inline function logdensity_def(
    d::For{T,F,I},
    x::AbstractVector{X};
) where {X,T,F,I<:Tuple{<:AbstractUnitRange}}
    ind = only(d.inds)
    Δj = firstindex(x) - first(ind)
    sum(ind) do j
        @inbounds logdensity_def(d.f(j), x[j + Δj])
    end
end

logdensity_def comes down to

julia> @btime logdensity_def($m, $x)
  674.781 ns (0 allocations: 0 bytes)
-1.6689849749288797e8

Second, the base measure of our For is traversing the data a second time, which is doesn't need to (it just adds lots of zeros). I'll need to fix this.

Finally, our logdensity_def for Normal could be faster. @oschulz suggested

@inline logdensity_def(d::Normal{()}, x) = - muladd(x, x, log2π) / 2
@inline basemeasure(::Normal{()}) = LebesgueBase()

which does much better:

# Before:
julia> @btime sum(j -> logdensityof(Normal(), $x[j]), 1:1000)
  149.426 ns (0 allocations: 0 bytes)
-1425.5727928221847

# After:
julia> @btime sum(j -> logdensityof(Normal(), $x[j]), 1:1000)
  87.670 ns (0 allocations: 0 bytes)
-1425.5727928221847

With these changes (and avoiding the base measure issue above) we're down to

# MeasureTheory
julia> @btime sum(j -> logdensityof(Normal(μ=j), $x[j]), 1:1000)
  677.175 ns (0 allocations: 0 bytes)
-1.6689849749288797e8

# Distributions
julia> @btime sum(j -> logdensityof(Dists.Normal(j,1), $x[j]), 1:1000)
  7.121 μs (0 allocations: 0 bytes)
-1.6689849749288803e8

Over 10x faster!! Well, that's not entirely fair, since we specialize on the "no sigma" case. It really shoudl be:

julia> @btime sum(j -> logdensityof(Normal(j, 1), $x[j]), 1:1000)
  681.201 ns (0 allocations: 0 bytes)
-1.6689849749288797e8

cscherrer · 2022-10-21T19:51:10Z

With some new updates in dev:

julia> x = randn(1000) .+ (1:1000);

julia> # Distributions.jl
       @btime logdensityof(Dists.Product(Dists.Normal.(1:1000, 1)), $x)
  8.168 μs (1 allocation: 15.75 KiB)
-1400.7073038573099

julia> @btime sum(j -> logdensityof(Dists.Normal(j, 1), $x[j]), 1:1000)
  7.123 μs (0 allocations: 0 bytes)
-1400.7073038573099

julia> # MeasureTheory.jl
       @btime logdensityof(For(j -> Normal(j,1), 1:1000), x)
  697.123 ns (1 allocation: 16 bytes)
-1400.7073038573094

julia> @btime sum(j -> logdensityof(Normal(j, 1), $x[j]), 1:1000)
  674.877 ns (0 allocations: 0 bytes)
-1400.7073038573094

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`logdensityof(::For, x)` slower than it should be #243

`logdensityof(::For, x)` slower than it should be #243

cscherrer commented Oct 20, 2022

jeremiahpslewis commented Oct 20, 2022

cscherrer commented Oct 20, 2022

cscherrer commented Oct 21, 2022

logdensityof(::For, x) slower than it should be #243

logdensityof(::For, x) slower than it should be #243

Comments

cscherrer commented Oct 20, 2022

jeremiahpslewis commented Oct 20, 2022

cscherrer commented Oct 20, 2022

cscherrer commented Oct 21, 2022

`logdensityof(::For, x)` slower than it should be #243

`logdensityof(::For, x)` slower than it should be #243