Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive type instability in this package #117

Open
jakobnissen opened this issue Apr 27, 2021 · 4 comments
Open

Excessive type instability in this package #117

jakobnissen opened this issue Apr 27, 2021 · 4 comments
Labels
help wanted Extra attention is needed question Further information is requested RFC

Comments

@jakobnissen
Copy link

jakobnissen commented Apr 27, 2021

I encountered some severe type instabilities in code using this package. When attempting to fix the type instability, I discovered that it runs deep in this package.

For example:

  • PosixPath stores an NTuple of strings. When N is not known at compile time, this is much less efficient than storing a vector
  • join is completely type unstable in multiple ways: The splatting of the unknown-sized tuple, the fact that p can be any of multiple types, the fact that p can switch type halfway through
  • The Path function is fundamentally unstable due to how override works.
  • Splatting is used a lot on objects with a length unknown at compile time

To me at least, one of the major promises of this package is increased type safety: I use AbstractPath instead of String because it allows better static analysis, and catches more bugs. But this is hindered when the code is so opaque to the compiler.

It would be nice if this package became type stable.

@rofinn
Copy link
Owner

rofinn commented Apr 28, 2021

FWIW, it feels like you're conflating type safety and stability. The type instability concerns associated with using Tuple vs Vector{String} internally has been raised before and it was an intentional tradeoff. I went with tuples because:

  1. The immutability of tuples is a nice way to avoid bugs, as path types should be immutable. I'm also pretty sure StaticArrays would have the same problem as tuples in this context.
  2. Performance wasn't as much of a priority given that the latency of the underlying filesystem was more likely to be the bottleneck.

If you disagree with either of these assumptions, feel free to share specific examples or alternatives. Simply listing operations that aren't type stable isn't very helpful because it's hard to determine the overall impact relative to the convenience. Having looser constraints often improves code readability and usability, so without more context it's hard to prioritize changing that.

To me at least, one of the major promises of this package is increased type safety: I use AbstractPath instead of String because it allows better static analysis, and catches more bugs.

And it still usually does because even if the AbstractString components aren't type stable the wrapper path type should be known at compile time and you can dispatch on that API appropriately. The other intention of this package is to provide a common API for working with a variety of filesystems (AWSS3.S3Path).

It would be nice if this package became type stable.

Specific suggestions or PRs are always welcome :)

@jakobnissen
Copy link
Author

FWIW, it feels like you're conflating type safety and stability

Right, sorry, I was being imprecise. I use this package for type safety. I agree that for file operations, the performance impact of small type instabilities is usually negligible.

However, type instability does interfere with type safety tools like JET or Cthulhu. When the compiler can't infer what types you have before runtime, it can't make any promises before runtime, either.

So my issue is this: Given that type safety is the number one reason to use path types in the first place, it's a shame that type instabilities prevent the user from actually getting verifiable, analyzable type safety.

@rofinn rofinn added help wanted Extra attention is needed RFC question Further information is requested labels Nov 17, 2021
@rofinn
Copy link
Owner

rofinn commented Nov 18, 2021

When the compiler can't infer what types you have before runtime, it can't make any promises before runtime, either.

Hmm, I see. I'm not familiar with JET, but Cthulhu.jl as I understand it's intended more for debugging inference issues like invalidation, rather than enforcing any kind of type safety. If the instabilities are making it hard to debug the Cthulhu.jl output, my understanding is that you can actually run the code in question and it'll figure out the runtime types for you?
Without any specific recommendations or examples (preferably as separate issues) I'm not sure what action to take here.

@rofinn
Copy link
Owner

rofinn commented Nov 24, 2021

FWIW, there are some cases where we can do better about knowing types at compile time (e.g., default system path behaviour). Turns out the join function is a bit special snowflake because we need to mimic Base.Filesystem.

JuliaLang/julia#20912
#124

Basically, since Julia decided several years ago that joinpath("/my/root/path", "segment", "/another/absolute/path") should return "/another/absolute/path" we need to handle the case where the return type isn't determined by the type of the first argument. For example, joinpath(S3Path("s3://mybucket/prefix"), "foo", p"/bar") needs to return p"/bar" for backwards compatibility.

Again, I think improving type stability where possible is a good idea to help in reading output from Cthulhu.jl, but I don't think it should override easy-of-use/interop with Base.Filesystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed question Further information is requested RFC
Projects
None yet
Development

No branches or pull requests

2 participants