Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance #43

Open
lo48576 opened this issue Sep 19, 2024 · 0 comments
Open

Improve performance #43

lo48576 opened this issue Sep 19, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@lo48576
Copy link
Owner

lo48576 commented Sep 19, 2024

There is a lot of room for improvement.

Forced validation which can fail only by a bug

For example, this part.

iri-string/src/normalize.rs

Lines 641 to 657 in f95d50a

#[cfg(feature = "alloc")]
impl<S: Spec> ToDedicatedString for Normalized<'_, RiStr<S>> {
type Target = RiString<S>;
fn try_to_dedicated_string(&self) -> Result<Self::Target, TryReserveError> {
let s = self.try_to_string()?;
Ok(TryFrom::try_from(s).expect("[validity] the normalization result must be a valid IRI"))
}
}
#[cfg(feature = "alloc")]
impl<S: Spec> From<Normalized<'_, RiStr<S>>> for RiString<S> {
#[inline]
fn from(v: Normalized<'_, RiStr<S>>) -> Self {
v.to_dedicated_string()
}
}

When the syntax of the normalization result is already known, the algorithm guarantees that the resulting string is conformant to the syntax. However, Some part of this crate uses general and public interface to convert string types (such as TryFrom::try_from instead of unsafe counterparts and debug_assert!).
This kind of "should fail only in case of bug" test should basically only be run in debug build if the performance is important.

Non-allocating algorithm even when alloc or std is available

For example, this part.

// None: No segments are written yet.
// Some(false): Something other than `/` is already written as the path.
// Some(true): Only a `/` is written as the path.
let mut only_a_slash_is_written = None;
let mut too_deep_area_may_have_dot_segments = true;
while !rest.is_empty() && too_deep_area_may_have_dot_segments {
/// The size of the queue to track the path segments.
///
/// This should be nonzero.
const QUEUE_SIZE: usize = 8;

let mut queue: [Option<&'_ str>; QUEUE_SIZE] = Default::default();
let mut level: usize = 0;
let mut first_segment_has_leading_slash = false;
// Find higher path segments.
let mut end = 0;
for seg in PathSegmentsIter::new(&rest) {
let kind = seg.kind(&rest);
match kind {
SegmentKind::Dot => {
too_deep_area_may_have_dot_segments = true;
}
SegmentKind::DotDot => {
level = level.saturating_sub(1);
too_deep_area_may_have_dot_segments = true;

In this algorithm, the extra string iteration is necessary if . or .. exists in the deep compontent of the path, because the constant-sized queue is used in order to support non-alloc usage.
However, most of the use case of this crate would be web-related and thus std would be available, so in that case more efficient implementation would be possible. (For example, pre-allocating extra buffer and modifying it inline.)

Other parts that uses core::fmt::Display for serialization could also be improved using the memory allocation.

@lo48576 lo48576 added the enhancement New feature or request label Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant