From 3b6124171323c45f3701d326fbf8f0750f48f4ff Mon Sep 17 00:00:00 2001 From: Richard Gibson Date: Tue, 3 Sep 2024 11:23:32 -0400 Subject: [PATCH 1/2] Editorial: Rename aliases in Canonicalize for better readability --- spec.html | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/spec.html b/spec.html index e8fa3043cf..1e9baab074 100644 --- a/spec.html +++ b/spec.html @@ -37175,12 +37175,12 @@

1. If _rer_.[[IgnoreCase]] is *false*, return _ch_. 1. Assert: _ch_ is a UTF-16 code unit. 1. Let _cp_ be the code point whose numeric value is the numeric value of _ch_. - 1. Let _u_ be toUppercase(« _cp_ »), according to the Unicode Default Case Conversion algorithm. - 1. Let _uStr_ be CodePointsToString(_u_). - 1. If the length of _uStr_ ≠ 1, return _ch_. - 1. Let _cu_ be _uStr_'s single code unit element. - 1. If the numeric value of _ch_ ≥ 128 and the numeric value of _cu_ < 128, return _ch_. - 1. Return _cu_. + 1. Let _mapped_ be toUppercase(« _cp_ »), according to the Unicode Default Case Conversion algorithm. + 1. Let _mappedStr_ be CodePointsToString(_mapped_). + 1. If the length of _mappedStr_ ≠ 1, return _ch_. + 1. Let _mappedCU_ be _mappedStr_'s single code unit element. + 1. If the numeric value of _ch_ ≥ 128 and the numeric value of _mappedCU_ < 128, return _ch_. + 1. Return _mappedCU_.

In case-insignificant matches when HasEitherUnicodeFlag(_rer_) is *true*, all characters are implicitly case-folded using the simple mapping provided by the Unicode Standard immediately before they are compared. The simple mapping always maps to a single code point, so it does not map, for example, `ß` (U+00DF LATIN SMALL LETTER SHARP S) to `ss` or `SS`. It may however map code points outside the Basic Latin block to code points within it—for example, `ſ` (U+017F LATIN SMALL LETTER LONG S) case-folds to `s` (U+0073 LATIN SMALL LETTER S) and `K` (U+212A KELVIN SIGN) case-folds to `k` (U+006B LATIN SMALL LETTER K). Strings containing those code points are matched by regular expressions such as `/[a-z]/ui`.

From 21057df08ce4b02f14884725b50cc535792b3ace Mon Sep 17 00:00:00 2001 From: Richard Gibson Date: Tue, 3 Sep 2024 11:54:58 -0400 Subject: [PATCH 2/2] Editorial: Remove an alias from Canonicalize --- spec.html | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-) diff --git a/spec.html b/spec.html index 1e9baab074..02b91dfcc9 100644 --- a/spec.html +++ b/spec.html @@ -37176,11 +37176,10 @@

1. Assert: _ch_ is a UTF-16 code unit. 1. Let _cp_ be the code point whose numeric value is the numeric value of _ch_. 1. Let _mapped_ be toUppercase(« _cp_ »), according to the Unicode Default Case Conversion algorithm. - 1. Let _mappedStr_ be CodePointsToString(_mapped_). - 1. If the length of _mappedStr_ ≠ 1, return _ch_. - 1. Let _mappedCU_ be _mappedStr_'s single code unit element. - 1. If the numeric value of _ch_ ≥ 128 and the numeric value of _mappedCU_ < 128, return _ch_. - 1. Return _mappedCU_. + 1. If the number of elements in _mapped_ ≠ 1 or the length of CodePointsToString(_mapped_) ≠ 1, return _ch_. + 1. Let _mappedCP_ be the single code point element of _mapped_. + 1. If the numeric value of _ch_ ≥ 128 and _mappedCP_ < 128, return _ch_. + 1. Return the code unit whose numeric value is _mappedCP_.

In case-insignificant matches when HasEitherUnicodeFlag(_rer_) is *true*, all characters are implicitly case-folded using the simple mapping provided by the Unicode Standard immediately before they are compared. The simple mapping always maps to a single code point, so it does not map, for example, `ß` (U+00DF LATIN SMALL LETTER SHARP S) to `ss` or `SS`. It may however map code points outside the Basic Latin block to code points within it—for example, `ſ` (U+017F LATIN SMALL LETTER LONG S) case-folds to `s` (U+0073 LATIN SMALL LETTER S) and `K` (U+212A KELVIN SIGN) case-folds to `k` (U+006B LATIN SMALL LETTER K). Strings containing those code points are matched by regular expressions such as `/[a-z]/ui`.