Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A prime sieve #197

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from
Open

A prime sieve #197

wants to merge 1 commit into from

Conversation

czurnieden
Copy link
Contributor

The actual sieve from #190 plus the two functions mp_next_small_prime and mp_prec_small_prime

@czurnieden czurnieden force-pushed the bn_sieve branch 4 times, most recently from 85d0276 to c9c37f2 Compare April 8, 2019 02:21
# define LTM_SIEVE_PR_UINT PRIu32
# define LTM_SIEVE_UINT_MAX 0xFFFFFFFFlu
# define LTM_SIEVE_UINT_MAX_SQRT 0xFFFFlu
#endif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move those definitions to the private header with a MP_* prefix? Maybe just use size_t instead of LTM_SIEVE_UINT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you move those definitions to the private header with a MP_* prefix

It already needs rebase'ing, so: no problem.
(Apropos rebase'ing: should I wait a day or two until your stuff gets merged or is there much more to come?)

Maybe just use size_t instead of LTM_SIEVE_UINT

Mmh…no, I need to know the exact sizes here and size_t is most likely LTM_SIEVE_UINT but I cannot be completely sure and I need to.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@czurnieden From me there is not much more to come for now. I would rather like to reduce the backlog a bit.

Mmh…no, I need to know the exact sizes here and size_t is most likely LTM_SIEVE_UINT but I cannot be completely sure and I need to.

We already include limits.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Sorry for the delay, everybody seems to have pushed their urgent things to "after Easter" which didn't make it better *sigh*)

Could you move those definitions to the private header with a MP_* prefix?

The MP_ prefix I can do. Was already planned when I watched you harmonizing the style.

Making (some of) the macros private, not so much.

The sieve exists in three sizes, one for 8-bit, one extra-large, and one for the whole rest.

#ifdef MP_8BIT
#   define LTM_SIEVE_BIGGEST_PRIME      65521lu
#   define LTM_SIEVE_UINT               uint16_t
#   define LTM_SIEVE_PR_UINT            PRIu16
#   define LTM_SIEVE_UINT_MAX           0xFFFFlu
#   define LTM_SIEVE_UINT_MAX_SQRT      0xFFlu
#elif ( (defined MP_64BIT) && (defined LTM_SIEVE_USE_LARGE_SIEVE) )
#   define LTM_SIEVE_BIGGEST_PRIME      18446744073709551557llu
#   define LTM_SIEVE_UINT               uint64_t
#   define LTM_SIEVE_PR_UINT            PRIu64
#   define LTM_SIEVE_UINT_MAX           0xFFFFFFFFFFFFFFFFllu
#   define LTM_SIEVE_UINT_MAX_SQRT      0xFFFFFFFFllu
#else
#   define LTM_SIEVE_BIGGEST_PRIME      4294967291lu
#   define LTM_SIEVE_UINT               uint32_t
#   define LTM_SIEVE_PR_UINT            PRIu32
#   define LTM_SIEVE_UINT_MAX           0xFFFFFFFFlu
#   define LTM_SIEVE_UINT_MAX_SQRT      0xFFFFlu
#endif
  • The macro LTM_SIEVE_UINT is used in the definition of the functions, a replacement is not easy:

Maybe just use size_t instead of LTM_SIEVE_UINT?

According to the standard (ISO/IEC 9899:2011 7.20.3) the limit of size_t is SIZE_MAX and has only a minimum (65535) not a fixed size. I would need to use a large type (mp_word would do it, I think) for it as a replacement and that would make it slower for high-mp architectures and is also a waste of memory.

I could also do something in the line of typedef TYPE mp_small_prime which looks a bit more elegant.

  • LTM_SIEVE_BIGGEST_PRIME can be set private, although it comes quite handy in loops.
  • LTM_SIEVE_PR_UINT can be set private, although it would be a bit of a work for the user to find out the correct way to print the values.
  • LTM_SIEVE_UINT_MAX can be set private. It is convenient but LTM_SIEVE_BIGGEST_PRIME should suffice.
  • LTM_SIEVE_UINT_MAX_SQRT can be set private, but it is the size of the base which might be usefull to know for low memory architectures, On the other side: it is not directly the size, it needs a bit of work to get the actual size from this information. This is the macro I would be able to kick out with the cleanest conscious.

Tell me what you think while I'll run the rebase and the replace LTM_SIEVE_ MP_SIEVE_ -- *.[ch]

@czurnieden czurnieden force-pushed the bn_sieve branch 4 times, most recently from ad19a9a to 880fdf0 Compare May 19, 2019 20:13
@czurnieden czurnieden force-pushed the bn_sieve branch 2 times, most recently from 6659295 to 8a82665 Compare May 22, 2019 16:26
@czurnieden
Copy link
Contributor Author

czurnieden commented May 22, 2019

@minad rebased and tried to adapt to the new API. I'm pretty sure that some slipped through, please check if I missed some.

Oh, and this is not really a "work in progress", I don't plan to add anything in this PR (re. logic parts and all that) this one is complete.
Being complete doesn't mean that is fully polished, of course, and free of errors ;-)

@czurnieden czurnieden force-pushed the bn_sieve branch 5 times, most recently from fdb02b0 to 52402ab Compare June 3, 2019 13:57
@minad
Copy link
Member

minad commented Feb 20, 2020

Close, see #160 for the reason

@minad minad closed this Feb 20, 2020
@czurnieden
Copy link
Contributor Author

Rebased to actual (see date) develop branch.

Did a bit of a cleanup (no featuritis anymore, at least not that much) and added some documentation.
The base sieve has a size of 4096 bytes (that is fixed) as do the segments, but that is not fixed and can go down to 670 bits (largest prime-gap < 2^32 is 335). Default is 4096 bits, 512 bytes, which gives a good random access time (about 100 microsec on my machine) and is not bade sequentially especially in the base sieve. Warm-up time of the base sieve is about 50 microseconds.

Binary size (stripped) is a combined 6k, the stripped s_mp_prime_tab.o has 2.8k (all compiled for 64-bit) so about double the size.

This sieve is useful to add a large-ish random prime to the Miller-Rabin test. It is quite simple (although computationally intense) to generate strong pseudoprimes in cryptographic relevant sizes if the tests use small primes only. In our case two and three in the first run and one random large one (could be composite with very small factors). Just checking the whole table in s_mp_prime_tab does not do it, there are known pseudoprimes that pass all 255 tests.

It would be better to combine the 2,3 test with a random large-ish prime from the sieve. We could generate that small prime with the deterministic version of MR but that costs, the little space and negligible runtime needed to do it with this sieve would pay.

@czurnieden czurnieden reopened this Mar 13, 2023
@czurnieden czurnieden force-pushed the bn_sieve branch 2 times, most recently from e7a3b41 to 8ae39ca Compare March 14, 2023 05:11
@czurnieden
Copy link
Contributor Author

I don't know what is on with the VS tests, maybe a caching problem?
9dabf48 complained about a comparison it never complained about, before.

18beae6 tried some changes, same error, same line-number.

b21bef4 tested my guess with changing the line of the. error. VS threw the same error with the same line number.

@czurnieden
Copy link
Contributor Author

So, VS, it was just coincidence that the numbers 1630 and later 1634 could be taken as a line numbers because they were inside newly added code?
Yes, it was a problem with signedness, but those kind of error messages are not very helpful!

@czurnieden czurnieden added this to the v2.0.0 milestone Mar 16, 2023
@czurnieden
Copy link
Contributor Author

It is just the sieve itself, optimized for quick random access for the primes up to 2^32 -5. Sequential access above the prime 2^16 - 15 is rather slow but not unusable.

The time to add all primes up to 2^32 is 2:40 min with the default segment-size and 0:55 min with a slightly larger segment size.

Warm-up time (bullding the base sieve) is about 60 microseconds. Random access in the base sieve is <1 microsecond, random access in a segemnt is about 100 microseconds including the building of the segment but that timing depends on the size of the segment. See documentation in bn.tex for more details.

Fun fact: Pari/gp s = 0;forprime(n=0, 2^32, s+=n);s needs 1:24 min on the same machine.

@MasterDuke17
Copy link
Collaborator

MasterDuke17 commented Mar 17, 2023 via email

@czurnieden
Copy link
Contributor Author

@MasterDuke17 It is in the base sieve but it is not in the segments.

It is rarely as simple in real life as it is in the papers ;-)

We support 16-bit architectures. That means that the largest type we can use is a 32-bit integer (which is already a bigint in 16-bit archs) so we have to avoid anything larger than that. Forišek's implementation of the Miller-Rabin test needs 64-bit variables. It can be avoided but for a cost in runtime. If that would still be faster is a good question. And we already have a Miller-Rabin test. Do we need another?

But it is much smaller in code, admitted, even with such a large table.

My code has some possibilities to optimize, I was just happy that I got VS to finally accept it ;-)

  • the segments are normal sieves, they should be restricted to odd numbers, too. Less memory used and speedier.
  • extend the wheel from 2 to 210 = 2*3*5*7 or maybe even higher: more speed (to an extend) but also more code
  • get rid of the segments altogether and compute the primes directly (complicated but prob. the fastest)
  • minimize the segments. The largest prime-gap < 2^32 is 335. If we only keep the odd numbers, we need 21 bytes, that are 6 32-bit integers to find the next prime without recomputing that little segment. And that memory is on the stack, not on the heap (which might be a disadvantage if stack-space is tight) . A lot of the code managing the memory can go, so we have more speed with less code and less memory.

But further optimization depends heavily on the success of the actual use: adding a random small prime to the first M-R tests in mp_prime_is_prime. The base of our M-R test is a bigint, so we can multiply the three together 2*3*random_prime and have a better first test without much more overhead (how much? t.b.d.) to reduce the calls to the quite costly Lucas and/or Frobenius-Underwood tests. LTM offers a way to avoid them and just rely on M-R only. That always gave me the…uhm…heeby-jeebies—it made me a bit uncomfortable.

It quiet easy to construct strong pseudorprimes to several small bases. Even as much as all of the bases in LTM's prime-table[1]. The algorithm is highly parallelizable, a medium botnet makes quick work out of it. I put a quick&dirty example in this gist here. The example call at the end produces a strong pseudoprime to the bases 2, 3, 5, 7, 11, 13, 17, 19, 23, 29 in a couple of seconds (378 milliseconds when I tried it now but the trials are random , so it might last longer sometimes) or the first 15 bases with parameters 50, 1, 61, 173, 64, 100000. Or all 25 prime bases smaller than 100 with parameters 97, 1,101, 173, 64, 100000. For smaller pseudoprimes use 97, 1,101, 113, 16, 1000 which is quite quick, spits out 100 spsp's in under 2 minutes.

We could use a normal random base as we already do in the final step which is composite in about n - n/log(n) cases and could be composed out of all small primes. And counting the latter is a problem I'm now stuck with ;-)

A risk that can be minimized to some extend with another large-ish prime in the first M-R round.
The obvious question: is it worth it?
I'll implement it and we can take a look at it. If it is too much hassle for too little gain: dump it, if not: take it.

[1] Albrecht, Martin R., et al. "Prime and prejudice: primality testing under adversarial conditions." Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. 2018.
PDF: https://eprint.iacr.org/2018/749.pdf

@MasterDuke17
Copy link
Collaborator

MasterDuke17 commented Mar 17, 2023 via email

@czurnieden
Copy link
Contributor Author

czurnieden commented Apr 5, 2023

@MasterDuke17 I run some tests and found the method using BigInts about ten times slower than the native versions. The native version able to generate primes up to 32 bit is about the same speed as the sieve for random access, sometimes even faster. I have not tested sequential generation but it should be in the same ballpark.

If we use the whole mp_prime_is_prime() shebang it averages out at about three times slower for generating random primes up to 64 bit than the native version.

I have not run the tests against #541 which should shave another usec or two off in that range.

So: the whole work for nothing? Well, that's life ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants