Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ATtiny85 PulseIn() issue #384

Closed
5chufti opened this issue Jan 30, 2020 · 30 comments
Closed

ATtiny85 PulseIn() issue #384

5chufti opened this issue Jan 30, 2020 · 30 comments
Labels
Critical Serious issue requiring priority fix Pre-2.0.0 bug Bug present in versions of the core older than 2.0.0 - needs to be reteted in 2.0.0
Milestone

Comments

@5chufti
Copy link

5chufti commented Jan 30, 2020

Hi there,
I don't know if it is an issue or just me missing some information. I tried to migrate a fraction of a 328 project to a tiny: model-RC channel decoder. Input is a pulse of 0.7-1.7ms at a rate of 22.5ms that I "decode" via PulseIn(Port,Pol). Somehow the results from PulseIn() on ATTinyCore differ (too much) from standard Arduino (and digispark). For my testpulse (0,5-2,4ms) I get approx this values on the Uno but only 0.34-1.5 on ATTinyCore. This holds true for 16/8/1MHz whereas baudrate and delay seem to be within reasonable tolerance.
Any hints? suggestions?

thanks,
schufti

P.S.: on digispark the values are comparable to the ones from Uno.

@5chufti
Copy link
Author

5chufti commented Feb 1, 2020

@SpenceKonde
so far I checked the relevant sources but found no apparent differences in wiring_pulse.c and the relevant macros microsecondsToClockCycles() and clockCyclesToMicroseconds()
I also checked the build (compile, link) options/parameters with no (to me) relevant differences.
So all that was left was to look at the asm-listing and that showed a signifficant difference in the code of the pulsein() "inner loop", so the calculation of puleswidth had to be wrong.
Most likely due to different versions of avr-gcc : 4.8.1-arduino5 vs. 7.3.0-atmel3.6.1-arduino5, as result of q'n'd coding - such timing critical sections should be avoided or at least inlined, otherwise they are kind of timebomb or sword of damokles.
Compiling ATTinyCore with 4.8.1-arduino5 (have to disable LTO) resuts in almost the same values for PulseIn() as on Uno/digispark.

real pulsewidth: 542-2390µs
Uno/digispark: 550-2410µs
ATTinyCore org: 340-1500µs
ATTinyCore old gcc: 499-2110µs

Fixing it (like in Arduino core from ~1.6.7) for compiler independence is beyond my avr-asm capabilities.

@sleemanj
Copy link

sleemanj commented Feb 1, 2020

clockCyclesToMicroseconds and vice-versa should not differ from compilation options, that's a straight forward calculation

The pulseIn function just counts how many times it went around the loop waiting for the pulse to end, and multiplies that by a constant (and adds an offset) which represents the number of cycles the loop should take.

So the issue there is that the compilation changes that constant considerably, the solution, is to figure out the new constant, I wouldn't expect that it should change too much other than between LTO enabled or not enabled.

In other words return clockCyclesToMicroseconds(width * 21 + 16); 21 and 16 are no longer correct.

https://github.com/SpenceKonde/ATTinyCore/blob/master/avr/cores/tiny/wiring_pulse.c#L68

I'll assume you are running at 8MHz, so that would mean that each cycle is 1/8000000 seconds which is 0.125 microseconds.

So, for ATTinyCore org: 340-1500µs that would I think make the ideal calculation producing a 1500uS result be.... 1500 = (width * 21 + 16) * 0.125 so you can work that back to find the number of loops that was counted being 570.

The 16 offset is just 2uS so we might as well just accept that as is, what we want is to find a new 21 based on the real pulsewidth. 2390 = ( 570 * X + 16 ) * 0.125 which we can back calculate to 33.51, call it 34.

570 loops for 2390 microseconds means 0.2384 loops per uS, so we should be able to now do some checking to see if 34 is reasonable.

With a real pulsewidth of 542uS, that is 129.26 loops, ( 129 * 34 + 16 ) * 0.125 gives us 550us... that's pretty close to what we want.

So, short answer,

https://github.com/SpenceKonde/ATTinyCore/blob/master/avr/cores/tiny/wiring_pulse.c#L68

at least for LTO compilation, change 21 to 34 and see how that goes for you.

@sleemanj
Copy link

sleemanj commented Feb 1, 2020

NB: it is interesting that the loop is slower by so much (was 21 cycles, now 34 cycles), finding out why would be interesting.

@SpenceKonde
Copy link
Owner

Ugh!

Thanks for your investigation here. Need to see how this is handled on other board packages that use LTO, as this can't possibly be broken on the official core, others would have noticed in the several years that LTO has been enabled there.

@sleemanj
Copy link

sleemanj commented Feb 1, 2020

It looks like the main core basically compiled pulseIn as it stood (well they wrote a temporary function which does the counting and returns the counted width and compiled that), and then took the assembly and used that directly, so that it's always going to end up producing the same "width" regardless of compiler options (optimiser though?)

https://github.com/arduino/ArduinoCore-avr/blob/master/cores/arduino/wiring_pulse.S

arduino/ArduinoCore-avr@2bac15c#diff-b72a8b22493b7c991a53249e7a05c8e3

arduino/ArduinoCore-avr@6940c1d

@5chufti
Copy link
Author

5chufti commented Feb 1, 2020

Hi,
no, LTO is not the problem. The problem seems to be that the new avr-gcc implements/optimizes loops different. Thats why wiring_pulse in default arduino core was inlined 5 years ago to be compiler neutral.
Ofcourse recalculating the constants might work, but then the results would be wrong for people using older versions of ATTinyCore based on older version gcc.

p.s.: I tried taking over the "new" wiring_rules.c and .S from default arduino core, but that only gave me constant 0 as result.

@sleemanj
Copy link

sleemanj commented Feb 1, 2020

then the results would be wrong for people using older versions of ATTinyCore

Well the only way somebody can fix an older version would be to upgrade anyway unless they patch it themselves and in that case changing one number of probably easiest ;-)

ATTinyCore specifies the version of gcc it needs in the package .json file, so an updated fixed constant isn't too bad a fix probably.

Having inlined assembly might be tricky when dealing with the large number of chips that ATTinycore does, vs arduino's own avr core which is rather more limited in what it needs to support.

With the simple constant the scope exists to change that constant for different chips (or even potentially allow the user to "calibrate").

@5chufti
Copy link
Author

5chufti commented Feb 2, 2020

ok, I see, the old cores (with matching gcc) are not affected because they also get the old code.
The ones with the new gcc up to maybe next release have to upgrade or fix it on their own or upgrade.
But: do "the large number of chips" really differ in assembly language? Aren't they based on the same core?
And if "only" corrected constants, then: wouldn't it be better based on counted instructions (cpu cycles) from resulting code than average (not even stable) measured values?

@sleemanj
Copy link

sleemanj commented Feb 2, 2020

"the large number of chips" ... Aren't they based on the same core?

Not necessarily. Within @SpenceKonde 's ATTinyCore itself there are two cores, one for the classic avr, one for modern and I think the timing at least may be different for some instructions there. I also have a fork of the classic which supports the Tiny4/5/9/10 which have a reduced core comprising fewer register, fewer instructions, and different timings.

The AVR Instruction Set reference has this to say...

Several updates of the AVR CPU during its lifetime has resulted in different flavors of the instruction set, especially for the timing of the instructions. Machine code level of compatibility is intact for all CPU versions with a very few exceptions related to the Reduced Core (AVRrc), though not all
instructions are included in the instruction set for all devices.

That's not to say that the assembly can't be written such that it works and maintains the same instruction count/timings across all targets... just that adjusting the constant might be easier is all.

@5chufti
Copy link
Author

5chufti commented Feb 2, 2020

yes, but these two variants (old & new) of aTTiny allready have seperate codebase, so different wiring_pulse.c

SpenceKonde added a commit that referenced this issue Feb 5, 2020
part of fix for #384
@SpenceKonde SpenceKonde reopened this Feb 5, 2020
@SpenceKonde
Copy link
Owner

SpenceKonde commented Feb 5, 2020

As far as I can tell for wiring_pulse.S, that code should work for any register (possibly only if that register is in the low 64 register addresses, but that's the case for all the PINx registers). Hopefully what is in github now will work on all parts except the "modern" ones (where that code just needs to be added).

I just pulled in the stock pulseIn() code, which means we will also get pulseInLong() which was previously missing (though the fact that nobody complained about this is hardly an endorsement of how popular pulseInLong() is)

@SpenceKonde SpenceKonde added this to the 1.3.4 release milestone Feb 5, 2020
@SpenceKonde SpenceKonde added the Pre-2.0.0 bug Bug present in versions of the core older than 2.0.0 - needs to be reteted in 2.0.0 label Feb 5, 2020
@5chufti
Copy link
Author

5chufti commented Feb 8, 2020

Hi @SpenceKonde ,
should be reopenden because:
a) does not compile (.s should be .S because "recipe.S.o.pattern" is case sensitive)
b) does not work, pulseIn() allways returns 0

rgds,
schfti

@SpenceKonde
Copy link
Owner

Huh, will investigate this, wondering why it didn't turn up in travis build checks

@SpenceKonde SpenceKonde reopened this Feb 8, 2020
@SpenceKonde
Copy link
Owner

And, as an aside, if I close an issue, and the issue isn't fixed, you don't need my permission to reopen it and complain! I much prefer that to users suffering with a problem that I think I have fixed. I want to know about all the bugs and issues! Even if I closed something with a dismissive comment and "no plans to change", feel free to make your case that I really should do something about it.

Honestly, I don't even mind when people create issues when they want tech support, as I do give support via email/dm anyway - and with github issues, sometimes someone else following the repo sorts it out before I have a chance to respond! Better than the arduino forums, since the people who follow my repos and respond to issues are at least semi-experts, whereas on the forums, people's eagerness to reply often outweighs their understanding, so I then have to step up and not only answer the original question, but correct the other people who tried to help.

As long as you don't like persist in reopening something repeatedly, without contributing a new case for a fix or information that would help to fix it, or posting straight up spam (actually, it sort of amazes me that github issues don't have a spam problem, or maybe they do for repos with tons of people following them), it's all good. Even heaping verbal abuse on me - it doesn't make me want to stick my neck out to help you, but if it's a legit bug, I'll still be happy to have gotten the report. I think playing an MMO for years, I built up an immunity to verbal abuse...

@SpenceKonde
Copy link
Owner

SpenceKonde commented Feb 9, 2020

NO, this is not fixed, just because I mention an issue in a commit, doesn't mean the commit successfully solved the problem!!!

pulseInLong is now actually available and works, I think (though haven't actually tested it's accuracy)
pulseIn is completely hosed - the timeout parameter is being hopelessly mangled and is much shorter than it should be, but I can't for the life of me figure out why!

have been pounding on this shit for HOURS, and while the fixes I put in were easy and quick, there is a serious problem with pulseIn timing out far too early.

@SpenceKonde SpenceKonde reopened this Feb 9, 2020
@SpenceKonde SpenceKonde added the Critical Serious issue requiring priority fix label Feb 9, 2020
@SpenceKonde
Copy link
Owner

Also, marking critical as multiple people are hounding me about this one

@SpenceKonde
Copy link
Owner

Okay, some improvement has been made, as the timeouts are now consistently 0.7x what was requested. But it seriously looks like we are literally back where we started - why does this solution not work correctly here?

@SpenceKonde
Copy link
Owner

That loop is supposed to run in 16 clock cycles. I was looking at the ratio between the expected and actual times... it is starting to look like, somehow, that loop is instead running in 11 clock cycles - 11/16 is 0.6875, while the ratio of actual to expected, I recorded 0.691 - a little overhead, and there you go.... Yet I most definitely do not understand how a block of assembly could be running in a different number of clock cycles than expected, nor why it doesn't impact the official core.... Turning off LTO does't make a difference unless I botched my testing of that...

@sleemanj
Copy link

Sounds to me like the analysis (clock counting) of the assembly is incorrect, or there is some optimisation going on (check the disassembly from the final compiled).

This sounds like something @nerdralph might have some insight, he knows his way around AVR assembly

@nerdralph
Copy link

nerdralph commented Feb 10, 2020

Hans also asked me to write an assembler version of pulseIn() for MicroCore, but I haven't looked much into it.
MCUdude/MicroCore#30

I'm willing to write a good implementation in asm, but have yet to be convinced (sorry @MCUdude) that there is a need for 100% compatibility with the Arduino implementation which uses unsigned long for timeout and duration. For example, 2^32 us is 4295s, or over 1 hour. While I'm confident I could make an implementation that is 100% compatible, it would come at the cost of increased code size and reduced resolution. And, it would be untested for long durations because I won't waste my time setting up a test to generate an hour long pulse.

The API docs say pulseIn() works on lengths up to 3 minutes, so the full range of unsigned long is clearly not required.

From searching around for code that uses pulseIn, the most common use case I found is ultrasonic distance sensors. Pulse durations are in the range of 1-10ms, so a uint16_t would work well in this case, which would go up to 65.5ms.

So, absent any input on the requirements, my plan would be to maintain only API code compatibility, and internally work with uint16_t rather than unsigned long. I may even be able to do it in C, with minor use of inline asm where the compiler can't be persuaded to generate the asm code I want. This is how I'm doing the new version of my soft UART, as I've found heavy use of inline asm much harder to understand and maintain.

@5chufti
Copy link
Author

5chufti commented Feb 10, 2020

Hi all,
today I had time to (re)test the latest git and .... TADAH: it works as expected.
the times I get are [464-560] to [2224-2336] for real values (542/2390) measured on DSO with both compilers (4.8.1/7.3.0). I think the "noise" is from using tiny_soft_serial for debugging.
For me the issue is solved - appart from the .s/.S problem.

thanks for your effort,
schufti

@SpenceKonde
Copy link
Owner

SpenceKonde commented Feb 10, 2020 via email

@5chufti
Copy link
Author

5chufti commented Feb 10, 2020

If there is antything (send you sketch, schematic, .lst, etc) I can do to help you solve this puzzle, please feel free to ask. That is the least I can do.

@SpenceKonde
Copy link
Owner

Well, I see what I think is the same behavior that you are - I get the correct length for pulses. I do wish I had thought to do that sooner... But the timeout is wrong, though. I wonder how that could be wrong without anything else being wrong, though......

SpenceKonde added a commit that referenced this issue Feb 12, 2020
Okay.... FFS, I wish I had thought of checking it this weekend....
because timeout is broken on the official core too, and here I was
trying to figure out how it could be different between official and my
core. And, yes, of course it is fucking broken! It's right there, black
and white, clear as crystal - in the first two loops, all they have to
do is decrement an unsigned long, and there is no need for a separate
comparison because the Z-flag will be set if the last result was zero.
But on the third loop, they have to increment an unsigned long and then
compare it to another unsigned long.
Once you unpick the hideous mess that the compiler output and count the
instructions, the first two loops run 11 clocks and only the final one,
which measures the pulse length, runs 16.

This should fix it... Give this a spin for me will you, and make sure I
haven't broken it for short pulses or something?
@SpenceKonde
Copy link
Owner

Okay.... FFS, I wish I had thought of checking it this weekend....
because timeout is broken on the official core too, and here I was
trying to figure out how it could be different between official and my
core. And, yes, of course it is fucking broken! It's right there, black
and white, clear as crystal - in the first two loops, all they have to
do is decrement an unsigned long, and there is no need for a separate
comparison because the Z-flag will be set if the last result was zero.
But on the third loop, they have to increment an unsigned long and then
compare it to another unsigned long.
Once you unpick the hideous mess that the compiler output and count the
instructions, the first two loops run 11 clocks and only the final one,
which measures the pulse length, runs 16.

This should fix it... Give this a spin for me will you, and make sure I
haven't broken it for short pulses or something?

@5chufti
Copy link
Author

5chufti commented Feb 12, 2020

ok, tested it with 1,4,8,16MHz internal and get plausible results for the range I need.
With my current setup I can't test timeout situations apart from "no pulse", but in general the primary function (pulseIn) is definitively improved/fixed.

please review the .s/.S topic, it has a reason the original arduino uses .S

@SpenceKonde
Copy link
Owner

ugh, github isn't honoring the case of the file name :-(

@SpenceKonde
Copy link
Owner

let's see if it sticks now. I think it doesn't honor file name case through the windows client because windows file system isn't case sensitive

@pmcclay
Copy link

pmcclay commented Mar 12, 2020

A treat to find work in progress when I hit a problem - I first came by here 7 Feb.
Latest pulseIn() giving results within a fraction of a % on 16.5MHz Digispark clone.
Appreciated. Thanks.

@SpenceKonde
Copy link
Owner

So glad to hear it's working for you!
Also, if you haven't noticed, the timeout is actually correctly honored if if doesn't get a pulse - before the timeout would tick by 5/16ths too fast, and this is busted in official core as well. When I finally figured that out, it was so gratifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Critical Serious issue requiring priority fix Pre-2.0.0 bug Bug present in versions of the core older than 2.0.0 - needs to be reteted in 2.0.0
Projects
None yet
Development

No branches or pull requests

5 participants