Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing endless loop (100% cpu), adding IO exceptions when ports closes (unplug USB serial ports), refactored blocking read/write support with interrupt() capabilities, added Input/Output stream support #127

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

factoritbv
Copy link

First of all, we noticed that the return value of select(), read() and write() is not properly checked in the native library (jSSC.cpp). We propose a simple patch, which throws an IOException when something goes wrong while calling the kernel system calls. We have added a time-out of 100ms there, to regularly (re)check for a valid file-descriptor, while using negligible system resources. It fixes the problem of performing a (blocked) read in one thread, and getting the file-descriptor (serial port) closed in another thread. The linux kernel does not return the select() call when this happens, however, it does return -1 for a read() or a select() when it is re-executed (every 100ms). Moreover, it also allows exiting the blocked read (or write) loop when the executing thread is terminated using an interrupt() invocation.

Furthermore, Java works great with Input/Output streams. After you get use to it, you never want to use other ways of handling streaming data (e.g. sockets, i/o ports, files). There are also many helper derivative classes which help parsing data like DataInputStream, which helps to avoid having all kind of read/write functions for several different types:
https://docs.oracle.com/javase/7/docs/api/java/io/DataInputStream.html

You will find a simple SerialPortStream class extension of the jSSC package and adds an Input/Output stream derivative.

import java.net.*;
import java.util.*;
import java.util.concurrent.*;
import java.io.*;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(trivial) Looks as those imports are unused. Should we remove them?

import java.net.*;
import java.util.concurrent.*;

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it is better to remove them

}

if (failure) {
throw new IOException("Error occured while closing and cleaning up the in/output streams");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(trivial) Typo? occured -> occurred

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good fix

}

if (failure) {
throw new IOException("Error occured while closing and cleaning up the in/output streams");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw new IOException("Error occured while closing and cleaning up the in/output streams");

In case we throw here, we won't call m_serial_port.closePort(). Is this intended?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no this is indeed wrong, I've merged some internal test-classes to get to this result. Better to make sure m_serial_port.closePort() is executed before this io exception is thrown.

m_serial_port_input_stream.available();
} catch (IOException ex) {
try {
this.close();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just wondering) To me it sounds strange to call close() in scope of isClosed() which looks like a simple getter.

Don't get me wrong. Maybe it's perfectly fine :) I just mention it as it "feels" strange to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IOException should be handled correctly by the caller function, which included calling closing of the port. However, it does not breaks much here. Since there is boolean double-check anyway that registers if the port was already closed (before) or not.

I think that calling close() is more fail-safe for less experienced users, and does not really adds more overhead, however, it is not a must-have. There I agree on.

}

public void close() throws IOException {
SerialPortInputStream.this.close();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private class SerialPortInputStream extends InputStream {
    ...
    public void close() throws IOException {
        SerialPortInputStream.this.close();
    }

I suspect this to end up in an infinite recursion. Isn't it?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good one, you are perfectly right, this should be SerialPortStream.this.close() of course.

I'm afraid this was caused by our turbo refactoring as well ;)

m_buffer_len = m_serial_port.getInputBufferBytesCount();
if (m_buffer_len == 0) {
// Nothing available, just block until the first byte comes available and return directly
return (int)m_serial_port.readBytes(1)[0];

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(int)m_serial_port.readBytes(1)[0]

(WARN) I suspect this to return bad values if the byte value is 0x80 or larger.

Legend for the table below:

byte: The byte value returned by readBytes(1)[0]
expect: The value I would expect it to return
actual: The value I suspect it to return (ToBeVerified)

byte expect actual comment
42 42 42 OK, because below 0x80
200 200 -56 Not the value I would expect (doc says the range is 0-255)
255 255 -1 Not so fun as doc says that -1 indicates EOF

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could be right, however, we never run into problem with this code. But maybe we only focussed on ASCII tranmissions. Feel free to adjust and cast/handle the in the right type.


public void close() throws IOException {
SerialPortStream.this.close();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) I mention this as it looks like an oversight. If it is by intent then feel free to ignore this comment :)

It seems as the only thing SerialPortOutputStream.close() does, is to call its caller (which doesn't make sense to me).

class SerialPortStream {
    void close() {
        ...
        try {
            m_serial_port_output_stream.close();
        }
        ...
    }
    ...

    // Seems as it only calls the method which did call us.
    class SerialPortOutputStream {
        void close() {
            SerialPortStream.this.close();
        }
    }
    ...
}

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually makes sense, because as with the Socket class you will perform call .getInputStream() somewhere in the application code and feed that to a another Stream or Reader. When that is closed, the InputStream will be closed, but that should trigger a complete close (also the OutputStream and the SerialPort) self.

A sanity check in the main close() function can be added to check if the Input/Output stream is closed already, but actually the Java docs specify it does not harm to close an Input/Output stream more than once (it is and stays closed).


result = select(portHandle + 1, NULL, &write_fd_set, NULL, &timeout);
if (result < 0) {
env->ThrowNew(env->FindClass("java/io/IOException"), "Waiting for serial port to become writable failed");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(trivial) Only a suggestion. Just ignore that comment if it isn't relevant enough :)

env->ThrowNew(env->FindClass("java/io/IOException"), "Waiting for serial port to become writable failed");

I guess in this case here, we could use strerror(errno) to get a more concise error message provided by select(). Eg:

#include <errno.h>
#include <string.h>

env->ThrowNew(env->FindClass("java/io/IOException"), strerror(errno));

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome suggestion, makes it more readable


result = write(portHandle, jBuffer + (bufferSize - byteRemains), byteRemains);
if(result < 0){
env->ThrowNew(env->FindClass("java/io/IOException"), "Error writing to serial port");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(trivial) Another place where we potentially could use strerror(errno)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same awesome suggestion as before ;)

}

/**
* Waits until 'read()' has something to tell for the specified filedescriptor.
*/
/*

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

    // Default impl using 'poll'. This is more robust against fd>=1024 (eg
    // SEGFAULT problems).
    ....
        int result = poll(fds, 1, -1);

Ough :( too bad we drop poll again. So our project won't be able to upgrade to any newer jssc versions.

I let the decision up to other maintainers if we want to go this route or not 👍

Just for reference:

  • poll got introduced in Use poll in readBytes to fix segfaults with high fd numbers #90
  • We did this because we saw SEGFAULTs irregularly. It took us a long time to find that those FDs were the cause as the issue was hard to reproduce.
  • About why we have large FDs, I only can speculate. One possibility is that there is a lot going on in this process as it is a jetty, hosting half a dozen apps, all of them doing lots of network and device IO.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are completely right, to be honest, we have already commented to the authors of issue #90 that it would be awesome if the poll() technique could be used for this proposal.

We never had any problems with select() because of to many (open?) file-descriptors. So we never really dived into this problem/solution. But we encourage you to adjust/refactor our code and migrate to using the poll() technique.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had an idea how we could reduce the impact without the need for "poll". I did not find a good way to show that patch here. So I opened PR_128 that we can "see" the idea. Basically it just replaces the SEGFAULT by a java exception and so at least we get a proper error message. Additionally I removed the now unused stuff around "poll".

hiddenalpha added a commit to hiddenalpha/jssc that referenced this pull request Jun 20, 2022
In [PR_127] we got miscellaneous improvements. But we do no longer use
poll and therefore are back to the FD_SETSIZE limit of *select*.

This commit suggests some fine-tuning which IMHO are enough so we can
make use of the improvements from [PR_127]. It does so by replacing a
possible SEGFAULT (aka crashing the whole JVM) by throwing a java
exception instead. This improves the reported error and even enables
user code to react to the situation.

[PR_127]: java-native#127
hiddenalpha added a commit to hiddenalpha/jssc that referenced this pull request Jun 20, 2022
In [PR_127] we got miscellaneous improvements. But we do no longer use
poll and therefore are back to the FD_SETSIZE limit of *select*.

This commit suggests some fine-tuning to [PR_127]. It does so by
replacing a possible SEGFAULT (aka crashing the whole JVM) by throwing a
java exception instead. This improves the reported error and even
enables user code to react to the situation.

Still not perfect. But IMHO good enough for now. We can still improve
later (for example as soon there's enough time to do so).

[PR_127]: java-native#127
@tresf tresf marked this pull request as draft November 10, 2023 20:37
@tresf
Copy link

tresf commented Nov 10, 2023

@hiddenalpha I know you've already started work on certain findings here as well as opened a new PR #128. I've converted this PR to a draft, but I think it should continue to be split up into smaller parts, then superseded.

@pietrygamat pietrygamat added this to the 2.9.6 milestone Dec 1, 2023
@pietrygamat pietrygamat removed this from the 2.9.6 milestone Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants