Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't access retrievedStdout/retrievedStderr #29

Open
martinmcclure opened this issue Feb 24, 2018 · 13 comments
Open

Can't access retrievedStdout/retrievedStderr #29

martinmcclure opened this issue Feb 24, 2018 · 13 comments

Comments

@martinmcclure
Copy link

#waitForExitPollingEvery: aDelay retrievingStreams: true
seems less useful than it could be due to having no way (that I can see) to access the retrieved output and error streams' contents.

The use case is for testing -- to fork a subprocess, write something to its stdin, wait for it to exit, then check its outputs and exit status for expected values.

For now I've subclassed OSSUnixSubprocess to give myself access. Please let me know if there's a better way to do this.

@marianopeck
Copy link
Collaborator

marianopeck commented Feb 24, 2018

Hi @martinmcclure
Isn't testBasicCommandWriteToStdin something like what you want to do? In that test I use waitForExit which waits on signal SIGCHLD. You may want to use waitForExitPollingEvery: aDelay retrievingStreams: true if you want poll and while polling retrieving streams (to make room in pipes). You can also use waitForExitWithTimeout: which is more secure in case your child hungs and then the Pharo would be hung because it never received a SIGCHLD.

Is it clear? If not, show me how you had to subclass OSSUnixSubprocess so that I understand better.

@martinmcclure
Copy link
Author

Well, testBasicCommandWriteToStdin is something like what I want to do. But I was worried that the program might write too much to stdout or stderr and block on the pipe and get stuck, so using waitForExit seemed like a bad idea.
Using waitForExitPollingEvery: aDelay retrievingStreams: true is exactly what I'm doing, but the "retrievingStreams: true" part seems largely useless because there is no way to, after the process exits, actually see what was written by the process.

Which is the subject of this bug report.

My current subclass just adds accessors like this one:
retrievedStderr ^ retrievedStderr contents
and another for stdout, so one can see what the process wrote.

@marianopeck
Copy link
Collaborator

OK, I understand now. I guess the main reason is because its most used sender is runAndWaitPollingEvery: aDelay retrievingStreams: retrieveStreams onExitDo: onExitClosure which, as you can see in its senders, the accumulated streams are culled into the block so you have them right there.

OK, I agree with your change but i would do a normal accessor that answers the stream and not the contents. Do you agree?

@martinmcclure
Copy link
Author

What I'd originally done was use run to launch the subprocess, then wrote to its stdin, then used waitForExitPollingEvery:retrievingStreams: to wait for it to exit. Eventually I realized that if the subprocess wrote too much to its stdout/stderr before I got around to doing the wait that I'd have problems.

So I then switched to forking a new Smalltalk Process to do the waitForExitPolling... and wrote to stdin from the original Smalltalk Process, and used a Smalltalk semaphore to synchronize when the subprocess had exited.

And now that I'm using that kind of approach, I think that runAndWaitPollingEvery... is indeed a better message to use.
So perhaps waitForExitPolling... is not really supposed to be useful for anything but runAndWaitPollingEvery... and the best fix for my misunderstanding is to mark waitForExitPolling... as private.

You could add on the accessors as I've done (or straight accessors for the streams, as you suggest) but I no longer think that's the best solution.

@martinmcclure
Copy link
Author

In the end, I get the method below. Does this look like correct/best usage of OSSUnixSubprocess? Thanks for your patience with the OSSubprocess newbie. :-)

run: executablePath 
withArguments: argArray 
stdin: inputString 
expectingStdout: expectedStdOut 
expectingStderr: expectedStdErr 
expectingExitStatus: expectedExitStatus
	"Does not change the current working directory, or the environment."

	| process exitedSemaphore actualStdout actualStderr |
	exitedSemaphore := Semaphore new.
	process := OSSUnixSubprocess new
		command: executablePath;
		arguments: argArray;
		redirectStdin;
		redirectStdout;
		redirectStderr;
		yourself.
	[ process
		runAndWaitPollingEvery: (Delay forMilliseconds: 50)
		retrievingStreams: true
		onExitDo: [ :p :stdout :stderr | 
			actualStdout := stdout.
			actualStderr := stderr.
			exitedSemaphore signal ] ] forkAt: Processor activeProcess priority + 1.
	process stdinStream
		nextPutAll: inputString;
		close.
	exitedSemaphore wait.
	self
		assert: actualStdout equals: expectedStdOut;
		assert: actualStderr equals: expectedStdErr;
		assert: process exitStatusInterpreter exitStatus equals: expectedExitStatus

@marianopeck
Copy link
Collaborator

marianopeck commented Feb 24, 2018 via email

@martinmcclure
Copy link
Author

The fork is an attempt to avoid deadlock. It is necessary with large strings (>1MB or so) but, it turns out, not sufficient. If I invoke my test method with this kind of simple but large test:

testLargeCat
	| stream bigString |
	stream := WriteStream on: String new.
	50000 timesRepeat: [ stream nextPutAll: 'This is a string of significant length ' ].
	bigString := stream contents.
	self
		run: '/bin/cat'
		withArguments: #()
		stdin: bigString
		expectingStdout: bigString
		expectingStderr: ''
		expectingExitStatus: 0

Then it deadlocks because I'm attempting to dump all of the characters into cat's stdin all at once, and the pipe is blocking, so each character I send is sent back from cat, but I only poll every 50ms, and if cat's write pipe blocks because I haven't read from it yet, and I keep sending characters and cat isn't reading them because it's blocked on write, then the VM blocks on write: Deadlock. So what I have to do is to send a little, read a little, send a little, read a little... until done. The code below (getting complicated) seems to work. I don't immediately see a simpler way to do it (maybe after you have threaded FFI callouts...).

run: executablePath withArguments: argArray stdin: inputString expectingStdout: expectedStdOut expectingStderr: expectedStdErr expectingExitStatus: expectedExitStatus
	"Does not change the current working directory, or the environment."

	| inputStream process exitedSemaphore actualStdout actualStderr counter |
	exitedSemaphore := Semaphore new.
	inputStream := ReadStream on: inputString.
	actualStdout := actualStderr := ''.
	process := OSSUnixSubprocess new
		command: executablePath;
		arguments: argArray;
		redirectStdin;
		redirectStdout;
		redirectStderr;
		yourself.
	[ process
		runAndWaitPollingEvery: (Delay forMilliseconds: 50)
		retrievingStreams: true
		onExitDo: [ :p :stdout :stderr | 
			actualStdout := stdout.
			actualStderr := stderr.
			exitedSemaphore signal ] ] forkAt: Processor activeProcess priority + 1.
	counter := 0.
	[ inputStream atEnd ]
		whileFalse: [ counter := counter + 1.
			counter \\ 100000 = 0
				ifTrue: [ (Delay forMilliseconds: 50) wait ].
			process stdinStream nextPut: inputStream next ].
	process stdinStream close.
	exitedSemaphore wait.
	self
		assert: actualStdout equals: expectedStdOut;
		assert: actualStderr equals: expectedStdErr;
		assert: process exitStatusInterpreter exitStatus equals: expectedExitStatus

@martinmcclure
Copy link
Author

...actually, this solution works for cat, where the number of characters on stdin and stdout is the same, but what if you had a program "cat5000" that output 5000 characters for every character on stdin? My code would still lock up the VM. One good answer would be to use a non-blocking pipe for the stdin, and when you get an EWOULDBLOCK stop writing but block the Smalltalk Process that is trying to do the write on a Smalltalk semaphore. Then add the file descriptor to the VM's list of descriptors being monitored via poll() or select() so it can signal the semaphore when it's ready to accept more characters. And a similar strategy for stdout and stderr -- you can have a Smalltalk Process for each, and you don't have to poll at the Smalltalk level. I don't know whether the VM offers support for such a thing, but it would be useful if it did.

@marianopeck
Copy link
Collaborator

Hi @martinmcclure

Nice experiment from you. As you realized, I do have support for makeNonBlockingPipe but we are currently using that (by default) for stdout and stderr. For stdin we are using blocking pipes so yeah, your forked process must read quickly enough. Another alternative you can give it a try is to use files for the stdin rather than a blocking pipe, but I have experienced problems with that in the past. Another stupid idea is to just make the write into stdin slower? You know.. just write some, then wait, then write more, then wait, etc. Just to give time the forked process to read from it?

" that output 5000 characters for every character on stdin? My code would still lock up the VM" Its not clear to me it would lock the VM. In that case maybe putting a smaller timeout for retrieving from the out streams would help?

About EWOULDBLOCK, I do have support for putting a semaphore for a given signal. I am using a OSProcess (yes, OSProcess) primitive for this, but it's ok for now. See implementors and senders of forwardSigChld and sigChldSemaphore. But, once you get the signal, and you signal the semaphore, how do you know which pipe is the one that became ready again?

BTW, @guillep may also have some ideas as he has been doing a lot of stress tests recently...

Let me know what do you think,

@martinmcclure
Copy link
Author

Hi @marianopeck
What I'm suggesting would be nice is not the ability to associate a semaphore with a signal (which is already there) but the ability to associate a semaphore with a file descriptor. It looks like this is done in some places -- AsyncFile and Socket>>waitForDateFor:ifClosed:ifTimedOut: for instance, but it's not clear whether the primitives provided there are general enough to be used with pipes, which is a pity.

With this kind of approach, there would be a separate semaphore for each non-blocking pipe. You'd read (or write, depending on which end of the pipe you had) through a primitive that basically calls the OS read() or write() function. If the result was EWOULDBLOCK, you'd wait on the semaphore for that pipe. The VM would then signal the semaphore when data could once again be read (or written).

@marianopeck
Copy link
Collaborator

Hi @martinmcclure
Ok, I think I get it now. Unfortunately, I am not aware of that functionality in the VM. Would you like to ask the VM mailing list? maybe it is..somewhere...

@marianopeck
Copy link
Collaborator

@martinmcclure

Could you please take a look to this conversation? pharo-project/pharo-vm#142 (comment) Do you see any relation with what @zecke is saying?

Cheers,

@martinmcclure
Copy link
Author

@marianopeck
I can't immediately tell whether that is the same idea or not. It does seem related, but doesn't look quite the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants