-
Notifications
You must be signed in to change notification settings - Fork 74
[#722] fix segfault and hung threads on KeyboardIinterrupt during parallel get #728
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
2700f86
3548b0d
a252628
8cf893f
e02c504
e9d9800
e3d5123
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -312,6 +312,106 @@ will spawn a number of threads in order to optimize performance for | |
| iRODS server versions 4.2.9+ and file sizes larger than a default | ||
| threshold value of 32 Megabytes. | ||
|
|
||
| Because multithreaded processes under Unix-type operating systems sometimes | ||
| need special handling, it is recommended that any put or get of a large file | ||
| be appropriately handled in the case that a terminating signal aborts the | ||
| transfer: | ||
|
|
||
| ```python | ||
| from irods.parallel import abort_parallel_transfers | ||
|
|
||
| def handler(signal, _): | ||
| abort_parallel_transfers() | ||
|
|
||
| signal(SIGTERM, handler) | ||
|
|
||
| try: | ||
| # A multi-1247 put or get can leave non-daemon threads running if not treated with care. | ||
| session.data_objects.put(...) | ||
| except KeyboardInterrupt: | ||
| # Internally, the library has likely already started a shutdown for the | ||
| # present put operation, but we can non-destructively issue the following | ||
| # call to ensure any other ongoing transfers are aborted, prior to re-raising: | ||
| abort_parallel_transfers() | ||
|
|
||
| printf('Due to a SIGINT or Control-C, the put failed.') | ||
| # Raise again, as is customary when catching a directly BaseException-derived object. | ||
| raise | ||
| except RuntimeError: | ||
| printf('The put failed.') | ||
| # ... | ||
| ``` | ||
|
|
||
| In general it is better (for applications wanting to gracefully handle | ||
| interrupted lengthy data transfers to/from iRODS data objects) to anticipate | ||
| control-C by handling both `KeyboardInterrupt` and `RuntimeError`, as shown | ||
| above. | ||
|
|
||
| Of course, had we intercepted SIGINT (meaning we assigned it a custom | ||
| handler), we could avoid the need to handle the `KeyboardInterrupt` in an | ||
| exception clause. | ||
|
|
||
| Internally, of course, the PRC must handle all eventualities, including | ||
| `KeyboardInterrupt`, by closing down the current transfer being setup or | ||
| waited on, in consideration of very simple applications which might do no | ||
| signal or exception handling of their own. Otherwise we would risk some | ||
| non-daemon threads not finishing, which could risk preventing a prompt and | ||
| orderly exit from the main program. | ||
|
|
||
| When a signal or exception handler calls `abort_parallel_transfers()`, all | ||
| parallel transfers are aborted immediately. Upon return to the normal flow | ||
| of the main program, the affected transfers will raise `RuntimeError` to | ||
| indicate the PUT or GET operation has failed. | ||
|
|
||
| Note that `abort_parallel_transfers()` is designed to be safe for inclusion | ||
| in signal handlers (e.g. it may be called several times without detrimental | ||
| effects); and that it returns promptly after having initiated the process of | ||
| shutting down the data transfer threads, rather than waiting for them to | ||
| terminate first. This owes to the best practice of minimizing time spent in | ||
| signal handlers. However, if desired, `abort_parallel_transfers()` may be | ||
| iterated subsequently with `(dry_run=True, ...)` to track the progress of the | ||
| shutdown. The default object returned (a dictionary whose keys are weak | ||
| references to the thread managers) will have a boolean value of `False` once | ||
| all transfer threads have exited. | ||
|
|
||
| The following example shows how to abort all synchronous ("foreground") puts | ||
| while leaving background transfers alone: | ||
|
|
||
| ```python | ||
| import irods.helpers, threading | ||
| from irods.parallel import abort_parallel_transfers, FILTER_FUNCTIONS, io_main, Oper | ||
|
|
||
| session = irods.helpers.make_session() | ||
| hc = irods.helpers.home_collection(session) | ||
|
|
||
| sessions_for_threads = [session.clone() for _ in range(3)] | ||
|
|
||
| # Launch an asynchronous (i.e. "background") transfer | ||
| io_main(session, '{hc}/target_0', Oper.PUT|Oper.NONBLOCKING, 'my_large_file') | ||
|
|
||
| Threads = [ threading.Thread(target=lambda _sess, put_args: _sess.data_objects.put(*put_args, num_threads=2), | ||
| args=(sess,["my_large_file", f"{hc}/target_"+str(i+1)])) | ||
| for i, sess in enumerate(sessions_for_threads) ] | ||
|
|
||
| try: | ||
| # Launch transfer threads | ||
| for t in Threads: | ||
| t.start() | ||
| # Wait on transfer threads | ||
| for t in Threads: | ||
| t.join() | ||
| except KeyboardInterrupt: | ||
| # Trapping control-C, stop transfers launched synchronously, i.e. in the foreground of each transfer thread | ||
| # explicitly launched above. | ||
| abort_parallel_transfers(filter_function = FILTER_FUNCTIONS.foreground) | ||
|
Comment on lines
+404
to
+406
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is If so, is it documented?
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this would be it, for documentation, right now. I could and probably should add a comment above the definition, now that you mention it.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes please. |
||
|
|
||
| # [...] | ||
| # As the main application continues (or exits), the asynchronous transfer will be the only one to finish | ||
| # successfully after this call to `abort_parallel_transfers`. Note: Although care is taken not to | ||
| # leave replicas in a locked state, these relics of unfinished transfers can still remain in the | ||
| # object catalog. | ||
| ``` | ||
|
|
||
| Progress bars | ||
| ------------- | ||
|
|
||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.