Node cannot access the sandboxes home directory
The sandboxes home directory is a place where shared sandboxes are located (configured by sandboxes.home server property). The directory can be on a local or network file system. If the directory is not accessible, it is a serious problem preventing the node from working correctly (e.g. jobs cannot be executed and run). In such a case the affected node must be suspended to prevent jobs from being sent to it.
The suspended node can be resumed when the directory is accessible again, see the Auto-Resuming in Unreliable Network section.
Timeline describing the scenario:
-
sandboxes home is connected to a remote file system
-
the connection to the file system is lost
-
periodic check is executed trying to access the directory
-
if the check fails, the node is suspended
The following configuration properties set the time intervals mentioned above:
sandboxes.home.check.checkMinInterval
-
Periodicity of sandboxes home checks, in milliseconds.
Default: 20000
sandboxes.home.check.filewrite.timeout
-
Accessing sandboxes home timeout, in milliseconds.
Default: 600000
Be careful, setting the timeout value too low might force the node under a heavy load to suspend even if the sandboxes home is actually available. |