CWL conformance test test_conformance_v1_2_timelimit_basic is a false green - it passes because a broad exception handler (lib/galaxy_test/base/populators.py:3191-3194) catches an unrelated failure, not because Galaxy actually enforces the ToolTimeLimit requirement. Galaxy declares ToolTimeLimit in SUPPORTED_TOOL_REQUIREMENTS but never extracts, propagates, or enforces the timelimit value.
- Add
timelimitas a first-class resource requirement in Galaxy XML/YAML tools (non-CWL) - Wire it through the existing resource requirements infrastructure (which implicitly reaches TPV)
- Enforce it in the local job runner
- Wire CWL
ToolTimeLimitinto Galaxy's resource requirement system
Commits 1-3 are CWL-independent and can be branched off for a standalone PR. Commit 4 is CWL-specific.
ResourceTypeliteral (lib/galaxy/tool_util/deps/requirements.py:236-249): 12 types (cores, ram, tmpdir, cuda, shm). No timelimit.ResourceRequirementclass (same file, line 253): storesvalue_or_expression+resource_type, supports numeric values and expressions (expressions raiseNotImplementedError).resource_requirements_from_list()(line 280): maps CWL camelCase keys to Galaxy snake_case keys viacwl_to_galaxydict. For Galaxy-format items (type: resource), valid keys come fromcwl_to_galaxy.values().ResourceRequirementPydantic model (lib/galaxy/tool_util_models/tool_source.py:56-73): separate model for YAML tool validation/schema generation. Has explicit fields for each resource type. DrivesToolSourceSchema.jsongeneration.- Local runner (
lib/galaxy/jobs/runners/local.py): already has__poll_if_needed()(line 235) that polls running processes and callsjob_wrapper.check_limits()for output size and global walltime. - Job wrapper
check_limits()(lib/galaxy/jobs/__init__.py:2414): checks globalwalltime_deltafrom job config.runtimeparameter is adatetime.timedelta. This is an admin-set global limit, NOT a per-tool limit. - Job wrapper
has_limits()(line 2442): gates whether__poll_if_needed()activates polling. Currently only checks global output_size and walltime. - Runner states (
lib/galaxy/jobs/runners/util/__init__.py:10-17): includesWALLTIME_REACHEDandGLOBAL_WALLTIME_REACHEDas distinct states. - CWL ToolTimeLimit format:
ToolTimeLimit: { timelimit: <seconds> }where value is int/float in seconds. Can also be an expression$(...). - XSD schema (
galaxy.xsd:8090-8156):ResourceTypesimpleType with 12 enumerated values. - Other runners using check_limits(): drmaa.py and pbs.py also call
check_limits()via the runner state wrapper - they'll automatically benefit. Kubernetes and Pulsar do NOT usecheck_limits()and won't get timelimit support from this change.
The existing walltime limit in check_limits() is a global admin setting from job_conf.xml. Our new timelimit is a per-tool declaration by the tool author. These are complementary - both should be checked, with the more restrictive one winning.
| Test | Tool | should_fail | Expected Behavior |
|---|---|---|---|
| timelimit_basic | timelimit.cwl | true | sleep 15, 3s limit - killed by timeout |
| timelimit_invalid | timelimit2.cwl | true | negative timelimit -1 - CWL schema validation error |
| timelimit_zero_unlimited | timelimit3.cwl | false | zero timelimit = no limit, sleep 15 succeeds |
| timelimit_from_expression | timelimit4.cwl | true | $(1+2) expression - requires JS eval |
| timelimit_expressiontool | timelimit5.cwl | false | ExpressionTool ignores timelimit |
Files to modify:
-
lib/galaxy/tool_util/deps/requirements.py- Add
"timelimit"toResourceTypeliteral (line 248, before closing paren) - No need to add to
cwl_to_galaxydict yet - CWL ToolTimeLimit is a separate requirement class, not a ResourceRequirement field. The dict is only consulted forclass: ResourceRequirementitems. The"timelimit"key will be found viacwl_to_galaxy.values()for Galaxy-formattype: resourceitems.
- Add
-
lib/galaxy/tool_util_models/tool_source.py- Add
timelimit: Optional[Union[int, float]] = Nonefield to theResourceRequirementPydantic model (aftershm_size, ~line 73). This is required for YAML tool validation andToolSourceSchema.jsongeneration.
- Add
-
lib/galaxy/tool_util/xsd/galaxy.xsd- Add
<xs:enumeration value="timelimit">with documentation toResourceTypesimpleType (aftershm_size, before line 8155). Doc: "Maximum time in seconds the tool is allowed to run. Job will be terminated if exceeded."
- Add
-
test/functional/tools/resource_requirements.xml- Add
<resource type="timelimit">60</resource>after line 14
- Add
-
test/unit/tool_util/test_parsing.py- Update
TOOL_XML_1fixture: add<resource type="timelimit">60</resource>(~line 55) - Update
TOOL_YAML_1fixture: add- type: resource/timelimit: 60block (~line 166) - Update
TestXmlLoader.test_requirements(): change count from 7 to 8, addassert resource_requirements[7].resource_type == "timelimit"after line 386 - Update
TestYamlLoader.test_requirements(): changelen(resource_requirements) == 7to== 8(line 574), add assertion forresource_requirements[7]
- Update
-
Regenerate
ToolSourceSchema.json(viaclient/src/components/Tool/rebuild.py) after tool_source.py change.
Red-to-green test: Write the test assertion for timelimit first, see it fail (resource type not found), then add the type.
Files to modify:
-
lib/galaxy/jobs/runners/util/__init__.py- Add new runner state:
TOOL_TIMELIMIT_REACHED="tool_timelimit_reached"(afterGLOBAL_WALLTIME_REACHED, ~line 15). Distinct from walltime states for operational visibility.
- Add new runner state:
-
lib/galaxy/tools/__init__.py- After
self.resource_requirementsis set (~line 1535), extract timelimit:self.timelimit = None for rr in self.resource_requirements: if rr.resource_type == "timelimit" and not rr.runtime_required: self.timelimit = rr.get_value() break
- No dedicated
parse_timelimit()on tool source interface -cores_minhaving its own accessor is a historical artifact. Extracting from the already-parsed resource_requirements list is cleaner.
- After
-
lib/galaxy/jobs/__init__.py(JobWrapper)- In
has_limits()(~line 2442): add check for per-tool timelimit:This is critical - without it,has_tool_timelimit = self.tool is not None and getattr(self.tool, 'timelimit', None) is not None return has_output_limit or has_walltime_limit or has_tool_timelimit
__poll_if_needed()won't activate when only per-tool timelimit exists (no global walltime configured). - In
check_limits()(~line 2414): after global walltime check, add per-tool timelimit check:Note:if self.tool and getattr(self.tool, 'timelimit', None) and runtime is not None: timelimit_seconds = self.tool.timelimit if timelimit_seconds > 0: # zero = no limit (CWL spec) timelimit_delta = datetime.timedelta(seconds=timelimit_seconds) if runtime > timelimit_delta: return ( JobState.runner_states.TOOL_TIMELIMIT_REACHED, f"Job exceeded tool time limit ({timelimit_seconds}s)" )
runtimeis already atimedelta, so we convert timelimit seconds totimedeltafor comparison.
- In
-
lib/galaxy/jobs/runners/local.py- No changes needed.
__poll_if_needed()already callsjob_wrapper.has_limits()andjob_wrapper.check_limits(runtime=...).
- No changes needed.
-
lib/galaxy/jobs/runners/state_handlers/resubmit.py- Add
tool_timelimit_reachedto MESSAGES dict (~line 19):tool_timelimit_reached="it exceeded the tool's time limit",
- Add to
_ExpressionContext(~line 147):"tool_timelimit_reached": runner_state == JobState.runner_states.TOOL_TIMELIMIT_REACHED,
- Add
-
test/unit/app/jobs/test_runner_local.pyMockJobWrapper.has_limits()is hardcoded toFalse(line 214). Update to checkself.tool.timelimit:def has_limits(self): return getattr(self.tool, 'timelimit', None) is not None
- Add
check_limits()mock method that mirrors the real implementation. - Add test: mock tool with
timelimit=3, runsleep 15, assert job is killed andfail()is called.
Red-to-green test: Add test that runs a tool with timelimit: 3 and sleep 15, expect job failure. Should fail initially since timelimit isn't enforced, then pass after implementation.
The timelimit resource requirement is automatically available to TPV through tool.resource_requirements - no explicit wiring needed. But an integration test confirming TPV can read it would be good.
Files to modify:
test/integration/test_user_defined_tool_job_conf.py- Add
TOOL_WITH_TIMELIMIT_SPECIFICATIONconstant following the pattern ofTOOL_WITH_RESOURCE_SPECIFICATION(~line 18) - Verify TPV receives it (similar to existing
test_user_defined_applies_resource_requirementstest forcores_min)
- Add
Note: This commit is optional if TPV doesn't yet have a {timelimit} template variable. The important thing is that timelimit appears in tool.resource_requirements which TPV already iterates.
Files to modify:
-
lib/galaxy/tool_util/cwl/parser.py- Add
timelimit_requirements()method on the tool proxy:def timelimit_requirements(self) -> List: return self.hints_or_requirements_of_class("ToolTimeLimit")
- Add
-
lib/galaxy/tool_util/parser/cwl.py- In
parse_requirements(), extract ToolTimeLimit and add it to resource_requirements list:for tl in self.tool_proxy.timelimit_requirements(): timelimit_value = tl.get("timelimit") if timelimit_value is not None: resource_requirements.append({"type": "resource", "timelimit": timelimit_value})
- Pass these through to
parse_requirements_from_lists(resource_requirements=...). The"timelimit"key will be found incwl_to_galaxy.values()(since Commit 1 adds it toResourceType) and converted to aResourceRequirementobject.
- In
-
CWL conformance test expectations
timelimit_basic: converts from false-green to true-green. Job actually killed by timeout after 3s. The broad exception handler atpopulators.py:3191still catches it, but now it's the right exception.timelimit_invalid(negative): fails at CWL schema validation before Galaxy. No change.timelimit_zero_unlimited: zero = no limit.check_limits()skips enforcement whentimelimit_seconds <= 0. Job completes successfully.timelimit_from_expression:$(1+2)expression.ResourceRequirementmarksruntime_required=True,get_value()raisesNotImplementedError. Acceptable failure mode - expression evaluation is a pre-existing TODO across all resource requirements.timelimit_expressiontool: ExpressionTools don't go through the job runner, so timelimit enforcement doesn't apply. Passes as expected.
Red-to-green test: Add unit test that loads a CWL tool with ToolTimeLimit and verifies it appears in parse_requirements() output as a ResourceRequirement with resource_type="timelimit".
master
|
+-- Branch: tool_timelimit (commits 1-3, non-CWL PR)
| |-- Commit 1: Add timelimit resource type
| |-- Commit 2: Local runner timelimit enforcement
| +-- Commit 3: TPV integration test (optional)
|
+-- Branch: cwl_tool_state (commit 4, CWL PR, depends on tool_timelimit)
|-- ... existing CWL work ...
+-- Commit 4: Wire CWL ToolTimeLimit
Commits 1-3 can be branched off independently. Commit 4 depends on commits 1-3.
- Units: Seconds (matches CWL spec). No HH:MM:SS support - that's a global walltime format concern.
- Negative values: CWL spec treats negative as schema validation error (timelimit2.cwl). Galaxy should skip enforcement for values <= 0.
- No
timelimit_maxvariant: CWL has singletimelimitfield. Justtimelimit. - Precedence: Stricter of global walltime and per-tool timelimit wins. Both are checked independently in
check_limits(). - Zero = no limit: Per CWL spec (timelimit3.cwl conformance test).
check_limits()skips whentimelimit_seconds <= 0. - Runner state: New
TOOL_TIMELIMIT_REACHEDstate, distinct fromWALLTIME_REACHED/GLOBAL_WALLTIME_REACHED. - Expression timelimit: Not implemented (pre-existing TODO for all resource requirement expressions).
NotImplementedErroris acceptable failure mode.
- Should the TPV integration test (Commit 3) be deferred if TPV doesn't have timelimit template support yet?