Failing test: test_conformance_v1_2_nested_prefixes_arrays
err_msg: "2 validation errors for binding-test.cwl (request model)
min_std_max_min - Extra inputs are not permitted
minimum_seed_length - Extra inputs are not permitted"
CWL conformance tests share job JSON files across multiple tools. bwa-mem-job.json has 4 keys:
reference(File)reads(File[])min_std_max_min([1,2,3,4])minimum_seed_length(3)
But binding-test.cwl only defines inputs: reference, reads, #args.py.
Galaxy's Pydantic request model uses extra="forbid" and rejects them.
The CWL v1.2 spec does NOT explicitly address extra job inputs. It says "Validate the input
object against the inputs schema" but doesn't specify whether extra properties should be
rejected or ignored.
However, cwltool (reference implementation) keeps extra keys and makes them available in JS:
process.py:_init_job()copies the rawjoborderdict →job- Validates with
validate_ex(schema, job, strict=False)—strict=Falsemeans extra keys produce a warning, not an error (schema_salad/validate.py:438-441) job(with extras still present) is passed toBuilder.__init__→self.jobBuilder.do_eval()passesself.jobtoexpression.do_eval()asjobinputexpression.do_eval()sets{"inputs": jobinput, ...}as the JS root context (cwl_utils/expression.py:292)
So extra job keys survive into $(inputs.extra_key) in JavaScript expressions. cwltool's
--strict flag would reject them, but the default is --non-strict (despite strict=True
being the argparse default — the conformance test runner uses --non-strict).
The conformance test itself is implicit evidence: nested_prefixes_arrays uses
bwa-mem-job.json (4 keys) with binding-test.cwl (3 inputs). The test expects success.
No explicit "extra inputs" conformance test exists.
For Galaxy's purposes: filtering extras before validation is safe — Galaxy doesn't run cwltool JS expressions at request validation time; it passes them through to cwltool later in the job execution where cwltool handles the full job dict independently.
CwlPopulator.run_cwl_job()loadsbwa-mem-job.jsonwith all 4 keysstage_inputs()processes ALL keys — even creates an HDCA from[1,2,3,4]formin_std_max_min_run_cwl_tool_job()→tool_request_raw()POSTs to/api/jobswith all 4 keysJobsService.create()buildsRequestToolStatefrom inputsRequestToolState.validate()creates Pydantic model with only the tool's defined params (reference,reads,#args.py)- Pydantic model has
extra="forbid"(viacreate_model_strict()intool_util_models/parameters.py:2497) - Validation rejects
min_std_max_minandminimum_seed_lengthas extra forbidden inputs
lib/galaxy/webapps/galaxy/services/jobs.py:241-255—JobsService.create(), validation entry pointlib/galaxy/tool_util/parameters/state.py:87-92—RequestToolStateusescreate_request_modellib/galaxy/tool_util_models/parameters.py:2495-2497—create_model_strictwithextra="forbid"lib/galaxy_test/base/populators.py:3053-3085—_run_cwl_tool_job, submits raw job dictlib/galaxy_test/base/populators.py:3111-3175—run_cwl_job, loads job JSON and stages inputstest/functional/tools/cwl_tools/v1.2/tests/binding-test.cwl— tool with 3 inputstest/functional/tools/cwl_tools/v1.2/tests/bwa-mem-job.json— job with 4 keys (1 extra)
lib/galaxy/tool_util/cwl/job_conversion.py already has cwl_job_to_request() which strips extra keys:
param_names = {p.name for p in input_models.parameters}
for key in list(job.keys()):
if key not in param_names:
del job[key]This function isn't used in the conformance test submission path though.
galactic_job_json() (tool_util/cwl/util.py:418-422) iterates every key in the job dict
with zero schema awareness:
replace_keys = {}
for key, value in job.items():
replace_keys[key] = replacement_item(value)
job.update(replace_keys)replacement_item() dispatches purely on Python type / class field:
{class: File}→ upload →{src: hda, id: ...}{class: Directory}→ tar + upload →{src: hda, id: ...}list→ each item uploaded, wrapped in HDCA →{src: hdca, id: ...}- scalar (for tools) → pass through unchanged
No schema is used client-side at any point — not for staging, not for submission. All the CWL input schema parsing and parameter model generation happens server-side via the tool parameter models (which already have good test coverage for CWL types).
Filter extra keys server-side in JobsService.create(), after loading the tool but
before request validation. The tool's parameter models are already available at this point
and correctly handle all CWL schema complexity. The JobRequest Pydantic model accepts
inputs: dict[str, Any], so extras pass through FastAPI fine — rejection happens at
RequestToolState.validate() inside create().
# jobs.py:create(), after line 247 (inputs = job_request.inputs)
if inputs and tool.tool_type in ("cwl", "galactic_cwl"):
param_names = {p.name for p in tool.parameters}
inputs = {k: v for k, v in inputs.items() if k in param_names}This reuses the server's already-parsed parameter models — no CWL schema parsing needed. The CWL job runner builds its own job dict independently from the tool source, so filtering at the API boundary doesn't lose anything.
stage_inputs() still blindly uploads all job keys (e.g. creating an HDCA from [1,2,3,4]
for min_std_max_min). This is harmless but wasteful. Fixing it would require either:
- Passing tool parameter info to the client (more invasive)
- Parsing the CWL file client-side (fragile — list vs dict forms,
#prefixes, nested types,$import/$mixin, etc.)
Not worth it for now.
- Any other conformance tests hit same issue? Likely yes — any test sharing a job JSON across tools with different input sets.