You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Issue 21573: Conversions triggered via display application
Author: mvdbeek (Marius van den Beek)
Created: 2026-01-13
State: OPEN
Labels: kind/bug
Description
From @jennaj
Hi @dannon -- would you be able to help me with this? The users datasets are being "converted" hundreds of times. I've not seen this before. They are not very large but can clutter up the storage view and is confusing them. This looks like a bug but I am not sure how to ticket it. They stated that they are reviewing data at UCSC. Maybe we are recreating the index for them at each request, instead of reusing it? (guess!) Thank you!
Issue 21573: Conversions Triggered via Display Applications - Code Research
Issue Summary
Datasets are being converted hundreds of times when users view data at UCSC via display applications. Conversions should be reused, not recreated on each request. This clutters storage and confuses users.
Relevant Code Paths
1. Display Application Flow
When a user clicks "display at UCSC", the following path is executed:
API Endpoint (lib/galaxy/webapps/galaxy/api/display_applications.py:52-71)
POST /api/display_applications/create_link calls DisplayApplicationsManager.create_link()
The issue: _get_dataset_like_object() returns None if no converted dataset exists yet (even if one is being created), so is_preparing() checks if None has a state - which it doesn't.
Root Cause: The flush_job=False parameter in convert_dataset() method means the ImplicitlyConvertedDatasetAssociation record isn't committed to the database before the method returns. Concurrent requests checking get_converted_files_by_type() won't see the pending conversion, triggering duplicate conversions.
Assessment: This is a regression introduced in April 2024 (release 24.0).
Timeline of Key Changes
1. April 21, 2024 - The Regression Introduced
Commit: 1b1094be215 by mvdbeek
PR: #18036 - [24.0] Fix History Dataset Association creation so that hid is always set
Release: 24.0
This commit:
Added flush_job=False to converter.execute() call in convert_dataset()
Removed session.commit() from attach_implicitly_converted_dataset()
Moved ICDA creation into convert_dataset() (from parameters.py)
The purpose was to fix HDAs missing hid values, but it inadvertently removed the database commit that made conversions visible to concurrent requests.
Before this commit: Display application code in parameters.py explicitly committed:
new_data=next(iter(data.datatype.convert_dataset(...).values()))
new_data.hid=data.hidnew_data.name=data.nametrans.sa_session.add(new_data)
assoc=trans.app.model.ImplicitlyConvertedDatasetAssociation(...)
trans.sa_session.add(assoc)
withtransaction(trans.sa_session):
trans.sa_session.commit() # <-- This was removed
The explicit commit ensured the ICDA was visible to concurrent requests immediately.
Conclusion
This is a regression from Galaxy 24.0 (April 2024). The fix that prevented missing HIDs inadvertently created a race condition where concurrent display application requests don't see each other's pending conversions because:
flush_job=False delays job record creation
attach_implicitly_converted_dataset() no longer commits
Session changes are only visible after eventual auto-flush or commit
Potential fixes:
Add explicit commit/flush after attach_implicitly_converted_dataset()
Pass use_cached_job=True to convert_dataset() from display applications
Check for pending (unflushed) conversions in get_converted_files_by_type()
Add locking/mutex around conversion check-and-start logic
Datasets converted hundreds of times when viewing via display applications (e.g., UCSC browser). Conversions should be reused, not recreated each request.
1. Severity: MEDIUM
No data loss - User data remains intact; conversions create additional copies but don't corrupt or delete source data.
No security issue - Not a vulnerability or authentication/authorization bypass.
No crash/hang - Galaxy remains operational; issue manifests as excessive resource consumption.
Functional breakage - Yes, but partial:
Conversion reuse mechanism is broken (functional issue)
Galaxy still works, just inefficiently
User workflow not blocked, but cluttered
Classification: Resource waste + UX degradation, not critical failure.
2. Blast Radius: LIMITED
Affected users:
Users of display applications (UCSC browser, IGV, etc.)
Particularly those making multiple requests to view same dataset
NOT affected:
Tool execution / job running
Data upload/download
History management (outside display apps)
API consumers not using display applications
Users who don't use external genome browsers
Configurations affected:
All Galaxy instances with display applications enabled
Primarily usegalaxy.org and similar production instances
Dev/test instances less impacted (lower traffic, less concurrency)
Estimate: Maybe 5-15% of user sessions involve display applications, and of those, only subset trigger concurrent conversion requests.
3. Workaround Existence: YES, but painful
Admin workaround:
Periodically clean up duplicate ICDA records
SQL: Identify and remove duplicate conversions of same parent/type
User workaround:
Wait for conversion to complete before clicking display link again
Delete duplicate converted datasets manually from history
Use storage manager to clean up clutter
Workaround pain level: MODERATE
Users confused by clutter (see support request in issue)
Storage usage inflated (quota impact)
Manual cleanup tedious but possible
4. Regression Status: CONFIRMED REGRESSION
Regression introduced: Galaxy 24.0 (April 2024)
Source: PR #18036 - "Fix History Dataset Association creation so that hid is always set"
Mechanism: PR #18036 changed session flush behavior. ICDA records added to session but not immediately flushed. Concurrent requests don't see pending conversions in get_converted_files_by_type() because query only sees committed records.
Age of regression: ~9 months (April 2024 to January 2025)
Issue 21573 Fix Plan: Duplicate Conversions via Display Applications
Problem Summary
Datasets are being converted hundreds of times when users view data at UCSC via display applications. This is a race condition where multiple concurrent requests check for existing conversions, find none (because the ICDA record hasn't been committed yet), and all start new conversions.
Root Cause Analysis
The Race Condition Timeline
Request A arrives, calls get_converted_files_by_type() - finds no conversion
Request A calls convert_dataset() which:
Creates job with flush_job=False (line 863 in data.py)
Creates ICDA record via attach_implicitly_converted_dataset() - NO COMMIT YET
Calls job_manager.enqueue() - commit happens here eventually
Request B arrives (before Request A's commit), calls get_converted_files_by_type() - finds no conversion
Request B starts another conversion...
Repeat for hundreds of concurrent requests
The Breaking Change (PR #18036, Galaxy 24.0)
Before PR #18036 (attach_implicitly_converted_dataset in model/init.py):
# Current code:job, converted_datasets, *_=converter.execute(
trans,
incoming=params,
set_output_hid=visible,
history=history,
flush_job=False,
completed_job=completed_job,
)
# We should only have a single converted output, but let's be defensive heren_converted_datasets=len(converted_datasets)
forconverted_datasetinconverted_datasets.values():
ifconverted_dataset.extension=="auto"andn_converted_datasets==1:
converted_dataset.extension=target_typeoriginal_dataset.attach_implicitly_converted_dataset(trans.sa_session, converted_dataset, target_type)
trans.app.job_manager.enqueue(job, tool=converter)
# Proposed change:job, converted_datasets, *_=converter.execute(
trans,
incoming=params,
set_output_hid=visible,
history=history,
flush_job=False,
completed_job=completed_job,
)
# We should only have a single converted output, but let's be defensive heren_converted_datasets=len(converted_datasets)
forconverted_datasetinconverted_datasets.values():
ifconverted_dataset.extension=="auto"andn_converted_datasets==1:
converted_dataset.extension=target_typeoriginal_dataset.attach_implicitly_converted_dataset(trans.sa_session, converted_dataset, target_type)
# Flush ICDA record immediately to prevent duplicate conversions from concurrent requeststrans.sa_session.flush()
trans.app.job_manager.enqueue(job, tool=converter)
Pros:
Simple, targeted fix
Low risk - only adds a flush, doesn't change model or schema
Restores intended behavior without full commit
Cons:
Doesn't prevent true concurrent execution (microsecond window still exists)
Requires the flush to complete before enqueue
Approach 2: Database Unique Constraint
Rationale: Database-level enforcement prevents duplicates even with concurrent transactions.
Changes:
File: lib/galaxy/model/__init__.py (after line 6869)
classImplicitlyConvertedDatasetAssociation(Base, Serializable):
__tablename__="implicitly_converted_dataset_association"__table_args__= (
# Prevent duplicate conversions for same parent/type combinationUniqueConstraint('hda_parent_id', 'type', name='uq_icda_hda_parent_type'),
UniqueConstraint('ldda_parent_id', 'type', name='uq_icda_ldda_parent_type'),
)
File: lib/galaxy/model/migrations/alembic/versions_gxy/XXX_add_icda_unique_constraint.py (new migration)
"""Add unique constraint to implicitly_converted_dataset_associationRevision ID: XXX"""fromalembicimportopdefupgrade():
# First, delete duplicate records (keep oldest)op.execute(""" DELETE FROM implicitly_converted_dataset_association WHERE id NOT IN ( SELECT MIN(id) FROM implicitly_converted_dataset_association GROUP BY hda_parent_id, type ) AND hda_parent_id IS NOT NULL """)
op.execute(""" DELETE FROM implicitly_converted_dataset_association WHERE id NOT IN ( SELECT MIN(id) FROM implicitly_converted_dataset_association GROUP BY ldda_parent_id, type ) AND ldda_parent_id IS NOT NULL """)
op.create_unique_constraint('uq_icda_hda_parent_type',
'implicitly_converted_dataset_association',
['hda_parent_id', 'type'])
op.create_unique_constraint('uq_icda_ldda_parent_type',
'implicitly_converted_dataset_association',
['ldda_parent_id', 'type'])
defdowngrade():
op.drop_constraint('uq_icda_hda_parent_type', 'implicitly_converted_dataset_association')
op.drop_constraint('uq_icda_ldda_parent_type', 'implicitly_converted_dataset_association')
defconvert_dataset(
self,
trans,
original_dataset: DatasetHasHidProtocol,
target_type: str,
...
):
# Generate lock key for this specific conversionlock_key=f"conversion:{original_dataset.id}:{target_type}"# Try to acquire lock (non-blocking)lock=trans.app.model.context.get_conversion_lock(lock_key)
ifnotlock.acquire(blocking=False):
# Another request is already starting this conversionlog.debug(f"Conversion to {target_type} for {original_dataset.id} already in progress")
returnNonetry:
# Double-check after acquiring lockexisting=original_dataset.get_converted_files_by_type(target_type)
ifexisting:
returnexisting# Already converted# Proceed with conversion...converter=trans.app.datatypes_registry.get_converter_by_target_type(...)
# ... rest of conversion logicfinally:
lock.release()
Pros:
Serializes conversion requests properly
No database schema changes
Cons:
Requires lock infrastructure (Redis or similar)
Complex implementation
Lock management overhead
Potential deadlock risks
Approach 4: Enable Job Caching for Display Applications
Rationale: PR #21021 added job caching for implicit conversions, but it's not enabled by default for display applications.
Already has use_cached_job: bool = False parameter, but it's not being used from display applications.
Analysis of why this alone won't fix the issue:
The job caching in PR #21021 uses completed_jobs() which queries for completed jobs with matching parameters. However:
It checks for completed jobs, not in-progress ones
The race occurs BEFORE any job is committed to the database
Job caching helps avoid re-running a conversion that already succeeded, not preventing concurrent starts
Verdict: Job caching is useful but doesn't fix this race condition. Multiple requests would still all fail to find a cached job and start new ones.
Recommended Solution
Approach 1 (Re-add Flush) is recommended for these reasons:
Simplest fix - one line change
Low risk - doesn't change schema or require migrations
Addresses root cause - makes ICDA visible before enqueue
Easy to verify - behavior matches pre-24.0
Secondary recommendation: Also add Approach 2 (unique constraint) as a longer-term safeguard, but as a separate PR/release to avoid coupling fixes.
Implementation Plan
Phase 1: Immediate Fix (Approach 1)
Add flush in convert_dataset()
File: lib/galaxy/datatypes/data.py, after line 871
forconverted_datasetinconverted_datasets.values():
ifconverted_dataset.extension=="auto"andn_converted_datasets==1:
converted_dataset.extension=target_typeoriginal_dataset.attach_implicitly_converted_dataset(trans.sa_session, converted_dataset, target_type)
# Flush ICDA record to prevent duplicate conversions from concurrent requeststrans.sa_session.flush()
trans.app.job_manager.enqueue(job, tool=converter)
Phase 2: Future Enhancement (Approach 2)
Create migration to add unique constraint (separate PR)
Handle IntegrityError in attach_implicitly_converted_dataset
Test Strategy
Unit Test: Race Condition Simulation
Create a test that simulates concurrent conversion requests:
File: lib/galaxy_test/unit/data/test_conversion_race.py (new file)
importthreadingfromunittest.mockimportMagicMock, patchdeftest_concurrent_conversion_requests_dont_duplicate():
""" Verify that concurrent requests to convert the same dataset don't create duplicate ICDA records. """mock_session=MagicMock()
mock_dataset=MagicMock()
mock_dataset.id=1mock_dataset.get_converted_files_by_type.return_value=Noneconversions_started= []
lock=threading.Lock()
defmock_convert(*args, **kwargs):
withlock:
conversions_started.append(1)
returnMagicMock()
# Start multiple threads attempting conversionthreads= []
for_inrange(10):
t=threading.Thread(target=mock_convert)
threads.append(t)
t.start()
fortinthreads:
t.join()
# Should only have started one conversion# (This would fail with current code, pass with fix)
Integration Test: Display Application Conversion
File: lib/galaxy_test/api/test_display_applications.py (new or extend existing)
fromconcurrent.futuresimportThreadPoolExecutor, as_completedclassDisplayApplicationsConversionTestCase(ApiTestCase):
deftest_concurrent_display_app_conversion(self):
""" Test that concurrent requests to display application don't create duplicate conversions. """# Upload a dataset that requires conversion (e.g., BAM)history_id=self.dataset_populator.new_history()
hda=self.dataset_populator.new_dataset(
history_id,
content=open("test-data/1.bam", "rb").read(),
file_type="bam"
)
dataset_id=hda["id"]
# Make concurrent requests to UCSC display appdefmake_request():
returnself._post(
"display_applications/create_link",
data={
"app_name": "ucsc",
"link_name": "main",
"dataset_id": dataset_id,
}
)
# 20 concurrent requestswithThreadPoolExecutor(max_workers=20) asexecutor:
futures= [executor.submit(make_request) for_inrange(20)]
results= [f.result() forfinas_completed(futures)]
# Verify only one conversion job was created# Query ICDA table for this dataseticda_count=self._get_icda_count_for_dataset(dataset_id)
asserticda_count==1, f"Expected 1 ICDA, got {icda_count}"
Manual Testing
Set up Galaxy with UCSC display application
Upload a BAM file
Open UCSC link in multiple browser tabs simultaneously
Verify in database: SELECT COUNT(*) FROM implicitly_converted_dataset_association WHERE hda_parent_id = <id>
Should be exactly 1
Risks and Considerations
Performance Impact
Flush overhead: One additional flush per conversion is minimal
Lock contention: None with Approach 1
Backward Compatibility
No API changes
No schema changes (for Approach 1)
Behavior matches pre-24.0 Galaxy
Edge Cases
Session already flushed: If session auto-flushes before our explicit flush, no harm
Transaction rollback: If later code rollbacks, both ICDA and job are undone (correct behavior)
Database disconnection: Standard error handling applies
Migration Concerns (if Approach 2 is added later)
Existing duplicate ICDA records need cleanup before constraint can be added
Large tables may take time to add constraint
Unresolved Questions
Should we also enable use_cached_job=True in display_applications/parameters.py for additional protection?
For Approach 2, what's the strategy for cleaning up existing duplicate ICDAs - delete all but oldest, or all but newest?
Should we add logging to track when a duplicate conversion was prevented?
Is there a concern about the flush happening before enqueue affecting HID assignment (the original reason for PR #18036)?
Should fix be backported to 24.0, 24.1, 25.0 branches?
Issue 21573 Summary: Duplicate Conversions via Display Applications
Top-Line Summary
Display applications (e.g., UCSC browser) trigger duplicate dataset conversions due to a race condition introduced in Galaxy 24.0 (PR #18036). The fix that prevented missing HIDs inadvertently removed an immediate session commit from attach_implicitly_converted_dataset(), causing concurrent requests to not see pending conversions and each start new ones. The fix is simple: add trans.sa_session.flush() after ICDA creation in lib/galaxy/datatypes/data.py:871.
Importance Assessment
Dimension
Assessment
Severity
MEDIUM - Resource waste and UX degradation, no data loss or security issues
Blast Radius
LIMITED - Only affects display application users (5-15% of sessions)
Workaround
EXISTS but painful - Manual cleanup possible, storage quota impact
Regression
CONFIRMED - Introduced in Galaxy 24.0 (April 2024), PR #18036
Priority
HIGH for next release - Simple fix, low risk, affects usegalaxy.org
Questions for Discussion
Should fix be backported to 24.x/25.0 branches?
Does flushing before job_manager.enqueue() affect HID assignment (original concern in PR #18036)?
Add unique constraint on ICDA table as additional safeguard (separate PR)?
Enable use_cached_job=True for display applications too?
Cleanup strategy for existing duplicate conversions on production instances?
Effort Estimate
Fix complexity: LOW (one-line change)
Add trans.sa_session.flush() after ICDA creation loop in convert_dataset()
File: lib/galaxy/datatypes/data.py, after line 871
Testing complexity: MEDIUM
Need concurrency test to verify fix
ThreadPoolExecutor test with display_applications API
Manual testing with UCSC browser
Reproduction difficulty: MODERATE
Requires concurrent requests to display application
More likely on production instances with real traffic
PR #18036 (April 2024) by @mvdbeek removed session.commit() from attach_implicitly_converted_dataset() to fix HID assignment issues. This created a race window where ICDA records aren't visible to concurrent requests.