Skip to content

Instantly share code, notes, and snippets.

@jmchilton
Created January 19, 2026 23:54
Show Gist options
  • Select an option

  • Save jmchilton/4ca9e8b218ba8a607d7999bf42193ce9 to your computer and use it in GitHub Desktop.

Select an option

Save jmchilton/4ca9e8b218ba8a607d7999bf42193ce9 to your computer and use it in GitHub Desktop.
Triage documents for Galaxy Issue #21604

Issue #21604: How can I publish a dataset?

Author: Simon Bray (@simonbray) Created: 2026-01-16 URL: galaxyproject/galaxy#21604 Version: 25.1 (EU)

Problem Description

User has a published history containing 1 already published dataset. After uploading a second dataset, they cannot publish it to make it publicly available.

Current Behavior

  • Published history shows mixed state: 1 published dataset, 1 unpublished dataset
  • No UI option to publish individual dataset
  • Unpublishing and republishing the whole history leaves situation unchanged
  • BioBlend's gi.datasets.publish_dataset() has no effect

Dataset Permission Differences

Published dataset:

{'manage': ['my_user_id'], 'access': []}

Unpublished dataset:

{'manage': ['my_user_id'], 'access': ['my_user_id']}

API Error

Attempting to update permissions via BioBlend:

gi.datasets.update_permissions('dataset_id', manage_ids=['my_user_id'], access_ids=[])

Returns:

ConnectionError: Unexpected HTTP status code: 400: {"err_msg":"Attempting to share a non-shareable dataset.","err_code":400008}

Key Questions

  1. Why does a dataset become "non-shareable"?
  2. How should datasets in published histories handle new uploads?
  3. Is this a regression or intended behavior?

Issue #21604 Code Research

Executive Summary

The issue involves datasets uploaded to a published history becoming non-shareable, preventing them from being published via BioBlend's publish_dataset() method. The API returns error code 400008 ("Attempting to share a non-shareable dataset") when attempting to update permissions.

Relevant Code Locations

Error Source

  • lib/galaxy/model/__init__.py:325 - CANNOT_SHARE_PRIVATE_DATASET_MESSAGE constant
  • lib/galaxy/model/__init__.py:4520-4522 - Dataset.ensure_shareable() method that raises the error
  • lib/galaxy/model/__init__.py:4510-4518 - Dataset.shareable property that checks object store privacy
  • lib/galaxy/model/security.py:875-898 - SecurityAgent.set_all_dataset_permissions() where error is returned (line 898)

Permission Update Flow

  • lib/galaxy/webapps/galaxy/api/datasets.py:234-246 - API endpoint for updating dataset permissions
  • lib/galaxy/managers/datasets.py:569-623 - DatasetManager.update_permissions() and _set_permissions()
  • lib/galaxy/managers/hdas.py:338-356 - HistoryDatasetAssociationManager._set_permissions() implementation

History Publishing Flow

  • lib/galaxy/webapps/galaxy/services/sharable.py:94-101 - ShareableService.publish() calls make_members_public()
  • lib/galaxy/managers/histories.py - HistoryManager.make_members_public() makes existing datasets public
  • lib/galaxy/model/security.py:1193-1204 - SecurityAgent.make_dataset_public() removes DATASET_ACCESS permissions
  • lib/galaxy/model/security.py:865-873 - SecurityAgent.history_get_default_permissions() retrieves history default permissions

Dataset Permission Inheritance

  • lib/galaxy/tools/actions/upload_common.py - Datasets inherit permissions from history when uploaded
  • lib/galaxy/job_execution/output_collect.py - Output datasets use history_get_default_permissions()

Shareable Property Check

  • lib/galaxy/objectstore/__init__.py:354-355 - ObjectStore.is_private() abstract method
  • lib/galaxy/objectstore/__init__.py:1739-1743 - HierarchicalObjectStore._is_private() returns self.private

How Dataset Publishing Works

  1. Normal Dataset: When a dataset is created, it inherits permissions from the history via history_get_default_permissions()
  2. Shareable Check: A dataset is shareable only if NOT stored in a private object store:
    • Dataset.shareable property checks: not object_store.is_private(self) (line 4518)
  3. Permission Update: When updating permissions, set_all_dataset_permissions() checks if dataset is shareable (line 891)
  4. History Publishing: When a history is published:
    • make_members_public() is called on all activatable datasets
    • For each dataset: if not already public, calls make_dataset_public()
    • make_dataset_public() calls dataset.ensure_shareable() FIRST (line 1196), which throws exception if dataset is in private object store

What Makes a Dataset "Non-Shareable"

From lib/galaxy/model/__init__.py:4510-4518:

@property
def shareable(self) -> bool:
    """Return True if placed into an objectstore not labeled as ``private``."""
    if self.external_filename:
        return True
    else:
        object_store = self._assert_object_store_set()
        return not object_store.is_private(self)

A dataset is non-shareable if:

  • It doesn't have an external_filename AND
  • Its object store is marked as private (object_store.is_private() returns True)

Error Code 400008 Mapping

Error code 400008 is associated with:

  • RequestParameterInvalidException
  • ActionInputError

Thrown when set_all_dataset_permissions() detects a non-shareable dataset attempting to be shared (line 898 in security.py).

Root Cause Theories

Theory 1: Private Object Store Assignment (Most Probable)

When a dataset is uploaded to a published history, the dataset is assigned to a private object store (configured with private=True). This causes the shareable property to return False, preventing any permission updates. The history publishing code tries to make datasets public but fails when the dataset is in a private store.

Why this is probable: The EU instance likely has a user object store or default object store configured as private. When new datasets are uploaded, they go to this private store regardless of history publication status.

Theory 2: Timing/Ordering Issue

  1. User creates/publishes history with 1 dataset
  2. User uploads new dataset - it inherits default permissions from history
  3. New dataset placed in private object store despite being in a published history
  4. New dataset's permissions remain tied to private store
  5. BioBlend's publish_dataset() fails because it can't update permissions on a non-shareable dataset

Theory 3: UI/UX Gap for Private Store Datasets

There's no UI mechanism to publish a dataset stored in a private object store - this is by design (private stores are meant to be private). However, the user expectation is that uploading to a published history should make the dataset public, which conflicts with the private store behavior.

Key Code Flow for Publishing a Dataset

POST /api/datasets/{dataset_id}/permissions
  -> DatasetManager.update_permissions()
    -> HDAManager._set_permissions()
      -> security_agent.set_all_dataset_permissions()
        -> Checks: if not new and not dataset.shareable
          -> Returns CANNOT_SHARE_PRIVATE_DATASET_MESSAGE

When dataset.shareable is False due to private object store, the error is returned as HTTP 400 with error code 400008.

Issue #21604 Importance Assessment

Issue: Cannot publish datasets uploaded to already-published history Reporter: Simon Bray (@simonbray) Version: Galaxy 25.1 (EU instance)

1. Severity: MEDIUM

Rationale:

  • Functional breakage - user workflow blocked but no data loss/security issue
  • Operation fails with clear error message (not crash/hang)
  • Workaround exists (though painful, see below)
  • Not a regression - appears to be working as designed for private object stores

Key distinction: This is design gap not a bug. The system correctly prevents sharing datasets from private storage. The issue is the UX when uploading to a published history - the dataset quietly gets stored privately with no warning.

2. Blast Radius: SPECIFIC CONFIGURATION

Affected users:

  • Users on instances with private object stores (EU, institutional instances)
  • Users with new_user_dataset_access_role_default_private = True
  • Users who upload to already-published histories

NOT affected:

  • UseGalaxy.org (public storage, public defaults)
  • Instances without private object stores
  • Users who publish histories after uploading all datasets

Estimate: ~20-30% of Galaxy deployments use private object stores (EU is major public instance). Within those, subset who upload to published histories.

3. Workaround Existence: PAINFUL

Available workarounds:

  1. Delete dataset from published history, create new history, upload, then copy dataset to published history (if permitted)
  2. Change default object store to non-private before upload (if user has option)
  3. Ask admin to move dataset to non-private storage (requires admin intervention)
  4. Re-upload to new non-published history, make public, then copy

Why painful:

  • No clear UI guidance - user must understand object store architecture
  • Error message doesn't explain cause or solution
  • Multiple steps required
  • May require admin help

4. Regression Status: NOT A REGRESSION

Evidence:

  • Private object store feature introduced in Galaxy 23.1 (PR #14073)
  • Test TestPrivatePreventsSharingObjectStoreIntegration explicitly tests and expects this behavior
  • Related issue #21536 confirmed "not a regression" - design gap since 23.1
  • User statement "I seem to remember it was possible in the past" - likely before private object stores existed or before EU adopted them

History:

  • Pre-23.1: All storage was shareable by default
  • 23.1+: Private object stores introduced with explicit private=True flag
  • Current: System correctly enforces privacy but creates UX gap

5. User Impact Signals

Issue reactions: None Duplicate reports: None directly Related issues:

  • #21536 - Same root cause (upload fails with private storage)
  • #19608 - Related privacy handling gap
  • #13001 - Historical sharing issues (2021, closed)

Support requests: Low - issue filed by experienced developer, suggests not widespread complaint yet.

Note: Lack of reactions/duplicates may indicate:

  1. Low occurrence (specific configuration)
  2. Users finding workarounds without reporting
  3. Users attributing to "expected behavior" and not reporting

6. Related Context

This issue is part of a broader class of private object store UX gaps:

Issue Description Status
#21604 Can't publish uploaded dataset Open
#21536 Upload fails with private storage Open
#19608 Making history private doesn't work Open

All stem from: private storage semantics not integrated into UX flows

7. Recommendation: BACKLOG (with UX enhancement flag)

NOT hotfix because:

  • No data loss/security issue
  • Not a regression
  • Working as designed (enforcing privacy)
  • Workarounds exist

NOT wontfix because:

  • Legitimate user expectation that upload to published history should "just work"
  • UX is confusing - no warning, no guidance
  • EU is major public instance

Recommended priority: Medium-backlog

Suggested approach (if fixing):

  1. Short-term (UX): When uploading to published history and only private storage available, warn user that dataset will not be shareable.

  2. Medium-term: Consider allowing object store selection at upload time if non-private options exist.

  3. Long-term: Holistic review of private storage UX as part of #18128 (Bring Your Own Storage initiative).

Unresolved Questions

  1. Should uploads to published histories prefer non-private storage if available?
  2. Should system warn at upload time vs fail silently at share time?
  3. Is EU object store config intentionally private? If so, is this expected behavior for their users?
  4. Should we consolidate #21604, #21536, #19608 into a single private-storage UX epic?

Fix Plan: Issue #21604 - Publishing Datasets Uploaded to Already-Published History

Executive Summary

When a user uploads a dataset to an already-published history, the dataset ends up in a private object store, making it impossible to publish or share. The Dataset.shareable property returns False for datasets in private stores, causing make_dataset_public() to fail with error 400008.

Root Cause Analysis

The issue occurs due to a disconnect between:

  1. History state (published) - indicates sharing intent
  2. Object store selection - determined by job handler based on requires_shareable_storage()
  3. Default permissions - inherited from history

Flow when uploading to published history:

  1. Dataset created with history's default permissions (likely public for published history)
  2. Upload job queued with the dataset
  3. Job handler calls _set_object_store_ids() which:
    • Calls job.requires_shareable_storage() to check if dataset needs shareable storage
    • requires_shareable_storage() checks if dataset is "private to a user" via dataset_is_private_to_a_user()
    • For public datasets (no access role), returns True -> needs shareable
  4. If user/history has preferred_object_store_id pointing to a private store, ObjectStorePopulator.set_object_store_id() raises ObjectCreationProblemSharingDisabled
  5. If NO preferred store is set and only private stores are available (EU instance config), dataset goes to private store
  6. Later attempts to make dataset public fail because dataset.shareable == False

The key disconnect: When there's only ONE object store and it's private, the require_shareable check happens in ObjectStorePopulator.set_dataset_object_store_id() (lines 2112-2113) but only if object store selection is active. The basic path doesn't have the same guard.

Recommended Approach: Proactive Validation with Fail-Fast

Strategy: Prevent upload to published histories when the only available/selected object store is private. This is cleaner than attempting recovery or copy-on-share approaches.

Rationale:

  1. Fail-fast principle - Users get immediate feedback rather than discovering the problem later
  2. Data integrity - No half-states where datasets exist but can't be shared
  3. Clear UX - Users understand why upload failed and can take corrective action
  4. Minimal complexity - Doesn't require dataset migration or complex recovery logic
  5. Aligns with existing behavior - Similar to how ObjectCreationProblemSharingDisabled works for jobs

Implementation Steps

Step 1: Add Helper Method to Check Shareability Requirement

File: lib/galaxy/model/__init__.py

Add method to History class to determine if uploaded datasets need shareable storage:

# In History class (around line 3600)
def requires_shareable_storage_for_new_datasets(self) -> bool:
    """Return True if new datasets added to this history need shareable storage.

    This is True if the history is published or importable, indicating
    datasets will need to be accessible to others.
    """
    return self.published or self.importable

Step 2: Add Object Store Shareability Check

File: lib/galaxy/objectstore/__init__.py

Add method to ObjectStore base class to check if a given object store ID (or default) is private:

# In ObjectStore base class (around line 380)
def is_store_private(self, object_store_id: Optional[str] = None) -> bool:
    """Check if a specific object store (or default) is private.

    Args:
        object_store_id: The store ID to check, or None for the default store.

    Returns:
        True if the store is private, False otherwise.
    """
    return False  # Default implementation for non-distributed stores

# In DistributedObjectStore class (around line 1630)
def is_store_private(self, object_store_id: Optional[str] = None) -> bool:
    if object_store_id is None:
        # Check if all weighted backends are private
        if not self.weighted_backend_ids:
            return False
        for backend_id in set(self.weighted_backend_ids):
            backend = self.backends.get(backend_id)
            if backend and not backend.private:
                return False
        return True
    else:
        backend = self.backends.get(object_store_id)
        return backend.private if backend else False

Step 3: Add Validation in Upload Action

File: lib/galaxy/tools/actions/upload.py

Add validation before creating upload job:

# In BaseUploadToolAction.execute() around line 58, after getting history
def execute(self, tool, trans, ...):
    trans.check_user_activation()
    incoming = incoming or {}

    # Get the target history
    if history is None:
        history = trans.history

    # Check if uploading to a published/importable history with private storage
    if history.requires_shareable_storage_for_new_datasets():
        # Determine which object store would be used
        effective_object_store_id = self._get_effective_object_store_id(
            trans, history, preferred_object_store_id
        )
        if trans.app.object_store.is_store_private(effective_object_store_id):
            raise RequestParameterInvalidException(
                "Cannot upload to a published history when using private storage. "
                "Either unpublish the history, or select a shareable storage location."
            )

    # ... rest of existing code

Step 4: Add Validation in Fetch Tool Action

File: lib/galaxy/tools/actions/upload.py

Apply same validation to FetchUploadToolAction._setup_job() (around line 101).

Step 5: Add UI Warning for Object Store Selection

File: client/src/components/ObjectStore/SelectObjectStore.vue

Add warning when selecting private store for published history:

<template>
  <!-- Add after existing content -->
  <BAlert
    v-if="isPrivateStoreWarning"
    variant="warning"
    show>
    This storage location is private. Datasets stored here cannot be shared or
    published. If your history is published, new uploads will fail.
  </BAlert>
</template>

<script setup>
// Add computed property
const isPrivateStoreWarning = computed(() => {
  if (props.selectedObjectStoreId) {
    const store = selectableObjectStores.value?.find(
      s => s.object_store_id === props.selectedObjectStoreId
    );
    return store?.private === true;
  }
  return false;
});
</script>

Step 6: Add API Endpoint for Checking Upload Compatibility

File: lib/galaxy/webapps/galaxy/api/histories.py

Add endpoint to check if a history can accept uploads with current storage settings:

@router.get(
    "/api/histories/{id}/upload_compatible",
    summary="Check if history can accept uploads with current storage config"
)
def check_upload_compatible(
    trans: ProvidesUserContext = DependsOnTrans,
    id: DecodedDatabaseIdField = HistoryIDPathParam,
) -> dict:
    history = trans.sa_session.get(History, id)
    # ... validation

    can_upload = True
    reason = None

    if history.requires_shareable_storage_for_new_datasets():
        effective_store_id = _get_effective_object_store_id(trans, history)
        if trans.app.object_store.is_store_private(effective_store_id):
            can_upload = False
            reason = "History is published but current storage location is private"

    return {"can_upload": can_upload, "reason": reason}

Test Plan

Unit Tests

File: test/unit/data/test_galaxy_mapping.py

def test_history_requires_shareable_storage(self):
    """Test History.requires_shareable_storage_for_new_datasets()."""
    history = History()
    assert not history.requires_shareable_storage_for_new_datasets()

    history.published = True
    assert history.requires_shareable_storage_for_new_datasets()

    history.published = False
    history.importable = True
    assert history.requires_shareable_storage_for_new_datasets()

Integration Tests

File: test/integration/objectstore/test_private_handling.py

class TestPrivateStoreBlocksPublishedHistoryUpload(BaseObjectStoreIntegrationTestCase):
    """Test that uploads to published histories fail when only private store available."""

    @classmethod
    def handle_galaxy_config_kwds(cls, config):
        config["new_user_dataset_access_role_default_private"] = False
        cls._configure_object_store(PRIVATE_OBJECT_STORE_CONFIG_TEMPLATE, config)

    def test_upload_to_published_history_fails(self):
        with self.dataset_populator.test_history() as history_id:
            # Publish the history
            self._put(f"histories/{history_id}", json={"published": True})

            # Attempt upload should fail with clear error
            response = self.dataset_populator.new_dataset_request(
                history_id, content="test", wait=False, assert_ok=False
            )
            assert response.status_code == 400
            assert "private storage" in response.json()["err_msg"].lower()

    def test_upload_to_unpublished_history_succeeds(self):
        with self.dataset_populator.test_history() as history_id:
            # Upload to unpublished history should work
            response = self.dataset_populator.new_dataset_request(
                history_id, content="test", wait=True
            )
            assert response.status_code == 200

Risks and Considerations

Risks

  1. Breaking change for users - Users who rely on uploading to published histories with private storage will now get errors instead of silent failures
  2. Object store configuration complexity - Admins need to ensure at least one non-private store exists for public sharing use cases
  3. Edge case: History published after upload - Datasets uploaded before publishing remain stuck in private store

Mitigations

  1. Clear error messages - Provide actionable guidance in error messages
  2. Documentation - Update admin docs explaining private store + published history incompatibility
  3. Future enhancement - Consider dataset migration tool for edge case (out of scope for this fix)

Alternative Approaches Considered

  1. Route to public store automatically - Rejected: violates user's storage preference, quota implications
  2. Copy on share - Rejected: doubles storage, complex implementation, quota issues
  3. Silent skip - Rejected: current behavior, causes user confusion (the bug we're fixing)
  4. Documentation only - Rejected: doesn't solve the UX problem

Unresolved Questions

  1. Should we add a config option to auto-unpublish history when user selects private storage?
  2. Should the UI proactively check storage compatibility when publishing a history?
  3. Should existing datasets uploaded before this fix get a migration path?
  4. Is there a use case for "private to me but published for viewing" that we're breaking?

Critical Files for Implementation

  • lib/galaxy/model/__init__.py - Add requires_shareable_storage_for_new_datasets() to History class
  • lib/galaxy/objectstore/__init__.py - Add is_store_private() method to ObjectStore and DistributedObjectStore
  • lib/galaxy/tools/actions/upload.py - Add validation in upload action before job creation
  • test/integration/objectstore/test_private_handling.py - Add integration tests for new behavior
  • client/src/components/ObjectStore/SelectObjectStore.vue - Add UI warning for private store selection

Issue #21604 Summary: Cannot Publish Dataset in Published History

Top-Line Summary

Issue: User cannot publish a dataset uploaded to an already-published history. API returns error 400008: "Attempting to share a non-shareable dataset."

Root Cause: This is a design gap, not a regression. The dataset is stored in a private object store (object_store.is_private() returns True), which correctly prevents sharing per the private storage contract introduced in Galaxy 23.1. However, the UX fails to warn users when uploading to a published history with only private storage available, leading to a confusing failure at share-time rather than upload-time.

Most Probable Fix: Implement fail-fast validation - reject uploads to published/importable histories when the target object store is private, with a clear error message guiding users to either unpublish the history or select a shareable storage location.

Importance Assessment

Dimension Rating Notes
Severity Medium Functional breakage, no data loss/security issue
Blast Radius Specific config EU instance + private object store users
Workaround Painful Requires understanding object store architecture
Regression No Behavior introduced with private stores in 23.1
Priority Backlog Medium priority, not hotfix

Questions for Discussion

  1. EU Configuration: Is the EU object store intentionally configured as private? If so, is this expected behavior for EU users, and should documentation clarify?

  2. Consolidation: Should we consolidate this with related issues (#21536 - upload fails with private storage, #19608 - making history private) into a single private-storage UX epic?

  3. Behavior Choice: Should uploads to published histories:

    • Fail immediately with clear error (proposed fix)?
    • Warn but allow (creates non-shareable datasets)?
    • Auto-route to public store if available (changes user preference)?
  4. History Publish Flow: Should publishing a history warn/block if it contains datasets in private storage that cannot be made public?

Effort Estimate

Component Effort Complexity
Backend validation Small Low - add check in upload action
UI warning Small Low - add alert component
Integration tests Medium Medium - requires object store config
Documentation Small Low
Total ~1-2 days Low-Medium

Reproducibility

Easy to reproduce on any Galaxy instance with:

  1. Private object store configured (or new_user_dataset_access_role_default_private = True)
  2. Publish a history with at least one dataset
  3. Upload a new dataset to the published history
  4. Try to share/publish the new dataset

Harder to test fix - requires integration test environment with private object store configuration.

Related Issues

  • #21536 - Same root cause (private storage + sharing)
  • #19608 - Related privacy handling gap
  • #14073 - Original PR introducing private object stores (23.1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment