You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Summary
Migrate the Galaxy Google Drive file source plugin from the deprecated fs backend to fsspec, utilizing the gdrive_fsspec.GoogleDriveFileSystem implementation.
The Google Cloud Storage source (googlecloudstorage.py) is the closest reference since it also handles Google OAuth2 credentials. It demonstrates passing OAuth tokens as a dict:
Standard fsspec interface (ls, get_file, put_file, walk, glob, etc.)
The key question for this migration is how to pass the existing OAuth2 access token to gdrive_fsspec. The library may accept a token dict (like gcsfs) or may need a Google credentials object.
Files That Must Be Modified
Primary changes:
lib/galaxy/files/sources/googledrive.py -- Complete rewrite: swap base class, imports, config classes, and _open_fs implementation.
lib/galaxy/dependencies/conditional-requirements.txt (line 28) -- Replace fs.googledrivefs with gdrive_fsspec.
lib/galaxy/dependencies/__init__.py (line 272-273) -- Rename check_fs_googledrivefs to check_gdrive_fsspec.
Secondary changes:
packages/files/setup.cfg -- Consider adding gdrive_fsspec to test extras.
pyproject.toml -- Add gdrive_fsspec if it should be a default dependency.
test/unit/files/test_googledrive.py -- Update test if config or import changes require it.
test/unit/files/googledrive_file_sources_conf.yml -- Update config keys if necessary.
Potentially affected:
lib/galaxy/files/templates/examples/production_google_drive.yml -- Verify template still works.
lib/galaxy/files/templates/models.py -- OAuth2 configuration model for googledrive; review for compatibility.
client/src/api/schema/schema.ts -- Auto-generated; will update if API schema changes.
Reactions on issue: Zero (no thumbs-up or other reactions)
Comments: Zero
Assignees: None
Milestone: None
Related Issues
This is part of a batch of 5 issues filed simultaneously by davelopez on 2026-02-17:
Issue
Title
Reactions
Comments
#21865
Migrate dropbox file source plugin to fsspec
0
0
#21866
Migrate ftp file source plugin to fsspec
0
0
#21867
Migrate ssh file source plugin to fsspec
0
0
#21868
Migrate googledrive file source plugin to fsspec
0
0
#21869
Migrate webdav file source plugin to fsspec
0
0
The broader fsspec migration effort traces back to issue #20415 (referenced in PR #20698).
Duplicate / Related Issues
No duplicate issues found. No community help forum threads requesting this specific migration.
Community Sentiment
This is entirely developer-driven technical debt work. There are no user reports of problems with the current Google Drive implementation that would be solved by this migration. The motivation is architectural: moving away from a deprecated library ecosystem.
Indirect Demand Signals
Production usage: Google Drive is available as a configured file source template (production_google_drive.yml) with full OAuth2 integration, suggesting it is actively used in production Galaxy instances.
Library health: The PyFilesystem2 (fs) ecosystem is deprecated/less maintained compared to fsspec. The fs.googledrivefs package has limited maintenance activity. The gdrive_fsspec package is under the official fsspec GitHub organization, indicating institutional backing and better long-term support.
Ecosystem trend: fsspec has become the de facto standard for filesystem abstraction in the Python data ecosystem (used by pandas, dask, xarray, etc.).
Demand Assessment
LOW direct user demand, HIGH indirect/strategic demand.
This is maintenance-driven debt reduction rather than a user-requested feature. No users have asked for this. However, the risk of staying on deprecated libraries increases over time -- eventual incompatibilities, security issues, or Python version support gaps could force an urgent migration later. Proactive migration while the effort is small is the prudent approach.
No community discussion threads requesting this migration
Developer-driven technical debt initiative, not user-reported
However, Google Drive is used in production (has OAuth2 template configuration in production_google_drive.yml)
Strategic Value: HIGH
Part of a planned migration of all file sources from deprecated PyFilesystem2 to fsspec (#21865-#21869)
The fs (PyFilesystem2) ecosystem is deprecated and less maintained than fsspec
fsspec is the modern standard for filesystem abstraction in the Python ecosystem (used by pandas, dask, xarray, etc.)
Staying on deprecated libraries increases maintenance risk and security exposure over time
Completing this migration (along with the other 4 sibling issues) unblocks eventual removal of the _pyfilesystem2.py base class and the fs dependency
The gdrive_fsspec package is under the official fsspec GitHub organization, indicating better institutional support
Consistent architecture: having all file sources on the same base class simplifies maintenance, testing, and feature development (e.g., cache options, pagination improvements apply to all sources at once)
Effort Estimate: SMALL
Current implementation is only 58 lines
The migration pattern is well-established with 3+ sources already migrated successfully
PR #21590 (GCS migration, the closest analog) was +135/-171 lines, touching 8 files
The FsspecFilesSource base class handles all the heavy lifting
Main work: swap base class, swap imports, adjust _open_fs to use gdrive_fsspec, update dependency declarations
OAuth2 credential mapping (MEDIUM risk): The current implementation uses google.oauth2.credentials.Credentials(token=access_token) passed to GoogleDriveFS(credentials). The new gdrive_fsspec library uses a different authentication model (token parameter, creds dict). This mapping needs careful implementation and testing to ensure backward compatibility.
Configuration compatibility (LOW risk): The YAML configuration format uses token, refresh_token, token_uri, client_id, client_secret fields. These must continue to work. Since the configuration models are separate from the filesystem implementation, the risk is manageable.
Library maturity (LOW risk): gdrive_fsspec is relatively new but is part of the official fsspec organization. Its API stability and completeness for Galaxy's use cases (ls, get_file, put_file, walk, glob) should be verified.
Testing gap (LOW risk): The existing test requires live Google Drive credentials (GALAXY_TEST_GOOGLE_DRIVE_ACCESS_TOKEN), making CI verification difficult. However, the fsspec base class itself is well-tested, and the plugin surface area is small.
Mitigations:
PR #21590 (GCS migration) provides a proven pattern for Google OAuth2 credential handling in fsspec
The fsspec base class is well-tested with memory and temp filesystem backends
The scope is small enough that manual testing by a developer with Google Drive access is feasible
Can be reverted independently if issues arise
Recommendation: PRIORITIZE NOW
Rationale:
This is low-effort, low-risk work that contributes to an important strategic goal (eliminating the PyFilesystem2 dependency). It follows an established pattern with clear prior art. The five fsspec migration issues (#21865-21869) should ideally be worked as a batch since they all follow the same pattern, and completing the set enables removing the deprecated _pyfilesystem2.py base class entirely.
The Google Drive migration specifically is one of the simpler ones in the batch given:
The existing GCS migration as a direct reference for Google OAuth2 handling
The small size of the current implementation (58 lines)
The straightforward authentication model (single access token)
This is a good candidate for a newer contributor familiar with the codebase patterns, or for batch implementation by the original issue author (davelopez) who designed and implemented the fsspec base class.
Follow the exact pattern established by PR #21590 (GCS migration). The implementation is straightforward: swap the base class from PyFilesystem2FilesSource to FsspecFilesSource, replace the fs.googledrivefs import with gdrive_fsspec, and update the _open_fs method to use the new library's authentication model.
Google credentials object: creds=Credentials(token=...)
The GCS implementation (googlecloudstorage.py) uses a dict approach. The gdrive_fsspec library documentation suggests it uses token and creds parameters.
Step 3: Update test configuration
test/unit/files/googledrive_file_sources_conf.yml: Review field names. Current format:
- type: googledriveid: test1doc: Test access to a Google drive.token: ${user.preferences['googledrive|access_token']}refresh_token: ${user.preferences['googledrive|refresh_token']}token_uri: "https://www.googleapis.com/oauth2/v4/token"client_id: ${user.preferences['googledrive|client_id']}client_secret: ${user.preferences['googledrive|client_secret']}
This may need updates to match the new configuration model.
test/unit/files/test_googledrive.py: Minimal changes expected. The test structure should remain the same; only import paths or config field names might change.
Step 4: Update package configuration
packages/files/setup.cfg: Consider adding gdrive_fsspec to test extras:
pyproject.toml: Add gdrive_fsspec alongside other fsspec implementations if it should be available by default.
Step 5: Verify template compatibility
lib/galaxy/files/templates/examples/production_google_drive.yml: Verify the template still works with the new implementation. The configuration keys oauth2_client_id and oauth2_client_secret are handled by the template system before reaching the plugin.
lib/galaxy/files/templates/models.py: The GoogleDriveFileSourceConfiguration and GoogleDriveFileSourceTemplateConfiguration models may need updates if the resolved config class interface changes.
Testing Strategy
Automated unit tests: The existing test (test_googledrive.py) requires live credentials. Consider adding a mock-based test using the BaseFileSourceTestSuite pattern established in PR #20698 with a mocked GoogleDriveFileSystem.
Integration testing: Run test_googledrive.py with valid Google Drive credentials:
Issue #21868 requests migrating Galaxy's Google Drive file source plugin from the deprecated PyFilesystem2 (fs.googledrivefs) backend to the modern fsspec framework (gdrive_fsspec). This is one of five coordinated migration issues (#21865-#21869) filed by davelopez to eliminate Galaxy's dependency on the unmaintained PyFilesystem2 ecosystem. The recommended approach follows the established pattern from PR #21590 (GCS migration) and involves swapping the base class from PyFilesystem2FilesSource to FsspecFilesSource, replacing the fs.googledrivefs import with gdrive_fsspec, and updating the _open_fs method to use the new library's authentication model. This is small, well-scoped work with clear prior art.
PRIORITIZE NOW -- low effort, high strategic value, proven pattern
Key Questions for Group Discussion
Should all five fsspec migration issues (#21865-21869) be worked as a single coordinated batch, or can they be picked up independently? Working them together enables removing _pyfilesystem2.py sooner.
Who has access to Google Drive credentials for manual testing? The CI test requires GALAXY_TEST_GOOGLE_DRIVE_ACCESS_TOKEN and GALAXY_TEST_GOOGLE_DRIVE_REFRESH_TOKEN.
Is there a plan for the remaining PyFilesystem2-based sources not covered in this batch (azure, posix, rspace, onedata, anvil, basespace)?
Should we invest in mock-based unit tests for these cloud file sources, or is the integration test pattern sufficient?
Concerns
Scope creep: Keep each migration as a standalone PR. Do not combine multiple file source migrations into one PR, even though they follow the same pattern.
Breaking changes: The OAuth2 credential mapping between fs.googledrivefs and gdrive_fsspec must preserve backward compatibility for existing configured Galaxy instances. The YAML configuration format must not change without a documented migration path.
Maintenance burden: Minimal ongoing burden. The fsspec base class handles all generic file source operations; the plugin only implements _open_fs. Moving to gdrive_fsspec (under the official fsspec GitHub org) should actually reduce maintenance burden compared to fs.googledrivefs.
Library maturity: gdrive_fsspec is relatively new. Its API stability and completeness for Galaxy's use cases (ls, download, upload, walk, glob) should be verified before merging.