kurokobo/WEAVIATE_MIGRATION_GUIDE.md

## migrate_weaviate_collections_ce.py
"""
# NOTE: THIS SCRIPT IS DEPRECATED AND OUTDATED

## TL;DR;

Use the official migration script instead.
https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py

## Background

This script was originally released as a community-edited version of the draft of the script presented by the Dify Team,
to address some issues encountered during migration of Weaviate collections in certain environments,
before the official script was finalized, as a temporary workaround.

However, all modifications made in this script have already been backported to the official script.
Therefore, this unofficial script is deprecated and will not be maintained in the future.
We strongly recommend using the latest official script.
If you face any issues with the official script, please report them to the Dify Team via their GitHub repository or any supported channels.

You can see the revisions made in this script by checking the git history: https://gist.github.com/kurokobo/51fbe7f92f4526957e12dacfa7783cdf/revisions
The original source for this script can be found at: https://github.com/langgenius/dify/issues/27291#issuecomment-3501003678.
The key changes made in this script were:

- Retrieve Weaviate connection info from environment variables to make this script run in the Worker container.
- Switch to cursor-based pagination in "replace_old_collection", since the migration could fail with large collections.
- Fix an issue where both the old and new collections remained without being deleted after migrating an empty collection.

"""

import sys

print("WARNING: This migration script is DEPRECATED and OUTDATED.")
print("Please use the following official migration script instead:")
print("https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py")
print("This script will now exit without making any changes.")

sys.exit(1)

## WEAVIATE_MIGRATION_GUIDE.md

      
    Raw
  

              WEAVIATE_MIGRATION_GUIDE.md
            
          
    Weaviate 1.19 to 1.27+ Migration Guide for Dify


⚠️ This guide is not officially supported by the Dify Team.
⚠️ This is a community-edited, simplified version of the official migration guide presented by the Dify Team.

Complete guide to safely migrate Dify knowledge bases from Weaviate 1.19 to 1.27/1.33.

✅ NOTE: BEFORE PROCEEDING FURTHER

If your environment contains only a small number of Knowledges, you might be able to resolve the issue using the following much simpler steps, instead of the more complicated process on this page.

Open the Settings page for your knowledge.
Change the Embedding Model to something else.
On the Documents page, wait until all documents become Available.
Open the Settings page again and change the Embedding Model back to the original.
On the Documents page, wait again until all documents become Available.
Repeat these steps for each Knowledges.

The steps described in the following sections are aimed at large environments, where it's not feasible to manually edit every Knowledges.

📝 Outline

This guide covers the following two cases.

While Case A is recommended for a safer migration, this guide can also be applied to Case B:

Case A

You are currently running a version of Dify 1.9.1 or earlier with Weaviate 1.19 included.
All knowledge is functioning properly.


Case B

You have already upgraded to Weaviate 1.27+ and are running Dify 1.9.2 or later.
The knowledge created with the previous version is corrupted, and you have no backup to revert to the earlier version.


The procedure in this guide is as follows:

Take a complete backup of your current Dify environment.
If your Dify version is 1.9.1 or earlier, upgrade Dify.
Operate the weaviate container and modify the directory structure of the LSM data.
Operate the worker container and run the migration script.
Perform cleanup.


📝 Migration Procedure

Note:

This procedure cannot be rolled back by any means other than a restore. Attempting to roll back using anything other than a restore may make things worse.

We recommend that you follow the steps to take a full backup first, in preparation for a possible restore.

Step 1: Backup Your Environment

Stop your Dify services:
cd /path/to/dify/docker
docker compose down
Then making full copy or archive of your entire docker directory (/path/to/dify/docker for example) as a safety measure.
If you encounter issues later, you can restore this backup to revert to the original state.

Step 2: Upgrade to Weaviate 1.27+ (Only for Case A)

This step is only for Case A - users currently on Dify 1.9.1 or earlier with Weaviate 1.19.

If you are already running Weaviate 1.27+ (Case B), you can skip this step.
Follow the upgrade guide to move to the latest (or a specific) Dify version that uses Weaviate 1.27+.

Upgrade guide: https://github.com/langgenius/dify/releases


Step 3: Fix Orphaned LSM Data

If your Dify has stopped, start it and wait until it has fully launched.
cd /path/to/dify/docker
docker compose up -d
Ensure your Weaviate using the image version 1.27.0 or higher.
cd /path/to/dify/docker
docker compose ps weaviate  # The "IMAGE" column should show "semitechnologies/weaviate:1.27.0" or higher
Enter the shell of your weaviatwe container:
cd /path/to/dify/docker
docker compose exec -it weaviate /bin/sh
Then run the following commands inside the container to fix LSM data:
cd /var/lib/weaviate
for dir in vector_index_*_node_*_lsm; do
  [ -d "$dir" ] || continue
  
  # Extract index ID and shard ID
  index_id=$(echo "$dir" | sed -n 's/vector_index_\([^_]*_[^_]*_[^_]*_[^_]*_[^_]*\)_node_.*/\1/p')
  shard_id=$(echo "$dir" | sed -n 's/.*_node_\([^_]*\)_lsm/\1/p')
  
  # Create target directory and copy
  mkdir -p "vector_index_${index_id}_node/$shard_id/lsm"
  cp -a "$dir/"* "vector_index_${index_id}_node/$shard_id/lsm/"
  
  echo "✓ Copied $dir"
done
exit
Then restart weaviate container to ensure changes are recognized:
cd /path/to/dify/docker
docker compose restart weaviate

Step 4: Migrate Schema

Place migrate_weaviate_collections.py script to your /path/to/dify/docker/volumes/app/storage/ directory, then enter the shell of your worker container:
cp /path/to/migrate_weaviate_collections.py /path/to/dify/docker/volumes/app/storage/
cd /path/to/dify/docker
docker compose exec -it worker /bin/bash
Then run the following commands inside the container to execute the migration script:
uv run --no-cache /app/api/storage/migrate_weaviate_collections.py
exit
Restart Dify services:
docker compose down
docker compose up -d
Verify in Dify UI:

Go to your Dify console
Open your knowledge bases
Try "Retrieval Testing"
Should work without errors!

Step 5: Cleanup (Optional)

After successful migration, you can delete orphaned files to free up space.

Enter the shell of your weaviatwe container:
cd /path/to/dify/docker
docker compose exec -it weaviate /bin/sh
Then run the following commands inside the container to delete orphaned files:
cd /var/lib/weaviate
rm -rf vector_index_*_node_*
exit
Also, you can delete the migration script from your storage volume:
rm /path/to/dify/docker/volumes/app/storage/migrate_weaviate_collections.py

📝 Files Needed


migrate_weaviate_collections.py - Schema migration script

📝 Credits


Original migration approach: Dify team
LSM recovery method: Chinese Dify community user
Combined solution: Community effort
	"""
	# NOTE: THIS SCRIPT IS DEPRECATED AND OUTDATED

	## TL;DR;

	Use the official migration script instead.
	https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py

	## Background

	This script was originally released as a community-edited version of the draft of the script presented by the Dify Team,
	to address some issues encountered during migration of Weaviate collections in certain environments,
	before the official script was finalized, as a temporary workaround.

	However, all modifications made in this script have already been backported to the official script.
	Therefore, this unofficial script is deprecated and will not be maintained in the future.
	We strongly recommend using the latest official script.
	If you face any issues with the official script, please report them to the Dify Team via their GitHub repository or any supported channels.

	You can see the revisions made in this script by checking the git history: https://gist.github.com/kurokobo/51fbe7f92f4526957e12dacfa7783cdf/revisions
	The original source for this script can be found at: https://github.com/langgenius/dify/issues/27291#issuecomment-3501003678.
	The key changes made in this script were:

	- Retrieve Weaviate connection info from environment variables to make this script run in the Worker container.
	- Switch to cursor-based pagination in "replace_old_collection", since the migration could fail with large collections.
	- Fix an issue where both the old and new collections remained without being deleted after migrating an empty collection.

	"""

	import sys

	print("WARNING: This migration script is DEPRECATED and OUTDATED.")
	print("Please use the following official migration script instead:")
	print("https://github.com/langgenius/dify-docs/blob/main/assets/migrate_weaviate_collections.py")
	print("This script will now exit without making any changes.")

	sys.exit(1)
No results found