Skip to content

Instantly share code, notes, and snippets.

@ireun
Last active January 28, 2025 12:10
Show Gist options
  • Select an option

  • Save ireun/c54f6ad4de6bd6f4ff691bb104c9f983 to your computer and use it in GitHub Desktop.

Select an option

Save ireun/c54f6ad4de6bd6f4ff691bb104c9f983 to your computer and use it in GitHub Desktop.

Hello!

I present You a simple script, that can help you to scan Game Cards, or Photos, multiple at once. I've made it as a project for my university, it was meant to preserve old images and cards I had.

It basically finds the individual cards/photos, extracts them and rotates them to straight orientation.

I share it with You under Creative Commons Licence. Please respect that.

Few assumptions:

  • Place game cards/Photos in ( more or less ) straight orientation, avoid 45° rotation (0° = best quality, 45° = worst quality)
  • Scan to multi-page PDF, place the file name in line 93
  • DO NOT overlap images, about 3-5 mm margin is safe, otherwise script may detect two images as one!
  • Remember - usually scanners are not scanning anything that's placed to about 3 mm from edges
  • If you'd like to scan cards with white borders - do not close the lid, change white_cards to True
  • For Photos you may need to change the cv.threshold(...) on line 17 to something like cv.threshold(gray, 240, 255, cv.THRESH_BINARY_INV)

Needed packages:

  • pip install opencv-python PyMuPDF numpy

Showdown ( Those are Munchkin 3 cards, blurred due to copyright. )

Before ( original scan )

image

After ( extracted images )

image

import cv2 as cv
import numpy as np
import fitz
import os
from datetime import datetime
from pathlib import Path
white_cards = True # DO NOT CLOSE THE LID WHEN SCANNING WHITE CARDS!
def create_mask(img):
gray = cv.cvtColor(img, cv.COLOR_BGR2GRAY)
if white_cards:
gray = cv.bitwise_not(gray)
_, threshold = cv.threshold(gray, 250, 255, cv.THRESH_BINARY_INV + cv.THRESH_OTSU)
contours, _ = cv.findContours(threshold, cv.RETR_CCOMP, cv.CHAIN_APPROX_SIMPLE)
for cnt in contours:
cv.drawContours(threshold, [cnt], 0, 255, -1)
threshold = cv.bitwise_not(threshold)
kernel = cv.getStructuringElement(cv.MORPH_RECT, (5, 5))
mask = cv.bitwise_not(cv.morphologyEx(threshold, cv.MORPH_CLOSE, kernel, iterations=3))
return mask
def trim_border(array):
return np.mean(array) > 127.5 if not white_cards else np.mean(array) < 127.5
def detect_and_cut(mask, img):
contours, _ = cv.findContours(mask, 1, 2)
img_return = [] # array with processed images.
for cnt in contours:
# get boundigRect ( w/o rotation )
x, y, w, h = cv.boundingRect(cnt)
# get minAreaRect ( w/ rotation )
rect = cv.minAreaRect(cnt)
angle = rect[2]
if angle > 45:
angle -= 90
image = img[y:y + h, x:x + w] # crop image by boudingRect
image_center = tuple(np.array(image.shape[1::-1]) / 2) # get center point of img
rot_mat = cv.getRotationMatrix2D(image_center, angle, 1.0) # get rotationMatrix from angle
result = cv.warpAffine(image, rot_mat, image.shape[1::-1], flags=cv.INTER_LINEAR) # warp by rotationMatrix
# fill conrers after rotation
height, width, _ = result.shape
for point in [(0, 0), (width - 1, 0), (0, height - 1), (width - 1, height - 1)]:
for _ in range(2):
cv.floodFill(result, None, seedPoint=point, newVal=(0,) * 3 if white_cards else (255,) * 3, loDiff=(0,) * 4, upDiff=(10,) * 4)
# final croping
mask2 = create_mask(result)
contours2, _ = cv.findContours(mask2, 1, 2)
for cnt2 in contours2:
x2, y2, w2, h2 = cv.boundingRect(cnt2) # get final boundingRect
result = result[y2:y2 + h2, x2:x2 + w2]
untrimmed = True
while untrimmed:
untrimmed = False
if trim_border(result[:1]):
result = result[1:, :]
untrimmed = True
if trim_border(result[-1:]):
result = result[:-1]
untrimmed = True
if trim_border(result[:, :1]):
result = result[:, 1:]
untrimmed = True
if trim_border(result[:, -1:]):
result = result[:, :-1]
untrimmed = True
img_return.append(result)
return img_return
file = "filename.pdf"
# open the file
pdf_file = fitz.Document(file)
image_count = 0
folder_name = datetime.now().strftime("extracted %m-%d-%y %H-%M-%S")
for page_index in range(len(pdf_file)):
# get the page itself
page = pdf_file[page_index]
image_list = page.get_images()
for img in page.get_images():
# get the XREF of the image
xref = img[0]
# extract the image bytes
base_image = pdf_file.extract_image(xref)
image_bytes = base_image["image"]
nparr = np.frombuffer(image_bytes, np.uint8)
img = cv.imdecode(nparr, cv.IMREAD_COLOR)
mask = create_mask(img) # create a mask
images = detect_and_cut(mask, img) # frame images
for img in images:
Path(folder_name).mkdir(exist_ok=True)
cv.imwrite(os.getcwd() + "\\" + folder_name + "\\" + str(image_count) + ".jpg", img)
image_count += 1
@Maneleki
Copy link

Fixed now!! Thanks for this! :) <3

@BurzumBlacker
Copy link

Hi! Newbie here...
The script runs and generates a folder but... it creates a single JPG from a PDF with 4 cards.
That image is also reversed and mirrored.
I'm using Python 3.11 from CMD.

Thanks in advance.

@ireun
Copy link
Author

ireun commented Nov 15, 2024

Hi @BurzumBlacker Can you share that pdf with me? It's hard to tell what is happening without an example to test. Message me at Discord maybe? My username is ireun

@BurzumBlacker
Copy link

BurzumBlacker commented Nov 15, 2024

Hi again.
The material is copyrighted... I'am trying to translate it to my language (that game never was translated and it's only available in english.. 13 years ago)
Is only a test with 4 cards

Thank you!

PS: I have all the originals but my little nepews don't understand english yet.

@ireun
Copy link
Author

ireun commented Nov 15, 2024

Can you maybe blur it out? And show me a blurred out version of before/after ?

@BurzumBlacker
Copy link

BurzumBlacker commented Nov 15, 2024

0
That is the output JPG.

@ireun
Copy link
Author

ireun commented Nov 15, 2024

And what is the input?

@BurzumBlacker
Copy link

BurzumBlacker commented Nov 15, 2024

Seems that the chat don't support PDFs

@ireun
Copy link
Author

ireun commented Nov 15, 2024

You can probably create a secret gist here, and share a link with me, or contact me via discord:)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment