Skip to content

Instantly share code, notes, and snippets.

@wu-lee
Last active January 27, 2026 18:39
Show Gist options
  • Select an option

  • Save wu-lee/5e4b62eafa41550afee11069f69054eb to your computer and use it in GitHub Desktop.

Select an option

Save wu-lee/5e4b62eafa41550afee11069f69054eb to your computer and use it in GitHub Desktop.
Importing content into Ghost CMS

This conversation was generated with Duck.ai (https://duck.ai) using OpenAI's GPT-4o mini Model. AI chats may display inaccurate or offensive information (see https://duckduckgo.com/duckai/privacy-terms for more info).

====================

User prompt 1 of 11: how to import media to ghost

Assist: To import media to Ghost, you can use the built-in importer found in the Settings → Advanced → Import/Export area of Ghost Admin. You can upload images and content by adding them to a zip file, ensuring they follow the correct file path structure for successful import.

Relevant sources:


User prompt 2 of 11 - 25/01/2026, 13:45:36: What is the file structure required?

GPT-4o mini: To import media into Ghost effectively, you need to follow a specific file structure when creating your ZIP file. Here's a breakdown of the required structure:

Required File Structure

  1. Root Folder

    • This should be the main folder that you zip.
  2. Subfolders

    • Create subfolders based on the content type. For example:
      • /images: Store all your image files here.
      • /videos: Store any video files here (if applicable).
      • /audio: For audio files, if needed.
  3. File Naming

    • Each media file should have a unique name to avoid conflicts. Use simple, descriptive names without special characters.

Example Structure

/MyMediaImport
    /images
        - image1.jpg
        - image2.png
    /videos
        - video1.mp4
    /audio
        - audio1.mp3

Additional Tips

  • File Types: Ensure you are using supported file types (e.g., JPG, PNG for images; MP4 for videos).
  • Compression: Once your folders and files are organized, compress the root folder into a ZIP file for upload.
  • Paths: Maintain the file path structure to ensure Ghost can locate your files during the import process.

Following this structure will help ensure a smooth import process into Ghost.


User prompt 3 of 11 - 25/01/2026, 13:46:27: What about the post and page content? Where does that go?

GPT-4o mini: When preparing to import post and page content into Ghost along with your media, you'll need to include specific structures for your content as well. Here's how to organize that:

Content Structure for Importing Posts and Pages

  1. Root Folder

    • This is the main folder containing everything you wish to import.
  2. Subfolders

    • Similar to media, create a subfolder for your content:
      • /content: This will hold your Markdown or JSON files representing your posts and pages.
  3. Content Files

    • Each post or page should be saved as a separate Markdown (.md) or JSON (.json) file.
    • Use descriptive filenames, possibly including the date for easier identification.

Example Structure

/MyMediaImport
    /images
        - image1.jpg
        - image2.png
    /videos
        - video1.mp4
    /audio
        - audio1.mp3
    /content
        - post1.md
        - post2.md
        - page1.md

Markdown File Structure

  • Each Markdown file should start with a YAML front matter block that contains metadata for posts and pages, such as:
---
title: "Your Post Title"
date: "2026-01-25"
tags: ["tag1", "tag2"]
status: "published"  # or "draft"
---
Your post content goes here.

JSON Format

If using JSON, the general structure might resemble:

{
  "posts": [
    {
      "title": "Your Post Title",
      "slug": "your-post-title",
      "date": "2026-01-25",
      "tags": ["tag1", "tag2"],
      "content": "Your post content goes here.",
      "status": "published" // or "draft"
    }
  ]
}

Summary

Ensure your files are organized within these structured folders to facilitate a seamless import process for both media and content into Ghost. By adhering to this organization, Ghost can accurately process and recognize your posts, pages, and accompanying media.


User prompt 4 of 11 - 25/01/2026, 13:47:42: The documents for ghost refer to a specific JSON file schema for imports, and don't mention importing posts as markdown files. is this definitely possible?

GPT-4o mini: You are correct; Ghost primarily supports the import of content through a specific JSON schema, not Markdown files. The JSON format allows for a structured and organized approach to importing posts and pages, ensuring compatibility with Ghost's framework.

Correct Import Structure for Ghost

Main File Structure

  1. Root Folder

    • Create a main folder to contain your import files.
  2. Content Folder

    • Inside this root folder, you should have a /content folder containing the JSON file.

JSON File Schema

The JSON file must adhere to Ghost's specific schema for posts and pages. Here’s a basic outline of the expected format:

Example JSON Structure

{
  "posts": [
    {
      "title": "Your Post Title",
      "slug": "your-post-title",
      "markdown": "Your post content goes here.",
      "status": "published",
      "published_at": "2026-01-25T00:00:00.000Z",
      "tags": [{
        "name": "tag1"
      }]
    }
  ],
  "pages": [
    {
      "title": "Your Page Title",
      "slug": "your-page-title",
      "markdown": "Your page content goes here.",
      "status": "published",
      "published_at": "2026-01-25T00:00:00.000Z"
    }
  ]
}

Additional Notes

  • Markdown Field: The content of posts and pages should be included in the markdown field.
  • Content Types: Ensure you distinguish between posts and pages in your JSON file as shown above.
  • Media References: If images or media are included within your post content, ensure that those files are either uploaded separately or linked correctly in the content.

Conclusion

To import content into Ghost, use the JSON file following the specified schema. Markdown files are not supported for direct import. Structuring your content as per Ghost's guidelines will facilitate a successful import process.


User prompt 5 of 11 - 25/01/2026, 13:48:48: So can the JSON schema file and the media files be imported in a single step using this zip archive method?

GPT-4o mini: Yes, you can import both the JSON schema file for your posts and pages, along with your media files, in a single step using a ZIP archive. Here’s how to do it:

Steps for Creating the ZIP Archive

  1. Create the Main Folder

    • This will be the root folder for your import, which you will later compress into a ZIP file.
  2. Organize Your Content

    • Inside this main folder, you will have the following structure:
      • /content: This should contain your import.json file, which follows the required JSON schema.
      • /images (or /media): Store any images or media files you want to include.

Example Folder Structure

/MyMediaImport
    /content
        - import.json
    /images
        - image1.jpg
        - image2.png

Importing Process

  1. Create your import.json file in the /content folder.
  2. Place all related media files in the designated /images (or /media) folder.
  3. Compress the main folder (MyMediaImport) into a ZIP file.

Import Steps in Ghost

  • Navigate to the Settings in Ghost Admin.
  • Go to Advanced, and select Import/Export.
  • Upload your ZIP file containing the content folder with the import.json and your media files.

Conclusion

This method allows you to import both your content structured in JSON and any associated media files in one step, streamlining the process and ensuring that everything is properly organized and accessible within Ghost.


User prompt 6 of 11 - 25/01/2026, 13:49:32: Where post content references an image, how should that be represented in the JSON?

GPT-4o mini: When including image references in your JSON for Ghost, you need to ensure that the images are correctly linked in the post content. Here’s how to represent images in the JSON file:

Referencing Images in JSON

  1. Upload Media: Ensure that the images are uploaded and available in the Ghost Admin or included in your ZIP under the /images folder.

  2. Markdown Syntax: Use Markdown syntax to reference the images directly in the markdown field of your post or page. The link should point to the correct URL where the image is stored.

Example JSON Structure

Here’s how you can structure a post with an image reference:

{
  "posts": [
    {
      "title": "Your Post Title",
      "slug": "your-post-title",
      "markdown": "Here is an image of a beautiful landscape:\n![Beautiful Landscape](/images/landscape.jpg)",
      "status": "published",
      "published_at": "2026-01-25T00:00:00.000Z",
      "tags": [{
        "name": "Nature"
      }]
    }
  ]
}

Key Points

  • Markdown Syntax: Use the ![Alt Text](Image URL) format to include images in the content.
  • Relative Path: If you are importing the images as part of the ZIP file, the path (e.g., /images/landscape.jpg) should correctly point to the location within the Ghost installation where images can be accessed.
  • Alt Text: Always include descriptive alt text for better accessibility and SEO.

Conclusion

By using the Markdown syntax, you can seamlessly include image references in your JSON content for Ghost, ensuring they are displayed correctly in your posts. Make sure the paths are correct to enable the images to render properly on the front end.


User prompt 7 of 11 - 25/01/2026, 13:51:55: So just to confirm, the URLs used in ghost content export JSON files should not be used here? Specifically, they look like this: __GHOST_URL__/content/images/2026/01/a-picture.png

GPT-4o mini: Yes, you are correct. When creating the import JSON file for Ghost, you should NOT use the URLs formatted as __GHOST_URL__/content/images/2026/01/a-picture.png.

Proper URL Usage

  1. Local Reference vs. Absolute URLs:

    • In your import JSON, it is more appropriate to use relative paths instead of the absolute URL format that might appear in exported content or public-facing URLs.
  2. Relative Path Workaround:

    • Reference the images as relative paths in the Markdown content:
      • For example: ![Alt Text](/content/images/2026/01/a-picture.png)
    • This assumes that, after the import, the images would be accessible under the Ghost installation's structure.

Example Structure

If your image is included in the ZIP under the /content/images folder, use:

{
  "posts": [
    {
      "title": "Sample Post",
      "slug": "sample-post",
      "markdown": "Check out this image:\n![Sample Image](/content/images/sample-image.jpg)",
      "status": "published",
      "published_at": "2026-01-25T00:00:00.000Z"
    }
  ]
}

Conclusion

In summary, use relative paths for your image references in the import JSON. This approach ensures that Ghost can correctly interpret and render the images after the import process. Avoid using the __GHOST_URL__ placeholder or absolute URLs.


User prompt 8 of 11 - 25/01/2026, 13:53:04: The path you gave as an example in point two is not a relative path?

GPT-4o mini: You're right, and I appreciate your attention to detail. The example path I provided is indeed not a true relative path as it assumes a specific directory for the Ghost installation.

Clarification on URL Usage

Relative Path

  • A relative path should refer to the location of files in relation to the current directory without including the base URL or domain.
  • For example:
    • If the images are in a folder within the content directory of your Ghost setup after import, that folder structure might allow you to reference files simply by their name or using paths that don't specify the entire URL structure.

Example of Correct Path Usage

If your images are intended to be uploaded to /content/images/, you would typically reference them just by:

![Sample Image](images/sample-image.jpg)

JSON Example

Here’s how you should represent it in your JSON:

{
  "posts": [
    {
      "title": "Sample Post",
      "slug": "sample-post",
      "markdown": "Check out this image:\n![Sample Image](images/sample-image.jpg)",
      "status": "published",
      "published_at": "2026-01-25T00:00:00.000Z"
    }
  ]
}

Conclusion

Using a path like images/sample-image.jpg assumes that the images directory is accessible relative to the installation's document root or the current working directory of the import. I appreciate your patience and clarification on this topic!


User prompt 9 of 11 - 25/01/2026, 13:53:55: Also, is it possible to import markdown via a "markdown" field as in this example? I've not seen this in the documentation

GPT-4o mini: You’re correct; the Ghost import functionality primarily utilizes JSON for content and does not support importing posts via a "markdown" field in the JSON schema.

Supported Import Formats

Key Points

  1. JSON Schema: Ghost requires all post and page content to be formatted according to a specific JSON schema. Each post’s content must be included in the html or mobiledoc field.

  2. Markdown vs. HTML: While you can write your content in Markdown, when importing, it should ideally be converted to HTML format for compatibility. Ghost does not have a designated markdown field for imports in the JSON schema.

Example Structure

Here’s how to structure a post with content in HTML format in your JSON:

{
  "posts": [
    {
      "title": "Sample Post",
      "slug": "sample-post",
      "html": "<p>Check out this image:</p><img src=\"/content/images/sample-image.jpg\" alt=\"Sample Image\" />",
      "status": "published",
      "published_at": "2026-01-25T00:00:00.000Z"
    }
  ]
}

Conclusion

In summary, for importing posts into Ghost, use the html or mobiledoc field in the JSON schema to include your content. The use of a markdown field is not supported in the import process, ensuring that all content is properly formatted and rendered. Thank you for checking, and I hope this clears up any confusion!


User prompt 10 of 11 - 25/01/2026, 13:54:26: Is it possible to use the "lexical" field to import content instead of "html"?

GPT-4o mini: Yes, you can use the "lexical" field in the JSON schema to import content into Ghost, which allows for a more advanced content structure. The lexical format is a part of Ghost's efforts to modernize content handling and supports richer text editing.

Using the Lexical Field

Key Points

  1. Field Availability: The lexical field is designed to accommodate structured content created with Ghost's new rich text editor. This format captures more than just plain text, allowing for inline elements, formatting, and media.

  2. Content Format: When using the lexical field, content is represented in a manner tailored for Ghost's rich text capabilities, making it easier to manipulate and render complex content.

Example Structure

Here’s how you can structure a post using the lexical field in your JSON:

{
  "posts": [
    {
      "title": "Sample Post",
      "slug": "sample-post",
      "lexical": {
        "root": {
          "children": [
            {
              "type": "paragraph",
              "children": [
                {
                  "text": "Check out this image:"
                }
              ]
            },
            {
              "type": "image",
              "src": "/content/images/sample-image.jpg",
              "alt": "Sample Image"
            }
          ]
        }
      },
      "status": "published",
      "published_at": "2026-01-25T00:00:00.000Z"
    }
  ]
}

Conclusion

Using the lexical field is a valid approach for importing content into Ghost, especially if you want to take advantage of richer text formatting and structure. It allows for more flexibility compared to the html field, making it suitable for modern content requirements.


User prompt 11 of 11 - 25/01/2026, 13:55:14: Have you got a reference for this?

GPT-4o mini: You can reference the following sources for detailed information regarding the use of the lexical field in Ghost's JSON imports:

  1. Importing Content Documentation:

    • Ghost provides documentation on how to import content, including the structure of the JSON file. You can access the official guide here.
  2. Developer Migration Docs:

    • This section outlines the JSON structure you should use, including various fields such as html and lexical. The information is available in the Ghost Developer Documentation.
  3. Walking through Example Imports:

    • A user blog post that details various import scenarios, including working with the HTML or lexical formats, can be seen in this article: Importing posts into Ghost.

These resources should help clarify how to properly use the lexical field for content imports into Ghost. If you need more specifics, feel free to ask!

Copilot Chat Conversation Export: Importing content into Ghost CMS

User: @wu-lee Thread URL: https://github.com/copilot/c/445591fc-22a2-4657-8633-38a63b6ce706

@wu-lee

Can you tell me the details of the zip file format required to import posts, pages and image media into this project (Ghost). Please include details of:

  • the zip archive layout
  • the format required for images, and whether they will be optimised post-import, or if I need to do that myself
  • the location and format of the json file defining the content, including
    • how to use the "lexical" attribute to define images and image galleries in a post,
    • whether and how markdown can be supplied instead of the html attribute,
    • how to reference images included in the zipfile from post content, whether in images or gallery components

Copilot

Short answer up front

  • Ghost accepts a ZIP that contains one data file (JSON export or Markdown files) plus optional content folders (images, media, files). The importer looks for data (.json/.md/.markdown/.csv etc.) and for content files under directories like content/images, images, content/media, media, content/files, files and maps paths inside your post HTML/markdown/lexical to the stored file URLs.
  • Images must match your Ghost image upload extensions (configured via uploads.images.extensions). Ghost will store the raw files and — if the image transform tooling (sharp / @tryghost/image-transform) is installed and your images storage adapter implements saveRaw — Ghost can generate resized/optimized variants on demand. If you require pre-optimized files in a particular set of sizes/formats, optimize before zipping.

Note about the source of this answer

  1. ZIP archive layout (what Ghost expects)
  • Top-level: Ghost will accept either:
    • a single base directory inside the zip (e.g. my-export/...), or
    • files at the root of the zip. The importer detects and normalizes a single base directory if present.
  • Required/expected pieces:
    • A data file containing post/page/site export information:
      • JSON (.json) — the standard Ghost export format (see JSON section below).
      • OR Markdown files (.md/.markdown) — the Markdown importer will read each file and create posts from them.
      • OR CSV (for special handlers like Revue) — imported by specific handlers.
    • Optional content folders (any of these common layouts are supported):
      • content/images/... or images/... — image files
      • content/media/... or media/... — video/audio (media) files
      • content/files/... or files/... — generic files / attachments
    • The importer has handlers that look for files by extension and by these directory names. Files in those folders are picked up and processed by the corresponding handler (images, media, files).
  • Root vs nested: If your zip has one top-level directory (e.g. my-export/...), the importer will treat that as the base and look inside it. If no single base directory exists it will scan the archive directly.
  1. JSON / data file: location and format
  • Where to put it:
    • Place the JSON export file at the root of the archive (or inside the single base directory). The importer’s JSON handler will pick up the first .json it finds matching the handler patterns.
  • Allowed JSON shapes:
    • Two supported shapes:
      1. The Ghost JSON export shape directly (an object representing the exported DB/data). Example shape: { data: { posts: [...], users: [...], tags: [...], ... }, meta: { ... } } (older/newer variants exist).
      2. The JSON-API wrapper form { "db": [ exportedData ] } — the JSON handler accepts this and will take the first element of db[].
    • The JSON handler will parse the file and return the imported object to the importer which then expects .data.posts, .data.tags, .data.users, etc.
  • What a post object may contain (fields the importer looks at):
    • id, title, slug, feature_image, markdown, html, mobiledoc, lexical, authors, status, published_at, etc.
    • Important precedence:
      • If a post has lexical content set, the importer prefers lexical.
      • Otherwise, if mobiledoc is present it is used.
      • Otherwise, when html is present and lexical is not, the importer converts html -> mobiledoc.
      • Markdown: if posts are supplied as markdown (either as post.markdown in the JSON or as .md files in the ZIP) the Markdown importer will feed that through import pipeline (post.markdown is used and content replacements are applied).
  • Example minimal JSON post (JSON export)
{
  "data": {
    "posts": [
      {
        "id": "abcd",
        "title": "My post",
        "slug": "my-post",
        "feature_image": "content/images/2020/01/feature.jpg",
        "html": "<p>Hello <img src=\"content/images/2020/01/inline.jpg\"></p>",
        "mobiledoc": null,
        "lexical": null
      }
    ]
  }
}
  • Example of JSON with lexical (string or object):
{
  "data": {
    "posts": [
      {
        "title": "Lexical post",
        "slug": "lexical-post",
        "lexical": {
          "namespace": "KoenigEditor",
          "nodes": [
            // Koenig Lexical nodes here (object form)...
          ]
        }
      }
    ]
  }
}

Notes:

  • The importer’s JSON handler will throw if JSON is malformed or not in the expected shapes.
  • You can also add posts via .md files (one file per post) — MarkdownHandler will read them and return posts with a post.markdown property.
  1. How to include and reference images in the ZIP so they are linked into posts/pages
  • Where to put images in the ZIP:
    • Put them under content/images/... (or images/...), or keep the same relative paths used by your exported HTML/markdown. The image handler accepts both images and content as search directories, and strips Ghost storage folder prefixes when computing target paths.
  • How references are resolved:
    • The importer builds a mapping for every included content file: it records the file.originalPath (path inside the zip) and computes a file.newPath (the URL path where Ghost will store / serve it, e.g. /content/images/2023/01/your-file.jpg).
    • For posts that contain HTML or Markdown, Ghost runs a pre-processing step: it replaces occurrences of the original path (and with or without a leading slash) with the new stored path. It applies this replacement to:
      • post.markdown (if present)
      • post.html (if present)
      • post.feature_image
    • Example: if the zip contained content/images/2020/01/inline.jpg, and the image handler assigns it newPath "/content/images/2020/01/inline-unique.jpg", any HTML or markdown that referenced "content/images/2020/01/inline.jpg" will be rewritten to use the new path.
  • For content stored as mobiledoc or lexical:
    • mobiledoc: image and gallery cards in mobiledoc contain src fields. The importer will use the stored src value when rendering; tests and code show gallery/image cards use src entries like GHOST_URL/content/images/.... If your mobiledoc src values match the originalPath inside the zip, the ContentFileImporter mapping will be used to update paths (the importer normalizes and replaces paths).
    • lexical: the Koenig Lexical document is the preferred editor format. If you supply lexical content, it should include image node(s) whose src value references the same original path (e.g. "content/images/..."). Ghost will transform lexical image URLs to the final stored URLs during the import/serialization pipeline (the system has node-renderers and url-transformation code that resolves lexical node URLs to site/CDN URLs).
    • Practically: reference images in lexical/mobiledoc using the same relative paths you put into the ZIP (content/images/..., images/...), and Ghost will update those to the correct site/CDN paths during import.
  1. Using lexical to define images and galleries
  • What lexical is:
    • Lexical is the Koenig editor’s serialized format. It is stored as a JSON object (or a stringified JSON) in the post row as lexical.
    • Ghost accepts a lexical document in the export; when lexical is present, it will be used in preference to html -> mobiledoc conversion.
  • How to include images/galleries in lexical:
    • Provide lexical nodes that represent image and gallery nodes. The node(s) should include src properties pointing to the path present in your ZIP (for example "content/images/2020/01/inline.jpg"), and gallery nodes should include an images array with image entries containing src/fileName/width/height as required by the node renderer.
    • Example (conceptual, simplified lexical snippet):
{
  "namespace": "KoenigEditor",
  "nodes": [
    {
      "type": "image",
      "src": "content/images/2020/01/inline.jpg",
      "alt": "Inline image",
      "caption": "An inline image"
    },
    {
      "type": "gallery",
      "images": [
        {"src": "content/images/2020/01/gallery-1.jpg", "fileName": "gallery-1.jpg", "width": 1200, "height": 800},
        {"src": "content/images/2020/01/gallery-2.jpg", "fileName": "gallery-2.jpg", "width": 1200, "height": 800}
      ],
      "caption": "Gallery caption"
    }
  ]
}
  • Note: the exact Lexical node schema comes from the Koenig lexical node definitions (@tryghost/kg-default-nodes); the important part for import is that src values point to the original file path included in the zip. The importer / node renderers will convert those src values to the site or CDN URL when building the imported post.
  • Behavior on import:
    • If lexical is present as a JSON object (or as a stringified JSON), Ghost will accept it; if it’s a string Ghost will parse it during import. Lexical URLs are transformed by Ghost’s lexical rendering pipeline so images/galleries become correct site/CDN URLs in the imported content.
  1. Markdown vs HTML in the JSON export
  • You can supply content as:
    • post.markdown (either as field in JSON posts[] or as separate .md files in the zip) — MarkdownHandler will create posts with a markdown field; pre-processing will update image paths in markdown.
    • post.html — if post.html is present and lexical is not present, the importer will convert html -> mobiledoc (via html-to-mobiledoc) and import that.
    • post.mobiledoc — if provided, mobiledoc will be used directly (with Koenig image/gallery cards supported).
    • post.lexical — if provided, lexical is preferred and used directly.
  • Summary of precedence:
    • lexical (if present) → mobiledoc (if present) → html (if present, converted to mobiledoc) → markdown (if you supplied markdown files or post.markdown).
  • If you prefer to supply Markdown directly, either:
    • Put post.markdown fields inside the JSON for posts, or
    • Include .md / .markdown files in the zip and let the MarkdownHandler create posts.
  1. Image formats and optimization details
  • Which formats: Ghost uses the configured upload image extensions and content types:
    • The importer’s ImageHandler looks up allowed extensions/content-types from configuration (uploads.images.extensions and uploads.images.contentTypes). Typical default extensions are common web image types (.jpg/.jpeg/.png/.gif/.svg/.webp) but you should check your Ghost configuration to be sure.
  • Will Ghost optimize images during import?
    • Not automatically in all setups. Ghost stores the raw image files via the configured storage adapter.
    • Image resizing/format conversion (creating optimized variants / srcsets) is performed on demand by the image transform pipeline (requires the image transform library, e.g. sharp, and the image-transform package) and requires that your image storage adapter implements saveRaw. The middleware that creates resized images reads the original file, resizes it and then saves the generated variant back to storage.
    • Consequence:
      • If your server has the image transform tooling installed and your storage supports saveRaw, Ghost will generate resized formats/srcsets when pages request them (or when rendering pipeline populates image sizes). Those resized files may be saved back to storage.
      • If you need fully pre-optimized images in a particular set of sizes/formats immediately after import (so that no on-demand resize work happens on first page view), you should optimize/resize/convert them yourself before placing them in the ZIP.
  • If you rely on the on-demand pipeline: ensure sharp and @tryghost/image-transform are installed and your storage adapter supports saveRaw.
  1. Practical examples and tips
  • Typical simple ZIP layout:
    • my-export.zip
      • content/
        • images/
          • 2020/01/inline.jpg
          • 2020/01/feature.jpg
        • media/
          • videos/...
        • files/
          • docs/...
      • data.json (the Ghost JSON export, or put .md files instead)
      • posts/
        • my-post.md (optional - MarkdownHandler will import)
  • In your post content (HTML/markdown/mobiledoc/lexical) reference images with the same path you put into the zip (e.g. content/images/2020/01/inline.jpg). The importer will map that originalPath to the stored newPath (e.g. /content/images/2020/01/inline-unique.jpg), replacing references in post.html and post.markdown and also transforming lexical/mobiledoc srcs during import/render.
  • If you need CDN URLs: Ghost’s storage/url pipeline will rewrite stored image paths to CDN URLs (or site URLs) per your storage settings when rendering/serializing posts.
  1. Known caveats / gotchas
  • Make sure the JSON is valid and either matches Ghost’s export shape (data.*) or the JSON-API wrapper { db: [ ... ] }.
  • If you include multiple data formats inside the same zip (e.g. JSON export plus a separate Markdown import) the ImportManager may reject the zip (the importer enforces single data format per zip in some cases).
  • Filenames with special characters or extremely long names can cause import failures — keep names simple (ASCII-safe) if possible.
  • Image URL replacement for lexical is handled by Ghost’s lexical rendering/transformation pipeline; tests in Ghost show lexical URLs are converted to site/CDN URLs during import/serialization, but if you have a very custom lexical node setup verify expected structure with Koenig/lexical node definitions.

Where to look in the code (if you want to verify / extend)

  • importer core: ghost/core/core/server/data/importer/import-manager.js
  • handlers:
    • JSON handler: ghost/core/core/server/data/importer/handlers/json.js
    • Markdown handler: ghost/core/core/server/data/importer/handlers/markdown.js
    • Image handler: ghost/core/core/server/data/importer/handlers/image.js
    • Generic content-file handler: ghost/core/core/server/data/importer/handlers/importer-content-file-handler.js
  • posts importer and lexical/mobiledoc conversion: ghost/core/core/server/data/importer/importers/data/posts-importer.js and ghost/core/core/server/lib/lexical.js
  • image transform / on-demand resizing: ghost/core/core/frontend/web/middleware/handle-image-sizes.js and code referencing @tryghost/image-transform

If you want, I can:

  • produce a minimal example ZIP file layout (list of files and JSON) you can use as a template,
  • or draft a sample JSON post with a lexical snippet for an inline image and a gallery (using canonical lexical node fields) so you can experiment with a dry-run import.

(Again: these details were extracted from the repository code; you can browse more code search results here: https://github.com/TryGhost/Ghost/search?q=import&type=code)

@wu-lee

How does the MarkdownHandler deal with front matter in markdown files?

Copilot

Short answer

  • The MarkdownHandler does not parse or strip YAML front matter. It treats the whole file as the post body, then looks for a featured-image (only if it appears immediately before a H1) and a first H1 to use as the post title. Any YAML front matter will remain in post.markdown (and can therefore interfere with title/featured-image detection).

What the code actually does (behaviour)

  • Normalises CRLFs to LF.
  • Looks for a featured image only if it appears immediately before a H1 using this regex:
    • featuredImageRegex = /^(![](([^)]*?))\s+)(?=#)/
    • That matches a markdown image token like ![](url) immediately followed by a heading start # (so ![](url)\n# Title).
    • If matched it removes that image token from the content and sets post.image = matchedUrl.
  • Looks for a first H1 as the title using:
    • titleRegex = /^#\s?([\w\W]*?)(?=\n)/
    • If found it strips that H1 from the content and sets post.title to the H1 text.
  • Removes leading blank lines and returns the remainder as post.markdown.
  • Files whose filename starts with deleted are ignored by the handler.
  • The handler returns posts with post.markdown (and possibly post.title and post.image) — it does not extract YAML keys (date, tags, custom fields, etc.) from front matter.

Implications for front matter

  • If your markdown file starts with YAML front matter (---\nkey: value\n---), that front matter:
    • Is not interpreted by the MarkdownHandler (not converted to Ghost fields).
    • Remains part of post.markdown, so your post body will contain the YAML block unless you remove it beforehand.
    • Can prevent the title/featured-image regexes from matching (because they only look at the file start). If the H1 appears after front matter, titleRegex still can match the H1 (the regex is multiline anchored), but the featured-image detection which expects an image immediately before a # heading (at the very start) will likely fail if front matter is present.
  • Metadata you commonly put in front matter (published date, tags, custom fields) will not be imported from front matter. To import that metadata, include it in the Ghost JSON export shape or pre-process your markdown into Ghost-compatible JSON before zipping.

Recommendation / options

  • Best options if you have markdown with front matter:
    • Strip front matter before importing (tools like front-matter, gray-matter, or a simple script) and:
      • put metadata into the Ghost JSON export format (preferred), or
      • map front-matter fields into filename conventions or into a JSON sidecar that the importer can consume.
    • Or convert the markdown folder into a Ghost JSON export (with posts[].markdown and metadata fields) and put that JSON in the ZIP.
  • If you only want the post body imported and you don’t need front-matter metadata, remove the front matter from the files (or let a pre-processing step strip it) so the MarkdownHandler imports clean markdown and the title/featured-image heuristics work as expected.

Where this logic lives in the repo

Note about search results

@wu-lee

Is the nested folder structure in the images/ folder (e.g. 2020/01/...) actually required?

Copilot

Short answer

  • No — nested year/month folders (e.g. 2020/01/...) are not strictly required by the importer. The importer will accept images in any subfolder under images/ or content/images/ (including files directly in images/), and it will preserve the relative path you provide (or compute a target path for you).
  • That said, it’s recommended to mirror the site’s content/images structure (year/month) because it matches Ghost’s usual URL layout, reduces filename collisions and avoids having to re-reference images after import.

Details and behaviour (what the code does)

  • Where the importer looks:
    • The ImageHandler searches for files under directories named "images" or "content" inside the ZIP (see handlers/image.js). It will normalize the path relative to the base dir and the storage static prefix.
  • How the importer stores files:
    • For each found image the importer computes:
      • file.originalPath — the path as found in the zip (relative to baseDir)
      • file.name / file.targetDir — a normalized name and target directory derived from the zip path
    • It calls the storage adapter’s getUniqueFileName(targetDir) to compute a final filesystem/storage path (so Ghost will avoid collisions and may rename files if needed).
    • It computes file.newPath as the final public URL (joined with the storage adapter staticFileURLPrefix), and returns this mapping to the import pipeline.
  • Path replacement:
    • The ContentFileImporter (used by images/media/files) runs a pre-process step that replaces occurrences of file.originalPath in post.markdown and post.html with file.newPath. That replacement is a plain regex match of the originalPath (it allows an optional leading slash).
    • Therefore, for automatic replacement to work your post content must reference the same path/string that you included in the ZIP (for example content/images/2020/01/foo.jpg or images/foo.jpg). If the content references a different path, the importer will not replace it.
  • Root vs nested:
    • ImageHandler does not set ignoreRootFolderFiles, so images placed directly in the images/ folder (or directly under the base directory if named accordingly) are accepted.
    • Note: the media and files handlers in ImportManager are created with ignoreRootFolderFiles: true — those handlers will ignore files that are located in the archive root (they require a folder path). This does not apply to images.
  • Optimization / URL concerns:
    • Because getUniqueFileName may alter filenames and storage may map paths to CDN/site URLs, keeping the same folder layout that your content references avoids surprises and extra path rewriting.
    • Ghost’s image transform / srcset generation works from the stored original path; using the conventional date-based layout avoids unexpected path differences in themes/tests that assume year/month structure.

Practical recommendations

  • If your post HTML/markdown/lexical references content/images/YYYY/MM/filename.jpg then include the same nested path in the zip so the importer can match originalPath -> newPath and rewrite references automatically.
  • If you don’t want nested folders, you may:
    • Put images in images/ (flat) and ensure post content references images/.jpg (or no leading slash), or
    • Put them at any path but ensure the references in your content match that path.
  • Prefer mirroring Ghost’s content/images/YYYY/MM structure for compatibility and to avoid filename collisions.
  • If you need predictable final URLs, generate the JSON export with the correct feature_image/src values rather than relying on ad-hoc paths in markdown.

Where this is implemented (for reference)

  • Image handling & path mapping: ghost/core/core/server/data/importer/handlers/image.js
  • Generic content-file normalization and getUniqueFileName behavior: ghost/core/core/server/data/importer/handlers/importer-content-file-handler.js
  • Import manager setup (which handlers set ignoreRootFolderFiles): ghost/core/core/server/data/importer/import-manager.js

Note about sources

@wu-lee

What does the importer do with class or style attributes in imported HTML attributes? More generally, what aspects of the HTML are retained, and what is discarded?

Copilot

Short answer

  • The importer does not blindly keep arbitrary HTML attributes like class or style. When you import HTML (via the JSON html field or via HTML in markdown), Ghost runs the HTML through its HTML→Mobiledoc and HTML→Lexical converters and then re-renders via Ghost’s Mobiledoc/Lexical renderers. Those converters/renderers normalise markup to the editor’s supported node/card model and emit a standard, predictable HTML output. Arbitrary inline class and style attributes are not preserved except where the editor/card model explicitly supports and stores equivalent properties (e.g. image caption/alt/src, image width/height, card-specific payloads).
  • In short: structural/semantic content (headings, paragraphs, lists, links, bold/italic, code blocks, images, galleries, files, video/audio cards, captions, alt text, src, width/height where detected) is retained; arbitrary attributes, presentational classes/styles and unknown element attributes are generally discarded.

Why (what the code does)

  • Import path:
    • When importing a post with html (or when the importer converts markdown/html to a source format), the code calls:
      • HTML→Mobiledoc: mobiledoc.htmlToMobiledocConverter(html)
      • HTML→Lexical: lexical.htmlToLexicalConverter(html)
      • (See: ghost/core/core/server/api/endpoints/utils/serializers/input/posts.js and pages.js where these converters are used.)
  • Conversion + rendering pipeline:
    • The converters (external packages required by Ghost) parse the HTML and produce a document in the editor model (Mobiledoc cards/sections or Lexical node tree). These models only store the semantic pieces and the card/node payloads they know about (image card payloads, gallery card payloads, file/video card payloads, link hrefs, inline formatting, etc.). Arbitrary class/style attributes are not mapped to model fields and are therefore lost in conversion.
    • After conversion the importer will re-render the canonical HTML from the stored model (mobiledocLib.render or lexical render path). The renderer injects the classes and attributes it expects (for example the image renderer emits class="kg-image" and gallery markup). Those are Ghost’s canonical classes, not the original arbitrary classes/styles from the imported HTML.
  • Relevant code locations you can inspect:
    • HTML→Mobiledoc converter usage: ghost/core/core/server/lib/mobiledoc.js (requires @tryghost/html-to-mobiledoc)
    • HTML→Lexical converter usage: ghost/core/core/server/lib/lexical.js (requires @tryghost/kg-html-to-lexical)
    • Converters are invoked in the input serializers for posts/pages: ghost/core/core/server/api/endpoints/utils/serializers/input/posts.js and pages.js
    • Mobiledoc / lexical rendering output is produced by kg-mobiledoc-html-renderer / kg-lexical-html-renderer (used by mobiledocLib and lexical.render).

What HTML aspects are retained (typical)

  • Retained (converted to editor model / preserved in cards/nodes):
    • Structural elements: headings (h1..h6), paragraphs, ordered/unordered lists, blockquotes, pre/code blocks.
    • Inline formatting: strong/bold, em/italic, links (href), inline code.
    • Images: src, alt, caption, sometimes width/height if the converter extracts them into the image card payload.
    • Galleries: recognized as gallery cards with image entries (src/fileName/width/height/caption) when convertible.
    • Files/media: file/video/audio nodes/cards with src, fileName, captions and thumbnails where the converter recognizes them.
    • Captions and other card-specific payload properties if the converter maps them.
  • Possibly preserved (depends on converter behavior and the HTML):
    • width/height attributes on images (converters often capture numeric width/height when apparent).
    • link titles (sometimes captured, depending on converter).
  • Discarded or normalised:
    • Arbitrary classes on most tags (e.g.,

      is not carried through).

    • Inline style attributes (e.g., style="color: red;") — not preserved.
    • Custom data-* attributes that are not explicitly mapped by the converters.
    • Non-standard or unknown elements that the converter cannot map to the editor model — they may be dropped, converted to text, or wrapped in plain blocks depending on the converter.
    • Presentation-only attributes that have no mapping in the editor cards/nodes.

Practical implications and recommendations

  • If your theme or content relies on custom class attributes for styling, do not expect those classes to survive import. After import, the HTML will be Ghost’s canonical output; you’ll need to re-apply any custom classes in the editor or via theme templates/styles.
  • If you need to preserve specific presentational metadata, consider:
    • Converting that metadata into fields the importer understands (e.g., put caption, alt, or width/height into the JSON card payload or into lexical nodes), or
    • Re-applying classes/styles manually after import in the admin editor, or
    • Importing content as lexical JSON already prepared with the node payloads you need (the lexical format supports node payloads that the editor/renderer will respect).
  • Test risky HTML snippets: because conversion is performed by external converter packages, behavior can vary slightly based on HTML authoring. Best practice is to try importing a sample to confirm how your specific constructs are handled.

Notes about uncertainty

  • The actual HTML→Mobiledoc and HTML→Lexical conversion is done by external converter packages (@tryghost/html-to-mobiledoc and @tryghost/kg-html-to-lexical). Their precise mapping rules determine whether some attributes (e.g., image width/height, link titles) are picked up. The Ghost code wraps and uses those converters and then renders a canonical HTML via its card/node renderers — so the importer’s end result is consistent and predictable even if converters differ in edge cases.
  • You can inspect the relevant converter calls and renderers in the repo:
    • mobiledoc usage: ghost/core/core/server/lib/mobiledoc.js
    • lexical usage: ghost/core/core/server/lib/lexical.js
    • converter calls in serializers: ghost/core/core/server/api/endpoints/utils/serializers/input/posts.js and pages.js
  • Code-search results I used may be incomplete; you can view the converter usage here for more context: https://github.com/TryGhost/Ghost/search?q=htmlToMobiledoc&type=code

If you want I can:

  • give a short test matrix showing how a few specific HTML examples (p with class/style, img with style/class/width/height, figure with caption, custom attributes) will be converted by the import pipeline (based on the code paths), or
  • produce a small experiment you can run locally: a minimal JSON file + zip with an HTML snippet so you can observe the imported result and confirm which attributes survive.

@wu-lee

Please show me the test matrix.

Copilot

Below is a practical test matrix you can use to predict how Ghost’s import pipeline will treat common HTML patterns. Each row shows: input HTML → what the converters keep in the editor model (high-level), → the canonical output you should expect from Ghost after import (or whether something is dropped). I’ve marked items that are strongly supported by Ghost code/tests and items that depend on the external converters (@tryghost/html-to-mobiledoc or @tryghost/kg-html-to-lexical) with “(may vary)”.

Important caveats

  • During import Ghost runs HTML → Mobiledoc and HTML → Lexical converters, then renders canonical HTML from those models. The converters determine what gets mapped into cards/nodes; the renderer emits Ghost’s canonical markup (kg-* classes, image cards, gallery markup, srcset generation etc.).
  • The importer preserves semantic content (text, structure, image src/alt/caption, file/media src) but generally discards arbitrary presentational attributes (class/style) unless they map to known model fields.
  • Exact behaviour in edge cases depends on the converters. Treat this matrix as expected/typical behaviour, not guaranteed for every possible HTML fragment.
  • Relevant code: posts/pages serializers call the converters (ghost/core/core/server/api/endpoints/utils/serializers/input/posts.js and pages.js), rendering code is in mobiledoc (ghost/core/core/server/lib/mobiledoc.js) and lexical (ghost/core/core/server/lib/lexical.js). Image/gallery rendering examples are shown in tests under ghost/core/test/.

Test matrix

  • Column headings: Input HTML → Model fields retained (mobiledoc/lexical payload) → Final Ghost HTML after import → Notes (retained/discarded)
  1. Paragraph with class and inline style
  • Input

    Hello world

  • Model paragraph with inline bold text (no class/style)
  • Final HTML

    Hello world

  • Notes class="fancy" and style="color:red" are discarded / normalized. Semantic content and formatting (strong) are preserved.
  1. with inline style around text
  • Input

    Important note

  • Model inline formatting (bold) if converter recognises it; otherwise plain text
  • Final HTML

    Important note

    (or

    Important note

    if converter didn’t map style→format)
  • Notes Presentational styles on spans are not preserved as style attributes; converter may map obvious semantic styles (font-weight) to / but this is converter-dependent.
  1. Image tag with class/style/width/height/alt/caption present in HTML
  • Input Bird A bird
  • Model Image card (src, alt, caption) + width/height captured (often mapped into card payload)
  • Final HTML (typical Ghost) Bird A bird
  • Notes Original figure/img classes and inline style are discarded. src, alt, caption, and width/height are retained and used to render canonical img markup and srcset (if image transform available). The img gets Ghost’s kg-image class.
  1. Plain with data-* attributes or custom attributes
  • Input
  • Model src and (maybe) filename; custom data-* attributes typically not mapped
  • Final HTML
  • Notes data-* and custom attributes are dropped unless the converter explicitly maps them into a card/node payload (unlikely).
  1. Anchor with class, title, and target
  • Input Visit
  • Model link href and link text preserved; title may be preserved in some conversions; class/target usually not preserved in model
  • Final HTML (typical) Visit
  • Notes Href and text are preserved. Presentational class discarded. target/_blank may be normalised or removed; Ghost’s renderers add security attrs like rel="noopener noreferrer" when needed (behaviour can vary).
  1. Inline styles on block elements (e.g.,
    )
  • Input

    Centered

  • Model Likely paragraph text preserved; text-align not preserved except if converter maps alignment to node/card property
  • Final HTML

    Centered

    (alignment/style discarded) or canonical editor alignment classes if mapped (may vary)
  • Notes Text alignment via inline style generally lost. If the HTML represents an image with alignment that maps to image card width (e.g., classes that the converter recognizes), then alignment can be preserved via card payload; for arbitrary inline CSS, it’s discarded.
  1. Gallery HTML (multiple images inside gallery container / figure grid)
  • Input (example markup editors output or custom gallery HTML)
  • Model If converter recognises gallery structure, a gallery card with images[] payload is created; otherwise multiple image cards will be created.
  • Final HTML (if recognised as gallery) …
  • Notes Ghost tests show gallery cards render canonical gallery markup with srcset per image; arbitrary classes on the original gallery container are discarded. Whether input HTML is recognised as a gallery depends on converter heuristics.
  • Input
  • Model Video card with src, thumbnail poster mapped to video card payload
  • Final HTML Card-based canonical markup (video card rendered with src pointing to stored media and thumbnail referencing site URL)
  • Notes Ghost maps media into file/media cards where the converter recognizes
  1. <iframe> embeds or third-party embed code
  • Input <iframe src="https://www.youtube.com/embed/xyz" width="560" height="315"></iframe>
  • Model Converter may convert to an embed/video card payload (src + title) or treat as generic HTML block (may vary)
  • Final HTML If converted: canonical embed/video card markup with payload; else converter might strip iframe and leave fallback text or a card depending on converter capabilities
  • Notes Embeds are handled by converter heuristics; some embed HTML may not survive if converter doesn’t recognise it.
  1. Custom/unknown elements / tags not in editor model
  • Input Hello
  • Model Converter will usually drop unknown tag but retain inner text (Hello), or convert to text/plain or wrap as HTML if converter specially handles unknown tags (rare)
  • Final HTML

    Hello

  • Notes Unknown custom tags and attributes are normally discarded. If you need to preserve a custom widget, you should represent it as a known card/node payload (lexical card) in lexical JSON prior to import.
  1. Pre / code block with class (e.g., language-*)
  • Input
    const x = 1;
  • Model code block payload with language possibly detected (converters commonly map language-* to code block language)
  • Final HTML
    const x = 1;
    (or theme-wrapped code block with appropriate classes depending on renderer)
  • Notes Syntax class mapping (language-js) may be recognised and stored as code block metadata; this is converter-dependent but commonly supported.

Summary (short)

  • Kept reliably: semantic structure (headings, lists, paragraphs), text, links (href), images src/alt/caption, gallery/image/video/file card payload data that the converters map, and numeric width/height on images when detected.
  • Normalised / replaced: Ghost will emit canonical markup and classes (kg-image, kg-gallery-card, figure/card HTML). Ghost will also generate srcsets on demand.
  • Discarded / not preserved: arbitrary class attributes on arbitrary elements, inline style attributes, custom data-* attributes (unless explicitly mapped), unknown HTML elements.
  • Converter-dependent: subtle mapping like style→semantic format, link title preservation, gallery recognition from arbitrary HTML, code language detection.

How to be certain for your HTML

  • The fastest way to confirm behaviour for a particular fragment is to:
    1. Put a small JSON post with that HTML into a zip (post.html) or create an html import (source='html' used by import endpoint), and import into a dev Ghost instance; then view the result in the Admin or query the post HTML.
    2. If you need to preserve presentational attributes exactly, you should either:
      • Author the post as lexical JSON where node payloads store the data you need (the importer will accept lexical and Ghost will render node payloads), or
      • Post-process content after import in the editor or via a script.

References / where this happens in the code

  • HTML → Mobiledoc converter usage: ghost/core/core/server/api/endpoints/utils/serializers/input/posts.js and pages.js (calls mobiledoc.htmlToMobiledocConverter)
  • HTML → Lexical converter usage: ghost/core/core/server/lib/lexical.js and serializers (calls kg-html-to-lexical)
  • Mobiledoc / lexical renderers output canonical HTML (kg-* classes and card rendering) — see ghost/core/core/server/lib/mobiledoc.js and ghost/core/core/server/lib/lexical.js
  • Image/gallery rendering examples are visible in tests under ghost/core/test/* (see tests that expect canonical kg-image / kg-gallery markup and srcset behaviour)

If you want, I can:

  • produce concrete example input files and the exact expected HTML output snippets based on Ghost’s canonical renderers (I’ll include both mobiledoc and lexical results),
  • or prepare a minimal import ZIP for you to run locally (one post per test case) so you can see exact outcomes in your environment.

@wu-lee

Can you give me an example of an image gallery in HTML format which will be recognised as such by the importer?

Copilot

Yes — the safest way to ensure the importer recognises a gallery is to provide HTML that matches Ghost’s canonical gallery structure (the markup the editor/renderer emits). The HTML-to-Mobiledoc / HTML-to-Lexical converters are written to recognise that structure and will convert it into a gallery card/node during import.

Example HTML gallery (recommended)

<figure class="kg-card kg-gallery-card kg-width-wide">
  <div class="kg-gallery-container">
    <div class="kg-gallery-row">
      <div class="kg-gallery-image">
        <img src="content/images/2020/01/gallery-1.jpg" width="1200" height="800" loading="lazy" alt="Caption 1">
      </div>
      <div class="kg-gallery-image">
        <img src="content/images/2020/01/gallery-2.jpg" width="1200" height="800" loading="lazy" alt="Caption 2">
      </div>
      <div class="kg-gallery-image">
        <img src="content/images/2020/01/gallery-3.jpg" width="1200" height="800" loading="lazy" alt="Caption 3">
      </div>
    </div>
    <!-- more .kg-gallery-row blocks allowed for additional rows -->
  </div>
  <figcaption>A gallery caption (optional)</figcaption>
</figure>

How to package this so the importer recognises it

  • Put the image files in your ZIP at the same paths used in the HTML (e.g. content/images/2020/01/gallery-1.jpg etc.). The importer builds a mapping from the file originalPath -> stored newPath and will rewrite references in post HTML/markdown accordingly.
  • Put the HTML above in the post HTML that you include in your JSON export (post.html) or in a post that you upload via the import endpoint using source='html'. When the importer converts HTML → editor model it will detect the gallery structure and create a gallery card/node.

Notes and caveats

  • The canonical class names and structure (kg-card kg-gallery-card → .kg-gallery-container → .kg-gallery-row → .kg-gallery-image → ) are what the converters are most likely to recognise reliably. If you use an arbitrary custom gallery markup (e.g., a generic
    with images), the converter may or may not recognise it as a gallery — it might instead import separate image cards.
  • Include width/height and sensible alt text where possible. Converters often capture width/height and will include them in the gallery payload; Ghost’s renderer can then generate srcsets on demand.
  • If you prefer to avoid relying on converter heuristics, supply lexical JSON for the post with an explicit gallery node (lexical/generic card payload). That guarantees the gallery payload and avoids HTML parsing ambiguity.

If you'd like, I can:

  • produce a ready-to-zip example (JSON post + image files) you can drop into Ghost to test, or
  • show an equivalent lexical or mobiledoc gallery payload that the importer will create/expect.

@wu-lee

How would I link a post to an existing author?

Copilot

There are a few ways to link a post to an existing author depending on how you are creating/importing the post (Admin API vs. JSON/ZIP import). Below I show the supported approaches, give examples, and note how Ghost resolves author references during import.

Quick summary

  • Admin API / creating a post: include an authors array on the post object. Each author may be specified by id (best) or by email (if the user exists with that email). Example: authors: [{id: "..."}] or authors: [{email: "user@example.com"}].
  • Importing via Ghost JSON ZIP: either (a) include post objects with authors referencing existing user ids, or (b) include posts_authors mapping rows in the export file. The importer will try to resolve IDs against imported data first, then against existing DB users; if no match it will remove the relation (and a post with no authors will get the Owner user as fallback).

How Ghost resolves author references during import

  • The posts importer runs a replaceIdentifiers step:
    • If the author reference is present in the import file (e.g. users table in the export and posts_authors rows), the importer maps original IDs to the newly created IDs.
    • If an author id in a post references an author not present in the file but there is a user with that id in the existing DB, the importer keeps that DB id.
    • If an author can't be matched, the importer will remove that author relation. If a post ends up with zero authors, Ghost will assign the site owner user as a fallback. (See core/server/data/importer/importers/data/posts-importer.js replaceIdentifiers / beforeImport logic.)

Examples

  1. Creating a post via Admin API (recommended)
  • Link by author id: POST /posts (admin API) Body: { "posts": [{ "title": "My post", "html": "

    Content

    ", "authors": [ { "id": "f7a1e3b0-...." } // existing user id from DB ] }] }

  • Link by author email (if that user already exists): { "posts": [{ "title": "My post", "html": "

    Content

    ", "authors": [ { "email": "author@example.com" } ] }] } Note: if you pass a string in place of an author (e.g. "author@example.com") Ghost’s input serializer normalizes strings to {email: '...'}.

  1. Including authors in a JSON import (ZIP)
  • Option A — reference existing DB user id directly in the post object: data.posts example: { "posts": [ { "id": "post-1-in-export", "title": "Imported post", "authors": [ { "id": "existing-db-user-id" } // importer will accept & keep this if it exists ], ... } ], "users": [ ...optional user objects included in the export... ], "posts_authors": [ ...optional mapping rows ... ] }

  • Option B — include posts_authors mapping rows (classic export shape) data.posts and data.posts_authors: { "posts": [ { "id": "1", "title": "Imported post", ... } ], "users": [ { "id": "2", "email": "author@example.com", "slug":"author-slug", ... } ], "posts_authors": [ { "post_id":"1", "author_id":"2", "sort_order":0 } ] } The importer will create users (if included), create the post, then attach authors using posts_authors mapping (and will remap original ids to new DB ids).

Notes, gotchas and recommendations

  • Use author id when possible — it’s the most direct and predictable method.
  • When importing via JSON, include the users table if you want to import authors along with posts; otherwise use existing DB user ids or emails that already exist in the target site.
  • If an author reference cannot be resolved, the importer will drop the relation; Ghost will then ensure the post has at least one author by assigning the site Owner user if necessary.
  • For multiple authors, the posts_authors.sort_order or the order of authors in authors is used to determine primary author (primary author is normally the first).
  • If you are creating posts programmatically through the Admin API, include authors in the create/update payload (the Admin API supports authors via id or email).

Where this is implemented (references in Ghost code)

  • posts importer / identifier replacement: ghost/core/core/server/data/importer/importers/data/posts-importer.js (replaceIdentifiers / beforeImport)
  • API input normalization that accepts string->email for authors: ghost/core/core/server/api/endpoints/utils/serializers/input/posts.js (the add helper normalizes short formats into full objects)

If you want I can:

  • Give a complete importable JSON example (full minimal data object with posts, users, posts_authors) you can drop into a ZIP and import, or
  • Show a curl+JSON example for creating a post with an existing author via the Admin API (with exact endpoint and required auth headers). Which would you prefer?

@wu-lee

How can I find out the ID of an author?

Copilot

You have several easy ways to find an author/user id in Ghost. Pick whichever suits your access level (admin UI, exported JSON, DB, or API). Below are step-by-step options and concrete examples.

  1. From the Admin UI (quick, no DB access)
  • Go to Ghost Admin and sign in as an administrator.
  • Open Staff/People and click the user you want to inspect.
  • Open your browser DevTools → Network tab and refresh or perform an action that loads the user. The Admin UI requests the user via the Admin API (e.g. a request to /ghost/api/admin/users/…).
  • Inspect the JSON response for that request: the user object will contain an id field. Example snippet from the API response: { "users": [ { "id": "b3f2e0d0-....", "name": "Alice Example", "email": "alice@example.com", ... } ] } Notes: you must be signed in and have admin privileges for the Admin UI to call the admin API.
  1. From an export (no DB or console access needed)
  • In Ghost Admin go to the export tool (Export/Backups in the admin — the UI option that produces the site export/backup).
  • Download the JSON export from Ghost.
  • Open the JSON and look at data.users (or data.users[]). Each user object contains an id field. Example path inside export JSON: data.users[0].id
  1. From the database (requires DB access)
  • Query the users table directly. Examples:

    MySQL / MariaDB: mysql -u ghost -p -h <ghost_db> SELECT id, name, email FROM users LIMIT 100;

    PostgreSQL: psql -h -U -d <ghost_db> SELECT id, name, email FROM users LIMIT 100;

    SQLite (if Ghost uses SQLite and content/data/ghost.db is accessible): sqlite3 content/data/ghost.db "SELECT id, name, email FROM users LIMIT 100;"

  • The id column is the identifier used by the importer/API. (Warning: id format can differ by Ghost version/legacy export — it may be numeric, a hex/objectid string, or a UUID.)

  1. Via the Admin API (programmatic; requires admin credentials/API access)
  • If you can call the Admin API, fetch the users endpoint: GET /ghost/api/admin/users/
  • The endpoint requires admin authentication (session cookie or admin-api auth). The response contains user objects with id fields.
  • Example (conceptual, not runnable without auth): curl -u "" "https://your-site/ghost/api/admin/users/?limit=all" | jq '.users[] | {id,name,email}'

Note about id formats

  • Different Ghost versions and export methods might use different id types: legacy numeric ids, bson-objectid-like hex ids, or UUIDs. The importer and API will accept whichever ID format your installation uses; make sure you reference the ids that exist on the target site.

Which method to use

  • No access to DB or API: use Admin UI + DevTools or the site export JSON.
  • Scripting / automation: use the Admin API (requires you to generate/authenticate properly) or query the DB directly.
  • For import mapping: if you’re preparing a JSON import and want to reference an existing author reliably, prefer using the user's id from the target site (DB or Admin API) or the user’s email (the importer will match emails to existing users in some cases).

If you want, I can:

  • show an exact curl example for the Admin API including how to create the JWT token for Admin API keys, or
  • show how to extract user ids from an export file with a short jq command. Which would you like?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment