Created
January 9, 2025 19:28
-
-
Save stefansimik/d387e1d9ff784a8973feca0cde51e363 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "id": "e59e2b40-5795-4e19-bd32-dc84751a287f", | |
| "metadata": {}, | |
| "source": [ | |
| "<span style=\"font-weight: bold; font-size: 36px\">Analysis of Price Movement Patterns in EUR/USD Futures</span>" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "0c1962a5-4dc5-4d64-939c-fadacd006bf2", | |
| "metadata": {}, | |
| "source": [ | |
| "# Understanding Bar Structure for Improved Backtesting Accuracy\n", | |
| "\n", | |
| "## Introduction\n", | |
| "\n", | |
| "This analysis examines different approaches to simulating price movements within 1-minute bars for EUR/USD futures data. When backtesting trading strategies, especially those with both take-profit (PT) and stop-loss (SL) orders, the assumed sequence of price movements within a bar significantly impacts the accuracy of simulated order fills. There are three main approaches to handling this challenge:\n", | |
| "\n", | |
| "1. **Fixed Order Approach**\n", | |
| " - Option A: Always process as Open -> High -> Low -> Close\n", | |
| " - Option B: Always process as Open -> Low -> High -> Close\n", | |
| " - Advantages: Simple to implement, consistent behavior\n", | |
| " - Disadvantages: Assumes same price path for all bars, leading to ~50% accuracy of order-fill when both PT/SL are within the same bar\n", | |
| "2. **Random Sequence Approach**\n", | |
| " - Randomly choose between High and Low for each bar\n", | |
| " - Advantages: More realistic than fixed order, accounts for market uncertainty\n", | |
| " - Disadvantages: Introduces randomness into backtesting results, making them less reproducible\n", | |
| " - Results still approximate 50% accuracy of order-fill\n", | |
| "3. **Heuristic-Based Approach**\n", | |
| " - Uses bar structure to infer likely price path\n", | |
| " - Examines relative distances between Open price and High/Low levels\n", | |
| " - Advantages: More accurate simulation (~85% correct), deterministic results\n", | |
| " - Disadvantages: Slightly more complex to implement, still not perfect for all cases\n", | |
| "\n", | |
| "This analysis focuses on validating and quantifying the improvements possible with the heuristic-based approach, which offers a balance between accuracy and implementation complexity." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 1, | |
| "id": "af4f6918-18b7-4ae5-af5f-fd24d377d8f0", | |
| "metadata": {}, | |
| "outputs": [], | |
| "source": [ | |
| "# All imports\n", | |
| "import pandas as pd" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 8, | |
| "id": "dbcb4ed5-4da1-475f-b27b-139713e813e9", | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>open</th>\n", | |
| " <th>high</th>\n", | |
| " <th>low</th>\n", | |
| " <th>close</th>\n", | |
| " <th>volume</th>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>timestamp</th>\n", | |
| " <th></th>\n", | |
| " <th></th>\n", | |
| " <th></th>\n", | |
| " <th></th>\n", | |
| " <th></th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>2024-01-01 23:01:00</th>\n", | |
| " <td>1.12045</td>\n", | |
| " <td>1.12070</td>\n", | |
| " <td>1.12045</td>\n", | |
| " <td>1.12065</td>\n", | |
| " <td>205</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-01-01 23:02:00</th>\n", | |
| " <td>1.12060</td>\n", | |
| " <td>1.12065</td>\n", | |
| " <td>1.12055</td>\n", | |
| " <td>1.12060</td>\n", | |
| " <td>86</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-01-01 23:03:00</th>\n", | |
| " <td>1.12060</td>\n", | |
| " <td>1.12065</td>\n", | |
| " <td>1.12050</td>\n", | |
| " <td>1.12050</td>\n", | |
| " <td>47</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-01-01 23:04:00</th>\n", | |
| " <td>1.12045</td>\n", | |
| " <td>1.12045</td>\n", | |
| " <td>1.12030</td>\n", | |
| " <td>1.12030</td>\n", | |
| " <td>94</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-01-01 23:05:00</th>\n", | |
| " <td>1.12035</td>\n", | |
| " <td>1.12035</td>\n", | |
| " <td>1.12030</td>\n", | |
| " <td>1.12030</td>\n", | |
| " <td>92</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>...</th>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " <td>...</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-12-12 21:55:00</th>\n", | |
| " <td>1.04675</td>\n", | |
| " <td>1.04680</td>\n", | |
| " <td>1.04675</td>\n", | |
| " <td>1.04680</td>\n", | |
| " <td>13</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-12-12 21:56:00</th>\n", | |
| " <td>1.04680</td>\n", | |
| " <td>1.04685</td>\n", | |
| " <td>1.04680</td>\n", | |
| " <td>1.04685</td>\n", | |
| " <td>13</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-12-12 21:57:00</th>\n", | |
| " <td>1.04685</td>\n", | |
| " <td>1.04690</td>\n", | |
| " <td>1.04685</td>\n", | |
| " <td>1.04685</td>\n", | |
| " <td>9</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-12-12 21:58:00</th>\n", | |
| " <td>1.04680</td>\n", | |
| " <td>1.04680</td>\n", | |
| " <td>1.04670</td>\n", | |
| " <td>1.04675</td>\n", | |
| " <td>20</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2024-12-12 21:59:00</th>\n", | |
| " <td>1.04675</td>\n", | |
| " <td>1.04675</td>\n", | |
| " <td>1.04670</td>\n", | |
| " <td>1.04670</td>\n", | |
| " <td>22</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "<p>331194 rows × 5 columns</p>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " open high low close volume\n", | |
| "timestamp \n", | |
| "2024-01-01 23:01:00 1.12045 1.12070 1.12045 1.12065 205\n", | |
| "2024-01-01 23:02:00 1.12060 1.12065 1.12055 1.12060 86\n", | |
| "2024-01-01 23:03:00 1.12060 1.12065 1.12050 1.12050 47\n", | |
| "2024-01-01 23:04:00 1.12045 1.12045 1.12030 1.12030 94\n", | |
| "2024-01-01 23:05:00 1.12035 1.12035 1.12030 1.12030 92\n", | |
| "... ... ... ... ... ...\n", | |
| "2024-12-12 21:55:00 1.04675 1.04680 1.04675 1.04680 13\n", | |
| "2024-12-12 21:56:00 1.04680 1.04685 1.04680 1.04685 13\n", | |
| "2024-12-12 21:57:00 1.04685 1.04690 1.04685 1.04685 9\n", | |
| "2024-12-12 21:58:00 1.04680 1.04680 1.04670 1.04675 20\n", | |
| "2024-12-12 21:59:00 1.04675 1.04675 1.04670 1.04670 22\n", | |
| "\n", | |
| "[331194 rows x 5 columns]" | |
| ] | |
| }, | |
| "execution_count": 8, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "# Load the one-minute bar data\n", | |
| "csv_1_min_bars = '6E.SIM-1-MINUTE-LAST-EXTERNAL.csv'\n", | |
| "df = (\n", | |
| " pd.read_csv(csv_1_min_bars, sep=';', decimal='.', header=0, index_col=False)\n", | |
| " .reindex(columns=['timestamp', 'open', 'high', 'low', 'close', 'volume'])\n", | |
| " .assign(timestamp= lambda dft: pd.to_datetime(dft['timestamp'], format='%Y-%m-%d %H:%M:%S'))\n", | |
| " .set_index('timestamp')\n", | |
| ")\n", | |
| "\n", | |
| "# Preview data\n", | |
| "df" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 7, | |
| "id": "9c4dc181-7b78-484f-b665-5be39b9110b6", | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "70% of bars have Open-Close range bigger 35% of High/Low range.\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "# Analyzes what percentage of bars have Open/Close prices distanced at least 35% of the High/Low range.\n", | |
| "TRESHOLD = 0.35 # 35% treshold\n", | |
| "\n", | |
| "# Statistics collected here\n", | |
| "matching_bars = 0\n", | |
| "total_bars = len(df)\n", | |
| "\n", | |
| "for index, row in df.iterrows(): # Iterating in pandas this way is slow, but ok, for this analysis\n", | |
| " open, high, low, close = row['open'], row['high'], row['low'], row['close'],\n", | |
| " \n", | |
| " # Calculate ranges\n", | |
| " high_low_range = high - low\n", | |
| " open_close_range = abs(open-close)\n", | |
| "\n", | |
| " # Skip bars where high equals low to avoid division by zero\n", | |
| " if high_low_range == 0:\n", | |
| " total_bars -= 1\n", | |
| " continue\n", | |
| "\n", | |
| " # Calculate the ratio\n", | |
| " range_ratio = open_close_range / high_low_range\n", | |
| "\n", | |
| " # Check if ratio meets the threshold\n", | |
| " if range_ratio >= TRESHOLD:\n", | |
| " matching_bars += 1\n", | |
| "\n", | |
| "# Process final results\n", | |
| "percentage = (matching_bars / total_bars) * 100\n", | |
| "\n", | |
| "# Show answear\n", | |
| "print(f\"{percentage:.0f}% of bars have Open-Close range bigger {TRESHOLD:.0%} of High/Low range.\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "id": "e7f92dda-d91c-4828-9e5b-555b9453ddaa", | |
| "metadata": {}, | |
| "source": [ | |
| "This analysis demonstrates that:\n", | |
| "\n", | |
| "1. Approximately 70% of bars show significant directional bias\n", | |
| "2. The proposed heuristic approach could improve simulation accuracy from 50% to ~85%\n", | |
| "3. This improvement comes at minimal computational cost\n", | |
| "4. The approach can be implemented as an optional configuration in any backtesting engine\n", | |
| "\n", | |
| "The results suggest that implementing this heuristic could significantly improve \n", | |
| "backtesting accuracy for strategies where take-profit and stop-loss levels frequently \n", | |
| "fall within the same bar range." | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python 3 (ipykernel)", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.12.8" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 5 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment