nilp0inter/README.md

## README.md

      
    Raw
  

              README.md
            
          
    Stochastic Testing Framework: Roulette Implementation

This repository contains a reference implementation of a Domain-Specific extension to Gherkin (BDD), designed to automate the testing of non-deterministic and stochastic systems.
Problem Statement

Standard BDD frameworks (such as Behave, Cucumber, or Pytest-BDD) are designed for deterministic testing. They evaluate a single execution path and return a tri-state result (Pass/Fail/Error).
However, when testing stochastic systems (e.g., Random Number Generators, Machine Learning models, or complex behavioral economies), a single execution is insufficient to determine system correctness. System validation requires evaluating the statistical distribution of outcomes over $N$ iterations.
Attempting to force iterative statistical evaluation into standard BDD typically results in semantic overloading—such as hiding while loops, data aggregation, and statistical math within a single Then step. This destroys test readability, traceability, and state isolation.
Architectural Solution

This framework resolves the limitation by introducing a Meta-Testing Architecture via a custom Gherkin superset. It explicitly separates the orchestrator (Macro) from the payload (Micro).

The Micro-Domain (Atomic Behavior): A standard, deterministic BDD scenario representing a single system interaction.
The Macro-Domain (Stochastic Scenario): A wrapper that declares configuration limits, defines a strict data schema, iteratively executes the Micro-Domain, and queries the aggregated results.

Framework Semantics and Syntax

The extension introduces specialized keywords and blocks to handle the macro-execution lifecycle logically from top to bottom:
1. Top-Level Domain Boundaries

To prevent the standard BDD runner from attempting to execute a simulation sequentially, the framework introduces domain-specific root keywords: Stochastic Feature and Stochastic Scenario.
These act as routing directives. When the custom parser reads these keywords, it delegates the entire block to the Stochastic Orchestration Engine instead of the standard task runner.
Stochastic Feature: Roulette Fair Play Validation  
  
  Stochastic Scenario: Verify the average house edge on a single number bet  
2. Engine Configuration and Contract Definition

Inside the Stochastic Scenario, the environment is initialized by defining the execution boundaries and the strict shape of the data that will be collected during the iterations.
    Given the following Execution Strategy:  
      | Setting         | Value |  
      | Maximum Samples | 50000 |  

    And the following Sample Schema:  
      | Observation     | Type    | Description                               |  
      | color_result    | String  | The color evaluated by the engine         |  
Note: Declaring the schema in the feature file allows the framework to enforce strict type validation during runtime and establishes a single source of truth for downstream data-analysis tools.
3. Execution Payload (Atomic Behavior)

This block represents the Micro-Domain. It is written in standard Gherkin. The Stochastic runner parses this block, treats it as an independent test suite, and executes it iteratively based on the Execution Strategy.
    When the following Atomic Behavior is executed iteratively:  
      Scenario Outline: Processing 1:1 payouts for color bets
        Given a new roulette game with a starting balance of 100 chips
        When the player bets 10 chips on "<color_choice>"
        Then the system identifies the winning color
Implementation Note: Inside the Python step definition for the core Then step, the developer calls context.sample.observe(color_result="Red"). This yields the iteration's data point back to the orchestration engine without breaking the deterministic test flow.
4. Statistical Aggregation and Assertion

Once the engine completes the specified iterations, it compiles the yielded observations into a localized, queryable dataset (e.g., a Pandas DataFrame). The final assertions act as declarative data queries against this exact dataset.
    Then the statistical assertion "Red_Occurrence" is met:  
      | Observation  | Filter | Operator | Value |  
      | color_result | Red    | >=       | 0.470 |  
      | color_result | Red    | <=       | 0.495 |  

  
## advanced.feature
Stochastic Feature: Roulette Fair Play Validation

  Stochastic Scenario: Verify the RNG color distribution conforms to European Roulette probabilities

    Given the following Execution Strategy:
      | Setting         | Value |
      | Maximum Samples | 50000 |
      | Warmup Samples  | 500   |
      | Fail Fast       | false |

    And the following Sample Schema:
      | Observation     | Type    | Description                               |
      | winning_number  | Integer | The exact pocket the ball landed in       |
      | color_result    | String  | The color evaluated by the engine         |
      | player_payout   | Float   | The net chip fluctuation                  |
      | rng_seed        | String  | The randomness seed for reproducibility   |

    When the following Atomic Behavior is executed iteratively:
      Scenario Outline: Processing 1:1 payouts for color bets
        Given a new roulette game with a starting balance of 100 chips
        When the player bets 10 chips on "<color_choice>"
        And the wheel is spun
        Then the system identifies the winning color
        And pays the player if the winning color is Red and the bet was "Red"
        And pays the player if the winning color is Black and the bet was "Black"
        And awards the bet to the house in all other cases

        Examples:
          | color_choice |
          | Red          |
          | Black        |

    Then the statistical assertion "Green_Occurrence" is met:
      | Observation  | Filter | Operator | Value |
      | color_result | Green  | >=       | 0.025 |
      | color_result | Green  | <=       | 0.029 |

    And the statistical assertion "Red_Occurrence" is met:
      | Observation  | Filter | Operator | Value |
      | color_result | Red    | >=       | 0.470 |
      | color_result | Red    | <=       | 0.495 |

    And the statistical assertion "Black_Occurrence" is met:
      | Observation  | Filter | Operator | Value |
      | color_result | Black  | >=       | 0.470 |
      | color_result | Black  | <=       | 0.495 |

## basic.feature
Stochastic Feature: Roulette Fair Play Validation

  Stochastic Scenario: Verify the average house edge on a single number bet

    Given the following Execution Strategy:
      | Setting         | Value |
      | Maximum Samples | 50000 |
      | Warmup Samples  | 500   |
      | Fail Fast       | false |

    And the following Sample Schema:
      | Observation     | Type    | Description                               |
      | winning_number  | Integer | The exact pocket the ball landed in       |
      | color_result    | String  | The color evaluated by the engine         |
      | player_payout   | Float   | The net chip fluctuation                  |
      | rng_seed        | String  | The randomness seed for reproducibility   |

    When the following Atomic Behavior is executed iteratively:
      Scenario: Processing a 35:1 payout for a single number bet
        Given a new roulette game with a starting balance of 100 chips
        When the player bets 10 chips on "17"
        And the wheel is spun
        Then the system identifies the winning number
        And pays the player if the winning number is 17 and the bet was "17"
        And awards the bet to the house in all other cases

    Then the statistical assertion "Expected_House_Edge" is met:
      | Observation   | Aggregation | Operator | Value |
      | player_payout | Average     | >=       | -0.40 |
      | player_payout | Average     | <=       | -0.15 |

## steps.py
from behave import given, when, then
import pandas as pd # Usado por el framework para procesar metadatos
import random

# ========================================================================
# 1. PRETEND API (The System Under Test)
# ========================================================================
class RouletteEngine:
    """Mock of the casino's production roulette system."""
    COLORS = {0: "Green"} | {i: "Red" for i in [1,3,5,7,9,12,14,16,18,19,21,23,25,27,30,32,34,36]} \
                          | {i: "Black" for i in [2,4,6,8,10,11,13,15,17,20,22,24,26,28,29,31,33,35]}

    def __init__(self, start_balance):
        self.balance = start_balance
        self.bet_amount = 0
        self.bet_choice = None
        self.winning_number = None
        self.payout = 0

    def place_bet(self, amount, choice):
        self.balance -= amount
        self.bet_amount = amount
        self.bet_choice = choice

    def spin(self):
        # The RNG generates a pseudo-random number (0-36)
        self.winning_number = random.randint(0, 36)
        self.seed_used = hex(random.getrandbits(16))

    def resolve_bet(self):
        winning_color = self.COLORS[self.winning_number]

        # 35:1 Payout for single number
        if str(self.winning_number) == self.bet_choice:
            self.payout = self.bet_amount * 36
        # 1:1 Payout for colors
        elif winning_color == self.bet_choice:
            self.payout = self.bet_amount * 2
        else:
            self.payout = 0

        self.balance += self.payout

        # Returns the financial delta (Net win/loss)
        return self.payout - self.bet_amount


# ========================================================================
# 2. ATOMIC BEHAVIOR STEPS (Developer's Deterministic Tests)
# ========================================================================

@given('a new roulette game with a starting balance of {start_balance:d} chips')
def step_start_game(context, start_balance):
    context.game = RouletteEngine(start_balance)
    context.initial_balance = start_balance

@when('the player bets {amount:d} chips on "{bet_choice}"')
def step_place_bet(context, amount, bet_choice):
    context.game.place_bet(amount, bet_choice)

@when('the wheel is spun')
def step_spin_wheel(context):
    context.game.spin()

@then('the system identifies the winning {entity}')
def step_identify_winner(context, entity):
    # Deterministic assertion ensures the API generated a valid state before continuing
    assert context.game.winning_number is not None

@then('pays the player if the winning {entity} is {target} and the bet was "{bet_choice}"')
def step_pay_winner(context, entity, target, bet_choice):
    net_payout = context.game.resolve_bet()

    # Deterministic assertion: Did the API pay correctly for a win?
    if (entity == "color" and context.game.COLORS[context.game.winning_number] == target) or \
       (entity == "number" and str(context.game.winning_number) == target):
        assert context.game.balance > context.initial_balance

    # --- FRAMEWORK MAGIC: YIELDING THE OBSERVATION ---
    # The developer feeds the stochastic engine from inside the atomic test!
    context.sample.observe(
        winning_number=context.game.winning_number,
        color_result=context.game.COLORS[context.game.winning_number],
        player_payout=float(net_payout),
        rng_seed=context.game.seed_used
    )

@then('awards the bet to the house in all other cases')
def step_house_wins(context):
    # A simple deterministic UI/State check could go here
    pass


# ========================================================================
# 3. STOCHASTIC FRAMEWORK STEPS (The Engine's Meta-Steps)
# ========================================================================

@given('the following Execution Strategy:')
def step_execution_strategy(context):
    # The framework parses the table into a configuration object
    context.stochastic_engine.config = {row['Setting']: row['Value'] for row in context.table}

@given('the following Sample Schema:')
def step_sample_schema(context):
    # The framework prepares the strict schema validation for the observations
    context.stochastic_engine.build_schema(context.table)

@when('the following Atomic Behavior is executed iteratively:')
def step_execute_iterations(context):
    """
    Note: Because we are using the Custom Parser approach, this step acts as
    the trigger. Inside, the framework takes the child AST (the inner Scenarios),
    loops them up to 'Maximum Samples', collects the 'context.sample.observe()'
    calls, and compiles them into a Pandas DataFrame.
    """
    # Pretend API triggering the engine loop
    context.stochastic_engine.run_atomic_loop(max_iterations=int(context.stochastic_engine.config['Maximum Samples']))

    # Once finished, the data is available as a DataFrame for the Then steps
    # context.samples_df = pd.DataFrame([...50,000 rows of observations...])

@then('the statistical assertion "{assertion_name}" is met:')
def step_statistical_assertion(context, assertion_name):
    # We retrieve the dataset of the 50,000 runs
    df = context.stochastic_engine.samples_df
    total_samples = len(df)

    for row in context.table:
        observation = row['Observation']
        operator = row['Operator']
        target_value = float(row['Value'])

        # ----------------------------------------------------
        # Logic A: Data Filtering (e.g., Occurrences of "Red")
        # ----------------------------------------------------
        if 'Filter' in row.headings:
            filter_val = row['Filter']
            # Pandas calculates how many rows match the string (e.g., 'Red')
            occurrences = len(df[df[observation] == filter_val])
            actual_value = occurrences / total_samples

        # ----------------------------------------------------
        # Logic B: Data Aggregation (e.g., Average Payout)
        # ----------------------------------------------------
        elif 'Aggregation' in row.headings:
            aggregation = row['Aggregation']
            if aggregation == "Average":
                # Pandas calculates the mean of the float column
                actual_value = df[observation].mean()

        # ----------------------------------------------------
        # Dynamic Evaluation
        # ----------------------------------------------------
        if operator == ">=":
            assert actual_value >= target_value, \
                f"[{assertion_name}] FAIL: {observation} was {actual_value:.4f}, expected >= {target_value}"
        elif operator == "<=":
            assert actual_value <= target_value, \
                f"[{assertion_name}] FAIL: {observation} was {actual_value:.4f}, expected <= {target_value}"
	Stochastic Feature: Roulette Fair Play Validation

	Stochastic Scenario: Verify the RNG color distribution conforms to European Roulette probabilities

	Given the following Execution Strategy:
	\| Setting \| Value \|
	\| Maximum Samples \| 50000 \|
	\| Warmup Samples \| 500 \|
	\| Fail Fast \| false \|

	And the following Sample Schema:
	\| Observation \| Type \| Description \|
	\| winning_number \| Integer \| The exact pocket the ball landed in \|
	\| color_result \| String \| The color evaluated by the engine \|
	\| player_payout \| Float \| The net chip fluctuation \|
	\| rng_seed \| String \| The randomness seed for reproducibility \|

	When the following Atomic Behavior is executed iteratively:
	Scenario Outline: Processing 1:1 payouts for color bets
	Given a new roulette game with a starting balance of 100 chips
	When the player bets 10 chips on "<color_choice>"
	And the wheel is spun
	Then the system identifies the winning color
	And pays the player if the winning color is Red and the bet was "Red"
	And pays the player if the winning color is Black and the bet was "Black"
	And awards the bet to the house in all other cases

	Examples:
	\| color_choice \|
	\| Red \|
	\| Black \|

	Then the statistical assertion "Green_Occurrence" is met:
	\| Observation \| Filter \| Operator \| Value \|
	\| color_result \| Green \| >= \| 0.025 \|
	\| color_result \| Green \| <= \| 0.029 \|

	And the statistical assertion "Red_Occurrence" is met:
	\| Observation \| Filter \| Operator \| Value \|
	\| color_result \| Red \| >= \| 0.470 \|
	\| color_result \| Red \| <= \| 0.495 \|

	And the statistical assertion "Black_Occurrence" is met:
	\| Observation \| Filter \| Operator \| Value \|
	\| color_result \| Black \| >= \| 0.470 \|
	\| color_result \| Black \| <= \| 0.495 \|
	Stochastic Feature: Roulette Fair Play Validation

	Stochastic Scenario: Verify the average house edge on a single number bet

	Given the following Execution Strategy:
	\| Setting \| Value \|
	\| Maximum Samples \| 50000 \|
	\| Warmup Samples \| 500 \|
	\| Fail Fast \| false \|

	And the following Sample Schema:
	\| Observation \| Type \| Description \|
	\| winning_number \| Integer \| The exact pocket the ball landed in \|
	\| color_result \| String \| The color evaluated by the engine \|
	\| player_payout \| Float \| The net chip fluctuation \|
	\| rng_seed \| String \| The randomness seed for reproducibility \|

	When the following Atomic Behavior is executed iteratively:
	Scenario: Processing a 35:1 payout for a single number bet
	Given a new roulette game with a starting balance of 100 chips
	When the player bets 10 chips on "17"
	And the wheel is spun
	Then the system identifies the winning number
	And pays the player if the winning number is 17 and the bet was "17"
	And awards the bet to the house in all other cases

	Then the statistical assertion "Expected_House_Edge" is met:
	\| Observation \| Aggregation \| Operator \| Value \|
	\| player_payout \| Average \| >= \| -0.40 \|
	\| player_payout \| Average \| <= \| -0.15 \|
	from behave import given, when, then
	import pandas as pd # Usado por el framework para procesar metadatos
	import random

	# ========================================================================
	# 1. PRETEND API (The System Under Test)
	# ========================================================================
	class RouletteEngine:
	"""Mock of the casino's production roulette system."""
	COLORS = {0: "Green"} \| {i: "Red" for i in [1,3,5,7,9,12,14,16,18,19,21,23,25,27,30,32,34,36]} \
	\| {i: "Black" for i in [2,4,6,8,10,11,13,15,17,20,22,24,26,28,29,31,33,35]}

	def __init__(self, start_balance):
	self.balance = start_balance
	self.bet_amount = 0
	self.bet_choice = None
	self.winning_number = None
	self.payout = 0

	def place_bet(self, amount, choice):
	self.balance -= amount
	self.bet_amount = amount
	self.bet_choice = choice

	def spin(self):
	# The RNG generates a pseudo-random number (0-36)
	self.winning_number = random.randint(0, 36)
	self.seed_used = hex(random.getrandbits(16))

	def resolve_bet(self):
	winning_color = self.COLORS[self.winning_number]

	# 35:1 Payout for single number
	if str(self.winning_number) == self.bet_choice:
	self.payout = self.bet_amount * 36
	# 1:1 Payout for colors
	elif winning_color == self.bet_choice:
	self.payout = self.bet_amount * 2
	else:
	self.payout = 0

	self.balance += self.payout

	# Returns the financial delta (Net win/loss)
	return self.payout - self.bet_amount


	# ========================================================================
	# 2. ATOMIC BEHAVIOR STEPS (Developer's Deterministic Tests)
	# ========================================================================

	@given('a new roulette game with a starting balance of {start_balance:d} chips')
	def step_start_game(context, start_balance):
	context.game = RouletteEngine(start_balance)
	context.initial_balance = start_balance

	@when('the player bets {amount:d} chips on "{bet_choice}"')
	def step_place_bet(context, amount, bet_choice):
	context.game.place_bet(amount, bet_choice)

	@when('the wheel is spun')
	def step_spin_wheel(context):
	context.game.spin()

	@then('the system identifies the winning {entity}')
	def step_identify_winner(context, entity):
	# Deterministic assertion ensures the API generated a valid state before continuing
	assert context.game.winning_number is not None

	@then('pays the player if the winning {entity} is {target} and the bet was "{bet_choice}"')
	def step_pay_winner(context, entity, target, bet_choice):
	net_payout = context.game.resolve_bet()

	# Deterministic assertion: Did the API pay correctly for a win?
	if (entity == "color" and context.game.COLORS[context.game.winning_number] == target) or \
	(entity == "number" and str(context.game.winning_number) == target):
	assert context.game.balance > context.initial_balance

	# --- FRAMEWORK MAGIC: YIELDING THE OBSERVATION ---
	# The developer feeds the stochastic engine from inside the atomic test!
	context.sample.observe(
	winning_number=context.game.winning_number,
	color_result=context.game.COLORS[context.game.winning_number],
	player_payout=float(net_payout),
	rng_seed=context.game.seed_used
	)

	@then('awards the bet to the house in all other cases')
	def step_house_wins(context):
	# A simple deterministic UI/State check could go here
	pass


	# ========================================================================
	# 3. STOCHASTIC FRAMEWORK STEPS (The Engine's Meta-Steps)
	# ========================================================================

	@given('the following Execution Strategy:')
	def step_execution_strategy(context):
	# The framework parses the table into a configuration object
	context.stochastic_engine.config = {row['Setting']: row['Value'] for row in context.table}

	@given('the following Sample Schema:')
	def step_sample_schema(context):
	# The framework prepares the strict schema validation for the observations
	context.stochastic_engine.build_schema(context.table)

	@when('the following Atomic Behavior is executed iteratively:')
	def step_execute_iterations(context):
	"""
	Note: Because we are using the Custom Parser approach, this step acts as
	the trigger. Inside, the framework takes the child AST (the inner Scenarios),
	loops them up to 'Maximum Samples', collects the 'context.sample.observe()'
	calls, and compiles them into a Pandas DataFrame.
	"""
	# Pretend API triggering the engine loop
	context.stochastic_engine.run_atomic_loop(max_iterations=int(context.stochastic_engine.config['Maximum Samples']))

	# Once finished, the data is available as a DataFrame for the Then steps
	# context.samples_df = pd.DataFrame([...50,000 rows of observations...])

	@then('the statistical assertion "{assertion_name}" is met:')
	def step_statistical_assertion(context, assertion_name):
	# We retrieve the dataset of the 50,000 runs
	df = context.stochastic_engine.samples_df
	total_samples = len(df)

	for row in context.table:
	observation = row['Observation']
	operator = row['Operator']
	target_value = float(row['Value'])

	# ----------------------------------------------------
	# Logic A: Data Filtering (e.g., Occurrences of "Red")
	# ----------------------------------------------------
	if 'Filter' in row.headings:
	filter_val = row['Filter']
	# Pandas calculates how many rows match the string (e.g., 'Red')
	occurrences = len(df[df[observation] == filter_val])
	actual_value = occurrences / total_samples

	# ----------------------------------------------------
	# Logic B: Data Aggregation (e.g., Average Payout)
	# ----------------------------------------------------
	elif 'Aggregation' in row.headings:
	aggregation = row['Aggregation']
	if aggregation == "Average":
	# Pandas calculates the mean of the float column
	actual_value = df[observation].mean()

	# ----------------------------------------------------
	# Dynamic Evaluation
	# ----------------------------------------------------
	if operator == ">=":
	assert actual_value >= target_value, \
	f"[{assertion_name}] FAIL: {observation} was {actual_value:.4f}, expected >= {target_value}"
	elif operator == "<=":
	assert actual_value <= target_value, \
	f"[{assertion_name}] FAIL: {observation} was {actual_value:.4f}, expected <= {target_value}"