Bill Chambers bllchmbrs

## Step5_query_data.ipynb

      
              7 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                bllchmbrs
                / Step5_query_data.ipynb
            
            
              Created
              December 3, 2025 17:33
            
          
      Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## 01-direct-streaming.md

      
              5 files
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                bllchmbrs
                / 01-direct-streaming.md
            
            
              Created
              December 3, 2025 17:31
            
          
    Direct Streaming

Great for: Development workflow - Live, real-time debugging and development
Simple to set up, provides immediate visual feedback. Limited to a single viewer, constrained by Wi-Fi bandwidth.
On the Laptop (Viewer)

First, spawn the Rerun viewer:

  
## monte_carlo.py
import argparse
import time
import random
import math

from ray.util.multiprocessing import Pool

parser = argparse.ArgumentParser(description="Approximate digits of Pi using Monte Carlo simulation.")
parser.add_argument("--num-samples", type=int, default=1000000)

## gist:417cdc19b7f855bcfc96e0bb1d14ec9e

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                bllchmbrs
                / gist:417cdc19b7f855bcfc96e0bb1d14ec9e
            
            
              Created
              April 16, 2020 20:48
            
          
      Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## step_2.py
import ray

from utils import adder, timer

if __name__ == '__main__':
    ray.init(address='auto')
    # ray.init(num_cpus=2)
    values = range(10)
    new_values = [adder.remote(x) for x in values]
    timer(new_values)

## utils.py
import ray
import time
from datetime import datetime as dt

@ray.remote
def adder(x):
    return x+1

def timer(values):
    start = dt.now()

## cluster.yaml
# A unique identifier for the head node and workers of this cluster.
cluster_name: basic-ray
# The maximum number of workers nodes to launch in addition to the head
# node. This takes precedence over min_workers. min_workers defaults to 0.
max_workers: 0 # this means zero workers
# Cloud-provider specific configuration.
provider:
   type: aws
   region: us-west-2
   availability_zone: us-west-2a

## step_1.py
import ray
import time
from datetime import datetime as dt

@ray.remote
def adder(input_value):
    time.sleep(1)
    return input_value+1

if __name__ == '__main__':

## spark-summit-eu-2019.py
# Databricks notebook source
# MAGIC %md
# MAGIC
# MAGIC # RDDs

# COMMAND ----------

rdd = sc.parallelize(range(1000), 5)

print(rdd.take(10))

## f76f91af22fa13998bae121f42657ffb2f5f6cad.txt
$ ./bin/spark-shell
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
16/12/11 13:43:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/11 13:43:58 WARN Utils: Your hostname, bill-ubuntu resolves to a loopback address: 127.0.1.1; using 192.168.42.75 instead (on interface wlp2s0)
16/12/11 13:43:58 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Spark context Web UI available at http://192.168.42.75:4040
Spark context available as 'sc' (master = local[*], app id = local-1481492639112).
Spark session available as 'spark'.
	import argparse
	import time
	import random
	import math

	from ray.util.multiprocessing import Pool

	parser = argparse.ArgumentParser(description="Approximate digits of Pi using Monte Carlo simulation.")
	parser.add_argument("--num-samples", type=int, default=1000000)
	import ray

	from utils import adder, timer

	if __name__ == '__main__':
	ray.init(address='auto')
	# ray.init(num_cpus=2)
	values = range(10)
	new_values = [adder.remote(x) for x in values]
	timer(new_values)
	import ray
	import time
	from datetime import datetime as dt

	@ray.remote
	def adder(x):
	return x+1

	def timer(values):
	start = dt.now()
	# A unique identifier for the head node and workers of this cluster.
	cluster_name: basic-ray
	# The maximum number of workers nodes to launch in addition to the head
	# node. This takes precedence over min_workers. min_workers defaults to 0.
	max_workers: 0 # this means zero workers
	# Cloud-provider specific configuration.
	provider:
	type: aws
	region: us-west-2
	availability_zone: us-west-2a
	# Databricks notebook source
	# MAGIC %md
	# MAGIC
	# MAGIC # RDDs

	# COMMAND ----------

	rdd = sc.parallelize(range(1000), 5)

	print(rdd.take(10))
	$ ./bin/spark-shell
	Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
	Setting default log level to "WARN".
	To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
	16/12/11 13:43:58 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
	16/12/11 13:43:58 WARN Utils: Your hostname, bill-ubuntu resolves to a loopback address: 127.0.1.1; using 192.168.42.75 instead (on interface wlp2s0)
	16/12/11 13:43:58 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
	Spark context Web UI available at http://192.168.42.75:4040
	Spark context available as 'sc' (master = local[*], app id = local-1481492639112).
	Spark session available as 'spark'.