brew install git bash-completion
Basic config:
| WITH | |
| anon_5 AS | |
| (SELECT | |
| task_instance.dag_id AS dag_id, | |
| task_instance.run_id AS run_id, | |
| count(:count_1) AS now_running | |
| FROM task_instance | |
| WHERE | |
| task_instance.state IN (__[postcompile_state_4]) | |
| GROUP BY task_instance.dag_id, task_instance.run_id |
| """ | |
| Airflow task instance data generator. | |
| Takes two arguments: | |
| * unit: days / weeks / hours / minutes etc | |
| * num: just to do the same thing with different name dag | |
| Example: | |
| python /Users/dstandish/code/async-ssh-operator/synthesize_data.py days 1 | |
| * This will create one dag run per day for the time period. (hardcoded at 2.5 yrs) |
| from __future__ import annotations | |
| from aocd import lines, numbers | |
| class Node: | |
| def __init__(self, name, type, size=0): | |
| self.name = name | |
| self.type = type | |
| self.size = size |
| a=$'\001' | |
| b=$'\002' | |
| c=$'\003' | |
| d=$'\004' | |
| ft="$a$b$c$c" # odd field separator | |
| rt="$a$b$c$d" # odd row separator | |
| zcat very_large_file.csv.gz | awk -v rt=$rt 'BEGIN { RS = rt } NR>=946241736&&NR<=946241740' > out.txt |
brew install git bash-completion
Basic config:
| class WrappedStreamingBody: | |
| """ | |
| Wrap boto3's StreamingBody object to provide enough | |
| fileobj functionality so that GzipFile is | |
| satisfied. Sometimes duck typing is awesome. | |
| """ | |
| This module provides a boto3 s3 client factory get_client(), which returns an s3 client that has been augmented by some | |
| additional functionality defined in the ClientWrap class, also present in this module. | |
| ClientWrap adds a few wrapper methods that simplify simple list / delete / copy operations by (1) handling paging and | |
| batching and (2) dealing only with keys instead of more detailed object metadata. | |
| get_client() also makes it easy to to specify a default bucket for the client, so that you don't need to specify the | |
| bucket in each call. |