538's data on 45 years of Scrabble games turned into an Arrow file
$ python scrabble.py https://media.githubusercontent.com/media/fivethirtyeight/data/master/scrabble-games/scrabble_games.csv scrabble.arrow| > Task :sdks:java:io:hadoop-file-system:compileJava | |
| /beam/sdks/java/io/hadoop-file-system/src/main/java/org/apache/beam/sdk/io/hdfs/HadoopFileSystemRegistrar.java:60: error: An unhandled exception was thrown by the Error Prone static analysis plugin. | |
| checkArgument( | |
| ^ | |
| Please report this at https://github.com/google/error-prone/issues/new and include the following: | |
| error-prone version: 2.10.0 | |
| BugPattern: ArgumentSelectionDefectChecker | |
| Stack Trace: | |
| java.lang.NoSuchMethodError: 'java.util.stream.Stream com.google.common.base.Splitter.splitToStream(java.lang.CharSequence)' |
| Linkage Check difference on beam-sdks-java-extensions-sql between master(00ed8a87) and datacatalog-client(9bd21c3a): | |
| Lines starting with '<' mean the branch remedies the errors (good) | |
| Lines starting with '>' mean the branch introduces new errors (bad) | |
| 9022a9023,9028 | |
| > Class com.fasterxml.jackson.core.TSFBuilder is not found; | |
| > referenced by 1 class file | |
| > com.fasterxml.jackson.dataformat.csv.CsvFactoryBuilder (jackson-dataformat-csv-2.10.0.jar) | |
| > Class com.fasterxml.jackson.databind.cfg.MapperBuilder is not found; | |
| > referenced by 1 class file | |
| > com.fasterxml.jackson.dataformat.csv.CsvMapper (jackson-dataformat-csv-2.10.0.jar) |
| name: projects/apache-beam-testing/topics/java_mobile_gaming_topic | |
| name: projects/apache-beam-testing/topics/testpipeline-jenkins-0208193512-b2c6d3ca | |
| name: projects/apache-beam-testing/topics/testpipeline-jenkins-0208192737-e75f3cf5 | |
| name: projects/apache-beam-testing/topics/testpipeline-jenkins-0210041931-7dbd3392 | |
| name: projects/apache-beam-testing/topics/testpipeline-jenkins-0210041202-e68aa32b | |
| name: projects/apache-beam-testing/topics/wc_topic_input1f7fc593-1fb1-4590-b806-c373d1f4d9fa | |
| name: projects/apache-beam-testing/topics/wc_topic_output1f7fc593-1fb1-4590-b806-c373d1f4d9fa | |
| name: projects/apache-beam-testing/topics/game_stats_it_input_topic3f311c11-e954-4628-8889-f8dac2c855e7 | |
| name: projects/apache-beam-testing/topics/game_stats_it_input_topiccb2205dd-2d68-4b55-a0dd-e8e72df6182f | |
| name: projects/apache-beam-testing/topics/testpipeline-ajamato-0220012558-66bf781b |
| ❯ cat /tmp/topics | grep PubsubJsonIT | cut -d'-' -f7-9 | sort | uniq -c | |
| 14 2019-10-03 | |
| 32 2019-10-04 | |
| 12 2019-10-05 | |
| 8 2019-10-06 | |
| 28 2019-10-07 | |
| 16 2019-10-08 | |
| 22 2019-10-09 | |
| 20 2019-10-10 | |
| 20 2019-10-11 |
| from timeit import timeit | |
| N = int(1E6) | |
| def bench_conversion(int_size): | |
| np_to_int = timeit('int(i)', setup='import numpy as np; i=np.int%d(4528)' % int_size, number=N) | |
| int_to_np = timeit('np.int%d(i)' % int_size, setup='import numpy as np; i=int(4528)', number=N) | |
| np_to_np = timeit('np.int%d(i)' % int_size, setup='import numpy as np; i=np.int%d(4528)' % int_size, number=N) | |
| print("np.int%d to int:\t%.3f ns/op" % (int_size, np_to_int*1E9/N)) | |
| print("int to np.int%d:\t%.3f ns/op" % (int_size, np_to_int*1E9/N)) |
| > apache-arrow@0.3.0 perf /home/hulettbh/working_dir/arrow/js | |
| > node ./perf/index.js | |
| Running apache-arrow performance tests... | |
| Parse "tracks": | |
| Table.from | |
| x 6,199 ops/sec ±2.83% (81 runs sampled) | |
| avg: 0.16ms |
538's data on 45 years of Scrabble games turned into an Arrow file
$ python scrabble.py https://media.githubusercontent.com/media/fivethirtyeight/data/master/scrabble-games/scrabble_games.csv scrabble.arrow| """ | |
| =================== | |
| Label image regions | |
| =================== | |
| This example shows how to segment an image with image labelling. The following | |
| steps are applied: | |
| 1. Thresholding with automatic Otsu method | |
| 2. Close small holes with binary closing |