#A Collection of NLP notes
##N-grams
###Calculating unigram probabilities:
P( wi ) = count ( wi ) ) / count ( total number of words )
In english..
| import torch | |
| class BiaffineAttention(torch.nn.Module): | |
| """Implements a biaffine attention operator for binary relation classification. | |
| PyTorch implementation of the biaffine attention operator from "End-to-end neural relation | |
| extraction using deep biaffine attention" (https://arxiv.org/abs/1812.11275) which can be used | |
| as a classifier for binary relation classification. |
| #!/bin/bash | |
| # seeding adopted from https://stackoverflow.com/a/41962458/7820599 | |
| get_seeded_random() | |
| { | |
| seed="$1"; | |
| openssl enc -aes-256-ctr -pass pass:"$seed" -nosalt \ | |
| </dev/zero 2>/dev/null; | |
| } |
| ssh() { | |
| if [ "$(ps -p $(ps -p $$ -o ppid=) -o comm=)" = "tmux" ]; then | |
| tmux rename-window "$(echo $* | cut -d . -f 1)" | |
| command ssh "$@" | |
| tmux set-window-option automatic-rename "on" 1>/dev/null | |
| else | |
| command ssh "$@" | |
| fi | |
| } |
| """ | |
| License: BSD | |
| Author: Mathieu Blondel | |
| Implements three algorithms for projecting a vector onto the simplex: sort, pivot and bisection. | |
| For details and references, see the following paper: | |
| Large-scale Multiclass Support Vector Machine Training via Euclidean Projection onto the Simplex | |
| Mathieu Blondel, Akinori Fujino, and Naonori Ueda. |
| /* | |
| * I add this to html files generated with pandoc. | |
| */ | |
| html { | |
| font-size: 100%; | |
| overflow-y: scroll; | |
| -webkit-text-size-adjust: 100%; | |
| -ms-text-size-adjust: 100%; | |
| } |
| struct OBJECT{ // The object to be serialized / deserialized | |
| public: | |
| // Members are serialized / deserialized in the order they are declared. Can use bitpacking as well. | |
| DATATYPE member1; | |
| DATATYPE member2; | |
| DATATYPE member3; | |
| DATATYPE member4; | |
| }; | |
| void write(const std::string& file_name, OBJECT& data) // Writes the given OBJECT data to the given file name. |
#A Collection of NLP notes
##N-grams
###Calculating unigram probabilities:
P( wi ) = count ( wi ) ) / count ( total number of words )
In english..
| #!/bin/bash | |
| # bash generate random alphanumeric string | |
| # | |
| # bash generate random 32 character alphanumeric string (upper and lowercase) and | |
| NEW_UUID=$(cat /dev/urandom | tr -dc 'a-zA-Z0-9' | fold -w 32 | head -n 1) | |
| # bash generate random 32 character alphanumeric string (lowercase only) | |
| cat /dev/urandom | tr -dc 'a-z0-9' | fold -w 32 | head -n 1 |