Automated Audio Transcription (AAT) / Automated Music Transcription (AMT) (aka: converting audio to midi)
Some notes on Automated Audio Transcription (AAT) / Automated Music Transcription (AMT) (aka: converting audio to midi)
TL;DR:
ChatGPT appeared like an explosion on all my social media timelines in early December 2022. While I keep up with machine learning as an industry, I wasn't focused so much on this particular corner, and all the screenshots seemed like they came out of nowhere. What was this model? How did the chat prompting work? What was the context of OpenAI doing this work and collecting my prompts for training data?
I decided to do a quick investigation. Here's all the information I've found so far. I'm aggregating and synthesizing it as I go, so it's currently changing pretty frequently.
| import torch | |
| import torch.nn as nn | |
| import torch.nn.functional as F | |
| from typing import Optional | |
| class WeightedKappaLoss(nn.Module): | |
| """ | |
| Implements Weighted Kappa Loss. Weighted Kappa Loss was introduced in the | |
| [Weighted kappa loss function for multi-class classification |
| from datetime import datetime | |
| from typing import Optional | |
| import pandas as pd | |
| from rich import box | |
| from rich.console import Console | |
| from rich.table import Table | |
| console = Console() |
| # DFS | |
| def preorder(self, root): | |
| if not root: | |
| return [] | |
| ret = [] | |
| stack = [root] | |
| while stack: | |
| node = stack.pop() | |
| ret.append(node.val) | |
| if node.right: |
| """ | |
| A collection of helper functions for optimization with JAX. | |
| UPDATE: This is obsolete now that `jax.scipy.optimize.minimize` is exists! | |
| """ | |
| import numpy as onp | |
| import scipy.optimize | |
| from jax import grad, jit | |
| from jax.tree_util import tree_flatten, tree_unflatten | |
| from jax.flatten_util import ravel_pytree |
| ## Function to reduce the DF size | |
| def reduce_mem_usage(df, verbose=True): | |
| numerics = ['int16', 'int32', 'int64', 'float16', 'float32', 'float64'] | |
| start_mem = df.memory_usage().sum() / 1024**2 | |
| for col in df.columns: | |
| col_type = df[col].dtypes | |
| if col_type in numerics: | |
| c_min = df[col].min() | |
| c_max = df[col].max() | |
| if str(col_type)[:3] == 'int': |