🌲 Invert a binary tree! 🌲
Except with 3 catches:
- It must invert the keys ("bit-reversal permutation")
- It must be a dependency-free, pure recursive function
- It must have type
Bit -> Tree -> Tree(i.e., a direct recursion with max 1 bit state)
| <artifacts_info> | |
| The assistant can create and reference artifacts during conversations. Artifacts are for substantial, self-contained content that users might modify or reuse, displayed in a separate UI window for clarity. | |
| # Good artifacts are... | |
| - Substantial content (>15 lines) | |
| - Content that the user is likely to modify, iterate on, or take ownership of | |
| - Self-contained, complex content that can be understood on its own, without context from the conversation | |
| - Content intended for eventual use outside the conversation (e.g., reports, emails, presentations) | |
| - Content likely to be referenced or reused multiple times |
Diffusion text-to-image models take a short text prompt and turn it into an image. Here are some prompts I've written that worked well:
{"prompts":["scientific rendering of a black hole whose accretion disk is a spiders web, a consciousness holographically projected in 1D space from the bulk of the void", "a tesseract hypercube in an illuminated glow, a tesseract suspended above the dint of reality", "russian cosmonauts driving a rover on the lunar surface in the style of Lucien Rudaux", "symbol of the phoenix, a phoenix rising over all the sentences that have ever been written", "a yin yang symbol where each half is a black snake and a white snake devouring each others tails"]}
Your task is to write 5 more prompts in the way you infer I'd write them from these examples, but based on a combination of subject, style, and setting. For example:
| # Machine Intelligence Made to Impersonate Characteristics: MIMIC | |
| # NOTE run this $ conda install -c conda-forge mpi4py mpich to get mpi working | |
| # accelerate launch --use_deepspeed -m axolotl.cli.train ./config_name_here | |
| base_model: alpindale/Mistral-7B-v0.2-hf | |
| base_model_config: alpindale/Mistral-7B-v0.2-hf | |
| model_type: MistralForCausalLM | |
| tokenizer_type: LlamaTokenizer | |
| is_mistral_derived_model: true |
| from collections import defaultdict | |
| import numpy as np | |
| import pandas as pd | |
| import torch | |
| import torch.nn as nn | |
| from datasets import load_dataset | |
| from rich.console import Console | |
| from rich.table import Table | |
| from transformers import ( | |
| AutoTokenizer, |
Yoav Golderg, February 2024.
Researchers at Google DeepMind released a paper about a learned systems that is able to play blitz-chess at a grandmaster level, without using search. This is interesting and imagination-capturing, because up to now computer-chess systems that play at this level, either based on machine-learning or not, did use a search component.[^1]
Indeed, my first reaction when reading the paper was to tweet wow, crazy and interesting. I still find it crazy and interesting, but upon a closer read, it may not be as crazy and as interesting as I initially thought. Many reactions on twitter, reddit, etc, were super-impressed, going into implications about projected learning abilities of AI systems, the ability of neural networks to learn semantics from observations, etc, which are really over-the-top. The paper does not claim any of them, but they are still perceiv
| MemBlock is a writing format for large language models that helps them overcome | |
| their context window limitations by annotating pieces of text in a document with | |
| metadata and positional information. By breaking the document up into chunks | |
| it can be rearranged in whatever pattern is most helpful for remembering the | |
| contextually relevant information even if it wouldn't 'naturally' appear close | |
| together in a document. MemBlocks also allow for different views on the same | |
| document by letting the user filter for only the information they need to see. | |
| Each MemBlock is written in JSON format, and the document of MemBlocks is in | |
| JSON lines format, which means that each JSON block is separated by a newline |
| import torch | |
| from datasets import load_dataset | |
| from peft import LoraConfig, get_peft_model, prepare_model_for_int8_training | |
| from transformers import AutoModelForCausalLM, AutoTokenizer, TrainingArguments | |
| from trl import SFTTrainer | |
| def train(): | |
| train_dataset = load_dataset("tatsu-lab/alpaca", split="train") | |
| tokenizer = AutoTokenizer.from_pretrained("Salesforce/xgen-7b-8k-base", trust_remote_code=True) |
The target audience is people who are familiar with Urbit's architecture, though not necessarily much of its code.
As some of you already know, i recently left my job as a core dev for the Urbit Foundation to work on a similar system called Plunder. Plunder was created in 2020 by two former Tlon employees, after their proposal for a new version of Nock was rejected. They have since reworked that significantly and built a reference implementation of their own system. You can follow its continued development on its mailing list.
I've known about Plunder for quite some time now, but their recently released demo -- in which the system is used to serve a 70 GB dataset, complete with metadata and searchable -- made me feel the need to explore it again and in greater detail. Doing this with my personal server doesn't feel like a big ask, but there is currentl