Skip to content

Instantly share code, notes, and snippets.

View dburkhardt's full-sized avatar

Daniel Burkhardt dburkhardt

View GitHub Profile
@dburkhardt
dburkhardt / DETECTION_ONLY_VS_MARKDOWN_BBOX.md
Last active March 11, 2026 20:09
Nemotron Parse: detection_only vs markdown_bbox — issue reproduction

Nemotron Parse: detection_only vs markdown_bbox

Issue

When calling nvidia/nemotron-parse with tool_choice: detection_only, the API returns bounding boxes but no text content. This causes two problems:

  1. Bounding box geometry differs from what the NVIDIA Build demo produces on the same image.
  2. Page text is always empty, breaking any downstream text extraction.

The correct tool is markdown_bbox, which returns both bounding boxes and inline text for every detected element.

@dburkhardt
dburkhardt / slack-mcp-client.md
Last active February 26, 2026 16:52
Python Slack client + MCP server — fetch messages, threads, normalize markup

Slack MCP Client (Quickstart + Auto Channel Discovery)

Copy/paste implementation for a Slack MCP server where channel discovery is the default path.

If no channel IDs are provided, it will:

  1. list all channels the bot can see,
  2. keep only channels the bot is a member of,
  3. pull messages from those channels.

This is the intended onboarding behavior: no manual channel IDs needed.

@dburkhardt
dburkhardt / test_nvidia_opus_context.py
Last active February 16, 2026 15:22
NVIDIA inference API advertises 1M context for Opus 4.6 but backend rejects >200K tokens (both aws/ and us/aws/ model IDs)
#!/usr/bin/env python3
"""Opus 4.6 advertises 1M context on inference.nvidia.com but rejects >200K."""
import json, os, sys, urllib.request
API_KEY = os.environ.get("NVIDIA_API_KEY") or os.environ.get("NVIDIA_INFERENCE_KEY")
if not API_KEY:
sys.exit("Set NVIDIA_API_KEY env var (from https://inference.nvidia.com)")
# ~950K tokens of filler (well under advertised 1M limit)
@dburkhardt
dburkhardt / issue.md
Last active April 2, 2020 16:18
Issue

Hi, I just wanted to bring this back up again because I've been logging some of the issue's I've encountered. It seems we're at a bit of a philosophical divide, and so perhaps it's best for me to just register which use cases I have that AnnData / scanpy are causing me friction:

Instead of pasting all errors, I'm just going to paste code blocks I wish worked. Note, these are actual use cases I have regularly.

1. Cannot pass AnnData to numpy or sklearn operators

import scanpy as sc
import numpy as np
import pandas as pd
tissue subtissue gene
Endoderm Endoderm PRDX5
Endoderm Endoderm FOXA1
Endoderm Endoderm FOXA2
Endoderm Pharyngeal endoderm NKX2.7
Endoderm Pharyngeal endoderm IRX7
Endoderm Posterior (pancreatic and interstinal endoderm) CDX4
Axial Mesoderm Prechordal plate HE1A
Axial Mesoderm Prechordal plate HE1B
Axial Mesoderm Prechordal plate CTSLB

GENE 760 – Problem Set 3

The purpose of this problem set is to familiarize students with the analysis of mRNA-seq data. By the end of this problem set, you will have learned how to use:

  • STAR to map spliced RNA-seq reads to a genome
  • HTSeq to quantify gene expression as read counts
  • DESeq2 to perform differential expression analysis
  • DAVID and EnrichR to perform gene ontology analysis on a list of genes

Students are to submit a gzipped tarball called _PS3.tar.gz containing:

@dburkhardt
dburkhardt / parquet-cpp_install.log
Created August 27, 2018 18:20
Parquet build failure
==> Making package: parquet-cpp 1.4.0-1 (Mon 27 Aug 2018 02:17:39 PM EDT)
==> Checking runtime dependencies...
==> Checking buildtime dependencies...
==> Retrieving sources...
-> Found apache-parquet-cpp-1.4.0.tar.gz
==> Validating source files with sha256sums...
apache-parquet-cpp-1.4.0.tar.gz ... Passed
==> Extracting sources...
-> Extracting apache-parquet-cpp-1.4.0.tar.gz with bsdtar
==> Removing existing $pkgdir/ directory...