will thompson will-thompson-k

## nvidia_install.sh
sudo apt-get remove --purge '^nvidia-.*'
sudo ubuntu-drivers autoinstall
sudo reboot

## normcore-llm.md

      
              1 file
            
          
              339 forks
            
          
                67 comments
              
            
              3507 stars
            
          
                veekaybee
                / normcore-llm.md
            
            
              Last active
              March 9, 2026 02:47
            
              
                Normcore LLM Reads
              
          
    Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
Foundational Concepts


Pre-Transformer Models


## pl_multigpu_infer.py
  import pytorch_lightning as pl
  ...
  # override this method on pytorch-lightning model
  def on_predict_epoch_end(self, results):
      # gather all results onto each device
      # find created world_size from pl.trainer
      results = all_gather(results[0], WORLD_SIZE, self._device)
      # concatenate on the cpu
      results = torch.concat([x.cpu() for x in results], dim=1)
      # output will not preserve input order.

## dispatch_openai_requests.py
# NOTE:
# You can find an updated, more robust and feature-rich implementation
# in Zeno Build
# - Zeno Build: https://github.com/zeno-ml/zeno-build/
# - Implementation: https://github.com/zeno-ml/zeno-build/blob/main/zeno_build/models/providers/openai_utils.py

import openai
import asyncio
from typing import Any
	sudo apt-get remove --purge '^nvidia-.*'
	sudo ubuntu-drivers autoinstall
	sudo reboot
	import pytorch_lightning as pl
	...
	# override this method on pytorch-lightning model
	def on_predict_epoch_end(self, results):
	# gather all results onto each device
	# find created world_size from pl.trainer
	results = all_gather(results[0], WORLD_SIZE, self._device)
	# concatenate on the cpu
	results = torch.concat([x.cpu() for x in results], dim=1)
	# output will not preserve input order.
	# NOTE:
	# You can find an updated, more robust and feature-rich implementation
	# in Zeno Build
	# - Zeno Build: https://github.com/zeno-ml/zeno-build/
	# - Implementation: https://github.com/zeno-ml/zeno-build/blob/main/zeno_build/models/providers/openai_utils.py

	import openai
	import asyncio
	from typing import Any