Skip to content

Instantly share code, notes, and snippets.

View flaviaerius's full-sized avatar

Flávia Eichemberger Rius flaviaerius

View GitHub Profile
docker pull ghcr.io/rocker-org/verse:4.3.3
docker run --rm -ti ghcr.io/rocker-org/verse:4.3.3  bash
 
quarto create project book abook

cd abook
## I turned off PDF rendering in _quarto.yml because lots of deps
quarto render
@elowy01
elowy01 / BCFtools cheat sheet
Last active December 3, 2025 20:24
BCFtools cheat sheet
*bcftools filter
*Filter variants per region (in this example, print out only variants mapped to chr1 and chr2)
qbcftools filter -r1,2 ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.hg38.vcf.gz
*printing out info for only 2 samples:
bcftools view -s NA20818,NA20819 filename.vcf.gz
*printing stats only for variants passing the filter:
bcftools view -f PASS filename.vcf.gz
@ericmjl
ericmjl / ds-project-organization.md
Last active November 29, 2025 20:16
How to organize your Python data science project

UPDATE: I have baked the ideas in this file inside a Python CLI tool called pyds-cli. Please find it here: https://github.com/ericmjl/pyds-cli

How to organize your Python data science project

Having done a number of data projects over the years, and having seen a number of them up on GitHub, I've come to see that there's a wide range in terms of how "readable" a project is. I'd like to share some practices that I have come to adopt in my projects, which I hope will bring some organization to your projects.

Disclaimer: I'm hoping nobody takes this to be "the definitive guide" to organizing a data project; rather, I hope you, the reader, find useful tips that you can adapt to your own projects.

Disclaimer 2: What I’m writing below is primarily geared towards Python language users. Some ideas may be transferable to other languages; others may not be so. Please feel free to remix whatever you see here!

@adefelicibus
adefelicibus / install-samtools-bcftools-and-htslib.md
Last active April 3, 2024 20:49
Install samtools, bcftools and htslib on linux

Install Samtools, BCFTools and htslib on linux

Install some build dependencies

sudo apt-get install autoconf automake make gcc perl zlib1g-dev libbz2-dev liblzma-dev libcurl4-gnutls-dev libssl-dev libncurses5-dev

[samtools]

@wojteklu
wojteklu / clean_code.md
Last active December 6, 2025 13:31
Summary of 'Clean code' by Robert C. Martin

Code is clean if it can be understood easily – by everyone on the team. Clean code can be read and enhanced by a developer other than its original author. With understandability comes readability, changeability, extensibility and maintainability.


General rules

  1. Follow standard conventions.
  2. Keep it simple stupid. Simpler is always better. Reduce complexity as much as possible.
  3. Boy scout rule. Leave the campground cleaner than you found it.
  4. Always find root cause. Always look for the root cause of a problem.

Design rules

@rich-iannone
rich-iannone / subset.POSIXct.R
Last active April 8, 2022 08:48
Several examples in R on creating subsets of data via a POSIXct time object.
# Create a simple data frame for testing
df <- data.frame(POSIXtime = seq(as.POSIXct('2013-08-02 12:00'),
as.POSIXct('2013-08-06 05:00'), len = 45),
x = seq(45))
# The Subset Examples
#
# All data on 2013-08-06
sub.1 <- subset(df, format(POSIXtime,'%d')=='06')