Flávia Eichemberger Rius flaviaerius

## qutest.md

      
              1 file
            
          
              1 fork
            
          
                0 comments
              
            
              2 stars
            
          
                mdsumner
                / qutest.md
            
            
              Last active
              April 20, 2024 11:15
            
          
    docker pull ghcr.io/rocker-org/verse:4.3.3
docker run --rm -ti ghcr.io/rocker-org/verse:4.3.3  bash
 
quarto create project book abook

cd abook
## I turned off PDF rendering in _quarto.yml because lots of deps
quarto render


## BCFtools cheat sheet
*bcftools filter
*Filter variants per region (in this example, print out only variants mapped to chr1 and chr2)
qbcftools filter -r1,2 ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.hg38.vcf.gz

*printing out info for only 2 samples:
bcftools view -s NA20818,NA20819 filename.vcf.gz

*printing stats only for variants passing the filter:
bcftools view -f PASS filename.vcf.gz

## ds-project-organization.md

      
              1 file
            
          
              55 forks
            
          
                51 comments
              
            
              347 stars
            
          
                ericmjl
                / ds-project-organization.md
            
            
              Last active
              January 16, 2026 15:19
            
              
                How to organize your Python data science project
              
          
    UPDATE: I have baked the ideas in this file inside a Python CLI tool called pyds-cli. Please find it here: https://github.com/ericmjl/pyds-cli
How to organize your Python data science project

Having done a number of data projects over the years, and having seen a number of them up on GitHub, I've come to see that there's a wide range in terms of how "readable" a project is. I'd like to share some practices that I have come to adopt in my projects, which I hope will bring some organization to your projects.
Disclaimer: I'm hoping nobody takes this to be "the definitive guide" to organizing a data project; rather, I hope you, the reader, find useful tips that you can adapt to your own projects.
Disclaimer 2: What I’m writing below is primarily geared towards Python language users. Some ideas may be transferable to other languages; others may not be so. Please feel free to remix whatever you see here!

  
## install-samtools-bcftools-and-htslib.md

      
              1 file
            
          
              7 forks
            
          
                8 comments
              
            
              20 stars
            
          
                adefelicibus
                / install-samtools-bcftools-and-htslib.md
            
            
              Last active
              April 3, 2024 20:49
            
              
                Install samtools, bcftools and htslib on linux
              
          
    Install Samtools, BCFTools and htslib on linux

Install some build dependencies

sudo apt-get install autoconf automake make gcc perl zlib1g-dev libbz2-dev liblzma-dev libcurl4-gnutls-dev libssl-dev libncurses5-dev
[samtools]


## clean_code.md

      
              1 file
            
          
              1431 forks
            
          
                197 comments
              
            
              7273 stars
            
          
                wojteklu
                / clean_code.md
            
            
              Last active
              January 28, 2026 14:55
            
              
                Summary of 'Clean code' by Robert C. Martin
              
          
    Code is clean if it can be understood easily – by everyone on the team. Clean code can be read and enhanced by a developer other than its original author. With understandability comes readability, changeability, extensibility and maintainability.

General rules


Follow standard conventions.
Keep it simple stupid. Simpler is always better. Reduce complexity as much as possible.
Boy scout rule. Leave the campground cleaner than you found it.
Always find root cause. Always look for the root cause of a problem.

Design rules


## subset.POSIXct.R
# Create a simple data frame for testing
df <- data.frame(POSIXtime = seq(as.POSIXct('2013-08-02 12:00'),
                             as.POSIXct('2013-08-06 05:00'), len = 45),
                 x = seq(45))

# The Subset Examples
#
# All data on 2013-08-06
sub.1 <- subset(df, format(POSIXtime,'%d')=='06')
	*bcftools filter
	*Filter variants per region (in this example, print out only variants mapped to chr1 and chr2)
	qbcftools filter -r1,2 ALL.chip.omni_broad_sanger_combined.20140818.snps.genotypes.hg38.vcf.gz

	*printing out info for only 2 samples:
	bcftools view -s NA20818,NA20819 filename.vcf.gz

	*printing stats only for variants passing the filter:
	bcftools view -f PASS filename.vcf.gz
	# Create a simple data frame for testing
	df <- data.frame(POSIXtime = seq(as.POSIXct('2013-08-02 12:00'),
	as.POSIXct('2013-08-06 05:00'), len = 45),
	x = seq(45))

	# The Subset Examples
	#
	# All data on 2013-08-06
	sub.1 <- subset(df, format(POSIXtime,'%d')=='06')