Created
March 19, 2026 21:42
-
-
Save wojtyniakAQ/2215af01f5934501efce444dc02bf360 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| { | |
| "cells": [ | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "# Developmental Regulation of Cell Type-Specific Transcription by Novel Promoter-Proximal Sequence Elements\n", | |
| "\n", | |
| "**Paper:** Lu et al. (2020) *Genes & Development* 34:663-677 \n", | |
| "**Authors:** Dan Lu, Ho-Su Sin, Chenggang Lu, Margaret T. Fuller\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## Overview\n", | |
| "\n", | |
| "This notebook provides an educational walkthrough of the computational methods used to identify and characterize promoter-proximal sequence elements that regulate cell type-specific transcription during Drosophila spermatogenesis.\n", | |
| "\n", | |
| "### Biological Context\n", | |
| "\n", | |
| "During the transition from proliferating spermatogonia to differentiating spermatocytes in *Drosophila*, over 3,000 genes are either:\n", | |
| "- Newly expressed (\"off-to-on\" genes)\n", | |
| "- Expressed from new alternative promoters\n", | |
| "\n", | |
| "This dramatic transcriptional shift is orchestrated by:\n", | |
| "1. **tMAC complex** - a testis-specific chromatin binding complex that opens promoters\n", | |
| "2. **Achi/Vis** - TALE-class homeodomain transcription factors\n", | |
| "3. **Novel promoter motifs** - sequence elements that guide efficient transcription initiation\n", | |
| "\n", | |
| "### Key Findings\n", | |
| "\n", | |
| "- Spermatocyte-specific promoters lack canonical core promoter elements (TATA, DPE) except the Inr\n", | |
| "- Instead, they are enriched for:\n", | |
| " - **tMAC-ChIP motif** (upstream, ~60 bp from transcription start)\n", | |
| " - **Achi/Vis motif** (TGTCA)\n", | |
| " - **Downstream motifs**: ACA at positions +26/+28/+30, CNAAATT at +29 to +60\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## Notebook Structure\n", | |
| "\n", | |
| "This notebook is organized into the following sections:\n", | |
| "\n", | |
| "1. **Methodological Background** - Detailed explanation of analytical approaches\n", | |
| "2. **Statistical Frameworks** - Statistical methods and their assumptions\n", | |
| "3. **Data Preparation** - Generate synthetic example data\n", | |
| "4. **Motif Discovery** - De novo identification of sequence motifs\n", | |
| "5. **Positional Enrichment Analysis** - Where motifs occur relative to TSS\n", | |
| "6. **TSS Usage Analysis** - How motifs affect transcription efficiency\n", | |
| "7. **Logistic Regression** - Predicting promoter classes from motif composition\n", | |
| "8. **Visualization** - Comprehensive plots of results\n", | |
| "\n", | |
| "**Note:** This notebook uses small-scale synthetic data to demonstrate the workflow within computational constraints (4GB RAM, ~10 minute runtime). For full-scale analysis, researchers would use the complete genomic datasets on their own infrastructure." | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "---\n", | |
| "\n", | |
| "# Part 1: Methodological Background and Statistical Considerations\n", | |
| "\n", | |
| "Before implementing the analysis, it's crucial to understand the methods, their underlying assumptions, and statistical frameworks.\n", | |
| "\n", | |
| "## 1.1 Data Types and Generation Methods\n", | |
| "\n", | |
| "### RNA-seq (RNA Sequencing)\n", | |
| "\n", | |
| "**Purpose:** Quantify transcript levels genome-wide\n", | |
| "\n", | |
| "**Method:**\n", | |
| "- Total RNA extracted from testes at different time points\n", | |
| "- rRNA depleted using Ribo-Zero kit\n", | |
| "- cDNA libraries prepared with stranded protocol\n", | |
| "- Illumina NextSeq 500 sequencing (75 bp paired-end)\n", | |
| "- ~34-40 million reads per replicate, 2 biological replicates\n", | |
| "\n", | |
| "**Statistical Considerations:**\n", | |
| "- **Normalization:** DESeq2's variance-stabilizing transformation accounts for library size differences\n", | |
| "- **Replication:** Biological replicates essential for estimating biological variability\n", | |
| "- **Differential expression:** Negative binomial model accounts for overdispersion in count data\n", | |
| "\n", | |
| "### CAGE (Cap Analysis Gene Expression)\n", | |
| "\n", | |
| "**Purpose:** Precisely map transcription start sites (TSS) and quantify usage\n", | |
| "\n", | |
| "**Method:**\n", | |
| "- Captures 5' cap structure of mRNAs\n", | |
| "- nAnTi-CAGE protocol (non-amplifying, non-tagging)\n", | |
| "- 300 pairs of testes per replicate\n", | |
| "- ~25-30 million reads per replicate\n", | |
| "\n", | |
| "**Key Metric - RETI (Region of Efficient Transcription Initiation):**\n", | |
| "- Defined as region containing central 80% of CAGE signal\n", | |
| "- Calculated by trimming 10% from each side\n", | |
| "- Captures information about TSS distribution better than simple \"promoter width\"\n", | |
| "\n", | |
| "**Statistical Considerations:**\n", | |
| "- CAGE signal represents direct evidence of transcription initiation\n", | |
| "- Multiple TSSs per promoter require ranking and comparative analysis\n", | |
| "- \"Dominant TSS\" = most frequently used TSS within a CAGE cluster\n", | |
| "\n", | |
| "### ATAC-seq (Assay for Transposase-Accessible Chromatin)\n", | |
| "\n", | |
| "**Purpose:** Map chromatin accessibility genome-wide\n", | |
| "\n", | |
| "**Method:**\n", | |
| "- Tn5 transposase preferentially inserts into open chromatin\n", | |
| "- 10-20 pairs of testes per technical replicate\n", | |
| "- Technical replicates from same cross combined as biological replicate\n", | |
| "- HiSeq 4000 (75 bp paired-end)\n", | |
| "\n", | |
| "**Interpretation:**\n", | |
| "- ATAC-seq signal indicates nucleosome-free regions\n", | |
| "- Can infer nucleosome positions from signal gaps\n", | |
| "- Used NucleoATAC tool for nucleosome positioning\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## 1.2 Motif Discovery Methods\n", | |
| "\n", | |
| "### MEME (Multiple EM for Motif Elicitation)\n", | |
| "\n", | |
| "**Algorithm:**\n", | |
| "- Expectation-Maximization (EM) algorithm\n", | |
| "- Searches for ungapped sequence patterns (motifs)\n", | |
| "- Assumes motifs occur zero or one time per sequence\n", | |
| "\n", | |
| "**Statistical Framework:**\n", | |
| "- Position Weight Matrix (PWM) representation\n", | |
| "- E-value: Expected number of motifs with similar or better score by chance\n", | |
| "- Lower E-value = more significant enrichment\n", | |
| "\n", | |
| "**In this study:**\n", | |
| "- Identified tMAC-ChIP motif (E-value = 2.5×10⁻¹⁰⁴)\n", | |
| "- Searched 300 bp regions centered on CAGE clusters\n", | |
| "\n", | |
| "### DREME (Discriminative Regular Expression Motif Elicitation)\n", | |
| "\n", | |
| "**Algorithm:**\n", | |
| "- Specialized for short motifs (typically 3-8 bp)\n", | |
| "- Uses Fisher's exact test for enrichment\n", | |
| "- Faster than MEME for short patterns\n", | |
| "\n", | |
| "**In this study:**\n", | |
| "- Identified Achi/Vis motif TGTCA (E-value = 1.9×10⁻¹⁰¹)\n", | |
| "- Identified CNAAATT motif (E-value = 6.9×10⁻⁷¹)\n", | |
| "\n", | |
| "### CENTRIMO (Central Motif Discovery)\n", | |
| "\n", | |
| "**Purpose:** Identify positional enrichment of motifs\n", | |
| "\n", | |
| "**Algorithm:**\n", | |
| "- Tests whether motif occurrences are centrally enriched\n", | |
| "- Uses binomial test for each position\n", | |
| "- Corrects for multiple testing\n", | |
| "\n", | |
| "**Key Findings:**\n", | |
| "- tMAC-ChIP motif: ~60 bp upstream of RETI 3' edge (P < 1×10⁻⁵⁰)\n", | |
| "- Achi/Vis motif: -50 to -5 bp from RETI 3' edge\n", | |
| "- ACA motif: Positions +26, +28, +30 relative to dominant TSS\n", | |
| "- CNAAATT motif: +29 to +60 bp from dominant TSS\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## 1.3 Statistical Methods for TSS Usage Analysis\n", | |
| "\n", | |
| "### Research Question\n", | |
| "\n", | |
| "Do specific sequence motifs correlate with more efficient usage of transcription start sites?\n", | |
| "\n", | |
| "### Two Complementary Approaches\n", | |
| "\n", | |
| "#### Approach 1: Within-Promoter TSS Ranking\n", | |
| "\n", | |
| "**Rationale:** Within a single promoter, multiple potential TSSs exist. Which ones are actually used most frequently?\n", | |
| "\n", | |
| "**Method:**\n", | |
| "- For each promoter with multiple TSSs, rank them by CAGE signal\n", | |
| "- Compare motif presence/absence for different rank classes\n", | |
| "- Statistical test: Chi-square or Fisher's exact test for rank distribution\n", | |
| "\n", | |
| "**Key Finding:**\n", | |
| "- TSSs with TCA (Inr motif) are more highly ranked\n", | |
| "- TSSs with well-positioned ACA (+26/+28/+30) are more highly ranked\n", | |
| "- Effects are additive\n", | |
| "\n", | |
| "#### Approach 2: Across-Promoter Expression Comparison\n", | |
| "\n", | |
| "**Rationale:** Among all \"off-to-on\" genes (similar upstream regulation), do motifs correlate with expression level?\n", | |
| "\n", | |
| "**Method:**\n", | |
| "- Compare expression levels (CAGE signal) of dominant TSSs\n", | |
| "- Group by motif presence/absence\n", | |
| "- Statistical test: t-test or Wilcoxon rank-sum test\n", | |
| "\n", | |
| "**Important consideration:**\n", | |
| "- Most off-to-on genes depend on tMAC for expression\n", | |
| "- This provides relatively uniform upstream regulation\n", | |
| "- Makes it valid to compare expression levels to assess downstream motif effects\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## 1.4 Logistic Regression for Promoter Classification\n", | |
| "\n", | |
| "### Problem Definition\n", | |
| "\n", | |
| "Can we predict whether a promoter will be \"narrow-high\" (narrow RETI <11 bp with high expression) based on motif composition?\n", | |
| "\n", | |
| "### Model Structure\n", | |
| "\n", | |
| "**Binary logistic regression:**\n", | |
| "\n", | |
| "$$\\log\\left(\\frac{P(\\text{narrow-high})}{1-P(\\text{narrow-high})}\\right) = \\beta_0 + \\beta_1 X_{\\text{tMAC}} + \\beta_2 X_{\\text{Achi/Vis}} + \\beta_3 X_{\\text{TCA}} + \\beta_4 X_{\\text{ACA}} + \\beta_5 X_{\\text{CNAAATT}}$$\n", | |
| "\n", | |
| "Where:\n", | |
| "- $X_i$ are binary indicators (0/1) for motif presence at optimal position\n", | |
| "- $\\beta_i$ are coefficients estimated from data\n", | |
| "\n", | |
| "### Assumptions\n", | |
| "\n", | |
| "1. **Independence:** Motif presences are (approximately) independent predictors\n", | |
| "2. **Linearity:** Log-odds is linear in the predictors\n", | |
| "3. **No perfect multicollinearity:** Motifs don't perfectly predict each other\n", | |
| "\n", | |
| "### Results from Paper\n", | |
| "\n", | |
| "- All motifs significantly contribute (P-values: tMAC/TCA/ACA < 1×10⁻¹⁰, Achi/Vis = 2.4×10⁻⁶, CNAAATT = 5×10⁻⁴)\n", | |
| "- Promoters with all 5 motifs: 92% ± 5.5% probability of being narrow-high\n", | |
| "- Effects are additive: more motifs → higher probability\n", | |
| "\n", | |
| "### Interpretation\n", | |
| "\n", | |
| "The combination of:\n", | |
| "- Upstream motifs (tMAC-ChIP, Achi/Vis) creating open chromatin\n", | |
| "- Initiator motif (TCA) at +1\n", | |
| "- Downstream motifs (ACA, CNAAATT) at specific positions\n", | |
| "\n", | |
| "Together define a robust, highly expressed, narrowly-initiated promoter architecture.\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## 1.5 Multiple Testing Correction\n", | |
| "\n", | |
| "### The Problem\n", | |
| "\n", | |
| "When testing thousands of promoters or positions, we expect some \"significant\" results by chance.\n", | |
| "\n", | |
| "### Solutions Used\n", | |
| "\n", | |
| "1. **Benjamini-Hochberg FDR** (used by DESeq2 for differential expression)\n", | |
| " - Controls False Discovery Rate\n", | |
| " - More powerful than Bonferroni for many tests\n", | |
| "\n", | |
| "2. **Bonferroni correction** (used by CENTRIMO for positional enrichment)\n", | |
| " - Controls Family-Wise Error Rate\n", | |
| " - Conservative but appropriate for spatial scanning\n", | |
| "\n", | |
| "3. **Empirical assessment** (motif discovery)\n", | |
| " - E-values from MEME/DREME include background model\n", | |
| " - Extremely low E-values (10⁻¹⁰⁰) far exceed any reasonable correction\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## 1.6 Nucleosome Positioning Analysis\n", | |
| "\n", | |
| "### Biological Context\n", | |
| "\n", | |
| "DNA wrapped around nucleosomes is generally inaccessible to transcription factors. The tMAC complex appears to create nucleosome-free regions (NFRs) at promoters.\n", | |
| "\n", | |
| "### Analytical Approach\n", | |
| "\n", | |
| "**NucleoATAC algorithm:**\n", | |
| "- Uses ATAC-seq signal to infer nucleosome positions\n", | |
| "- Identifies periodic patterns in fragment size distribution\n", | |
| "- Estimates nucleosome dyad positions (center of nucleosome)\n", | |
| "\n", | |
| "**Key Measurement:**\n", | |
| "- Assuming 147 bp DNA per nucleosome\n", | |
| "- Distance between -1 and +1 nucleosome dyads averaged 253 bp\n", | |
| "- Therefore: ~100 bp nucleosome-free region at promoters\n", | |
| "\n", | |
| "### Biological Interpretation\n", | |
| "\n", | |
| "This ~100 bp NFR contains:\n", | |
| "- tMAC binding site\n", | |
| "- Achi/Vis binding site\n", | |
| "- Region of Efficient Transcription Initiation (RETI)\n", | |
| "\n", | |
| "The spatial constraint imposed by nucleosomes explains why motif positions are so precisely defined.\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## 1.7 Experimental Design Considerations\n", | |
| "\n", | |
| "### Heat-Shock Bam Time Course System\n", | |
| "\n", | |
| "**Why needed:**\n", | |
| "- In wild-type testes, cells at all stages are mixed\n", | |
| "- Impossible to get pure populations for each stage\n", | |
| "\n", | |
| "**Solution:**\n", | |
| "- *bam* mutants: spermatogonia proliferate indefinitely without differentiating\n", | |
| "- Heat-shock drives transient Bam expression\n", | |
| "- Synchronized differentiation of accumulated spermatogonia\n", | |
| "\n", | |
| "**Time points:**\n", | |
| "- **bam⁻/⁻**: Enriched for spermatogonia\n", | |
| "- **48 hrPHS** (hours post heat-shock): Early spermatocytes\n", | |
| "- **72 hrPHS**: Mature spermatocytes\n", | |
| "\n", | |
| "**Limitations:**\n", | |
| "- Samples still contain somatic cells\n", | |
| "- New spermatogonia continue to accumulate over time\n", | |
| "- May underestimate magnitude of changes\n", | |
| "\n", | |
| "### Biological Replicates\n", | |
| "\n", | |
| "All experiments used ≥2 biological replicates to:\n", | |
| "- Estimate biological variability\n", | |
| "- Enable statistical inference\n", | |
| "- Ensure reproducibility\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "# Part 2: Implementation\n", | |
| "\n", | |
| "Now that we understand the methodological framework, we'll implement the key analytical steps with working code.\n", | |
| "\n", | |
| "## 2.1 Setup and Dependencies" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 1, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[37m⠋\u001b[0m \u001b[2mResolving dependencies... \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mResolving dependencies... \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠋\u001b[0m \u001b[2mResolving dependencies... \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mResolving dependencies... \u001b[0m" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mnumpy==2.4.2 \u001b[0m" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mpandas==3.0.1 \u001b[0m" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mnumpy==2.4.2 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mscipy==1.17.1 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mmatplotlib==3.10.8 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mseaborn==0.13.2 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mbiopython==1.86 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mscikit-learn==1.8.0 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mlogomaker==0.8.7 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mpython-dateutil==2.9.0.post0 \u001b[0m" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mcontourpy==1.3.3 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mcontourpy==1.3.3 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mcycler==0.12.1 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mfonttools==4.62.1 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mkiwisolver==1.5.0 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mpackaging==26.0 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mpillow==12.1.1 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mpyparsing==3.3.2 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mjoblib==1.5.3 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2mthreadpoolctl==3.6.0 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2msix==1.17.0 \u001b[0m\r", | |
| "\u001b[2K\u001b[37m⠹\u001b[0m \u001b[2m \u001b[0m\r", | |
| "\u001b[2K\u001b[2mResolved \u001b[1m19 packages\u001b[0m \u001b[2min 235ms\u001b[0m\u001b[0m\r\n" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[37m⠋\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/0) \r", | |
| "\u001b[2K\u001b[37m⠋\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15) \r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15) \r", | |
| "\u001b[2K\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[1A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[1A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/8.13 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/33.57 MiB \u001b[4A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[4A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 16.00 KiB/33.57 MiB \u001b[4A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[4A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mcycler \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 8.13 KiB/8.13 KiB\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 16.00 KiB/33.57 MiB \u001b[4A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[4A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 16.00 KiB/33.57 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 16.00 KiB/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 16.00 KiB/33.57 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 16.00 KiB/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 32.00 KiB/33.57 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 18.20 KiB/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 32.00 KiB/33.57 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mthreadpoolctl \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 18.20 KiB/18.20 KiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 32.00 KiB/33.57 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 0 B/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 32.00 KiB/33.57 MiB \u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mpyparsing \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 15.56 KiB/119.90 KiB\r\n", | |
| "\u001b[2mseaborn \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 32.00 KiB/288.00 KiB\r\n", | |
| "\u001b[2mjoblib \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 49.17 KiB/301.83 KiB\r\n", | |
| "\u001b[2mcontourpy \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 30.91 KiB/354.35 KiB\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 14.88 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 32.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 32.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 30.88 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 30.91 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 31.89 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 109.15 KiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 30.91 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 240.00 KiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[13A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[13A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mpyparsing \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 119.90 KiB/119.90 KiB\r\n", | |
| "\u001b[2mseaborn \u001b[0m \u001b[32m-------------------\u001b[30m\u001b[2m-----------\u001b[0m\u001b[0m 188.75 KiB/288.00 KiB\r\n", | |
| "\u001b[2mjoblib \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 159.00 KiB/301.83 KiB\r\n", | |
| "\u001b[2mcontourpy \u001b[0m \u001b[32m----------------------------\u001b[30m\u001b[2m--\u001b[0m\u001b[0m 334.91 KiB/354.35 KiB\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m--------\u001b[30m\u001b[2m----------------------\u001b[0m\u001b[0m 412.36 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 256.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 448.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 174.88 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 440.56 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 304.00 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 531.24 KiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 431.71 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 352.00 KiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[13A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[13A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mseaborn \u001b[0m \u001b[32m----------------------\u001b[30m\u001b[2m--------\u001b[0m\u001b[0m 220.75 KiB/288.00 KiB\r\n", | |
| "\u001b[2mjoblib \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 159.00 KiB/301.83 KiB\r\n", | |
| "\u001b[2mcontourpy \u001b[0m \u001b[32m-----------------------------\u001b[30m\u001b[2m-\u001b[0m\u001b[0m 350.80 KiB/354.35 KiB\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 456.56 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 320.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 480.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 174.88 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 440.56 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 304.00 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 547.24 KiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 472.56 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 352.00 KiB/33.57 MiB \u001b[12A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[12A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mseaborn \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 252.75 KiB/288.00 KiB\r\n", | |
| "\u001b[2mjoblib \u001b[0m \u001b[32m------------------\u001b[30m\u001b[2m------------\u001b[0m\u001b[0m 191.00 KiB/301.83 KiB\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 478.54 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 496.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 512.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 222.88 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 504.56 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 384.00 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 573.15 KiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 504.56 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 368.00 KiB/33.57 MiB \u001b[11A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[11A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mjoblib \u001b[0m \u001b[32m-------------------------\u001b[30m\u001b[2m-----\u001b[0m\u001b[0m 254.89 KiB/301.83 KiB\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 478.54 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 496.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 512.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 270.88 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 504.56 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 490.72 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 573.15 KiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 504.56 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 416.00 KiB/33.57 MiB \u001b[10A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[10A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mjoblib \u001b[0m \u001b[32m----------------------------\u001b[30m\u001b[2m--\u001b[0m\u001b[0m 287.00 KiB/301.83 KiB\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 654.12 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 592.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 544.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 632.56 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 611.31 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 587.00 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 643.04 KiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 593.78 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 544.00 KiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[10A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[10A\u001b[37m⠙\u001b[0m \u001b[2mPreparing packages...\u001b[0m (0/15)\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 744.56 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 592.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 544.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 808.56 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 680.56 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 667.00 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 659.04 KiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 744.56 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 544.00 KiB/33.57 MiB \u001b[9A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[9A\u001b[37m⠹\u001b[0m \u001b[2mPreparing packages...\u001b[0m (6/15)\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m--------------------\u001b[30m\u001b[2m----------\u001b[0m\u001b[0m 963.56 KiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m-------\u001b[30m\u001b[2m-----------------------\u001b[0m\u001b[0m 752.00 KiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m------\u001b[30m\u001b[2m------------------------\u001b[0m\u001b[0m 987.00 KiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 984.45 KiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 995.56 KiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 955.00 KiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 1.04 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 977.97 KiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m\u001b[30m\u001b[2m------------------------------\u001b[0m\u001b[0m 1.11 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[9A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[9A\u001b[37m⠹\u001b[0m \u001b[2mPreparing packages...\u001b[0m (6/15)\r\n", | |
| "\u001b[2mkiwisolver \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 1.41 MiB/1.41 MiB\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 1.03 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m--------\u001b[30m\u001b[2m----------------------\u001b[0m\u001b[0m 1.32 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m------\u001b[30m\u001b[2m------------------------\u001b[0m\u001b[0m 1.46 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 1.26 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 1.37 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 1.56 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 1.44 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 1.38 MiB/33.57 MiB \u001b[9A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[9A\u001b[37m⠹\u001b[0m \u001b[2mPreparing packages...\u001b[0m (6/15)\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 1.03 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 1.52 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m------\u001b[30m\u001b[2m------------------------\u001b[0m\u001b[0m 1.47 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 1.29 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 1.37 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 1.56 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 1.44 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 1.42 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[8A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[8A\u001b[37m⠹\u001b[0m \u001b[2mPreparing packages...\u001b[0m (6/15)\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m--------------\u001b[30m\u001b[2m----------------\u001b[0m\u001b[0m 1.53 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 1.55 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m------\u001b[30m\u001b[2m------------------------\u001b[0m\u001b[0m 1.49 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 1.65 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 1.50 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 1.59 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 1.47 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-\u001b[30m\u001b[2m-----------------------------\u001b[0m\u001b[0m 1.74 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[8A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[8A\u001b[37m⠹\u001b[0m \u001b[2mPreparing packages...\u001b[0m (6/15)\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m----------------\u001b[30m\u001b[2m--------------\u001b[0m\u001b[0m 1.69 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 2.18 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 2.08 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-------\u001b[30m\u001b[2m-----------------------\u001b[0m\u001b[0m 2.11 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-------\u001b[30m\u001b[2m-----------------------\u001b[0m\u001b[0m 2.11 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m------\u001b[30m\u001b[2m------------------------\u001b[0m\u001b[0m 2.19 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m----\u001b[30m\u001b[2m--------------------------\u001b[0m\u001b[0m 2.07 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 2.36 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[8A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[8A\u001b[37m⠹\u001b[0m \u001b[2mPreparing packages...\u001b[0m (6/15)\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m-------------------\u001b[30m\u001b[2m-----------\u001b[0m\u001b[0m 2.05 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 2.43 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 2.34 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-------\u001b[30m\u001b[2m-----------------------\u001b[0m\u001b[0m 2.15 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-------\u001b[30m\u001b[2m-----------------------\u001b[0m\u001b[0m 2.16 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m------\u001b[30m\u001b[2m------------------------\u001b[0m\u001b[0m 2.36 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 2.38 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 2.49 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[8A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[8A\u001b[37m⠸\u001b[0m \u001b[2mPreparing packages...\u001b[0m (7/15)\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m-----------------------\u001b[30m\u001b[2m-------\u001b[0m\u001b[0m 2.39 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m----------------\u001b[30m\u001b[2m--------------\u001b[0m\u001b[0m 2.64 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 2.96 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 2.87 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-------\u001b[30m\u001b[2m-----------------------\u001b[0m\u001b[0m 2.25 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m--------\u001b[30m\u001b[2m----------------------\u001b[0m\u001b[0m 3.02 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-------\u001b[30m\u001b[2m-----------------------\u001b[0m\u001b[0m 2.95 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m--\u001b[30m\u001b[2m----------------------------\u001b[0m\u001b[0m 3.18 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[8A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[8A\u001b[37m⠸\u001b[0m \u001b[2mPreparing packages...\u001b[0m (7/15)\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m-------------------------\u001b[30m\u001b[2m-----\u001b[0m\u001b[0m 2.66 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m----------------------\u001b[30m\u001b[2m--------\u001b[0m\u001b[0m 3.54 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 3.56 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-----------\u001b[30m\u001b[2m-------------------\u001b[0m\u001b[0m 3.22 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m--------\u001b[30m\u001b[2m----------------------\u001b[0m\u001b[0m 2.48 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 3.56 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m--------\u001b[30m\u001b[2m----------------------\u001b[0m\u001b[0m 3.50 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 3.75 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[8A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[8A\u001b[37m⠸\u001b[0m \u001b[2mPreparing packages...\u001b[0m (7/15)\r\n", | |
| "\u001b[2mbiopython \u001b[0m \u001b[32m----------------------------\u001b[30m\u001b[2m--\u001b[0m\u001b[0m 2.94 MiB/3.09 MiB\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m-----------------------\u001b[30m\u001b[2m-------\u001b[0m\u001b[0m 3.67 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m------------------\u001b[30m\u001b[2m------------\u001b[0m\u001b[0m 4.22 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 4.16 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 3.00 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m------------\u001b[30m\u001b[2m------------------\u001b[0m\u001b[0m 4.18 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 4.14 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 4.05 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[8A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[8A\u001b[37m⠸\u001b[0m \u001b[2mPreparing packages...\u001b[0m (7/15)\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m-----------------------\u001b[30m\u001b[2m-------\u001b[0m\u001b[0m 3.76 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m--------------------\u001b[30m\u001b[2m----------\u001b[0m\u001b[0m 4.56 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m----------------\u001b[30m\u001b[2m--------------\u001b[0m\u001b[0m 4.47 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 3.09 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 4.51 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m----------\u001b[30m\u001b[2m--------------------\u001b[0m\u001b[0m 4.49 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 4.08 MiB/33.57 MiB \u001b[7A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[7A\u001b[37m⠸\u001b[0m \u001b[2mPreparing packages...\u001b[0m (7/15)\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 4.20 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m---------------------\u001b[30m\u001b[2m---------\u001b[0m\u001b[0m 4.81 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-----------------\u001b[30m\u001b[2m-------------\u001b[0m\u001b[0m 4.72 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-----------\u001b[30m\u001b[2m-------------------\u001b[0m\u001b[0m 3.17 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 4.65 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-----------\u001b[30m\u001b[2m-------------------\u001b[0m\u001b[0m 4.74 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 4.09 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[7A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[7A\u001b[37m⠼\u001b[0m \u001b[2mPreparing packages...\u001b[0m (8/15)\r\n", | |
| "\u001b[2mfonttools \u001b[0m \u001b[32m-----------------------------\u001b[30m\u001b[2m-\u001b[0m\u001b[0m 4.71 MiB/4.73 MiB\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m------------------------\u001b[30m\u001b[2m------\u001b[0m\u001b[0m 5.53 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-------------------\u001b[30m\u001b[2m-----------\u001b[0m\u001b[0m 5.41 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-----------\u001b[30m\u001b[2m-------------------\u001b[0m\u001b[0m 3.34 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 5.52 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m------------\u001b[30m\u001b[2m------------------\u001b[0m\u001b[0m 5.45 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 4.14 MiB/33.57 MiB \u001b[7A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[7A\u001b[37m⠼\u001b[0m \u001b[2mPreparing packages...\u001b[0m (8/15)\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m-------------------------\u001b[30m\u001b[2m-----\u001b[0m\u001b[0m 5.73 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m--------------------\u001b[30m\u001b[2m----------\u001b[0m\u001b[0m 5.56 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-----------\u001b[30m\u001b[2m-------------------\u001b[0m\u001b[0m 3.37 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----------------\u001b[30m\u001b[2m--------------\u001b[0m\u001b[0m 5.69 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 5.61 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 4.14 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[6A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[6A\u001b[37m⠼\u001b[0m \u001b[2mPreparing packages...\u001b[0m (8/15)\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m---------------------------\u001b[30m\u001b[2m---\u001b[0m\u001b[0m 6.23 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m----------------------\u001b[30m\u001b[2m--------\u001b[0m\u001b[0m 6.18 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 3.92 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-----------------\u001b[30m\u001b[2m-------------\u001b[0m\u001b[0m 6.14 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m--------------\u001b[30m\u001b[2m----------------\u001b[0m\u001b[0m 6.18 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---\u001b[30m\u001b[2m---------------------------\u001b[0m\u001b[0m 4.36 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[6A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[6A\u001b[37m⠼\u001b[0m \u001b[2mPreparing packages...\u001b[0m (8/15)\r\n", | |
| "\u001b[2mpillow \u001b[0m \u001b[32m------------------------------\u001b[30m\u001b[2m\u001b[0m\u001b[0m 6.71 MiB/6.71 MiB\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m------------------------\u001b[30m\u001b[2m------\u001b[0m\u001b[0m 6.73 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 4.36 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m------------------\u001b[30m\u001b[2m------------\u001b[0m\u001b[0m 6.37 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m----------------\u001b[30m\u001b[2m--------------\u001b[0m\u001b[0m 6.73 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 5.89 MiB/33.57 MiB \u001b[6A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[6A\u001b[37m⠼\u001b[0m \u001b[2mPreparing packages...\u001b[0m (8/15)\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m------------------------\u001b[30m\u001b[2m------\u001b[0m\u001b[0m 6.91 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 4.53 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m------------------\u001b[30m\u001b[2m------------\u001b[0m\u001b[0m 6.40 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m----------------\u001b[30m\u001b[2m--------------\u001b[0m\u001b[0m 6.96 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 6.16 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[5A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[5A\u001b[37m⠼\u001b[0m \u001b[2mPreparing packages...\u001b[0m (8/15)\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-------------------------\u001b[30m\u001b[2m-----\u001b[0m\u001b[0m 7.12 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m------------------\u001b[30m\u001b[2m------------\u001b[0m\u001b[0m 5.12 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-------------------\u001b[30m\u001b[2m-----------\u001b[0m\u001b[0m 6.80 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m------------------\u001b[30m\u001b[2m------------\u001b[0m\u001b[0m 7.61 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-----\u001b[30m\u001b[2m-------------------------\u001b[0m\u001b[0m 6.42 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[5A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[5A\u001b[37m⠴\u001b[0m \u001b[2mPreparing packages...\u001b[0m (10/15)\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m---------------------------\u001b[30m\u001b[2m---\u001b[0m\u001b[0m 7.61 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m---------------------\u001b[30m\u001b[2m---------\u001b[0m\u001b[0m 6.07 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m--------------------\u001b[30m\u001b[2m----------\u001b[0m\u001b[0m 7.24 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-------------------\u001b[30m\u001b[2m-----------\u001b[0m\u001b[0m 8.30 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m------\u001b[30m\u001b[2m------------------------\u001b[0m\u001b[0m 7.37 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[5A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[5A\u001b[37m⠴\u001b[0m \u001b[2mPreparing packages...\u001b[0m (10/15)\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m----------------------------\u001b[30m\u001b[2m--\u001b[0m\u001b[0m 7.96 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-----------------------\u001b[30m\u001b[2m-------\u001b[0m\u001b[0m 6.75 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m---------------------\u001b[30m\u001b[2m---------\u001b[0m\u001b[0m 7.57 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m---------------------\u001b[30m\u001b[2m---------\u001b[0m\u001b[0m 9.01 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m--------\u001b[30m\u001b[2m----------------------\u001b[0m\u001b[0m 9.13 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[5A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[5A\u001b[37m⠴\u001b[0m \u001b[2mPreparing packages...\u001b[0m (10/15)\r\n", | |
| "\u001b[2mmatplotlib \u001b[0m \u001b[32m-----------------------------\u001b[30m\u001b[2m-\u001b[0m\u001b[0m 8.14 MiB/8.31 MiB\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 7.50 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----------------------\u001b[30m\u001b[2m--------\u001b[0m\u001b[0m 7.70 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m-------------------------\u001b[30m\u001b[2m-----\u001b[0m\u001b[0m 10.81 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 10.45 MiB/33.57 MiB \u001b[5A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[5A\u001b[37m⠴\u001b[0m \u001b[2mPreparing packages...\u001b[0m (10/15)\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m---------------------------\u001b[30m\u001b[2m---\u001b[0m\u001b[0m 7.79 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----------------------\u001b[30m\u001b[2m--------\u001b[0m\u001b[0m 7.73 MiB/10.37 MiB\r\n", | |
| "\u001b[2mlogomaker \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 11.19 MiB/12.58 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 10.58 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[4A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[4A\u001b[37m⠴\u001b[0m \u001b[2mPreparing packages...\u001b[0m (10/15)\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-----------------------------\u001b[30m\u001b[2m-\u001b[0m\u001b[0m 8.26 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----------------------\u001b[30m\u001b[2m--------\u001b[0m\u001b[0m 7.82 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 11.00 MiB/33.57 MiB \u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠴\u001b[0m \u001b[2mPreparing packages...\u001b[0m (10/15)\r\n", | |
| "\u001b[2mscikit-learn \u001b[0m \u001b[32m-----------------------------\u001b[30m\u001b[2m-\u001b[0m\u001b[0m 8.29 MiB/8.49 MiB\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m----------------------\u001b[30m\u001b[2m--------\u001b[0m\u001b[0m 7.85 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---------\u001b[30m\u001b[2m---------------------\u001b[0m\u001b[0m 11.14 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[3A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[3A\u001b[37m⠦\u001b[0m \u001b[2mPreparing packages...\u001b[0m (12/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-----------------------\u001b[30m\u001b[2m-------\u001b[0m\u001b[0m 8.09 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m------------\u001b[30m\u001b[2m------------------\u001b[0m\u001b[0m 13.47 MiB/33.57 MiB \u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠦\u001b[0m \u001b[2mPreparing packages...\u001b[0m (12/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-----------------------\u001b[30m\u001b[2m-------\u001b[0m\u001b[0m 8.10 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m------------\u001b[30m\u001b[2m------------------\u001b[0m\u001b[0m 13.47 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠦\u001b[0m \u001b[2mPreparing packages...\u001b[0m (12/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m------------------------\u001b[30m\u001b[2m------\u001b[0m\u001b[0m 8.40 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-------------\u001b[30m\u001b[2m-----------------\u001b[0m\u001b[0m 15.37 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠦\u001b[0m \u001b[2mPreparing packages...\u001b[0m (12/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m------------------------\u001b[30m\u001b[2m------\u001b[0m\u001b[0m 8.64 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---------------\u001b[30m\u001b[2m---------------\u001b[0m\u001b[0m 17.76 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠦\u001b[0m \u001b[2mPreparing packages...\u001b[0m (12/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m-------------------------\u001b[30m\u001b[2m-----\u001b[0m\u001b[0m 8.79 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-------------------\u001b[30m\u001b[2m-----------\u001b[0m\u001b[0m 21.48 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠧\u001b[0m \u001b[2mPreparing packages...\u001b[0m (13/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 9.16 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---------------------\u001b[30m\u001b[2m---------\u001b[0m\u001b[0m 23.91 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠧\u001b[0m \u001b[2mPreparing packages...\u001b[0m (13/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 9.33 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-----------------------\u001b[30m\u001b[2m-------\u001b[0m\u001b[0m 26.84 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠧\u001b[0m \u001b[2mPreparing packages...\u001b[0m (13/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m--------------------------\u001b[30m\u001b[2m----\u001b[0m\u001b[0m 9.33 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m---------------------------\u001b[30m\u001b[2m---\u001b[0m\u001b[0m 30.78 MiB/33.57 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠧\u001b[0m \u001b[2mPreparing packages...\u001b[0m (13/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m---------------------------\u001b[30m\u001b[2m---\u001b[0m\u001b[0m 9.62 MiB/10.37 MiB\r\n", | |
| "\u001b[2mscipy \u001b[0m \u001b[32m-----------------------------\u001b[30m\u001b[2m-\u001b[0m\u001b[0m 33.56 MiB/33.57 MiB \u001b[2A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[2A\u001b[37m⠇\u001b[0m \u001b[2mPreparing packages...\u001b[0m (13/15)\r\n", | |
| "\u001b[2mpandas \u001b[0m \u001b[32m---------------------------\u001b[30m\u001b[2m---\u001b[0m\u001b[0m 9.67 MiB/10.37 MiB " | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\u001b[1A\r", | |
| "\u001b[2K\u001b[1B\r", | |
| "\u001b[2K\u001b[1A\u001b[37m⠇\u001b[0m \u001b[2mPreparing packages...\u001b[0m (13/15) \r", | |
| "\u001b[2K\u001b[37m⠇\u001b[0m \u001b[2m\u001b[0m (15/15) \r", | |
| "\u001b[2K\u001b[2mPrepared \u001b[1m15 packages\u001b[0m \u001b[2min 1.45s\u001b[0m\u001b[0m\r\n", | |
| "░░░░░░░░░░░░░░░░░░░░ [0/0] \u001b[2mInstalling wheels... \u001b[0m\r", | |
| "\u001b[2K░░░░░░░░░░░░░░░░░░░░ [0/15] \u001b[2mInstalling wheels... \u001b[0m\r", | |
| "\u001b[2K░░░░░░░░░░░░░░░░░░░░ [0/15] \u001b[2mcycler==0.12.1 \u001b[0m\r", | |
| "\u001b[2K█░░░░░░░░░░░░░░░░░░░ [1/15] \u001b[2mcycler==0.12.1 \u001b[0m\r", | |
| "\u001b[2K█░░░░░░░░░░░░░░░░░░░ [1/15] \u001b[2mthreadpoolctl==3.6.0 \u001b[0m\r", | |
| "\u001b[2K██░░░░░░░░░░░░░░░░░░ [2/15] \u001b[2mthreadpoolctl==3.6.0 \u001b[0m\r", | |
| "\u001b[2K██░░░░░░░░░░░░░░░░░░ [2/15] \u001b[2mlogomaker==0.8.7 \u001b[0m\r", | |
| "\u001b[2K████░░░░░░░░░░░░░░░░ [3/15] \u001b[2mlogomaker==0.8.7 \u001b[0m\r", | |
| "\u001b[2K████░░░░░░░░░░░░░░░░ [3/15] \u001b[2mpyparsing==3.3.2 \u001b[0m\r", | |
| "\u001b[2K█████░░░░░░░░░░░░░░░ [4/15] \u001b[2mpyparsing==3.3.2 \u001b[0m\r", | |
| "\u001b[2K█████░░░░░░░░░░░░░░░ [4/15] \u001b[2mcontourpy==1.3.3 \u001b[0m\r", | |
| "\u001b[2K██████░░░░░░░░░░░░░░ [5/15] \u001b[2mcontourpy==1.3.3 \u001b[0m\r", | |
| "\u001b[2K██████░░░░░░░░░░░░░░ [5/15] \u001b[2mseaborn==0.13.2 \u001b[0m\r", | |
| "\u001b[2K████████░░░░░░░░░░░░ [6/15] \u001b[2mseaborn==0.13.2 \u001b[0m\r", | |
| "\u001b[2K████████░░░░░░░░░░░░ [6/15] \u001b[2mjoblib==1.5.3 \u001b[0m\r", | |
| "\u001b[2K█████████░░░░░░░░░░░ [7/15] \u001b[2mjoblib==1.5.3 \u001b[0m\r", | |
| "\u001b[2K█████████░░░░░░░░░░░ [7/15] \u001b[2mkiwisolver==1.5.0 \u001b[0m\r", | |
| "\u001b[2K██████████░░░░░░░░░░ [8/15] \u001b[2mkiwisolver==1.5.0 \u001b[0m\r", | |
| "\u001b[2K██████████░░░░░░░░░░ [8/15] \u001b[2mbiopython==1.86 \u001b[0m\r", | |
| "\u001b[2K████████████░░░░░░░░ [9/15] \u001b[2mbiopython==1.86 \u001b[0m" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "\r", | |
| "\u001b[2K██████████████████░░ [14/15] \u001b[2mpandas==3.0.1 \u001b[0m\r", | |
| "\u001b[2K\u001b[2mInstalled \u001b[1m15 packages\u001b[0m \u001b[2min 60ms\u001b[0m\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mbiopython\u001b[0m\u001b[2m==1.86\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mcontourpy\u001b[0m\u001b[2m==1.3.3\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mcycler\u001b[0m\u001b[2m==0.12.1\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mfonttools\u001b[0m\u001b[2m==4.62.1\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mjoblib\u001b[0m\u001b[2m==1.5.3\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mkiwisolver\u001b[0m\u001b[2m==1.5.0\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mlogomaker\u001b[0m\u001b[2m==0.8.7\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mmatplotlib\u001b[0m\u001b[2m==3.10.8\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mpandas\u001b[0m\u001b[2m==3.0.1\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mpillow\u001b[0m\u001b[2m==12.1.1\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mpyparsing\u001b[0m\u001b[2m==3.3.2\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mscikit-learn\u001b[0m\u001b[2m==1.8.0\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mscipy\u001b[0m\u001b[2m==1.17.1\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mseaborn\u001b[0m\u001b[2m==0.13.2\u001b[0m\r\n", | |
| " \u001b[32m+\u001b[39m \u001b[1mthreadpoolctl\u001b[0m\u001b[2m==3.6.0\u001b[0m\r\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "# Install required packages\n", | |
| "!uv pip install numpy pandas scipy matplotlib seaborn biopython scikit-learn logomaker" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 3, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "Libraries imported successfully!\n", | |
| "NumPy version: 2.4.2\n", | |
| "Pandas version: 3.0.1\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "# Import libraries\n", | |
| "import numpy as np\n", | |
| "import pandas as pd\n", | |
| "import matplotlib.pyplot as plt\n", | |
| "import seaborn as sns\n", | |
| "from scipy import stats\n", | |
| "from scipy.stats import fisher_exact, chi2_contingency\n", | |
| "from sklearn.linear_model import LogisticRegression\n", | |
| "from sklearn.metrics import classification_report, confusion_matrix\n", | |
| "from Bio import motifs\n", | |
| "from Bio.Seq import Seq\n", | |
| "import warnings\n", | |
| "warnings.filterwarnings('ignore')\n", | |
| "\n", | |
| "# Set random seed for reproducibility\n", | |
| "np.random.seed(42)\n", | |
| "\n", | |
| "# Configure plotting\n", | |
| "plt.style.use('seaborn-v0_8-darkgrid')\n", | |
| "sns.set_palette(\"husl\")\n", | |
| "%matplotlib inline\n", | |
| "\n", | |
| "print(\"Libraries imported successfully!\")\n", | |
| "print(f\"NumPy version: {np.__version__}\")\n", | |
| "print(f\"Pandas version: {pd.__version__}\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.2 Generate Synthetic Promoter Data\n", | |
| "\n", | |
| "Since we cannot include the full genomic data within our resource constraints, we'll generate synthetic promoter sequences that capture the key properties described in the paper:\n", | |
| "\n", | |
| "1. **Off-to-on genes** (~1,800 promoters)\n", | |
| "2. **Down-regulated genes** (~1,100 promoters)\n", | |
| "3. **Genes with alternative promoters** (~1,200 genes with 2 promoters each)\n", | |
| "\n", | |
| "For this demonstration, we'll use a smaller representative sample." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 4, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "Generating synthetic promoter sequences...\n", | |
| "This represents a small-scale demonstration dataset.\n", | |
| "\n", | |
| "Generated 480 promoter sequences:\n", | |
| " - Off-to-on genes: 200\n", | |
| " - Down-regulated genes: 120\n", | |
| " - Alternative promoters: 160 (80 genes × 2 promoters)\n", | |
| "\n", | |
| "Example promoter:\n", | |
| " Type: off-to-on\n", | |
| " Sequence length: 300 bp\n", | |
| " TSS position: 150\n", | |
| " Motifs present: ['tMAC', 'AchiVis', 'Inr', 'ACA', 'CNAAATT']\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "def generate_random_sequence(length, gc_content=0.5):\n", | |
| " \"\"\"\n", | |
| " Generate a random DNA sequence with specified GC content.\n", | |
| " \n", | |
| " Parameters:\n", | |
| " -----------\n", | |
| " length : int\n", | |
| " Length of sequence to generate\n", | |
| " gc_content : float\n", | |
| " Proportion of G/C bases (0-1)\n", | |
| " \n", | |
| " Returns:\n", | |
| " --------\n", | |
| " str : Random DNA sequence\n", | |
| " \"\"\"\n", | |
| " # Calculate base probabilities\n", | |
| " gc_prob = gc_content / 2 # Each of G and C\n", | |
| " at_prob = (1 - gc_content) / 2 # Each of A and T\n", | |
| " \n", | |
| " bases = np.random.choice(['A', 'T', 'G', 'C'], \n", | |
| " size=length, \n", | |
| " p=[at_prob, at_prob, gc_prob, gc_prob])\n", | |
| " return ''.join(bases)\n", | |
| "\n", | |
| "def insert_motif(sequence, motif, position):\n", | |
| " \"\"\"\n", | |
| " Insert a motif into a sequence at specified position.\n", | |
| " \n", | |
| " Parameters:\n", | |
| " -----------\n", | |
| " sequence : str\n", | |
| " Background DNA sequence\n", | |
| " motif : str\n", | |
| " Motif sequence to insert\n", | |
| " position : int\n", | |
| " Position to insert motif (0-indexed)\n", | |
| " \n", | |
| " Returns:\n", | |
| " --------\n", | |
| " str : Sequence with motif inserted\n", | |
| " \"\"\"\n", | |
| " if position < 0 or position + len(motif) > len(sequence):\n", | |
| " return sequence\n", | |
| " return sequence[:position] + motif + sequence[position + len(motif):]\n", | |
| "\n", | |
| "def generate_promoter_sequence(promoter_type, \n", | |
| " insert_tmac=True, \n", | |
| " insert_achvis=True,\n", | |
| " insert_inr=True,\n", | |
| " insert_aca=True,\n", | |
| " insert_cnaaatt=True):\n", | |
| " \"\"\"\n", | |
| " Generate a synthetic promoter sequence with specified motifs.\n", | |
| " \n", | |
| " Promoter structure (300 bp total, TSS at position 150):\n", | |
| " - Position 90: tMAC-ChIP motif (60 bp upstream of TSS)\n", | |
| " - Position 120: Achi/Vis motif (30 bp upstream of TSS)\n", | |
| " - Position 150: TSS with Inr (TCA)\n", | |
| " - Position 178: ACA motif (+28 from TSS)\n", | |
| " - Position 190: CNAAATT motif (+40 from TSS)\n", | |
| " \n", | |
| " Parameters:\n", | |
| " -----------\n", | |
| " promoter_type : str\n", | |
| " 'off-to-on', 'down-regulated', or 'alternative'\n", | |
| " insert_* : bool\n", | |
| " Whether to insert each motif\n", | |
| " \n", | |
| " Returns:\n", | |
| " --------\n", | |
| " dict : Promoter information including sequence and motif positions\n", | |
| " \"\"\"\n", | |
| " # Define motif sequences\n", | |
| " MOTIFS = {\n", | |
| " 'tMAC': 'TAGTACC', # tMAC-ChIP motif\n", | |
| " 'AchiVis': 'TGTCA', # Achi/Vis binding site\n", | |
| " 'Inr': 'TCA', # Initiator\n", | |
| " 'ACA': 'ACA', # Downstream motif\n", | |
| " 'CNAAATT': 'CAAAATT' # Downstream motif (CNAAATT with N=A)\n", | |
| " }\n", | |
| " \n", | |
| " # Generate background sequence\n", | |
| " # Off-to-on promoters are more AT-rich\n", | |
| " if promoter_type == 'off-to-on':\n", | |
| " gc_content = 0.35\n", | |
| " else:\n", | |
| " gc_content = 0.50\n", | |
| " \n", | |
| " seq = generate_random_sequence(300, gc_content=gc_content)\n", | |
| " \n", | |
| " # TSS is at position 150 (0-indexed)\n", | |
| " tss_pos = 150\n", | |
| " \n", | |
| " # Insert motifs based on parameters\n", | |
| " motif_positions = {}\n", | |
| " \n", | |
| " if insert_tmac:\n", | |
| " tmac_pos = 90 # 60 bp upstream of TSS\n", | |
| " seq = insert_motif(seq, MOTIFS['tMAC'], tmac_pos)\n", | |
| " motif_positions['tMAC'] = tmac_pos\n", | |
| " \n", | |
| " if insert_achvis:\n", | |
| " achvis_pos = 120 # 30 bp upstream of TSS\n", | |
| " seq = insert_motif(seq, MOTIFS['AchiVis'], achvis_pos)\n", | |
| " motif_positions['AchiVis'] = achvis_pos\n", | |
| " \n", | |
| " if insert_inr:\n", | |
| " seq = insert_motif(seq, MOTIFS['Inr'], tss_pos)\n", | |
| " motif_positions['Inr'] = tss_pos\n", | |
| " \n", | |
| " if insert_aca:\n", | |
| " aca_pos = tss_pos + 28 # +28 from TSS\n", | |
| " seq = insert_motif(seq, MOTIFS['ACA'], aca_pos)\n", | |
| " motif_positions['ACA'] = aca_pos\n", | |
| " \n", | |
| " if insert_cnaaatt:\n", | |
| " cnaaatt_pos = tss_pos + 40 # +40 from TSS\n", | |
| " seq = insert_motif(seq, MOTIFS['CNAAATT'], cnaaatt_pos)\n", | |
| " motif_positions['CNAAATT'] = cnaaatt_pos\n", | |
| " \n", | |
| " return {\n", | |
| " 'sequence': seq,\n", | |
| " 'type': promoter_type,\n", | |
| " 'tss_position': tss_pos,\n", | |
| " 'motif_positions': motif_positions,\n", | |
| " 'has_tMAC': insert_tmac,\n", | |
| " 'has_AchiVis': insert_achvis,\n", | |
| " 'has_Inr': insert_inr,\n", | |
| " 'has_ACA': insert_aca,\n", | |
| " 'has_CNAAATT': insert_cnaaatt\n", | |
| " }\n", | |
| "\n", | |
| "# Generate synthetic dataset\n", | |
| "print(\"Generating synthetic promoter sequences...\")\n", | |
| "print(\"This represents a small-scale demonstration dataset.\\n\")\n", | |
| "\n", | |
| "promoters = []\n", | |
| "\n", | |
| "# Off-to-on genes (sample of 200 from ~1,800)\n", | |
| "# These should have most/all of the motifs\n", | |
| "for i in range(200):\n", | |
| " # Vary motif presence to create diversity\n", | |
| " has_all = np.random.random() < 0.4 # 40% have all motifs\n", | |
| " if has_all:\n", | |
| " prom = generate_promoter_sequence('off-to-on', \n", | |
| " insert_tmac=True,\n", | |
| " insert_achvis=True,\n", | |
| " insert_inr=True,\n", | |
| " insert_aca=True,\n", | |
| " insert_cnaaatt=True)\n", | |
| " else:\n", | |
| " # Randomly omit some motifs\n", | |
| " prom = generate_promoter_sequence('off-to-on',\n", | |
| " insert_tmac=np.random.random() < 0.8,\n", | |
| " insert_achvis=np.random.random() < 0.7,\n", | |
| " insert_inr=np.random.random() < 0.6,\n", | |
| " insert_aca=np.random.random() < 0.5,\n", | |
| " insert_cnaaatt=np.random.random() < 0.5)\n", | |
| " prom['gene_id'] = f'off_to_on_{i}'\n", | |
| " promoters.append(prom)\n", | |
| "\n", | |
| "# Down-regulated genes (sample of 120 from ~1,100)\n", | |
| "# These should lack the spermatocyte-specific motifs\n", | |
| "for i in range(120):\n", | |
| " prom = generate_promoter_sequence('down-regulated',\n", | |
| " insert_tmac=False,\n", | |
| " insert_achvis=False,\n", | |
| " insert_inr=np.random.random() < 0.3,\n", | |
| " insert_aca=False,\n", | |
| " insert_cnaaatt=False)\n", | |
| " prom['gene_id'] = f'down_reg_{i}'\n", | |
| " promoters.append(prom)\n", | |
| "\n", | |
| "# Alternative promoters (80 genes with 2 promoters each from ~1,200)\n", | |
| "# New promoters resemble off-to-on, old promoters resemble down-regulated\n", | |
| "for i in range(80):\n", | |
| " # Old promoter\n", | |
| " prom_old = generate_promoter_sequence('alternative-old',\n", | |
| " insert_tmac=False,\n", | |
| " insert_achvis=False,\n", | |
| " insert_inr=np.random.random() < 0.3,\n", | |
| " insert_aca=False,\n", | |
| " insert_cnaaatt=False)\n", | |
| " prom_old['gene_id'] = f'alt_prom_{i}'\n", | |
| " prom_old['promoter_class'] = 'old'\n", | |
| " promoters.append(prom_old)\n", | |
| " \n", | |
| " # New promoter\n", | |
| " has_all = np.random.random() < 0.5\n", | |
| " if has_all:\n", | |
| " prom_new = generate_promoter_sequence('alternative-new',\n", | |
| " insert_tmac=True,\n", | |
| " insert_achvis=True,\n", | |
| " insert_inr=True,\n", | |
| " insert_aca=True,\n", | |
| " insert_cnaaatt=True)\n", | |
| " else:\n", | |
| " prom_new = generate_promoter_sequence('alternative-new',\n", | |
| " insert_tmac=np.random.random() < 0.8,\n", | |
| " insert_achvis=np.random.random() < 0.7,\n", | |
| " insert_inr=np.random.random() < 0.6,\n", | |
| " insert_aca=np.random.random() < 0.5,\n", | |
| " insert_cnaaatt=np.random.random() < 0.5)\n", | |
| " prom_new['gene_id'] = f'alt_prom_{i}'\n", | |
| " prom_new['promoter_class'] = 'new'\n", | |
| " promoters.append(prom_new)\n", | |
| "\n", | |
| "print(f\"Generated {len(promoters)} promoter sequences:\")\n", | |
| "print(f\" - Off-to-on genes: 200\")\n", | |
| "print(f\" - Down-regulated genes: 120\")\n", | |
| "print(f\" - Alternative promoters: 160 (80 genes × 2 promoters)\")\n", | |
| "print(\"\\nExample promoter:\")\n", | |
| "print(f\" Type: {promoters[0]['type']}\")\n", | |
| "print(f\" Sequence length: {len(promoters[0]['sequence'])} bp\")\n", | |
| "print(f\" TSS position: {promoters[0]['tss_position']}\")\n", | |
| "print(f\" Motifs present: {list(promoters[0]['motif_positions'].keys())}\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.3 Simulate CAGE and Expression Data\n", | |
| "\n", | |
| "Now we'll simulate CAGE signals (representing TSS usage) and expression levels based on motif composition.\n", | |
| "\n", | |
| "### Biological Model:\n", | |
| "\n", | |
| "1. **Base expression** determined by promoter type\n", | |
| "2. **Enhancement** from each motif presence:\n", | |
| " - tMAC: +2× (opens chromatin)\n", | |
| " - Achi/Vis: +1.5× (transcription factor binding)\n", | |
| " - Inr (TCA): +1.8× (precise TSS positioning)\n", | |
| " - ACA: +1.4× (downstream element)\n", | |
| " - CNAAATT: +1.3× (downstream element)\n", | |
| "3. **Effects are multiplicative** (consistent with logistic regression findings)\n", | |
| "4. **Add biological noise** from Poisson distribution" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 5, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "Simulating CAGE signals and expression levels...\n", | |
| "\n", | |
| "Summary statistics:\n", | |
| "Total promoters: 480\n", | |
| "\n", | |
| "Promoter types:\n", | |
| "type\n", | |
| "off-to-on 200\n", | |
| "down-regulated 120\n", | |
| "alternative-old 80\n", | |
| "alternative-new 80\n", | |
| "Name: count, dtype: int64\n", | |
| "\n", | |
| "Promoter classes:\n", | |
| "promoter_class\n", | |
| "broad-low 215\n", | |
| "narrow-high 183\n", | |
| "broad-high 82\n", | |
| "Name: count, dtype: int64\n", | |
| "\n", | |
| "Expression statistics:\n", | |
| "count 480.000000\n", | |
| "mean 446.383333\n", | |
| "std 395.137812\n", | |
| "min 34.000000\n", | |
| "25% 59.750000\n", | |
| "50% 313.000000\n", | |
| "75% 948.000000\n", | |
| "max 1074.000000\n", | |
| "Name: cage_signal, dtype: float64\n", | |
| "\n", | |
| "Motif prevalence:\n", | |
| " has_tMAC: 255 (53.1%)\n", | |
| " has_AchiVis: 233 (48.5%)\n", | |
| " has_Inr: 286 (59.6%)\n", | |
| " has_ACA: 200 (41.7%)\n", | |
| " has_CNAAATT: 195 (40.6%)\n", | |
| "\n", | |
| "First 5 promoters:\n" | |
| ] | |
| }, | |
| { | |
| "data": { | |
| "text/html": [ | |
| "<div>\n", | |
| "<style scoped>\n", | |
| " .dataframe tbody tr th:only-of-type {\n", | |
| " vertical-align: middle;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe tbody tr th {\n", | |
| " vertical-align: top;\n", | |
| " }\n", | |
| "\n", | |
| " .dataframe thead th {\n", | |
| " text-align: right;\n", | |
| " }\n", | |
| "</style>\n", | |
| "<table border=\"1\" class=\"dataframe\">\n", | |
| " <thead>\n", | |
| " <tr style=\"text-align: right;\">\n", | |
| " <th></th>\n", | |
| " <th>gene_id</th>\n", | |
| " <th>type</th>\n", | |
| " <th>n_motifs</th>\n", | |
| " <th>cage_signal</th>\n", | |
| " <th>reti_width</th>\n", | |
| " <th>promoter_class</th>\n", | |
| " </tr>\n", | |
| " </thead>\n", | |
| " <tbody>\n", | |
| " <tr>\n", | |
| " <th>0</th>\n", | |
| " <td>off_to_on_0</td>\n", | |
| " <td>off-to-on</td>\n", | |
| " <td>5</td>\n", | |
| " <td>920</td>\n", | |
| " <td>8</td>\n", | |
| " <td>narrow-high</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>1</th>\n", | |
| " <td>off_to_on_1</td>\n", | |
| " <td>off-to-on</td>\n", | |
| " <td>2</td>\n", | |
| " <td>313</td>\n", | |
| " <td>10</td>\n", | |
| " <td>narrow-high</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>2</th>\n", | |
| " <td>off_to_on_2</td>\n", | |
| " <td>off-to-on</td>\n", | |
| " <td>5</td>\n", | |
| " <td>973</td>\n", | |
| " <td>9</td>\n", | |
| " <td>narrow-high</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>3</th>\n", | |
| " <td>off_to_on_3</td>\n", | |
| " <td>off-to-on</td>\n", | |
| " <td>5</td>\n", | |
| " <td>972</td>\n", | |
| " <td>10</td>\n", | |
| " <td>narrow-high</td>\n", | |
| " </tr>\n", | |
| " <tr>\n", | |
| " <th>4</th>\n", | |
| " <td>off_to_on_4</td>\n", | |
| " <td>off-to-on</td>\n", | |
| " <td>3</td>\n", | |
| " <td>543</td>\n", | |
| " <td>29</td>\n", | |
| " <td>broad-high</td>\n", | |
| " </tr>\n", | |
| " </tbody>\n", | |
| "</table>\n", | |
| "</div>" | |
| ], | |
| "text/plain": [ | |
| " gene_id type n_motifs cage_signal reti_width promoter_class\n", | |
| "0 off_to_on_0 off-to-on 5 920 8 narrow-high\n", | |
| "1 off_to_on_1 off-to-on 2 313 10 narrow-high\n", | |
| "2 off_to_on_2 off-to-on 5 973 9 narrow-high\n", | |
| "3 off_to_on_3 off-to-on 5 972 10 narrow-high\n", | |
| "4 off_to_on_4 off-to-on 3 543 29 broad-high" | |
| ] | |
| }, | |
| "execution_count": 5, | |
| "metadata": {}, | |
| "output_type": "execute_result" | |
| } | |
| ], | |
| "source": [ | |
| "def simulate_expression_data(promoters):\n", | |
| " \"\"\"\n", | |
| " Simulate CAGE signals and expression levels based on motif composition.\n", | |
| " \n", | |
| " Parameters:\n", | |
| " -----------\n", | |
| " promoters : list\n", | |
| " List of promoter dictionaries\n", | |
| " \n", | |
| " Returns:\n", | |
| " --------\n", | |
| " pd.DataFrame : Promoter data with simulated expression\n", | |
| " \"\"\"\n", | |
| " data = []\n", | |
| " \n", | |
| " for prom in promoters:\n", | |
| " # Base expression level\n", | |
| " if prom['type'] == 'off-to-on':\n", | |
| " base_expr = 100\n", | |
| " elif prom['type'] == 'down-regulated':\n", | |
| " base_expr = 50\n", | |
| " elif 'alternative' in prom['type']:\n", | |
| " if prom.get('promoter_class') == 'new':\n", | |
| " base_expr = 100\n", | |
| " else:\n", | |
| " base_expr = 50\n", | |
| " else:\n", | |
| " base_expr = 75\n", | |
| " \n", | |
| " # Calculate multiplicative enhancement from motifs\n", | |
| " enhancement = 1.0\n", | |
| " if prom['has_tMAC']:\n", | |
| " enhancement *= 2.0\n", | |
| " if prom['has_AchiVis']:\n", | |
| " enhancement *= 1.5\n", | |
| " if prom['has_Inr']:\n", | |
| " enhancement *= 1.8\n", | |
| " if prom['has_ACA']:\n", | |
| " enhancement *= 1.4\n", | |
| " if prom['has_CNAAATT']:\n", | |
| " enhancement *= 1.3\n", | |
| " \n", | |
| " # Calculate expected expression\n", | |
| " expected_expr = base_expr * enhancement\n", | |
| " \n", | |
| " # Add Poisson noise (represents biological and technical variability)\n", | |
| " cage_signal = np.random.poisson(expected_expr)\n", | |
| " \n", | |
| " # Determine RETI width based on motif composition\n", | |
| " # Promoters with all motifs tend to be narrow\n", | |
| " n_motifs = sum([prom['has_tMAC'], prom['has_AchiVis'], \n", | |
| " prom['has_Inr'], prom['has_ACA'], prom['has_CNAAATT']])\n", | |
| " \n", | |
| " if n_motifs >= 4:\n", | |
| " reti_width = np.random.randint(5, 11) # Narrow\n", | |
| " elif n_motifs >= 2:\n", | |
| " reti_width = np.random.randint(10, 30) # Medium\n", | |
| " else:\n", | |
| " reti_width = np.random.randint(25, 60) # Broad\n", | |
| " \n", | |
| " # Classify promoter\n", | |
| " is_narrow = reti_width < 11\n", | |
| " is_high_expr = cage_signal > np.median([p_cage for p_cage in \n", | |
| " [base_expr * 2.0 * 1.5 * 1.8 * 1.4 * 1.3 \n", | |
| " for _ in range(100)]]) # Approximate high threshold\n", | |
| " \n", | |
| " if is_narrow and cage_signal > 200:\n", | |
| " promoter_class = 'narrow-high'\n", | |
| " elif is_narrow:\n", | |
| " promoter_class = 'narrow-low'\n", | |
| " elif cage_signal > 200:\n", | |
| " promoter_class = 'broad-high'\n", | |
| " else:\n", | |
| " promoter_class = 'broad-low'\n", | |
| " \n", | |
| " data.append({\n", | |
| " 'gene_id': prom['gene_id'],\n", | |
| " 'type': prom['type'],\n", | |
| " 'sequence': prom['sequence'],\n", | |
| " 'tss_position': prom['tss_position'],\n", | |
| " 'has_tMAC': prom['has_tMAC'],\n", | |
| " 'has_AchiVis': prom['has_AchiVis'],\n", | |
| " 'has_Inr': prom['has_Inr'],\n", | |
| " 'has_ACA': prom['has_ACA'],\n", | |
| " 'has_CNAAATT': prom['has_CNAAATT'],\n", | |
| " 'n_motifs': n_motifs,\n", | |
| " 'cage_signal': cage_signal,\n", | |
| " 'reti_width': reti_width,\n", | |
| " 'promoter_class': promoter_class\n", | |
| " })\n", | |
| " \n", | |
| " return pd.DataFrame(data)\n", | |
| "\n", | |
| "# Generate expression data\n", | |
| "print(\"Simulating CAGE signals and expression levels...\\n\")\n", | |
| "df_promoters = simulate_expression_data(promoters)\n", | |
| "\n", | |
| "print(\"Summary statistics:\")\n", | |
| "print(f\"Total promoters: {len(df_promoters)}\")\n", | |
| "print(f\"\\nPromoter types:\")\n", | |
| "print(df_promoters['type'].value_counts())\n", | |
| "print(f\"\\nPromoter classes:\")\n", | |
| "print(df_promoters['promoter_class'].value_counts())\n", | |
| "print(f\"\\nExpression statistics:\")\n", | |
| "print(df_promoters['cage_signal'].describe())\n", | |
| "print(f\"\\nMotif prevalence:\")\n", | |
| "for motif in ['has_tMAC', 'has_AchiVis', 'has_Inr', 'has_ACA', 'has_CNAAATT']:\n", | |
| " n_with_motif = df_promoters[motif].sum()\n", | |
| " pct = 100 * n_with_motif / len(df_promoters)\n", | |
| " print(f\" {motif}: {n_with_motif} ({pct:.1f}%)\")\n", | |
| "\n", | |
| "# Display first few rows\n", | |
| "print(\"\\nFirst 5 promoters:\")\n", | |
| "display_cols = ['gene_id', 'type', 'n_motifs', 'cage_signal', 'reti_width', 'promoter_class']\n", | |
| "df_promoters[display_cols].head()" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.4 Motif Discovery and Enrichment Analysis\n", | |
| "\n", | |
| "Now we'll implement simplified versions of the motif discovery analyses.\n", | |
| "\n", | |
| "### Approach:\n", | |
| "\n", | |
| "1. **Scan sequences** for known motifs (in real analysis, motifs would be discovered de novo)\n", | |
| "2. **Calculate enrichment** in different promoter classes\n", | |
| "3. **Analyze positional preferences**" | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 6, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "Calculating motif enrichment across promoter types...\n", | |
| "\n", | |
| "\n", | |
| "============================================================\n", | |
| "Motif: tMAC-ChIP (TAGTACC)\n", | |
| "============================================================\n", | |
| " group n_promoters n_with_motif fraction_with_motif total_occurrences avg_occurrences_per_promoter\n", | |
| " off-to-on 200 179 0.895000 182 0.910000\n", | |
| " down-regulated 120 1 0.008333 1 0.008333\n", | |
| "alternative-old 80 1 0.012500 1 0.012500\n", | |
| "alternative-new 80 76 0.950000 76 0.950000\n", | |
| "\n", | |
| "Fisher's exact test (off-to-on vs down-regulated):\n", | |
| " Odds ratio: 1014.33\n", | |
| " P-value: 2.12e-64\n", | |
| " *** Significantly enriched in off-to-on promoters ***\n", | |
| "\n", | |
| "============================================================\n", | |
| "Motif: Achi/Vis (TGTCA)\n", | |
| "============================================================\n" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| " group n_promoters n_with_motif fraction_with_motif total_occurrences avg_occurrences_per_promoter\n", | |
| " off-to-on 200 174 0.870000 241 1.205000\n", | |
| " down-regulated 120 23 0.191667 26 0.216667\n", | |
| "alternative-old 80 20 0.250000 22 0.275000\n", | |
| "alternative-new 80 73 0.912500 95 1.187500\n", | |
| "\n", | |
| "Fisher's exact test (off-to-on vs down-regulated):\n", | |
| " Odds ratio: 28.22\n", | |
| " P-value: 7.44e-35\n", | |
| " *** Significantly enriched in off-to-on promoters ***\n", | |
| "\n", | |
| "============================================================\n", | |
| "Motif: Inr (TCA)\n", | |
| "============================================================\n", | |
| " group n_promoters n_with_motif fraction_with_motif total_occurrences avg_occurrences_per_promoter\n", | |
| " off-to-on 200 200 1.000 1444 7.220000\n", | |
| " down-regulated 120 117 0.975 578 4.816667\n", | |
| "alternative-old 80 78 0.975 415 5.187500\n", | |
| "alternative-new 80 80 1.000 481 6.012500\n", | |
| "\n", | |
| "Fisher's exact test (off-to-on vs down-regulated):\n", | |
| " Odds ratio: inf\n", | |
| " P-value: 5.19e-02\n", | |
| "\n", | |
| "============================================================\n", | |
| "Motif: ACA (ACA)\n", | |
| "============================================================\n" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| " group n_promoters n_with_motif fraction_with_motif total_occurrences avg_occurrences_per_promoter\n", | |
| " off-to-on 200 200 1.000000 1201 6.005000\n", | |
| " down-regulated 120 119 0.991667 592 4.933333\n", | |
| "alternative-old 80 78 0.975000 380 4.750000\n", | |
| "alternative-new 80 80 1.000000 428 5.350000\n", | |
| "\n", | |
| "Fisher's exact test (off-to-on vs down-regulated):\n", | |
| " Odds ratio: inf\n", | |
| " P-value: 3.75e-01\n", | |
| "\n", | |
| "============================================================\n", | |
| "Motif: CNAAATT (CAAAATT)\n", | |
| "============================================================\n" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| " group n_promoters n_with_motif fraction_with_motif total_occurrences avg_occurrences_per_promoter\n", | |
| " off-to-on 200 141 0.705000 149 0.745000\n", | |
| " down-regulated 120 2 0.016667 2 0.016667\n", | |
| "alternative-old 80 2 0.025000 2 0.025000\n", | |
| "alternative-new 80 56 0.700000 58 0.725000\n", | |
| "\n", | |
| "Fisher's exact test (off-to-on vs down-regulated):\n", | |
| " Odds ratio: 141.00\n", | |
| " P-value: 2.34e-39\n", | |
| " *** Significantly enriched in off-to-on promoters ***\n", | |
| "\n", | |
| "============================================================\n", | |
| "Motif enrichment analysis complete!\n", | |
| "============================================================\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "def find_motif_positions(sequence, motif, allow_mismatch=0):\n", | |
| " \"\"\"\n", | |
| " Find all occurrences of a motif in a sequence.\n", | |
| " \n", | |
| " Parameters:\n", | |
| " -----------\n", | |
| " sequence : str\n", | |
| " DNA sequence to search\n", | |
| " motif : str\n", | |
| " Motif to find\n", | |
| " allow_mismatch : int\n", | |
| " Number of mismatches allowed (0 = exact match)\n", | |
| " \n", | |
| " Returns:\n", | |
| " --------\n", | |
| " list : Positions where motif occurs (0-indexed)\n", | |
| " \"\"\"\n", | |
| " positions = []\n", | |
| " motif_len = len(motif)\n", | |
| " \n", | |
| " for i in range(len(sequence) - motif_len + 1):\n", | |
| " window = sequence[i:i+motif_len]\n", | |
| " mismatches = sum(a != b for a, b in zip(window, motif))\n", | |
| " if mismatches <= allow_mismatch:\n", | |
| " positions.append(i)\n", | |
| " \n", | |
| " return positions\n", | |
| "\n", | |
| "def calculate_motif_enrichment(df, motif_name, motif_seq, group_col='type'):\n", | |
| " \"\"\"\n", | |
| " Calculate motif enrichment across different groups.\n", | |
| " \n", | |
| " Parameters:\n", | |
| " -----------\n", | |
| " df : pd.DataFrame\n", | |
| " Promoter data\n", | |
| " motif_name : str\n", | |
| " Name of motif\n", | |
| " motif_seq : str\n", | |
| " Motif sequence\n", | |
| " group_col : str\n", | |
| " Column to group by\n", | |
| " \n", | |
| " Returns:\n", | |
| " --------\n", | |
| " pd.DataFrame : Enrichment statistics\n", | |
| " \"\"\"\n", | |
| " results = []\n", | |
| " \n", | |
| " for group in df[group_col].unique():\n", | |
| " group_df = df[df[group_col] == group]\n", | |
| " n_with_motif = 0\n", | |
| " total_occurrences = 0\n", | |
| " \n", | |
| " for seq in group_df['sequence']:\n", | |
| " positions = find_motif_positions(seq, motif_seq, allow_mismatch=0)\n", | |
| " if len(positions) > 0:\n", | |
| " n_with_motif += 1\n", | |
| " total_occurrences += len(positions)\n", | |
| " \n", | |
| " n_total = len(group_df)\n", | |
| " enrichment = n_with_motif / n_total if n_total > 0 else 0\n", | |
| " avg_occurrences = total_occurrences / n_total if n_total > 0 else 0\n", | |
| " \n", | |
| " results.append({\n", | |
| " 'group': group,\n", | |
| " 'n_promoters': n_total,\n", | |
| " 'n_with_motif': n_with_motif,\n", | |
| " 'fraction_with_motif': enrichment,\n", | |
| " 'total_occurrences': total_occurrences,\n", | |
| " 'avg_occurrences_per_promoter': avg_occurrences\n", | |
| " })\n", | |
| " \n", | |
| " return pd.DataFrame(results)\n", | |
| "\n", | |
| "# Define motifs\n", | |
| "MOTIFS = {\n", | |
| " 'tMAC-ChIP': 'TAGTACC',\n", | |
| " 'Achi/Vis': 'TGTCA',\n", | |
| " 'Inr': 'TCA',\n", | |
| " 'ACA': 'ACA',\n", | |
| " 'CNAAATT': 'CAAAATT'\n", | |
| "}\n", | |
| "\n", | |
| "print(\"Calculating motif enrichment across promoter types...\\n\")\n", | |
| "\n", | |
| "for motif_name, motif_seq in MOTIFS.items():\n", | |
| " print(f\"\\n{'='*60}\")\n", | |
| " print(f\"Motif: {motif_name} ({motif_seq})\")\n", | |
| " print(f\"{'='*60}\")\n", | |
| " \n", | |
| " enrichment_df = calculate_motif_enrichment(df_promoters, motif_name, motif_seq)\n", | |
| " print(enrichment_df.to_string(index=False))\n", | |
| " \n", | |
| " # Statistical test: compare off-to-on vs down-regulated\n", | |
| " off_to_on = enrichment_df[enrichment_df['group'] == 'off-to-on']\n", | |
| " down_reg = enrichment_df[enrichment_df['group'] == 'down-regulated']\n", | |
| " \n", | |
| " if len(off_to_on) > 0 and len(down_reg) > 0:\n", | |
| " # Fisher's exact test\n", | |
| " contingency_table = [\n", | |
| " [off_to_on['n_with_motif'].values[0], \n", | |
| " off_to_on['n_promoters'].values[0] - off_to_on['n_with_motif'].values[0]],\n", | |
| " [down_reg['n_with_motif'].values[0], \n", | |
| " down_reg['n_promoters'].values[0] - down_reg['n_with_motif'].values[0]]\n", | |
| " ]\n", | |
| " \n", | |
| " odds_ratio, p_value = fisher_exact(contingency_table)\n", | |
| " print(f\"\\nFisher's exact test (off-to-on vs down-regulated):\")\n", | |
| " print(f\" Odds ratio: {odds_ratio:.2f}\")\n", | |
| " print(f\" P-value: {p_value:.2e}\")\n", | |
| " if p_value < 0.05:\n", | |
| " print(f\" *** Significantly enriched in off-to-on promoters ***\")\n", | |
| "\n", | |
| "print(\"\\n\" + \"=\"*60)\n", | |
| "print(\"Motif enrichment analysis complete!\")\n", | |
| "print(\"=\"*60)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.5 Positional Enrichment Analysis\n", | |
| "\n", | |
| "Now we'll analyze where motifs occur relative to the TSS, implementing a CENTRIMO-like analysis." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 7, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "Analyzing motif positions relative to TSS...\n", | |
| "\n", | |
| "Analyzing tMAC-ChIP...\n", | |
| " Occurrences: 182\n", | |
| " Mean position: -58.1 bp from TSS\n", | |
| " Median position: -60.0 bp from TSS\n", | |
| " Modal position: -57.5 bp from TSS\n", | |
| "\n", | |
| "Analyzing Achi/Vis...\n", | |
| " Occurrences: 241\n", | |
| " Mean position: -24.0 bp from TSS\n", | |
| " Median position: -30.0 bp from TSS\n", | |
| " Modal position: -27.5 bp from TSS\n", | |
| "\n", | |
| "Analyzing Inr...\n" | |
| ] | |
| }, | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| " Occurrences: 1444\n", | |
| " Mean position: -1.6 bp from TSS\n", | |
| " Median position: +0.0 bp from TSS\n", | |
| " Modal position: -27.5 bp from TSS\n", | |
| "\n", | |
| "Analyzing ACA...\n", | |
| " Occurrences: 1201\n", | |
| " Mean position: +1.9 bp from TSS\n", | |
| " Median position: +26.0 bp from TSS\n", | |
| " Modal position: +27.5 bp from TSS\n", | |
| "\n", | |
| "Analyzing CNAAATT...\n", | |
| " Occurrences: 149\n", | |
| " Mean position: +35.5 bp from TSS\n", | |
| " Median position: +40.0 bp from TSS\n", | |
| " Modal position: +42.5 bp from TSS\n", | |
| "\n", | |
| "Positional analysis complete!\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "def analyze_motif_positions(df, motif_seq, motif_name, promoter_type='off-to-on'):\n", | |
| " \"\"\"\n", | |
| " Analyze positional distribution of a motif relative to TSS.\n", | |
| " \n", | |
| " Parameters:\n", | |
| " -----------\n", | |
| " df : pd.DataFrame\n", | |
| " Promoter data\n", | |
| " motif_seq : str\n", | |
| " Motif sequence to analyze\n", | |
| " motif_name : str\n", | |
| " Name of motif\n", | |
| " promoter_type : str\n", | |
| " Type of promoter to analyze\n", | |
| " \n", | |
| " Returns:\n", | |
| " --------\n", | |
| " dict : Position distribution data\n", | |
| " \"\"\"\n", | |
| " subset = df[df['type'] == promoter_type]\n", | |
| " \n", | |
| " positions_relative_to_tss = []\n", | |
| " \n", | |
| " for idx, row in subset.iterrows():\n", | |
| " seq = row['sequence']\n", | |
| " tss = row['tss_position']\n", | |
| " \n", | |
| " # Find motif positions\n", | |
| " motif_positions = find_motif_positions(seq, motif_seq, allow_mismatch=0)\n", | |
| " \n", | |
| " # Calculate relative positions\n", | |
| " for pos in motif_positions:\n", | |
| " rel_pos = pos - tss\n", | |
| " positions_relative_to_tss.append(rel_pos)\n", | |
| " \n", | |
| " # Create histogram\n", | |
| " if len(positions_relative_to_tss) > 0:\n", | |
| " hist, bin_edges = np.histogram(positions_relative_to_tss, \n", | |
| " bins=range(-150, 151, 5))\n", | |
| " bin_centers = (bin_edges[:-1] + bin_edges[1:]) / 2\n", | |
| " else:\n", | |
| " hist = np.array([])\n", | |
| " bin_centers = np.array([])\n", | |
| " \n", | |
| " return {\n", | |
| " 'positions': positions_relative_to_tss,\n", | |
| " 'hist': hist,\n", | |
| " 'bin_centers': bin_centers,\n", | |
| " 'n_occurrences': len(positions_relative_to_tss),\n", | |
| " 'n_promoters': len(subset)\n", | |
| " }\n", | |
| "\n", | |
| "# Analyze positional enrichment for each motif\n", | |
| "print(\"Analyzing motif positions relative to TSS...\\n\")\n", | |
| "\n", | |
| "position_data = {}\n", | |
| "\n", | |
| "for motif_name, motif_seq in MOTIFS.items():\n", | |
| " print(f\"Analyzing {motif_name}...\")\n", | |
| " \n", | |
| " pos_data = analyze_motif_positions(df_promoters, motif_seq, \n", | |
| " motif_name, 'off-to-on')\n", | |
| " position_data[motif_name] = pos_data\n", | |
| " \n", | |
| " if pos_data['n_occurrences'] > 0:\n", | |
| " mean_pos = np.mean(pos_data['positions'])\n", | |
| " median_pos = np.median(pos_data['positions'])\n", | |
| " mode_bin_idx = np.argmax(pos_data['hist'])\n", | |
| " mode_pos = pos_data['bin_centers'][mode_bin_idx] if len(pos_data['hist']) > 0 else 0\n", | |
| " \n", | |
| " print(f\" Occurrences: {pos_data['n_occurrences']}\")\n", | |
| " print(f\" Mean position: {mean_pos:+.1f} bp from TSS\")\n", | |
| " print(f\" Median position: {median_pos:+.1f} bp from TSS\")\n", | |
| " print(f\" Modal position: {mode_pos:+.1f} bp from TSS\")\n", | |
| " else:\n", | |
| " print(f\" No occurrences found\")\n", | |
| " print()\n", | |
| "\n", | |
| "print(\"Positional analysis complete!\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.6 TSS Usage Analysis\n", | |
| "\n", | |
| "Analyze how motif composition affects TSS efficiency and expression level." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 8, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "TSS Usage Correlation with Motif Presence\n", | |
| "============================================================\n", | |
| "\n", | |
| "Analyzing off-to-on genes (similar upstream regulation)\n", | |
| "\n", | |
| " motif n_with mean_with n_without mean_without fold_change t_statistic p_value cohens_d\n", | |
| " Inr 152 840.552632 48 307.062500 2.737399 16.291888 2.043162e-38 2.707954\n", | |
| " ACA 143 837.076923 57 400.017544 2.092601 12.208038 6.404917e-26 1.920671\n", | |
| "CNAAATT 139 841.417266 61 418.786885 2.009178 11.918303 4.867069e-25 1.839279\n", | |
| "\n", | |
| "============================================================\n", | |
| "Interpretation:\n", | |
| "============================================================\n", | |
| "\n", | |
| "Inr:\n", | |
| " Mean expression with motif: 840.6\n", | |
| " Mean expression without motif: 307.1\n", | |
| " Fold change: 2.74×\n", | |
| " Effect size (Cohen's d): 2.71\n", | |
| " Significance: P < 0.001 ***\n", | |
| "\n", | |
| "ACA:\n", | |
| " Mean expression with motif: 837.1\n", | |
| " Mean expression without motif: 400.0\n", | |
| " Fold change: 2.09×\n", | |
| " Effect size (Cohen's d): 1.92\n", | |
| " Significance: P < 0.001 ***\n", | |
| "\n", | |
| "CNAAATT:\n", | |
| " Mean expression with motif: 841.4\n", | |
| " Mean expression without motif: 418.8\n", | |
| " Fold change: 2.01×\n", | |
| " Effect size (Cohen's d): 1.84\n", | |
| " Significance: P < 0.001 ***\n", | |
| "\n", | |
| "\n", | |
| "============================================================\n", | |
| "Additive Effects of Multiple Motifs\n", | |
| "============================================================\n", | |
| "\n", | |
| "0 motifs: n=2, mean expression=92.0\n", | |
| "\n", | |
| "1 motifs: n=9, mean expression=163.7\n", | |
| "\n", | |
| "2 motifs: n=23, mean expression=285.6\n", | |
| "\n", | |
| "3 motifs: n=37, mean expression=447.9\n", | |
| "\n", | |
| "4 motifs: n=30, mean expression=682.8\n", | |
| "\n", | |
| "5 motifs: n=99, mean expression=982.1\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "# Focus on off-to-on genes for TSS usage analysis\n", | |
| "df_off_to_on = df_promoters[df_promoters['type'] == 'off-to-on'].copy()\n", | |
| "\n", | |
| "print(\"TSS Usage Correlation with Motif Presence\")\n", | |
| "print(\"=\"*60)\n", | |
| "print(\"\\nAnalyzing off-to-on genes (similar upstream regulation)\\n\")\n", | |
| "\n", | |
| "# Compare expression levels by motif presence\n", | |
| "results = []\n", | |
| "\n", | |
| "for motif in ['has_Inr', 'has_ACA', 'has_CNAAATT']:\n", | |
| " with_motif = df_off_to_on[df_off_to_on[motif]]['cage_signal']\n", | |
| " without_motif = df_off_to_on[~df_off_to_on[motif]]['cage_signal']\n", | |
| " \n", | |
| " if len(with_motif) > 0 and len(without_motif) > 0:\n", | |
| " # t-test\n", | |
| " t_stat, p_value = stats.ttest_ind(with_motif, without_motif)\n", | |
| " \n", | |
| " # Effect size (Cohen's d)\n", | |
| " pooled_std = np.sqrt(((len(with_motif)-1)*np.std(with_motif)**2 + \n", | |
| " (len(without_motif)-1)*np.std(without_motif)**2) / \n", | |
| " (len(with_motif) + len(without_motif) - 2))\n", | |
| " cohens_d = (np.mean(with_motif) - np.mean(without_motif)) / pooled_std\n", | |
| " \n", | |
| " results.append({\n", | |
| " 'motif': motif.replace('has_', ''),\n", | |
| " 'n_with': len(with_motif),\n", | |
| " 'mean_with': np.mean(with_motif),\n", | |
| " 'n_without': len(without_motif),\n", | |
| " 'mean_without': np.mean(without_motif),\n", | |
| " 'fold_change': np.mean(with_motif) / np.mean(without_motif),\n", | |
| " 't_statistic': t_stat,\n", | |
| " 'p_value': p_value,\n", | |
| " 'cohens_d': cohens_d\n", | |
| " })\n", | |
| "\n", | |
| "df_tss_results = pd.DataFrame(results)\n", | |
| "print(df_tss_results.to_string(index=False))\n", | |
| "\n", | |
| "print(\"\\n\" + \"=\"*60)\n", | |
| "print(\"Interpretation:\")\n", | |
| "print(\"=\"*60)\n", | |
| "for _, row in df_tss_results.iterrows():\n", | |
| " print(f\"\\n{row['motif']}:\")\n", | |
| " print(f\" Mean expression with motif: {row['mean_with']:.1f}\")\n", | |
| " print(f\" Mean expression without motif: {row['mean_without']:.1f}\")\n", | |
| " print(f\" Fold change: {row['fold_change']:.2f}×\")\n", | |
| " print(f\" Effect size (Cohen's d): {row['cohens_d']:.2f}\")\n", | |
| " if row['p_value'] < 0.001:\n", | |
| " print(f\" Significance: P < 0.001 ***\")\n", | |
| " elif row['p_value'] < 0.01:\n", | |
| " print(f\" Significance: P < 0.01 **\")\n", | |
| " elif row['p_value'] < 0.05:\n", | |
| " print(f\" Significance: P < 0.05 *\")\n", | |
| " else:\n", | |
| " print(f\" Significance: P = {row['p_value']:.3f} (not significant)\")\n", | |
| "\n", | |
| "# Analyze additive effects\n", | |
| "print(\"\\n\\n\" + \"=\"*60)\n", | |
| "print(\"Additive Effects of Multiple Motifs\")\n", | |
| "print(\"=\"*60)\n", | |
| "\n", | |
| "for n in range(6):\n", | |
| " subset = df_off_to_on[df_off_to_on['n_motifs'] == n]\n", | |
| " if len(subset) > 0:\n", | |
| " print(f\"\\n{n} motifs: n={len(subset)}, mean expression={np.mean(subset['cage_signal']):.1f}\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.7 Logistic Regression for Promoter Classification\n", | |
| "\n", | |
| "Build a logistic regression model to predict \"narrow-high\" promoters based on motif composition." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 9, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "Building Logistic Regression Model\n", | |
| "============================================================\n", | |
| "\n", | |
| "Predicting 'narrow-high' promoters from motif composition\n", | |
| "\n", | |
| "Training set: 160 promoters\n", | |
| "Test set: 40 promoters\n", | |
| "Proportion narrow-high in training: 65.62%\n", | |
| "Proportion narrow-high in test: 65.00%\n", | |
| "\n", | |
| "============================================================\n", | |
| "Model Coefficients (log-odds)\n", | |
| "============================================================\n", | |
| " motif coefficient odds_ratio\n", | |
| " ACA 2.333489 10.313867\n", | |
| "AchiVis 2.272681 9.705384\n", | |
| "CNAAATT 2.178116 8.829657\n", | |
| " Inr 2.005142 7.427150\n", | |
| " tMAC 1.787855 5.976618\n", | |
| "\n", | |
| "Intercept: -7.149\n", | |
| "\n", | |
| "============================================================\n", | |
| "Model Performance on Test Set\n", | |
| "============================================================\n", | |
| "\n", | |
| "Accuracy: 0.975\n", | |
| "Precision: 1.000\n", | |
| "Recall: 0.962\n", | |
| "F1 Score: 0.980\n", | |
| "ROC AUC: 0.974\n", | |
| "\n", | |
| "Confusion Matrix:\n", | |
| "[[14 0]\n", | |
| " [ 1 25]]\n", | |
| "\n", | |
| "(Rows: actual, Columns: predicted)\n", | |
| "[TN FP]\n", | |
| "[FN TP]\n", | |
| "\n", | |
| "============================================================\n", | |
| "Probability of Narrow-High by Motif Count\n", | |
| "============================================================\n", | |
| " n_motifs n_promoters mean_probability std_probability\n", | |
| " 0 1 0.000785 0.000000\n", | |
| " 1 3 0.006173 0.001399\n", | |
| " 2 2 0.043600 0.000000\n", | |
| " 3 9 0.292834 0.037394\n", | |
| " 4 5 0.759455 0.010281\n", | |
| " 5 20 0.968587 0.000000\n", | |
| "\n", | |
| "*** Promoters with all 5 motifs: 96.9% ± 0.0% probability of being narrow-high ***\n", | |
| "(Paper reported: 92% ± 5.5%)\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "print(\"Building Logistic Regression Model\")\n", | |
| "print(\"=\"*60)\n", | |
| "print(\"\\nPredicting 'narrow-high' promoters from motif composition\\n\")\n", | |
| "\n", | |
| "# Prepare data\n", | |
| "df_model = df_off_to_on.copy()\n", | |
| "\n", | |
| "# Binary outcome: is promoter narrow-high?\n", | |
| "df_model['is_narrow_high'] = (df_model['promoter_class'] == 'narrow-high').astype(int)\n", | |
| "\n", | |
| "# Features: motif presence\n", | |
| "feature_cols = ['has_tMAC', 'has_AchiVis', 'has_Inr', 'has_ACA', 'has_CNAAATT']\n", | |
| "X = df_model[feature_cols].astype(int)\n", | |
| "y = df_model['is_narrow_high']\n", | |
| "\n", | |
| "# Split into training and test sets (80/20)\n", | |
| "from sklearn.model_selection import train_test_split\n", | |
| "X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, \n", | |
| " random_state=42, stratify=y)\n", | |
| "\n", | |
| "print(f\"Training set: {len(X_train)} promoters\")\n", | |
| "print(f\"Test set: {len(X_test)} promoters\")\n", | |
| "print(f\"Proportion narrow-high in training: {y_train.mean():.2%}\")\n", | |
| "print(f\"Proportion narrow-high in test: {y_test.mean():.2%}\")\n", | |
| "\n", | |
| "# Fit logistic regression\n", | |
| "model = LogisticRegression(random_state=42, max_iter=1000)\n", | |
| "model.fit(X_train, y_train)\n", | |
| "\n", | |
| "print(\"\\n\" + \"=\"*60)\n", | |
| "print(\"Model Coefficients (log-odds)\")\n", | |
| "print(\"=\"*60)\n", | |
| "\n", | |
| "coef_df = pd.DataFrame({\n", | |
| " 'motif': [col.replace('has_', '') for col in feature_cols],\n", | |
| " 'coefficient': model.coef_[0],\n", | |
| " 'odds_ratio': np.exp(model.coef_[0])\n", | |
| "})\n", | |
| "coef_df = coef_df.sort_values('coefficient', ascending=False)\n", | |
| "print(coef_df.to_string(index=False))\n", | |
| "\n", | |
| "print(f\"\\nIntercept: {model.intercept_[0]:.3f}\")\n", | |
| "\n", | |
| "# Evaluate on test set\n", | |
| "y_pred = model.predict(X_test)\n", | |
| "y_pred_proba = model.predict_proba(X_test)[:, 1]\n", | |
| "\n", | |
| "print(\"\\n\" + \"=\"*60)\n", | |
| "print(\"Model Performance on Test Set\")\n", | |
| "print(\"=\"*60)\n", | |
| "\n", | |
| "from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score\n", | |
| "\n", | |
| "print(f\"\\nAccuracy: {accuracy_score(y_test, y_pred):.3f}\")\n", | |
| "print(f\"Precision: {precision_score(y_test, y_pred):.3f}\")\n", | |
| "print(f\"Recall: {recall_score(y_test, y_pred):.3f}\")\n", | |
| "print(f\"F1 Score: {f1_score(y_test, y_pred):.3f}\")\n", | |
| "print(f\"ROC AUC: {roc_auc_score(y_test, y_pred_proba):.3f}\")\n", | |
| "\n", | |
| "print(\"\\nConfusion Matrix:\")\n", | |
| "cm = confusion_matrix(y_test, y_pred)\n", | |
| "print(cm)\n", | |
| "print(\"\\n(Rows: actual, Columns: predicted)\")\n", | |
| "print(\"[TN FP]\")\n", | |
| "print(\"[FN TP]\")\n", | |
| "\n", | |
| "# Analyze probability by number of motifs\n", | |
| "print(\"\\n\" + \"=\"*60)\n", | |
| "print(\"Probability of Narrow-High by Motif Count\")\n", | |
| "print(\"=\"*60)\n", | |
| "\n", | |
| "prob_by_motifs = []\n", | |
| "\n", | |
| "for n in range(6):\n", | |
| " subset_test = X_test[X_test.sum(axis=1) == n]\n", | |
| " if len(subset_test) > 0:\n", | |
| " probs = model.predict_proba(subset_test)[:, 1]\n", | |
| " prob_by_motifs.append({\n", | |
| " 'n_motifs': n,\n", | |
| " 'n_promoters': len(subset_test),\n", | |
| " 'mean_probability': np.mean(probs),\n", | |
| " 'std_probability': np.std(probs)\n", | |
| " })\n", | |
| "\n", | |
| "df_prob = pd.DataFrame(prob_by_motifs)\n", | |
| "print(df_prob.to_string(index=False))\n", | |
| "\n", | |
| "# Reproduce key finding: promoters with all 5 motifs\n", | |
| "all_motifs = X_test[X_test.sum(axis=1) == 5]\n", | |
| "if len(all_motifs) > 0:\n", | |
| " all_motifs_probs = model.predict_proba(all_motifs)[:, 1]\n", | |
| " mean_prob = np.mean(all_motifs_probs)\n", | |
| " std_prob = np.std(all_motifs_probs)\n", | |
| " print(f\"\\n*** Promoters with all 5 motifs: {mean_prob:.1%} ± {std_prob:.1%} probability of being narrow-high ***\")\n", | |
| " print(f\"(Paper reported: 92% ± 5.5%)\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.8 Visualization\n", | |
| "\n", | |
| "Create comprehensive visualizations of the results." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 10, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "ename": "ValueError", | |
| "evalue": "The number of FixedLocator locations (5), usually from a call to set_ticks, does not match the number of labels (4).", | |
| "output_type": "error", | |
| "traceback": [ | |
| "\u001b[31m---------------------------------------------------------------------------\u001b[39m", | |
| "\u001b[31mValueError\u001b[39m Traceback (most recent call last)", | |
| "\u001b[36mCell\u001b[39m\u001b[36m \u001b[39m\u001b[32mIn[10]\u001b[39m\u001b[32m, line 15\u001b[39m\n\u001b[32m 13\u001b[39m ax1.set_xlabel(\u001b[33m'\u001b[39m\u001b[33mMotif\u001b[39m\u001b[33m'\u001b[39m)\n\u001b[32m 14\u001b[39m ax1.legend(title=\u001b[33m'\u001b[39m\u001b[33mPromoter type\u001b[39m\u001b[33m'\u001b[39m, bbox_to_anchor=(\u001b[32m1.05\u001b[39m, \u001b[32m1\u001b[39m), loc=\u001b[33m'\u001b[39m\u001b[33mupper left\u001b[39m\u001b[33m'\u001b[39m)\n\u001b[32m---> \u001b[39m\u001b[32m15\u001b[39m \u001b[43max1\u001b[49m\u001b[43m.\u001b[49m\u001b[43mset_xticklabels\u001b[49m\u001b[43m(\u001b[49m\u001b[43m[\u001b[49m\u001b[43mm\u001b[49m\u001b[43m.\u001b[49m\u001b[43mreplace\u001b[49m\u001b[43m(\u001b[49m\u001b[33;43m'\u001b[39;49m\u001b[33;43mhas_\u001b[39;49m\u001b[33;43m'\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[33;43m'\u001b[39;49m\u001b[33;43m'\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;28;43;01mfor\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mm\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;129;43;01min\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mmotif_counts_norm\u001b[49m\u001b[43m.\u001b[49m\u001b[43mindex\u001b[49m\u001b[43m]\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mrotation\u001b[49m\u001b[43m=\u001b[49m\u001b[32;43m45\u001b[39;49m\u001b[43m)\u001b[49m\n\u001b[32m 16\u001b[39m plt.setp(ax1.xaxis.get_majorticklabels(), rotation=\u001b[32m45\u001b[39m, ha=\u001b[33m'\u001b[39m\u001b[33mright\u001b[39m\u001b[33m'\u001b[39m)\n\u001b[32m 18\u001b[39m \u001b[38;5;66;03m# 2. CAGE signal distribution by promoter type\u001b[39;00m\n", | |
| "\u001b[36mFile \u001b[39m\u001b[32m/app/.venv/lib/python3.13/site-packages/matplotlib/axes/_base.py:74\u001b[39m, in \u001b[36m_axis_method_wrapper.__set_name__.<locals>.wrapper\u001b[39m\u001b[34m(self, *args, **kwargs)\u001b[39m\n\u001b[32m 73\u001b[39m \u001b[38;5;28;01mdef\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[34mwrapper\u001b[39m(\u001b[38;5;28mself\u001b[39m, *args, **kwargs):\n\u001b[32m---> \u001b[39m\u001b[32m74\u001b[39m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mget_method\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m(\u001b[49m\u001b[43m*\u001b[49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43m*\u001b[49m\u001b[43m*\u001b[49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", | |
| "\u001b[36mFile \u001b[39m\u001b[32m/app/.venv/lib/python3.13/site-packages/matplotlib/axis.py:2106\u001b[39m, in \u001b[36mAxis.set_ticklabels\u001b[39m\u001b[34m(self, labels, minor, fontdict, **kwargs)\u001b[39m\n\u001b[32m 2102\u001b[39m \u001b[38;5;28;01melif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(locator, mticker.FixedLocator):\n\u001b[32m 2103\u001b[39m \u001b[38;5;66;03m# Passing [] as a list of labels is often used as a way to\u001b[39;00m\n\u001b[32m 2104\u001b[39m \u001b[38;5;66;03m# remove all tick labels, so only error for > 0 labels\u001b[39;00m\n\u001b[32m 2105\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(locator.locs) != \u001b[38;5;28mlen\u001b[39m(labels) \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mlen\u001b[39m(labels) != \u001b[32m0\u001b[39m:\n\u001b[32m-> \u001b[39m\u001b[32m2106\u001b[39m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mValueError\u001b[39;00m(\n\u001b[32m 2107\u001b[39m \u001b[33m\"\u001b[39m\u001b[33mThe number of FixedLocator locations\u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m 2108\u001b[39m \u001b[33mf\u001b[39m\u001b[33m\"\u001b[39m\u001b[33m (\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mlen\u001b[39m(locator.locs)\u001b[38;5;132;01m}\u001b[39;00m\u001b[33m), usually from a call to\u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m 2109\u001b[39m \u001b[33m\"\u001b[39m\u001b[33m set_ticks, does not match\u001b[39m\u001b[33m\"\u001b[39m\n\u001b[32m 2110\u001b[39m \u001b[33mf\u001b[39m\u001b[33m\"\u001b[39m\u001b[33m the number of labels (\u001b[39m\u001b[38;5;132;01m{\u001b[39;00m\u001b[38;5;28mlen\u001b[39m(labels)\u001b[38;5;132;01m}\u001b[39;00m\u001b[33m).\u001b[39m\u001b[33m\"\u001b[39m)\n\u001b[32m 2111\u001b[39m tickd = {loc: lab \u001b[38;5;28;01mfor\u001b[39;00m loc, lab \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28mzip\u001b[39m(locator.locs, labels)}\n\u001b[32m 2112\u001b[39m func = functools.partial(\u001b[38;5;28mself\u001b[39m._format_with_dict, tickd)\n", | |
| "\u001b[31mValueError\u001b[39m: The number of FixedLocator locations (5), usually from a call to set_ticks, does not match the number of labels (4)." | |
| ] | |
| }, | |
| { | |
| "data": { | |
| "image/png": "iVBORw0KGgoAAAANSUhEUgAAAmcAAAGzCAYAAABq9bC1AAAAOnRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjEwLjgsIGh0dHBzOi8vbWF0cGxvdGxpYi5vcmcvwVt1zgAAAAlwSFlzAAAPYQAAD2EBqD+naQAAbd9JREFUeJzt3XlYjXn/B/D3KZU2pWTLkmWcoqIkg4gYZJSl8oxn1NgZkxnbkGUooizZwkz2QTOhhSwZhrFElDK2Ec9YhuixlLWi1Pn90a/76ah0Tk6dk/N+XVfXdc69fu7vafn0XUUSiUQCIiIiIlIJGsoOgIiIiIj+h8kZERERkQphckZERESkQpicEREREakQJmdEREREKoTJGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyVk15OfnB7FYDLFYjHPnzknti4+Ph5eXF+zs7IRjXrx4oaRI3+99z1GetLQ04Vxvb+9KilD1sRyIiD4+NZQdQHUQGhqKNWvWCO+7dOmCzZs3Sx1z5coVeHh4SG27dOkSdHR05L5fWloaYmJiAABWVlbo1auXzOdNmDABb968kfleLi4uuH//fpn7fXx8MHv2bJmvRxV37tw5JCYmAgB69eoFKysrJUdUund/HooYGBjgk08+gYeHBzw9PSESiZQQXeVQ9mfj7e0t3L88QUFBGDx4cCVHRESViclZBSQkJOD+/fswNzcXtu3atUth179//77wx2/QoEElkrPx48fD09MTACAWi6XiKkrMevXqha+++goaGhrQ19dXWGyKVNZzqKvExEThczc3N1fZ5Kwsr169woULF3DhwgWkpKQgKChI2SEpTHX/bIioemFyVgEFBQWIjIzEd999BwDIzs7G/v37q+z+FhYWsLCwKLH94cOHwmsXFxc4OjrKdd05c+aU+KNTr169CsX4PtnZ2dDT0yvzOaj66NatG8aNG4fc3FwcPHgQu3fvBgBER0fj3//+N2xsbMo8t6CgAHl5eRWqXf4Yva885syZg5cvXwrvAwMDce3aNQCF/+R07dpV2NesWbPKD5aIKhX7nMmpqBYqOjoaBQUFAICDBw8iKyur3BqqQ4cOwdvbGw4ODrC2tkbPnj0xf/58PHr0SDjG29sbPj4+wvuYmBihT5Gfnx+A0vtqicVihIaGCufNmjULYrEYLi4uMj9bq1at4ODgIPXVuHFjYX/x+8bHx2PVqlXo1q0bbGxs8MUXXyA1NVXqei4uLsLxDx48wMSJE9G+fXv079+/zOco8ueff+Lbb7+Fk5MTrK2t0aVLF4wZM0b4g/Su1NRUfPXVV2jbti26dOmCFStWCJ8PUPh5Fd0rNDQU4eHhcHFxQbt27TBmzBikp6fjzZs3CAwMRMeOHWFnZ4dJkybh2bNnJe51/vx5jB8/Hp9++imsra3h4uKCoKAgPH/+XOo4ecpLLBZLNRXOnDlTODc6OrqcT+5/rl27Bm9vb7Rt2xZOTk5YuXIl3r59C6CwZrXomjNmzChRfkX7xo8fL/P9TE1N4eDggM6dO2PBggVo1KiRsC85ORlAYTNo0bUjIyOxbt069OjRA23atMGff/4JAJBIJNi5cyeGDBkCOzs72NjYoG/fvli+fLlUUgIU/owUXe/KlSuYNm0a7Ozs0KVLF4SGhkIikSA1NRXe3t6wtbVF9+7dsW3bthKx5+bmYv369RgwYADatWuHtm3bwt3dHevXr0dubq5wnKyfTWpqKqZMmSJ8z3bt2hWzZ8/Gf//7X6n7ylIe7xKLxVI/l4aGhsK+pk2bwsHBAWZmZvjyyy/RuXNnDBs2TOr8Z8+eoXXr1hCLxXBzcwNQ2FRb/HfLqVOnMHjwYNjY2MDFxQVbt24tEUdeXh62bNmCwYMHo127dmjXrh28vLywd+/eUuMmoophzZmc+vTpg3379uG///0vTp06BWdnZ6FJs3///ti5c2ep5y1duhQbN26U2paWlobw8HAcPnwYv/76q1QipOr8/f1x79494f2FCxcwYcIEHD58GDVqlPy28vHxEY43MjJ677WjoqLwww8/ID8/X9j25MkTnDx5Eq6uriVq9+7evYuhQ4ciOzsbAPD69Wv89NNPaNSoEby8vEpcPzY2Fnfv3hXenzx5EuPGjUPjxo3x+++/C9vj4uJQo0YNLFu2TNi2e/duzJ07Vyrxu3//PrZu3YoTJ05g586dpT6fvOVVEffv38ewYcPw6tUrAIXl8OOPPyIzMxPz58/Hp59+ikaNGiEtLQ1HjhxBQEAAatasCQA4evSocJ2i5FleIpEIBgYGwvviCU6Rn376SaocgMLEbOrUqThw4IDU9tu3byMsLAxHjhxBREREqeU6efJk4bPMzs7GmjVr8Pz5c+zdu1cYCJOeno6FCxeiZcuW6Ny5sxDbyJEjkZSUJHW969ev4/r16zh58iQ2b94MbW1tmZ79xIkT8PX1lXrmR48eITIyEidOnCjz57u08qiopk2bwtHREYmJiTh//jwePHiAhg0bAgCOHz8u/DyV9vkmJycjNjZWOOb+/fsICgpCbm4uxo4dC6AwMRszZgwSEhKkzr106RKmT5+OGzdu4Pvvv1fIsxCpO9acycnU1BTdu3cHUPiH+vr167h48SIACP2n3nXx4kUhMdPR0cGMGTPw448/omPHjgCAx48fIyAgAEBh88WcOXOEc7t164bw8HCEh4e/t0YjPDxcqhPw+PHjER4ejlWrVsn8bD4+PsJ/0uWNovzvf/+LadOmYc2aNWjQoAGAwl/o8fHxpR6fkZGBmTNnYvPmzRg3blyZMTx8+BD+/v7CH4levXph7dq1WL16NYYMGQItLa1SY7G0tMS6deukRixGRESUeo+7d+9i9OjRWLdundBse/36dRw/fhwzZsxASEiIkLQcPHhQqLl5+PAh5s+fj4KCAujr6+OHH37Apk2bhHK/ffs2li9fXqHyKuvzCw8Ph7Ozc5nlVdz9+/fRrl07/PTTT/juu++gqakJANi5cydSU1MhEomEe2RlZeHYsWPCuUWvdXV15aptLZKbm4s9e/bg+vXrwrbS+hHeu3cPbm5uWL9+PRYvXox69eohLi5OSMyMjIywYMECrF27Vjj/1q1bZZZrVlYWli9fjilTpgjbtm/fjjp16mDt2rUYOnSosL3498PWrVuFxKxBgwYICQnB8uXLhWQmKSlJqDkq77PJycmBn58fcnNzUaNGDUyePBmbN2/G6NGjAUj/fMtSHh+iaFCSRCKR6mpR/LPu169fifPu3r0LV1dXrF+/HsOHDxe2h4aGIjMzEwCwbds2ITFr166d8HNZ1Iy6ceNG4XchEX0Y1pxVgJeXF44cOYLjx48LtR5isRi2tralHr9v3z7h9ZdffomRI0cCKPwF5+zsjNzcXMTHx+PZs2cQi8VSTWlFzUblcXBwkPqPtqipo7IMHToUY8aMAVCYlISEhAAA/vnnn1KPnzlzJoYMGVLudePi4oTaBzs7O6xdu1bY16dPn1LP0dLSQmhoKOrUqYMePXogMjISOTk5UrVjxdnZ2Qn/4Z8+fRrh4eEAgM8//1z4bPbt2yfUNty/fx+WlpZSsfXp0weWlpYACv8gxsXFIScnBwcOHMC8efOgoSH9f0955aWIz09XVxcrV66EoaEhevTogVu3bgnfe0ePHoWlpSUGDx6MNWvWoKCgALGxsejXrx8ePXqEq1evAihsitbT05P5njExMcLI4uKsra3h5ORUYru9vb1UTSQALF68WHj97bffCt8nTZo0EZrg4uLi4O/vX2IE6KRJk/D5558DKKyFKqo9nTt3Ljp16gR7e3v8+uuvACD1/VA8cZk3bx569OgBANDT0xP+CTpw4ADGjh1b7mfz+++/CwlM586dhX09evRAXFyckIRnZmbCxMSk3PL4EH379kVgYCBevnyJffv2YezYscLvF6Dwd05pNXgNGzbEkiVLoKmpCWdnZ1y6dAkpKSnIzc3FyZMnMXDgQMTGxgrHDx8+HMbGxgAANzc3rF69GkBhrXTbtm0V9jxE6orJWQV07doVDRo0QHp6OuLi4gDgvYnHnTt3hNfFEzgTExM0btwYN2/ehEQiwd27d4VfeMpQ2oCAskZRFh9sULt2beH1u/2DihT98StP8bIqqqEsT/PmzVGnTh0AgIaGBmrVqoWcnJwy53cr/hkUbyqztrYWXhd/pqLrFI8tOjq61L5gL1++xKNHj1C/fn2p7fKWV0U0b95cqi+Sra2tkJylpaUBKKwl6tKlC06dOoX4+Hg8ffoUf/zxByQSCQAIiU5FaWlpwdXVFbNmzRJq7oor7fugrJ+PVq1aQVdXFzk5OXj+/DkyMzNhamoqde67n2VRclY0EKF4MlT8+6H4PYsnE8WvV/yY97l9+7bw+uTJkzh58mSJYyQSCW7dulUiOZP150JWNWvWxOeff46IiAjcuHED169fx+PHj5GVlQWg7M/X2tpa6vOytbVFSkoKgP997xQvj0mTJpV6nZs3byrgKYiIyVkFaGhoYPDgwUKtjo6ODtzd3St0LVWaC6poQIAsatWqJbwu/ku96I/8u4qSp8rwbl+k8vpwFU9gitdwFe8vVVxZz1SWogShOHnLSxHK+t7y9PTEqVOnkJeXh7i4OJw4cQJAYTkWH/Uni6LRmiKRCPr6+rCwsBCahEvzbnL1oYp/ZrJ8luWpzJ/HnJycEtsUXR5A4edb1IQbGxsr3FdTU7PUJs3SVLQcSntGIpIfk7MK8vDwwI8//oiCggL07t1b6o/vuywsLHDq1CkAhZ1nXV1dAQBPnz4VmlpEIhGaNGkCQPqPTPGO59WZrL/si0+tceLECblGDla24rH5+vpi4sSJJY7JycmBrq5uha5fvIwq8rnfvn0br169EhKT4v1/io+idHFxQe3atfH06VNERkbi77//BgD07t1b5g7wRWRtdi9S2veBhYUFbt26BQC4fPmyUHt148YN4Y+9kZFRiVqnD2FhYSH0j7t06ZJQS1u8zIp/3u/7bIpPXTFo0CAEBweXuF9Z3xeVkQza2NhALBbj+vXrOHDggHCPjh07lvlP0tWrV1FQUCD87inte8fCwkIYYfz777+X2jzK5IxIMZicVZC5uTnmzp2LJ0+elNkXqkj//v2xfft2AIWdi+vVq4emTZvi559/FvowOTk5CU2axRO95ORknDhxAvr6+mjWrFml/Kdd5MaNGyWaogwNDat0glhXV1eEhIQgNzcXKSkpmDhxIgYMGACJRILTp0/D3t6+wrWUH6pv375CbOvXr4dIJEK7du3w+vVrpKWl4ezZs3jz5g22bNlSoesXrwE8fPgwGjVqhBo1asDW1lampCk7OxuTJk3CsGHDkJqaioMHDwr7evbsKbzW1tbGgAEDsHXrVqGvGfDhTZoV5ebmJnRYX716NbS1tVG7dm2p6StcXV0Vmsj0799fSM7mz5+PrKwsiEQiqf5fxcvjfZ9N586dYWJigszMTOzZswdGRkbo3LkzCgoKcP/+faSkpJT4PCqbp6cnFi5ciPT09FKf513379/HjBkz0L9/f5w9e1Zo0tTW1ka3bt0AFH5ORcnZ+PHjMXr0aNSvXx+PHj3CrVu3cOzYMYwYMYKrExApAJOzD1B8JNj7tGvXDqNHj8bGjRvx5s2bEjOnm5mZYd68ecL7Fi1awMzMDI8fP0ZaWpowlL2yl2UJDAwssc3R0VFILKtCvXr1MHfuXGG6isOHD+Pw4cPC/rIGXVSF+vXrC7Hl5uZKzStXRN6Jf989VyQSQSKR4MSJE0Jz49GjR6VqvspSr149JCUlCbW0Rby8vITBC0U8PT2l5rEyMzMTRg9XNVdXVxw5cgQHDx7Es2fPpEYrA4V96YqPxlSE4cOH48SJEzh//jzu379f4vodOnSQGrVY3mcTHBwsTKWxdevWEnOEFV9NpCq4u7tj6dKlwj9/Wlpa6N27d5nHt2jRAnFxcVKd/gFgwoQJQo2lj48P4uPjkZCQgL///luYd5GIFI9TaVSR77//HitXroSjoyMMDAygpaUFc3NzfPnll4iOjpZqIqhRowbWrVuH9u3bq+zSS5XJy8sL4eHh6N27N+rUqYMaNWrA1NQU3bp1U/qyOV5eXtixY4dUbHXq1IGtrS0mTJgglWTLSywWY/HixWjRooXczYsAhNpYe3t76OjowMzMDOPHj4e/v3+JYz/55BOpjvCurq4lRphWFZFIhJCQEAQEBMDW1hZ6enrQ1taGhYUFxo4di127dpU7N568tLW1sWXLFkydOhVisRg1a9aEjo4OWrVqhalTp5aY46y8z8bZ2RlRUVEYMGAA6tevDy0tLdSuXRtWVlYYMWIEVq5cqdD4y2NsbCy17Fu3bt3e2/XC1tYWGzZsgI2NDbS1tWFubg4/Pz98/fXXwjHa2trYuHEj5syZA1tbW+jr60NHRweNGjVC9+7dsXDhQnz22WeV+lxE6kIkqcweyUSkstasWSPU/u3evVuptZKkeHv27BFWglixYkWJwQDnzp0TViMpq68cESkHmzWJ1ExWVhaePHki9IFq3rw5E7OPSE5ODp49e4aoqCgAhX1YKzKxMBEpD5MzIjVjb28v9X7ChAlKioQqw+eff4779+8L70eNGvXe6U2ISPUwOSNSQyKRCA0bNsTw4cOFWfjp42JmZgYPDw9hQBERVR/sc0ZERESkQjhak4iIiEiFMDkjIiIiUiFMzoiIiIhUyEc/IODx45fKDqEEExN9ZGZmKTsMlcYyKh/LqHwso/KpWhmZmRkqOwQipWPNWRUTiQBNTQ1UwnrHHw2WUflYRuVjGZWPZUSkmpicEREREakQJmdEREREKoTJGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyRkRERGRCmFyRkRERKRCmJwRERERqRAmZ0REREQqhMkZERERkQr56Bc+VyTDoB/LPeblzK+rIBIiIiL6WLHmjIiIqIr4+flBLBZDLBbD2toan332GdasWYO3b98qO7RS+fn5YcKECZV2/dDQUAwYMKDSrl9dMTkjIiKqQl27dkV8fDx+++03jBgxAmvWrMGmTZtKPTY3N7eKo6scH8tzVBU2axIREVUhbW1tmJmZAQD+/e9/4/fff8exY8cwbtw4+Pn54cWLF7CxsUF4eDi0tbVx7NgxXL9+HQsXLsSff/4JXV1d9O7dG35+ftDX1wcA4TxbW1ts27YNubm5GD58OMaPH4+QkBBERUWhZs2a+O677+Dh4SHE8r7rhoaGIiYmBgAgFosBANu2bUPHjh2Rnp6O4OBgnD59GhoaGmjfvj1mz56NRo0aScXz7nMUFx0djTVr1khdPygoCElJScjMzERYWJhwbF5eHrp164YpU6bAy8sL3t7e+OSTTwAAe/fuRY0aNTB06FB89913EIlEAAoTwhUrVmD//v14+fIlPvnkE0ybNg0dO3ZU7AdaCZicERERKZGOjg6ePXsmvE9ISICBgQG2bNkCAMjOzsaoUaNgZ2eHyMhIZGRkYM6cOViwYAGCg4OF886ePYv69etjx44dSElJwezZs3HhwgV06NABu3btwsGDBzFv3jx06dIF9evXL/e6I0eOxM2bN/Hq1SsEBQUBAIyMjJCXl4dRo0ahXbt2CA8PR40aNbBu3TqMHj0asbGx0NbWLvU53tWvXz/85z//walTp4RjDA0NYWFhgWHDhuHRo0eoW7cuAOD48eN4/fo1+vXrJ5wfExMDT09P7N69G1euXMHcuXPRsGFDDBkyBAAwf/58/P3331ixYgXq1q2LI0eOYPTo0di3bx8sLCwU8+FVEjZrEhERKYFEIsGZM2cQHx8vVZujp6eHwMBAfPLJJ/jkk0+wf/9+5ObmYvHixWjVqhU6deqEuXPnYu/evXjy5IlwnrGxMebMmYPmzZvD09MTzZo1w+vXrzF+/HhYWFhg3Lhx0NLSQnJyMgCUe119fX3UrFlTqOkzMzODtrY2Dh48iIKCAixcuBBisRgtWrRAUFAQ0tPTkZiYWOZzvKtmzZrQ09ODpqamcP2aNWvC3t4ezZo1w969e4Vjo6Ki0LdvX6GmEAAaNGiAWbNmoXnz5nB3d8ewYcOwdetWAMCDBw8QHR2NVatWwcHBAU2aNMGoUaPQvn17REdHK+wzrCysOSMiIqpCx48fh52dHfLy8iCRSNC/f39MnDhR2N+qVSuh9gkAbt68CbFYDD09PWGbvb09CgoKcPv2bdSpUwcA0LJlS2ho/K/OpU6dOlJJkaamJoyNjZGRkSHXdd+VmpqKu3fvwt7eXmr7mzdvcPfu3TKfQx5eXl7YuXMnxowZgydPnuDUqVP4+eefpY5p27at0IQJAO3atcOWLVuQn5+PGzduID8/H3379pU6Jzc3F8bGxhWKqSoxOSMiIqpCHTt2hL+/P7S0tFC3bl3UqCH9p1hXV7dC1333OiKRqNRtBQUFFbp+kezsbLRp0wbLli0rsc/ExER4XdHnAIABAwZg2bJluHDhAi5cuIBGjRrBwcFBrhg1NTURFRUFTU1NqX3Fk1FVxeSMiIioCunq6qJp06YyH9+iRQvExMQgOztbSCxSUlKgoaGBZs2aVTgOWa6rpaVVIplr06YN4uLiYGpqCgMDgwrfv6zrA0Dt2rXRq1cvREdH488//8TgwYNLHHPp0iWp9xcvXkTTpk2hqakJKysr5OfnIzMzU66kTlWwzxkREZEKc3Nzg7a2Nvz8/HDjxg2cPXsWCxYswIABA8pselTUdc3NzXH9+nXcunULmZmZyMvLg5ubG2rXro2vv/4a58+fx71793Du3DkEBgbiv//9r1wxmJubIy0tDdeuXUNmZqbUlBteXl6IiYnBzZs3MXDgwBLnPnjwAEFBQbh16xb279+PHTt2wMfHBwDQrFkzuLm5Yfr06Th8+DDu3buHS5cuISwsDMePH69wmVUV1pwRERGpMF1dXWzatAkLFy6Ep6en1JQXlX3dIUOGIDExER4eHsjOzham0tixYweWLVsGX19fZGVloV69eujUqZPcNWl9+vTBkSNH4OPjgxcvXiAoKEioJevcuTPq1q2Lli1bol69eiXOHThwIF6/fg0vLy9oamrCx8cH//rXv4T9QUFB+PHHHxEcHIxHjx7B2NgY7dq1Q/fu3StWYFVIJJFIJMoOojI9fvxSYddSxPJNIhFQp44hnjx5iY+75CuOZVQ+llH5WEblU8UyMjMzVHYIpCKysrLQrVs3BAUFoXfv3lL7vL29YWlpidmzZyspusrFmjMiIiJSGQUFBXj69Ck2b96MWrVqwcXFRdkhVTkmZ0RERKQyHjx4gJ49e6J+/foIDg4uMeJUHajfExMREZHKatSoEa5fv/7eY7Zv315F0SgHkzMV1f9UbLnH7O/qXgWREBERUVXiVBpEREREKoTJGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyRkREdFHIC0tDWKxGNeuXVN2KHJzcXHB1q1blR2GyuBUGkRERABeT1lSpferuXx6pV7/3Llz8PHxQVJSEmrVqlWp95JVdHQ0Fi1ahPPnz0ttj4yMhK6urpKiUj1MzoiIiKhMEokE+fn5lTpTv4mJSaVduzpisyYREVE1cfLkSQwdOhQODg7o2LEjxo0bh7t375Y4Li0tDT4+PgCADh06QCwWw8/PD0Dh2pVhYWFwcXGBra0t3N3dcejQIeHcc+fOQSwW48SJExg8eDBsbGyQnJwMb29vBAYGYsmSJXB0dESXLl0QGhoqdd8tW7bAzc0N7dq1g7OzM/z9/ZGVlSVcd+bMmXj58iXEYjHEYrFwfvFmzalTp2LSpElS183Ly0PHjh2xZ88emZ6hNNHR0XBwcMCpU6fg6uoKOzs7jBo1Co8ePZI6bvfu3XB1dYWNjQ369u2L8PBwYd+3336L+fPnC+8XLlwIsViMmzdvAgByc3PRrl07nDlz5r2xlIfJGRERUTWRk5ODESNGICoqClu3boVIJMI333yDgoICqeMaNGggJD6HDh1CfHw8Zs+eDQAICwvDnj17EBAQgAMHDmD48OH4/vvvkZiYKHWNkJAQTJ06FQcPHoRYLAYAxMTEQE9PD7t27cL333+PtWvX4vTp08I5IpEIs2fPxv79+xEcHIyzZ89i6dKlAAA7OzvMmjULBgYGiI+PR3x8PEaOHFniGd3c3PDHH38ISR0AxMfH4/Xr1+jVq5dcz/Cu169fY/PmzViyZAl27NiB9PR0LF68WNgfGxuLVatWYfLkyTh48CCmTJmC1atXIyYmBkBholv8HklJSahdu7aw7fLly3j79i3s7OzeG0d52KxJRERUTfTp00fq/aJFi9CpUyf8/fff0NPTE7ZramrCyMgIAGBqair0OcvNzUVYWBi2bNkiJBCNGzdGcnIydu7cCUdHR+Ea3377Lbp06SJ1P7FYDF9fXwCAhYUFduzYgYSEBOG44cOHC8c2atQIkyZNwrx58+Dv7w9tbW0YGhpCJBLBzMyszGd0cnKCrq4ujhw5goEDBwIA9u/fDxcXFxgYGMj1DO/Ky8tDQEAAmjRpAgD48ssvsW7dOmF/aGgo/Pz80Lt3b+G6f//9N3bu3IlBgwbB0dERCxcuRGZmJjQ1NfH3339jwoQJSExMxNChQ5GYmAhra+sP7j/H5IyIiKiauHPnDlavXo2LFy/i6dOnkEgkAID09HS0aNGi3PP/+ecf5OTklKixysvLg5WVldQ2GxubEucX1aAVMTMzQ0ZGhvD+zJkzCAsLw61bt/Dq1Svk5+fjzZs3yMnJkTlhqVGjBlxdXbFv3z4MHDgQ2dnZOHr0KJYvXy7zM3z++ed48OABAKB9+/bYuHEjAEBXV1dIzACgbt26QvzZ2dm4e/cuZs+ejR9++EE45u3btzA0NAQAtGrVCkZGRkhMTISWlhZat26NHj164JdffgFQWJPWsWNHmZ7zvWXwwVcgIiKiKjF+/HiYm5sjMDAQdevWRUFBAfr374+8vDyZzs/OzgZQ2CxYr149qX3a2tpS70tLpt4dFCASiYQEMS0tDePGjcPQoUMxefJkGBkZITk5GbNnz0ZeXp5ctUlubm7w9vZGRkYGTp8+DR0dHXTt2lXmZ1i/fj3evn0LAKhZs6ZM8Rddd8GCBWjbtq3UcRoaGsLxRU2b2tracHR0hFgsRm5uLm7cuIELFy6U2lQrLyZnRERE1cDTp09x+/ZtBAYGwsHBAQBKTElRnJaWFgAgPz9f2NaiRQtoa2vjwYMH723+q4irV69CIpHAz89PSGbi4uJKxFQ8nrLY29ujfv36OHjwIE6ePIm+ffsKzyPLM5ibm8sdf506dVC3bl3cu3cP7u7uZR7XoUMH7N69G9ra2pg0aRI0NDTg4OCATZs2ITc3F/b29nLf+11Mzqja6n8qttxj9nct+weMiKg6MTIygrGxMXbu3AkzMzM8ePAAISEhZR5vbm4OkUiE48ePw9nZGTo6OjAwMMDIkSMRFBQEiUSC9u3b4+XLl0hJSYGBgQEGDRpU4fiaNm2KvLw8bN++HS4uLkhOTkZERESJmLKzs5GQkACxWAxdXd0ya9T69++PiIgI3LlzBz///LOwvTKf4dtvv0VgYCAMDQ3RtWtX5Obm4sqVK3jx4gVGjBgBAOjYsSOCgoKgpaWF9u3bAwAcHR2xZMkS2NjYSPX9qygmZ0RVzDDox3KPeTnz6yqIhIiqEw0NDaxYsQKBgYHo378/mjVrhjlz5sDb27vU4+vVq4eJEyciJCQEM2fOxMCBAxEcHIxJkybBxMQEYWFhSEtLg6GhIVq3bo3x48d/UHyWlpaYOXMmNmzYgOXLl8PBwQFTpkzBjBkzhGPs7e3xxRdfYNKkSXj27Bl8fX0xceLEUq/n7u6On376Cebm5kISVKSynsHLyws1a9bEpk2bsGTJEujp6aFVq1b46quvhGNatWqFWrVqwcLCAvr6+gAKE7b8/HyF1UaKJEWNrR+px49fKuxaivijKhIBdeoY4smTl3hfyatzrdDHXkZV+X2kzlhG5VPFMjIzM1R2CERKx3nOiIiIiFQIkzMiIiIiFcLkjIiIiEiFMDkjIiIiUiFKT87Cw8Ph4uICGxsbeHl54dKlS+89fuvWrejTpw9sbW3h7OyMRYsW4c2bN1UULREREVHlUupUGgcPHkRQUBACAgLQtm1b/Pzzzxg1ahQOHToEU1PTEsfv27cPISEhWLRoEezs7HDnzh34+flBJBJh5syZSngCeheniSAiIvowSq0527JlC4YMGQIPDw+0bNkSAQEBqFmzJqKioko9/sKFC7C3t4ebmxsaNWoEJycn9O/fv9zaNiIiIqLqQmk1Z7m5ubh69SrGjRsnbNPQ0EDnzp1x4cKFUs+xs7NDbGwsLl26BFtbW9y7dw8nTpzAgAED3nsvkUihoX/QvYr2KyKmqnwuRWIZla8qy+hjxTIqH8uISDUpLTl7+vQp8vPzSzRfmpqa4tatW6We4+bmhqdPn+Lf//43JBIJ3r59iy+++OK9MwKbmOhDU1MxFYSvZTimTh3ZJlA0Nf3wiRZlvVdVYhmVT9XK6GPHMiofy4hItVSr5ZvOnTuHsLAwzJs3D7a2trh79y4WLlyItWvX4ptvvin1nMzMLIX9V2ggwzFPnrx/RQKRqPAXYUbGh8/IXd69lIFlVD5VK6OPFcuofKpYRqr4D1V1kZaWhp49e2LPnj2wsrJSdjhycXFxgY+PD4YPH67UOPz8/PDixQusW7euzGO8vb1haWmJ2bNnV1ocSkvOateuDU1NTWRkZEhtz8jIQJ06dUo9Z9WqVXB3d4eXlxcAQCwWIzs7G3PnzsXXX38NDY3Sa8iq8peOrPeSSD48LlX5ZSovllH5qrKMPnYso/KxjAr9Hta+/IMUqNe45Eq9/rlz5+Dj44OkpCTUqlWrUu8lq+joaCxatAjnz5+X2h4ZGVnmAujqSGkDArS1tdGmTRskJCQI2woKCpCQkAA7O7tSz3n9+nWJBExTUxMA8JEvEUpERKQURd2IKpOJiQmTs2KUOlpzxIgR2LVrF2JiYnDz5k34+/sjJycHgwcPBgBMnz4dISEhwvE9evTAr7/+igMHDuDevXs4ffo0Vq1ahR49eghJGhER0cfq5MmTGDp0KBwcHNCxY0eMGzcOd+/eLXFcWloafHx8AAAdOnSAWCyGn58fgMKKkLCwMLi4uMDW1hbu7u44dOiQcO65c+cgFotx4sQJDB48GDY2NkhOToa3tzcCAwOxZMkSODo6okuXLggNDZW675YtW+Dm5oZ27drB2dkZ/v7+yMrKEq47c+ZMvHz5EmKxGGKxWDjfxcUFW7duBQBMnToVkyZNkrpuXl4eOnbsiD179sj0DGW5fv06fHx8YGtri44dO+KHH34Q4itNdnY2pk+fDjs7Ozg5OWHz5s3l3kMRlNrnrF+/fsjMzMTq1avx+PFjWFlZYePGjUKzZnp6ulRN2ddffw2RSISVK1fi4cOHMDExQY8ePTB58mRlPQIREVGVycnJwYgRI4RuPatWrcI333yDvXv3Sh3XoEEDhIaGYuLEiTh06BAMDAxQs2ZNAEBYWBhiY2MREBAACwsLJCUl4fvvv4eJiQkcHR2Fa4SEhGDGjBlo3Lix0CwaExMjVKz8+eef8PPzg729Pbp06QIAEIlEmD17Nho1aoR79+4hICAAS5cuhb+/P+zs7DBr1iysXr1aSKT09PRKPKObmxu+++47ZGVlQV9fHwAQHx+P169fo1evXnI9Q3HZ2dkYNWoU7OzsEBkZiYyMDMyZMwcLFixAcHBwqecsWbIESUlJWLduHUxMTLBixQpcvXoVlpaWMn9mFaH0AQHDhg3DsGHDSt23fft2qfc1atSAr68vfH19qyI0IiIildKnTx+p94sWLUKnTp3w999/SyU6mpqaMDIyAlA4C0JRcpWbm4uwsDBs2bJF6ELUuHFjJCcnY+fOnVKJzbfffiskXUXEYrHwN9jCwgI7duxAQkKCcFzxDv2NGjXCpEmTMG/ePPj7+0NbWxuGhoYQiUQwMzMr8xmdnJygq6uLI0eOYODAgQCA/fv3w8XFBQYGBnI9Q3H79+9Hbm4uFi9eLJTV3LlzMX78eEybNq1Ef/esrCxERkZi6dKl6NSpEwAgODgYzs7OZcauKEpPzoiIiEg2d+7cwerVq3Hx4kU8ffpU6G+dnp6OFi1alHv+P//8g5ycHIwcOVJqe15eXokRnjY2NiXOF4vFUu/NzMykBvadOXMGYWFhuHXrFl69eoX8/Hy8efMGOTk5Mvcpq1GjBlxdXbFv3z4MHDgQ2dnZOHr0KJYvXy7zM3z++ed48OABAKB9+/bYuHEjbt68CbFYLJXE2tvbo6CgALdv3y6RnN27dw95eXlo27atsM3Y2BjNmjWT6Tk+BJMzIiKiamL8+PEwNzdHYGAg6tati4KCAvTv3x95eXkynZ+dnQ2gsFmwXr16Uvu0tbWl3peWTNWoIZ02iEQiIUFMS0vDuHHjMHToUEyePBlGRkZITk7G7NmzkZeXJ1eHfzc3N3h7eyMjIwOnT5+Gjo4OunbtKvMzrF+/XhjEUNScW50wOSMiIqoGnj59itu3byMwMBAODg4AUGJKiuK0tLQAAPn5+cK2Fi1aQFtbGw8ePCiz+a+irl69ColEAj8/P6G/eFxcXImYisdTFnt7e9SvXx8HDx7EyZMn0bdvX+F5ZHkGc3PzEttatGiBmJgYZGdnC7VnKSkp0NDQKLU2rHHjxtDS0sLFixfRsGFDAMDz589x584ddOjQodxn+BBMzoiIiKoBIyMjGBsbY+fOnTAzM8ODBw+kZjR4l7m5OUQiEY4fPw5nZ2fo6OjAwMAAI0eORFBQECQSCdq3b4+XL18iJSUFBgYGGDRoUIXja9q0KfLy8rB9+3a4uLggOTkZERERJWLKzs5GQkICxGIxdHV1y6xR69+/PyIiInDnzh38/PPPwvaKPoObmxtWr14NPz8/+Pr6IjMzEwsWLMCAAQNKnV9VX18fHh4eWLp0KYyNjWFqaooVK1ZAVAXrnTE5I/qI9T8VW+4x+7u6V0EkRPShNDQ0sGLFCgQGBqJ///5o1qwZ5syZA29v71KPr1evHiZOnIiQkBDMnDkTAwcORHBwMCZNmgQTExOEhYUhLS0NhoaGaN269XuXQpSFpaUlZs6ciQ0bNmD58uVwcHDAlClTMGPGDOEYe3t7fPHFF5g0aRKePXsGX19fTJw4sdTrubu746effoK5uTnat5eeILgiz6Crq4tNmzZh4cKF8PT0hK6uLnr37i1MMVKa6dOnIzs7G19//TX09fUxYsQIvHr1Ss6SkZ9I8pHP3vr4seKW7zEM+rHcY17O/Pq9+0WiwuVJnjx5/3Ip1fWPKsuofCyjqiFrGakzVSwjMzMu30Sk1EloiYiIiEgakzMiIiIiFcLkjIiIiEiFMDkjIiIiUiEyJWfbtm3DmzdvAAAPHjzARz6GgIiIiEhpZErOgoODhaGjPXv2RGZmZqUGRURERKSuZJrnrG7duvjtt9/g7OwMiUSC//73v0JN2ruKZtElIiIiIvnJlJx9/fXXWLBgARYsWACRSARPT88Sx0gkEohEIly7dk3hQRIRERGpC5mSs3/961/CCu/u7u7YsmULateuXdmxEREREakdmZdvMjAwQKtWrRAUFIT27duXWL2eiIiIqpa3tzcsLS0xe/ZsZYeiEpRZHufOnYOPjw+SkpJQq1atD7qW3GtrfsiiqEREqkadl7giaYNi7av0fjHuKVV6PypJkQmVIsmUnDk6OuLQoUMwMTFBhw4d3rsie2JiosKCIyL1I8vao0D5648SkfLl5+dDJBJBQ4PTqspDpuRs5syZMDAwEF6/LzkjIiIixcvOzoa/vz+OHDkCfX19jBw5Umr/8+fPsXDhQvzxxx/Izc1Fhw4dMGfOHFhYWEAikaBTp07w9/dH3759AQADBgxARkYG4uPjAQDnz5/H8OHDkZSUBF1dXYjFYgQGBuL48eOIj49HvXr1MGPGDPTs2bPMGKOjo7Fo0SIsXrwYISEhuHPnDg4fPoy6detixYoV2L9/P16+fIlPPvkE06ZNQ8eOHYVzd+3ahbVr1+LZs2dwcnKCg4MD1q5di/PnzwMA/Pz88OLFC6xbt044Z+HChUhNTcX27dtLjWfPnj3Ytm0bbt++DT09PXz66aeYNWsWTE1NkZaWBh8fHwBAhw4dABS2DgYHB6OgoAAbNmzAzp078eTJE1hYWGDChAlC2QHAiRMnsGjRIqSnp6Nt27YKbVmUKTkrfsPBgwcr7OZEREQkmyVLliApKQnr1q2DiYkJVqxYgatXr8LS0hJAYfLyzz//4Mcff4SBgQGWLl2KsWPH4sCBA9DS0kKHDh2QmJiIvn374vnz57h58yZq1qyJmzdvokWLFkhKSoKNjQ10dXWFe65Zswbff/89pk+fju3bt2PatGn4448/YGxsXGacr1+/xoYNGxAYGAhjY2OYmppi/vz5+Pvvv7FixQrUrVsXR44cwejRo7Fv3z5YWFggOTkZ8+bNw7Rp0+Di4oIzZ85g9erVH1xmb9++xXfffYfmzZsjIyMDwcHB8PPzw4YNG9CgQQOEhoZi4sSJOHToEAwMDFCzZk0AQFhYGGJjYxEQEAALCwskJSXh+++/h4mJCRwdHZGeng5fX198+eWXGDJkCK5cuYLFixd/cLxF5O5zZmVlhfj4eJiamkptf/r0KTp37sypNIiIKhmbftVPVlYWIiMjsXTpUnTq1AlA4QTxzs7OAIA7d+7g2LFj+PXXX2FvX9h3btmyZejevTt+//13uLq6wtHRETt37gQAJCUloXXr1qhTpw4SExPRokULJCYmwtHRUeq+gwYNQv/+/QEAU6ZMwfbt23Hp0iV069atzFjz8vLg7+8vJI0PHjxAdHQ0/vjjD9SrVw8AMGrUKJw6dQrR0dGYMmUKduzYgW7dumHUqFEAgGbNmuHChQs4fvz4B5Vb8am/GjdujNmzZ8PT0xNZWVnQ19eHkZERAMDU1FToc5abm4uwsDBs2bIFdnZ2wrnJycnYuXMnHB0d8euvv6JJkybw8/MDADRv3hw3btzAhg0bPijeInInZ2Ut3ZSbmwstLa0PDoiIiIik3bt3D3l5eWjbtq2wzdjYGM2aNQMA3Lx5EzVq1JDaX7t2bTRr1gw3b94EUNh0t3DhQmRmZiIpKQmOjo5Ccubp6YkLFy5g9OjRUvcVi8XCaz09PRgYGAirBBVNsQUA7du3x8aNGwEAWlpaUufduHED+fn5Uk2CQGHeUFQDd/v2bfTq1Utqv62t7QcnZ1euXMGaNWuQmpqK58+fCzlMeno6WrZsWeo5//zzD3Jycko0G+fl5cHKygpAYXnb2tpK7W/Xrt0HxVqczMnZtm3bAAAikQi7d++Gnp6esK+goABJSUlo3ry5wgIjIiLVwBGtHwexWAwjIyMkJiYiKSkJkyZNgpmZGTZu3IjLly/j7du3Qk1RkXcrXUQiEQoKCgAA69evx9u3bwFAaA4sel28b3p2djY0NTURFRUFTU1NqesVzyXKIxKJSlQQFd2/NNnZ2Rg1ahScnJywbNky1K5dG+np6Rg1ahTy8vLeex5Q2LRZVNNXpKqmEZM5Odu6dSuAwpqziIgIqZEXWlpaaNSoEQICAhQeIBERkbpr3LgxtLS0cPHiRWGZxOfPn+POnTvo0KEDWrRogbdv3+LixYtCs+bTp09x+/ZtoYZIJBLBwcEBR48exX/+8x+0b98eurq6yM3Nxc6dO2FtbS1XsmRubi7TcVZWVsjPz0dmZiYcHBxKPaZZs2a4cuWK1LbLly9LvTcxMcF//vMfqW3Xrl0rs9Xu1q1bePbsGaZNm4YGDRoAQIl7FJ2bn58vbGvRogW0tbXx4MGDEs28xY85duyY1LaLFy+WemxFyJycFQXh7e2NNWvWCO20REREVLn09fXh4eGBpUuXCp3sV6xYIdRQWVhYoGfPnvjhhx8QEBAAAwMDLFu2DPXq1ZMaXeno6IjFixfD2toa+vr6AAAHBwfs27dP6O+laM2aNYObmxumT58OPz8/WFlZ4enTp0hISIBYLEb37t0xbNgwDBs2DFu2bEGPHj1w9uxZnDx5UqoG7tNPP8WmTZuwZ88etGvXDrGxsfjPf/6D1q1bl3rfhg0bQktLC9u3b8fQoUNx48YNqZGeQGGCKRKJcPz4cTg7O0NHRwcGBgYYOXIkgoKCIJFI0L59e7x8+RIpKSkwMDDAoEGD8MUXX2Dz5s1YvHgxvLy8cPXqVcTExCiszOSeeGT79u1CYiaRSMrsg0ZERESKM336dLRv3x5ff/01RowYgfbt28Pa2lrYHxQUhDZt2mD8+PH417/+BYlEgvXr10vVLDk6OiI/P1+qRqi0bYoWFBSEgQMHIjg4GK6urpgwYQIuX74s1Gi1b98eAQEB2LJlCwYMGIBTp05h+PDh0NHREa7RtWtXTJgwAUuXLhU69Q8cOLDMe5qYmCA4OBiHDh1Cv379sGHDBsyYMUPqmHr16mHixIkICQlB586dsWDBAgDApEmTMGHCBISFhaFfv34YPXo0jh8/jkaNGgEoTPxCQ0Nx9OhRDBgwABEREZg8ebLCykskqUB2tWfPHmzatAl37twBUJixjxo16r2FpCyPH79U2LVkGSFV3ugokQioU8cQT568xPtKvrr28WAZlY9l9H6KGonIMqqeZWRmZqiQ61D1N2fOHNy6dQu//PKLskOpcnKP1tyyZQtWrVqFL7/8EpMmTQIAJCcnw9/fH8+ePcPw4cMVHCIRERF97DZt2oQuXbpAV1cXJ0+exJ49ezBv3jxlh6UUcidn27dvh7+/v1QtWc+ePfHJJ58gNDSUyRkRERHJ7dKlS9i4cSOysrKEOcm8vLyUHZZSyJ2cPX78uMRQWwCws7PD48ePFRIUERERqZdVq1YpOwSVIfeAgKZNmyIuLq7E9oMHD8LCwkIRMRERERGpLblrziZOnIjJkycjKSlJmEslJSUFZ8+excqVKxUdHxEREZFakbvmrE+fPti1axdq166No0eP4ujRo6hduzZ2796Nzz77rDJiJCIiIlIbctecAYC1tTWWLVum6FiIiIiI1F6FkjMAyMjIQEZGhrDGVpGiVeiJiIiISH5yJ2dXrlyBn58fbt68WWJ1AJFIhGvXriksOCIiIiJ1I3dyNmvWLFhYWGDhwoUwNTWVWveKiIiIlKNoQvhbt27B2dkZ69atK3UbqT65k7N79+4hNDQUTZs2rYx4iIiIlKJjdHiV3u/c4C8Ver3g4GBYWlpiw4YN0NPTK3NbcWlpaejZsyf27NkDKysrhcZDFSf3aM1OnTohNTW1MmIhIiKiCrp79y4+/fRT1K9fH7Vq1SpzG6k+uZOzwMBAREVFYc2aNfjtt9+E6TSKvoiIiEjxcnNzERgYiE6dOsHGxgZDhw7FpUuXkJaWBrFYjGfPnmHWrFkQi8WIjo4uddu7evbsCQAYOHAgxGIxvL29AQAFBQVYs2YNunXrBmtrawwYMAAnT56scIxFzp07B7FYjISEBAwePBht27bFF198gVu3bimolD4Ocjdr/vnnn0hJSSn1Q+KAACIiosqxZMkS/PbbbwgODoa5uTk2btyI0aNH47fffkN8fDz69u2Lb7/9Fv369YO+vj66du0qtc3Q0LDENXfv3g0vLy9s3boVLVu2hJaWFgBg27Zt2LJlC+bPnw8rKytERUVhwoQJ2L9//3tXAyorxsOHD8PY2Fg4bsWKFfDz84OJiQnmzZuHWbNmISIiQtFFVm1VqObM3d0d8fHxSE1NlfpiYkZERKR42dnZiIiIwPTp0+Hs7IyWLVtiwYIF0NHRQVRUFMzMzCASiWBoaAgzMzPo6emV2FazZs0S1zUxMQEAGBsbw8zMTEigNm3ahDFjxuDzzz9H8+bN8f3338PS0hI///xzhWKMjIyUOnby5MlwdHREy5YtMXbsWFy4cAFv3rxRXIFVc3LXnD19+hTDhw9HnTp1KiMeIiIiesfdu3eRl5cnLJsIAFpaWrC1tcXNmzdlusbcuXOxb98+4f2FCxdKPe7Vq1d49OiR1L0AwN7eXuhz/tNPPyEsLEzYd+DAAbx48ULmGMVisfDazMwMQOH8qQ0bNpTpWT52cidnvXv3xrlz59CkSROFBBAeHo5Nmzbh8ePHsLS0xA8//ABbW9syj3/x4gVWrFiBI0eO4NmzZzA3N8esWbPg7OyskHiIiIg+Rt999x1GjRqlkGt98cUXcHV1Fd7XrVsXL168kPn8GjX+l34UTcn17qT26kzu5MzCwgIhISFITk5Gq1atpAoYAHx8fGS+1sGDBxEUFISAgAC0bdsWP//8M0aNGoVDhw7B1NS0xPG5ubkYMWIETE1NsWrVKtSrVw8PHjzgCBQiIvqoNWnSBFpaWkhJSYG5uTkAIC8vD5cvX8ZXX30l0zVMTU1L/G0t6mOWn58vbDMwMEDdunWRkpICR0dHYXtKSopQeWJsbCzVh0xRMVIhuZOz3bt3Q09PD4mJiUhMTJTaJxKJ5ErOtmzZgiFDhsDDwwMAEBAQgOPHjyMqKgpjx44tcXxUVBSeP3+OiIgI4RuqUaNG8j4CERFRtaKnp4ehQ4diyZIlMDIyQsOGDbFx40a8fv0anp6eFb6uqakpatasiVOnTqF+/frQ0dGBoaEhRo0ahdDQUDRp0gSWlpaIjo5Gamrqe9fVrqwY1ZHcydmxY8cUcuPc3FxcvXoV48aNE7ZpaGigc+fOZbaDHzt2DO3atcP8+fNx9OhRmJiYoH///hgzZgw0NTUVEhcREZEqmjZtGiQSCaZPn46srCxYW1tj48aNMDIyqvA1a9SogTlz5mDt2rVYvXo1HBwcsH37dvj4+ODVq1cIDg5GZmYmWrRogXXr1r13pGZlxaiOKrzw+Yd6+vQp8vPzS1Sxmpqaljnfyb1793D27Fm4ublh/fr1uHv3LgICAvD27Vv4+vqWea+qXGGqvHsV7VdETNV15SyWUflYRuVjGZWPZSQfRc/Yr2g6OjqYM2cO5syZU+r+8+fPy7TtXV5eXvDy8pLapqGhAV9f3/f+ba1IjB07dsT169eltllZWZXYpu6UlpxVhEQigampKRYsWABNTU1YW1vj4cOH2LRpU5nfQCYm+tDUlHvGkFK9luGYOnVKziNTGlNT2Y5TxL2qEsuofCyj95OlfACWkSzUuYyIqjOlJWe1a9eGpqYmMjIypLZnZGSUOU2HmZkZatSoIdWE2bx5czx+/Bi5ubnQ1tYucU5mZpbC/qMzkOGYJ09evne/SFT4izAj4yUkkg+Lp7x7KQPLqHwso/eTpXwAlpEsqmMZMckjUmJypq2tjTZt2iAhIQG9evUCUDiMNiEhAcOGDSv1HHt7e+zfvx8FBQXQ0CisDbtz5w7MzMxKTcyKfOgvHXnIei+J5MPjqsrnUiSWUflYRuVjGZWPZURUPSmmva+CRowYgV27diEmJgY3b96Ev78/cnJyMHjwYADA9OnTERISIhw/dOhQPHv2DAsXLsTt27dx/PhxhIWF4csvVbufABEREZGsKlRz9uLFC1y6dAkZGRmQvPPv0sCBA2W+Tr9+/ZCZmYnVq1fj8ePHsLKywsaNG4VmzfT0dKGGDAAaNGiATZs2ISgoCO7u7qhXrx58fHwwZsyYijwGERERkcqp0FQa06ZNQ3Z2NgwMDISZfYHCec7kSc4AYNiwYWU2Y27fvr3ENjs7O+zatUuuexARERFVF3InZ4sXL4aHhwemTJkCXV3dyoiJiIiISG3J3efs4cOH8PHxYWJGREREVAnkTs6cnJxw+fLlyoiFiIiISO3J1Kx59OhR4bWzszOWLl2Kmzdvlrrwec+ePRUbIREREZEakSk5++abb0psW7t2bYltIpEI165d+/CoiIiIiNSUTMlZampqZcdBRERERKhAn7M9e/YgNze3xPbc3Fzs2bNHETERERERqS25k7OZM2fi5cuSa6hlZWVh5syZCgmKiIiISF3JnZxJJBKpiWeLPHz4EIaGXLCWiIiI6EPIPAntwIEDIRKJIBKJ8NVXX0mN0szPz0daWhq6du1aKUESERERqQuZk7NevXoBAK5duwYnJyfo6+sL+7S0tGBubo7evXsrPkIiIiIiNSJzcubr6wsAMDc3R79+/aCjo1NpQRERERGpK7nX1hw0aFBlxEFEREREkDE5c3R0xKFDh2BiYoIOHTqUOiCgSGJiosKCIyIiIlI3MiVnM2fOhIGBgfD6fckZEREREVWcTMlZ8abMwYMHV1owREREROpO7nnOpk+fjqioKNy9e7cy4iEiIiJSa3IPCNDS0sL69esxe/Zs1KtXDx06dEDHjh3RoUMHWFhYVEKIREREROpD7uRs4cKFAApXBEhKSkJiYiI2b96MuXPnwszMDCdPnlR4kERERETqQu5mzSK1atWCsbExjIyMUKtWLWhqasLExESRsRERERGpHblrzpYvX47ExET89ddfaNGiBTp06IAxY8agQ4cOMDIyqowYiYiIiNSG3MnZ+vXrYWJiAl9fX3z22Wdo1qxZZcRFREREpJbkTs727NmDxMREoa+ZlpYWHB0dhS8ma0REREQVJ3dyZmlpCUtLS/j4+AAAUlNTsXXrVsyfPx8FBQW4du2awoMkIiIiUhdyJ2cSiQR//fUXEhMTce7cOaSkpODVq1cQi8Xo0KFDZcRIREREpDbkTs4cHR2RnZ0NsVgMR0dHDBkyBA4ODqhVq1ZlxEdERESkVuROzpYuXQoHBwdhrU0iIiIiUhy5k7Pu3btXQhhEREREBHzAJLREREREpHhMzoiIiIhUCJMzIiIiIhXC5IyIiIhIhcg9IAAA7ty5g3PnziEjIwMFBQVS+3x9fRUSGBEREZE6kjs527VrF/z9/VG7dm3UqVMHIpFI2CcSiZicEREREX0AuZOzH3/8EZMmTcLYsWMrIx4iIiIitSZ3n7Pnz5/D1dW1MmIhIiIiUntyJ2d9+/ZFfHx8ZcRCREREpPbkbtZs2rQpVq1ahYsXL6JVq1aoUUP6Ej4+PgoLjoiIiEjdyJ2c7dy5E3p6ekhMTERiYqLUPpFIxOSMiIiI6APInZwdO3asMuIgIiIiInzgJLQSiQQSiURRsRARERGpvQolZ3v27IGbmxtsbW1ha2sLNzc37NmzR8GhEREREakfuZs1t2zZglWrVuHLL7/EpEmTAADJycnw9/fHs2fPMHz4cAWHSERERKQ+5E7Otm/fDn9/fwwcOFDY1rNnT3zyyScIDQ1lckZERET0AeRu1nz8+DHs7OxKbLezs8Pjx48rFER4eDhcXFxgY2MDLy8vXLp0SabzDhw4ALFYjAkTJlTovkRERESqRu7krGnTpoiLiyux/eDBg7CwsJA7gIMHDyIoKAjffPMNYmJiYGlpiVGjRiEjI+O956WlpWHx4sVwcHCQ+55EREREqkruZs2JEydi8uTJSEpKgr29PQAgJSUFZ8+excqVK+UOYMuWLRgyZAg8PDwAAAEBATh+/DiioqLKXL8zPz8f06ZNw8SJE5GcnIwXL17IfV8iIiIiVSR3ctanTx/s2rULW7duxdGjRwEAzZs3x+7du9G6dWu5rpWbm4urV69i3LhxwjYNDQ107twZFy5cKPO8tWvXwtTUFF5eXkhOTi73PiKRXGF9kPLuVbRfETFV5XMpEsuofCyj8rGMyscyIqqe5E7OAMDa2hrLli374Js/ffoU+fn5MDU1ldpuamqKW7dulXrO+fPnERkZKfPUHSYm+tDU/KDp3ASvZTimTh1Dma5lairbcYq4V1ViGZWPZfR+spQPwDKShTqXEVF1JlNy9urVKxgYGAiv36fouMrw6tUrTJ8+HQsWLICJiYlM52RmZinsPzpZnuzJk5fv3S8SFf4izMh4iQ+dv7e8eykDy6h8LKP3k/U3CMuofNWxjJjkEcmYnHXo0AHx8fEwNTWFg4MDRKVkOxKJBCKRCNeuXZP55rVr14ampmaJzv8ZGRmoU6dOiePv3buH+/fv4+uvvxa2FRQUAABat26NQ4cOoUmTJqXEJnNIH0zWe0kkHx5XdV2cgWVUPpZR+VhG5WMZEVVPMiVnP//8M4yMjAAA27ZtU9jNtbW10aZNGyQkJKBXr14ACpOthIQEDBs2rMTxzZs3x759+6S2rVy5EllZWZg9ezbq16+vsNiIiIiIlEGm5MzR0VF43ahRIzRo0KBE7ZlEIkF6errcAYwYMQIzZsyAtbU1bG1t8fPPPyMnJweDBw8GAEyfPh316tXD1KlToaOjg1atWkmdX6tWLQAosZ2IiIioOpJ7QEDPnj2FJs7inj17hp49e8rVrAkA/fr1Q2ZmJlavXo3Hjx/DysoKGzduFJo109PToaGhmA79RERERKpO7uSsqG/Zu7Kzs6Gjo1OhIIYNG1ZqMyZQuFzU+wQHB1fonkRERESqSObkLCgoCAAgEomwcuVK6OrqCvvy8/Nx6dIlWFpaKj5CIiIiIjUic3L2119/ASisObtx4wa0tLSEfdra2rC0tMTIkSMVHyERERGRGpE5OStqXpw5cyZmz55dqfOZEREREakruXvaz5o1C2/fvi2x/dmzZ+VOUEtERERE7yd3cjZ58mQcOHCgxPa4uDhMnjxZIUERERERqSu5k7NLly7h008/LbHd0dERly5dUkhQREREROpK7uQsNze31GbNt2/f4vVrWZfjJSIiIqLSyD3PmY2NDXbt2oUffvhBantERATatGmjsMCI6P0GxdrLcNTUSo+DiIgUS+7kbNKkSRgxYgRSU1PRqVMnAEBCQgIuX76MzZs3KzxAIiIiInUid7Nm+/btsXPnTtSvXx9xcXE4duwYmjRpgtjYWDg4OFRGjERERERqQ+6aMwCwsrJCSEiIomMhIiIiUnsVSs6KvHnzBnl5eVLbODktERERUcXJnZzl5ORg6dKliIuLw7Nnz0rsv3btmiLiIiIiIlJLcvc5W7JkCc6ePQt/f39oa2sjMDAQEydORN26dbF48eLKiJGIiIhIbcidnP3xxx+YN28e+vTpA01NTTg4OGDChAmYPHky9u3bVxkxEhEREakNuZOz58+fo3HjxgAK+5c9f/4cQOEozvPnzys2OiIiIiI1I3dy1qhRI6SlpQEAmjdvjri4OACFNWqGhoaKjY6IiIhIzcg9IMDDwwOpqalwdHTE2LFjMX78eOzYsQNv376Fn59fZcRIpHYuRncv/6AGlR4GEREpgdzJ2fDhw4XXnTt3RlxcHK5evYomTZrA0tJSkbERERERqR25mjXz8vLw1Vdf4c6dO8I2c3Nz9O7dm4kZERERkQLIlZxpaWnh+vXrlRULERERkdqTe0CAu7s7IiMjKyMWIiIiIrUnd5+z/Px8/Prrrzhz5gysra2hq6srtX/mzJkKC46IiIhI3cidnN24cQOtW7cGANy+fVtqn0gkUkxURERERGpK5uTs3r17aNSoEbZv316Z8RARERGpNZn7nPXu3RuZmZnC+0mTJuHJkyeVEhQRERGRupI5OZNIJFLvT5w4gZycHIUHRERERKTO5B6tSURERESVR+bkTCQSscM/ERERUSWTeUCARCKBn58ftLW1AQC5ubnw9/cvMZXGmjVrFBshERERkRqROTkbNGiQ1Ht3d3eFB0NERESk7mROzoKCgiozDiIiIiICBwQQERERqRQmZ0REREQqhMkZERERkQphckZERESkQpicEREREakQJmdEREREKoTJGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyRkRERGRClGJ5Cw8PBwuLi6wsbGBl5cXLl26VOaxu3btwr///W906NABHTp0wPDhw997PBEREVF1ovTk7ODBgwgKCsI333yDmJgYWFpaYtSoUcjIyCj1+HPnzuHzzz/Htm3bEBERgQYNGmDkyJF4+PBhFUdOREREpHhKT862bNmCIUOGwMPDAy1btkRAQABq1qyJqKioUo8PCQnBl19+CSsrK7Ro0QKBgYEoKChAQkJCFUdOREREpHhKTc5yc3Nx9epVdO7cWdimoaGBzp0748KFCzJdIycnB2/fvoWRkVFlhUlERERUZWoo8+ZPnz5Ffn4+TE1Npbabmpri1q1bMl1j2bJlqFu3rlSC9y6R6IPClEt59yrar4iYqvK5FIllpFqqaxnx+6h8LCOi6kmpydmHWr9+PQ4ePIht27ZBR0en1GNMTPShqamYCsLXMhxTp46hTNcyNZXtOEXcqyqxjMonSxlVJVUrI1nLR52/j1hGRB83pSZntWvXhqamZonO/xkZGahTp857z920aRPWr1+PLVu2wNLSsszjMjOzFPYfnYEMxzx58vK9+0Wiwl+EGRkvIZF8WDzl3UsZWEblk6WMqpKqlZGs5aPO30cfcxkxySNScnKmra2NNm3aICEhAb169QIAoXP/sGHDyjxvw4YN+Omnn7Bp0ybY2NiUe58P/aUjD1nvJZF8eFxV+VyKxDJSLdW1jPh9VD6WEVH1pPRmzREjRmDGjBmwtraGra0tfv75Z+Tk5GDw4MEAgOnTp6NevXqYOnUqgMKmzNWrVyMkJATm5uZ4/PgxAEBPTw/6+vpKew4iIiIiRVB6ctavXz9kZmZi9erVePz4MaysrLBx40ahWTM9PR0aGv/rMxYREYG8vDx8++23Utfx9fXFxIkTqzR2IiIiIkVTenIGAMOGDSuzGXP79u1S748dO1YVIREREREphdInoSUiIiKi/2FyRkRERKRCmJwRERERqRAmZ0REREQqhMkZERERkQphckZERESkQpicEREREakQJmdEREREKoTJGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyRkRERGRCmFyRkRERKRCmJwRERERqRAmZ0REREQqhMkZERERkQphckZERESkQpicEREREakQJmdEREREKoTJGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyRkRERGRCmFyRkRERKRCmJwRERERqRAmZ0REREQqhMkZERERkQphckZERESkQpicEREREakQJmdEREREKoTJGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyRkRERGRCmFyRkRERKRCmJwRERERqRAmZ0REREQqhMkZERERkQphckZERESkQpicEREREakQJmdEREREKkQlkrPw8HC4uLjAxsYGXl5euHTp0nuPj4uLQ9++fWFjYwM3NzecOHGiiiIlIiIiqlxKT84OHjyIoKAgfPPNN4iJiYGlpSVGjRqFjIyMUo9PSUnB1KlT4enpiT179qBnz5745ptvcOPGjSqOnIiIiEjxlJ6cbdmyBUOGDIGHhwdatmyJgIAA1KxZE1FRUaUev23bNnTt2hWjR49GixYtMGnSJLRu3Ro7duyo4siJiIiIFK+GMm+em5uLq1evYty4ccI2DQ0NdO7cGRcuXCj1nD///BPDhw+X2ubk5ITff/+9zPuIRAoJVybl3atovyJiqsrnUiSWkWqprmXE76PysYyIqieRRCKRKOvmDx8+RLdu3RAREQE7Ozth+5IlS5CUlITdu3eXOMfa2hrBwcHo37+/sC08PBxr167FmTNnqiRuIiIiosqi9GZNIiIiIvofpSZntWvXhqamZonO/xkZGahTp06p59SpUwdPnjyR+XgiIiKi6kSpyZm2tjbatGmDhIQEYVtBQQESEhKkmjmLa9euHc6ePSu17cyZM2jXrl1lhkpERERUJZTerDlixAjs2rULMTExuHnzJvz9/ZGTk4PBgwcDAKZPn46QkBDheB8fH5w6dQqbN2/GzZs3ERoaiitXrmDYsGHKegQiIiIihVHqaE0A6NevHzIzM7F69Wo8fvwYVlZW2Lhxo9BMmZ6eDg2N/+WQ9vb2WLZsGVauXInly5fDwsICa9euRatWrZT1CEREREQKo9TRmkREREQkTenNmkREVPnWrFmDnJwcZYdBRDJgclaJ8vPzkZqaitevX5fYl5OTg9TUVBQUFCghMtWTnp6O//73v8L7S5cuYeHChdi5c6cSo1IdeXl5aN26NZcpqwCJRIITJ07g22+/VXYoSrV27VpkZ2crOwwikgGTs0q0d+9ezJo1C1paWiX2aWlpYdasWdi3b58SIlM9U6dOFUbhPn78GCNGjMDly5exYsUKrFmzRsnRKZ+WlhYaNGjAZF4O9+7dw8qVK9G9e3f4+vrizZs3yg5JqdiDhaj6YHJWiSIjIzFq1ChoamqW2FejRg2MHj0au3btUkJkquc///kPbG1tAQBxcXH45JNPEBERgWXLliEmJkbJ0amG8ePHY/ny5Xj27JmyQ1FZubm5iI2NhY+PD1xdXREWFoYRI0YgISEBYWFhyg5P6URcY4moWlD6aM2P2e3bt9G2bdsy99vY2ODmzZtVGJHqevv2LbS1tQEUzlvn4uICAGjevDkeP36szNBURnh4OP755x907doVDRs2hJ6entR+dU5ir1y5gsjISBw4cABNmjTBgAEDsHz5cjg7O8PJyQkGBgbKDlEl9OnTp9wELTExsYqiIaKyMDmrRDk5OXj16lWZ+7Oyskrtj6aOWrZsiYiICHTv3h1nzpzBpEmTAACPHj2CsbGxUmNTFb169VJ2CCpryJAhGDZsGHbu3InmzZsrOxyVNXHiRBgaGio7DCIqB5OzStS0aVNcuHABlpaWpe5PTk5G06ZNqzgq1TRt2jT4+vpi06ZNGDhwoFBmx44dE5o71Z2vr6+yQ1BZnTp1QmRkJDIyMjBgwAB07dqVTXil+Pzzz2FqaqrsMIioHEzOKlH//v2xcuVK2NnZlUjQUlNTsXr1aowePVpJ0amWjh074uzZs3j16hWMjIyE7UOGDIGurq4SI1M9ubm5yMzMLDE4oGHDhkqKSPk2bdqE9PR0REVFwd/fH2/evIGrqysA9rMiouqHk9BWory8PIwcORIpKSno1KmT0Nxy69YtJCQkwN7eHps3by51NCfRu27fvo3Zs2fjwoULUtslEglEIhGuXbumpMhUz+nTpxEdHY0jR46gQYMG6NOnD/r06YM2bdooOzSlsbS0xPHjx1G/fn1lh0JE5WByVsny8vKwdetW7N+/H//88w8kEgksLCzQv39/fPXVV0IneHU0aNAgbN26FUZGRhg4cOB7azjUubN7kS+++AI1atTAmDFjULdu3RLlVVbzuTp7/vw5YmNjERUVhevXr6t1AmtpaYnTp0+zWZOoGmCzZiXT0tLCmDFjMGbMmFL337hxQ23XBe3Zs6eQnPbs2ZPNT+VITU1FVFQUWrRooexQqg0jIyN4e3vD29sbV69eVXY4REQyYXKmBK9evcKBAwewe/duXL16VW3/m/f19RWS04kTJyo7HJXXokULPH36VNlhqKQ7d+5g9erVmD9/folpM16+fAl/f39hBLA64z9ARNUDmzWrUFJSEnbv3o0jR46gbt26+Oyzz9C7d2+1Ho1oaWkJGxsbeHl5oV+/fpyP6j0SEhKwatUqTJ48Ga1atSrRV1Gdy+6HH36AoaEhpk+fXur+pUuX4tWrVwgICKjiyFSHpaUlunXrVm5XCq7IQaR8rDmrZI8fP0ZMTAwiIyPx6tUruLq6Ijc3F2vXrkXLli2VHZ7S7dixA1FRUQgODkZQUBB69+4NLy8vODg4KDs0lTNixAgAwPDhw6W2c0BA4cSpS5cuLXO/q6srpk6dWoURqSZ9fX3UrFlT2WEQUTlYc1aJxo8fj6SkJHTv3h1ubm7o2rUrNDU10aZNG+zdu5fJWTHZ2dmIi4tDTEwMzp8/j6ZNm8LDwwODBg2CmZmZssNTCeXN3O7o6FhFkageW1tbxMXFwdzcvNT99+/fR79+/XDx4sUqjkx1cEAAUfXBmrNKdPLkSXh7e2Po0KGwsLBQdjgqTU9PDx4eHvDw8MA///yD6Oho/PLLL1i9ejWcnJzw008/KTtEpVPn5Ks8hoaGuHv3bpnJ2d27d9W62RdgfzOi6oTJWSX65ZdfEBkZicGDB6NFixYYMGAA+vXrp+ywVF7Tpk0xbtw4NGzYEMuXL8eJEyeUHZJSpaamynScOk+l4eDggB07dqBTp06l7t+2bRvat29fxVGplvIaSW7evInIyEjMmDGjiiIiorKwWbMKZGdn4+DBg4iKisLly5eRn58PPz8/eHh4qP1/8+9KSkpCVFQUfvvtN2hoaMDV1RWenp5o166dskNTGktLS4hEovf+cVX3Pmd//fUX/vWvf6FHjx4YPXo0mjVrBqBwwueNGzfi+PHjiIiIUOtJaBMTE2Fvb48aNf73P3l2djYOHDiAqKgo/Pnnn2jZsiX279+vxCiJCGByVuVu3bqFyMhIxMbG4sWLF+jcubPaN9k9fPgQMTExiImJwT///AM7Ozt4enrC1dUVenp6yg5P6e7fvy/TcWU16amLP/74A7NmzcKzZ8+kthsbGyMwMBA9e/ZUTmAqKDk5GZGRkTh06BBev36N4cOHw9PTk3PoEakIJmdKkp+fjz/++AORkZFqnZyNHj0aCQkJqF27NgYMGAAPDw9hmSuqGH9/f3z77bcwMTFRdihV7vXr1zh16pSwGkezZs3QpUsXrs8KICMjA9HR0YiKisKrV6/w+eefo3///vjiiy84QIlIxTA5I6UaP348PD090aNHD2hqaio7nI+Cvb099u7di8aNGys7FJUgkUhw8uRJREVFYfXq1coOR2lsbW3Rp08fuLu7o0uXLtDQ0AAAjh4nUkEcEFCJZs6cWe4xIpEIixYtqoJoVJM61xpWFv6/VejevXuIiopCTEwMMjMz0blzZ2WHpFQNGzZEcnIyGjZsiIYNG7IJk0iFMTmrRDExMWjYsCFat27NP5il8PX1RXBwMAwMDODr6/veYzlrOckiNzcXhw4dQmRkJFJSUpCfn48ZM2bA09NT7QffHDp0SOhr5unpiWbNmsHd3R0Ap9kgUjVMzirR0KFDceDAAaSlpWHw4MFwd3eHsbGxssNSGYaGhqW+JpLXlStXEBkZiQMHDqBJkyYYMGAAli9fDmdnZzg5Oal9Ylakffv2aN++PebMmYMDBw4gOjoa+fn58Pf3h5ubG3r16qWWfRWJVA37nFWy3NxcHD58GFFRUbhw4QKcnZ3h6ekJJycn/rdKlcLOzg6xsbFq1eesdevWGDZsGL744gupASXsT1W+ovnN9u7di+fPn+Pq1avKDolI7TE5q0L3799HTEwM9uzZg/z8fOzfvx/6+vrKDos+MuqYnI0aNQoXLlxAjx49MGDAAHTt2hUikYjJmRzevn2LY8eOoXfv3soOhUjtsVmzChWNjpJIJMjPz1dyNKrlyZMnWLx4MRISEpCZmVmij546T7AqL3d3d7VL+jdt2oT09HRERUXB398fb968gaurKwD2p5KFRCLB6dOnsX//fiZnRCqANWeVrHizZnJyMrp37w4PDw907dpVSNaocL6z9PR0fPnll6hbt26J/b169VJCVKrl5MmT0NPTg4ODAwAgPDwcu3btQsuWLTF37lwYGRkpOULVcfr0aURHR+PIkSNo0KAB+vTpgz59+qj1CgGlKW1Ea1hYmLLDIlJ7TM4qkb+/Pw4ePIj69evDw8MDbm5u7GxbBjs7O/zyyy+wsrJSdigqy83NDdOmTYOzszOuX78OT09PjBgxAufOnUPz5s0RFBSk7BBVzvPnzxEbG4uoqChcv36dNbDgiFai6oDNmpUoIiICDRs2ROPGjZGUlISkpKRSj+M0EUCDBg043Ug50tLShLmpDh8+jB49emDKlCm4evUqxo4dq+ToVJORkRG8vb3h7e0t1dFdHVdR4IhWouqD7WqVaODAgejYsSNq1aoFQ0PDMr8ImDVrFkJCQpCWlqbsUFSWlpYWXr9+DQA4c+YMunTpAqAwAXn16pUyQ6sWijdpxsbGIisrS4nRVL0hQ4ZAW1sbO3fuRFRUFHx8fFCnTh1lh0VEpWDNWSUKDg5WdggqrUOHDlKdtbOzs/HZZ5+hZs2a0NLSkjo2MTGxqsNTOfb29ggKCoK9vT0uX76MlStXAgDu3LmD+vXrKze4akYda2k7deqEyMhIZGRkSI1oJSLVw+SsCsycOROzZ88u0WyQnZ2NBQsWqG1foVmzZik7hGpl7ty5CAgIwG+//YZ58+ahXr16AAoHCnTt2lXJ0ZGq44hWouqDAwKqgJWVFeLj42Fqaiq1PTMzE05OTvjrr7+UFBmRelLHueDexRGtRKqLNWeV6NWrV5BIJJBIJMjKyoKOjo6wLz8/HydPnlSrDsnvc+LECWhoaJSoAYqPj0d+fj6cnZ2VFJlqevPmDfLy8qS2sUM3yaNLly7o0qWL1IjWDRs2cEQrkQpgclaJHBwcIBKJIBKJ0KdPnxL7RSIRJk6cqITIVM+yZcswbdq0EtsLCgoQEhLC5AyFzeDLli1DXFwcnj17VmI//6hSRZQ1opWIlIfJWSXatm0bJBIJvvrqK4SGhkpNEqqlpYWGDRsK/YbU3T///CNME1Fc8+bNcffuXSVEpHqWLl2Kc+fOwd/fH9OnT8fcuXPx8OFD7Ny5E1OnTlV2eNWKOq6i8D6vXr1CbGwsIiMjER0drexwiNQek7NK5OjoCAA4evQoGjZsWG6nW3Wce6mIoaEh7t27h0aNGkltv3v3LnR1dZUUlWr5448/sHjxYnTs2BEzZ86Eg4MDmjZtioYNG2Lfvn1wd3dXdohKJ+sqCgEBAcoMU2WcPXsWUVFROHLkCAwMDPDZZ58pOyQiAuc5qxLm5uYyjYZSx7mXivTs2ROLFi2SqiX7559/EBwcDBcXFyVGpjqeP38udGA3MDDA8+fPAQDt27fH+fPnlRmayli6dKnwM3T9+nUEBwfD2dkZaWlpnNrm/z18+BA//vgjPvvsM3z33XfYv38/Fi1ahFOnTmHevHnKDo+IwORMpajzwNnvv/8eenp6cHV1hYuLC1xcXNCvXz8YGxtj+vTpyg5PJTRq1EiYpLd58+aIi4sDUFijxsmMC5W1isLcuXNx8uRJJUenXL/99hvGjBmDvn374tq1a5gxYwZOnToFDQ0NtGrVitNpEKkQNmuSSjA0NERERAROnz6N1NRU1KxZE2KxGB06dFB2aCrDw8MDqampcHR0xNixYzF+/Hjs2LEDb9++hZ+fn7LDUwnvrqIwcOBAAFxFAQAmT56MMWPGYMWKFRzZS6TimJyRyhCJRHBycoKTkxOAwprEEydOICoqCqtXr1ZydMo3fPhw4XXnzp0RFxeHq1evokmTJrC0tFReYCqEqyiUzdPTE+Hh4Th37hwGDBiAfv36SQ1SIiLVweSMVM69e/cQFRWFmJgYZGZmonPnzsoOSWUkJCQgISEBGRkZKCgokNqnritNFMdVFMo2f/58zJo1C3FxcYiKisKiRYvg5OQEiURS4nuJiJSLKwSoEHWetTw3NxeHDh1CZGQkUlJSkJ+fjxkzZsDT05NNMP9vzZo1WLt2LaytrWFmZlaij9DatWuVFBlVR3fu3EF0dDRiYmKQnZ2N7t27o0+fPujdu7eyQyNSe0zOVMi8efPw3XffqdVUGleuXEFkZCQOHDiAJk2aCM0tzs7O2Lt3L1q2bKnsEFWGk5MTpk2bJvSjovfjKgqyKSgowPHjxxEZGYmTJ0/iypUryg6JSO0xOasCss69pI5at26NYcOG4YsvvkDz5s2F7W3atGFy9o6OHTti9+7daNKkibJDUVlcReHDZGRklFgDmIiqHqfSqAKce6lsnTp1QmRkJNauXYuTJ0+q9XQi5fH09MS+ffuUHYZKW7p0Kc6ePQt/f39oa2sjMDAQEydORN26dbF48WJlh6dUSUlJ5X7dunVL2WESETggoEqUNffS1atXMXbsWCVHp1ybNm1Ceno6oqKi4O/vjzdv3sDV1RUAOO8SpDv5FxQUYNeuXUhISIBYLEaNGtI/vjNnzqzq8FQOV1Eom7e3d5n7in7WRCIR/vrrr6oKiYjKwOSsCnDupfdr0KABfH194evri9OnTyM6OhqampqYMGEC+vTpgz59+qBNmzbKDlMp3v1DWTRlxo0bN6S2M5Et9L5VFNR9yaakpKRSt+fk5GDbtm3Yvn27Wg5GIlJFTM6qAOdekl2XLl3QpUsXPH/+HLGxsYiKisKGDRvUtq/Q9u3blR1CtVK0ikLDhg2FVRRsbW25igJQ4vkLCgoQFRWFNWvWQENDA3PnzsWgQYOUFB0RFccBAVXgwYMHCAgIQHp6Ory9veHl5QUAWLRoEQoKCjBnzhwlR6jarl69KtScqfPi8FS+rVu3QkNDAz4+Pjhz5gzGjx8PiUQirKLw1VdfKTtElXD48GEsX74cT58+xdixY+Ht7Q1tbW1lh0VE/4/JGVUr9vb22Lt3L5tfSCb379/nKgrFJCYmYtmyZbhx4wZ8fHwwZswYta9RJFJFbNasYpx76cPwfwkqD1dRKN2YMWOQkJCAwYMHY+3atTAzM1N2SERUBiZnVYBzLxFVjfJWUVBnp06dQo0aNRAXF4dDhw6VeVxiYmIVRkVEpWFyVgWWLl2Kc+fOwd/fH9OnT8fcuXPx8OFD7Ny5E1OnTlV2eEQfjYiICAQFBXEVhVKoc60hUXXD5KwKcO4loqqRl5cHe3t7ZYehkjgSk6j64AoBVeB9cy+dP39emaERfVS4ikLZnj9/ju3bt5c6t+LLly/L3EdEVY81Z1WAcy8pjru7O/T19ZUdBqkQrqIgmx07duD69eulrhRgaGiI8+fP49WrV/j666+VEB0RFcepNKoA514qHxeHp4p637JExYlEImzbtq2So1FdAwYMgJ+fHzp16lTq/oSEBCxevBh79uyp2sCIqAQmZ0rAuZdKcnNzw7Rp0+Ds7Izr16/D09MTI0aMwLlz59C8eXN2Zib6QHZ2djhw4AAaNmxY6v4HDx6gf//+SElJqeLIiOhdbNasIpx76f24ODxR5dLU1MSjR4/KTM4ePXoEDQ12QyZSBfxJrAJr1qzByJEjkZCQgKdPn+LFixdSX1RycfguXboA4OLwRIpiZWWF33//vcz9R44cgZWVVRVGRERlYc1ZFeDcS+Xj4vBElWvYsGGYMmUK6tevj6FDh0JTUxMAkJ+fj19++QU///wzli1bpuQoiQhgn7Mq0bFjR+zevRtNmjRRdigqi4vDE1W+FStWICwsDPr6+sL0Pvfu3UN2djZGjRqFadOmKTlCIgKYnFWJpUuXQk9PD998842yQyEiNXfp0iXExsbi7t27kEgksLCwgJubG2xtbZUdGhH9PyZnleTduZf27NkDsVjMuZdkwMXhiZTP398f3377LUxMTJQdCpHaYZ+zSvLXX39JvS+aMuPGjRtS27kwcyEuDk+kWmJjYzFq1CgmZ0RKwOSskmzfvl3ZIVQrXByeSLWwUYVIeTiVBqmEP/74A/PmzUOfPn2gqakJBwcHTJgwAZMnT+ZaiUREpFaYnJFK4OLwREREhZickUooWhwegLA4PAAuDk9ERGqHyRmpBA8PD6SmpgIAxo4di/DwcNjY2CAoKAijRo1ScnRERERVh1NpkEri4vBEyjVv3jx89913HK1JpARMzkhlcHF4osp38uRJ6OnpwcHBAQAQHh6OXbt2oWXLlpg7dy6MjIyUHCERsVmTVAIXhyeqGkuXLkVWVhYA4Pr16wgODoazszPS0tIQHBys5OiICOA8Z6QiuDg8UdVIS0tDixYtAACHDx9Gjx49MGXKFFy9ehVjx45VcnREBLDmjFREXl4e7O3tlR0G0UdPS0sLr1+/BgCcOXMGXbp0AQAYGRnh1atXygyNiP4fkzNSCZ6enpxslqgK2NvbIygoCGvXrsXly5fRvXt3AMCdO3dQv3595QZHRAA4IICUiIvDE1W9Bw8eICAgAOnp6fD29oaXlxcAYNGiRSgoKMCcOXOUHCERMTkjpfH29pbpOJFIhG3btlVyNERERKqByRkRkZp68+YN8vLypLYZGBgoKRoiKsLRmkREaiQ7OxvLli1DXFwcnj17VmL/tWvXqj4oIpLCAQFERGpk6dKlOHv2LPz9/aGtrY3AwEBMnDgRdevWxeLFi5UdHhGByRkRkVr5448/MG/ePPTp0weamppwcHDAhAkTMHnyZI6YJlIRTM6IiNTI8+fP0bhxYwCF/cueP38OAGjfvj3Onz+vzNCI6P8xOSMiUiONGjVCWloaAKB58+aIi4sDUFijZmhoqMzQiOj/cbQmEZEa2bp1KzQ0NODj44MzZ85g/PjxkEgkePv2Lfz8/PDVV18pO0QitcfkjIhIjd2/fx9Xr15FkyZNYGlpqexwiAicSoOISO0kJCQgISEBGRkZKCgokNpXfOUOIlIOJmdERGpkzZo1WLt2LaytrWFmZgaRSKTskIjoHWzWJCJSI05OTpg2bRoGDhyo7FCIqAwcrUlEpEby8vJgb2+v7DCI6D2YnBERqRFPT09ONkuk4tisSUT0kSveyb+goAB79uyBWCyGWCxGjRrSXY9nzpxZ1eER0Ts4IICI6CP3119/Sb0vmjLjxo0bUts5OIBINbDmjIiIiEiFsM8ZERERkQphckZERESkQpicEREREakQJmdEREREKoTJGdFHzNvbGwsXLhTe5+TkYOLEibC3t4dYLMaLFy+UGB0REZWGyRmRkvj5+UEsFmPu3Lkl9gUEBEAsFsPPz0+ma507d67UZCs0NBTfffed8D4mJgbnz59HREQE4uPjYWho+GEPQURECsfkjEiJGjRogIMHD+L169fCtjdv3mD//v1o2LDhB1/f2NgYBgYGwvt79+6hRYsWaNWqFRe9JiJSUUzOiJSodevWaNCgAQ4fPixsO3z4MBo0aAArKythW25uLgIDA9GpUyfY2Nhg6NChuHTpEgAgLS0NPj4+AIAOHTpI1bgVb9b09vbG5s2bkZSUBLFYDG9v76p6TCIikgOTMyIl8/DwQHR0tPA+KioKgwcPljpmyZIl+O233xAcHIyYmBg0bdoUo0ePxrNnz9CgQQOEhoYCAA4dOoT4+HjMnj27xH1CQ0MxZMgQ2NnZIT4+XjiHiIhUC5MzIiVzd3dHcnIy7t+/j/v37yMlJQXu7u7C/uzsbERERGD69OlwdnZGy5YtsWDBAujo6CAyMhKampowMjICAJiamsLMzKzUvmTGxsaoWbMmtLS0YGZmBmNj46p6RCIikgPX1iRSMhMTE3Tv3h0xMTGQSCTo3r07TExMhP13795FXl4e7O3thW1aWlqwtbXFzZs3lREyERFVItacEamAoqbNmJgYeHh4KDscIiJSIiZnRCqga9euyMvLw9u3b+Hk5CS1r0mTJtDS0kJKSoqwLS8vD5cvX0bLli0BFNakAUB+fn7VBU1ERJWCzZpEKkBTUxNxcXHC6+L09PQwdOhQLFmyBEZGRmjYsCE2btyI169fw9PTEwBgbm4OkUiE48ePw9nZGTo6OtDX16/y5yAiog/H5IxIRRSfj+xd06ZNg0QiwfTp05GVlQVra2ts3LhRGAhQr149TJw4ESEhIZg5cyYGDhyI4ODgqgqdiIgUSCSRSCTKDoKIiIiICrHPGREREZEKYXJGREREpEKYnBERERGpECZnRERERCqEyRkRERGRCmFyRkRERKRCmJwRERERqRAmZ0REREQqhMkZERERkQphckZERESkQpicEREREakQJmdEREREKuT/AAbqrXCTHHXrAAAAAElFTkSuQmCC", | |
| "text/plain": [ | |
| "<Figure size 1600x1200 with 1 Axes>" | |
| ] | |
| }, | |
| "metadata": {}, | |
| "output_type": "display_data", | |
| "transient": {} | |
| } | |
| ], | |
| "source": [ | |
| "# Create figure with multiple subplots\n", | |
| "fig = plt.figure(figsize=(16, 12))\n", | |
| "\n", | |
| "# 1. Motif enrichment across promoter types\n", | |
| "ax1 = plt.subplot(3, 3, 1)\n", | |
| "motif_counts = df_promoters.groupby('type')[['has_tMAC', 'has_AchiVis', \n", | |
| " 'has_Inr', 'has_ACA', \n", | |
| " 'has_CNAAATT']].sum()\n", | |
| "motif_counts_norm = motif_counts.div(df_promoters.groupby('type').size(), axis=0)\n", | |
| "motif_counts_norm.T.plot(kind='bar', ax=ax1)\n", | |
| "ax1.set_title('Motif Enrichment by Promoter Type', fontsize=12, fontweight='bold')\n", | |
| "ax1.set_ylabel('Fraction with motif')\n", | |
| "ax1.set_xlabel('Motif')\n", | |
| "ax1.legend(title='Promoter type', bbox_to_anchor=(1.05, 1), loc='upper left')\n", | |
| "ax1.set_xticklabels([m.replace('has_', '') for m in motif_counts_norm.index], rotation=45)\n", | |
| "plt.setp(ax1.xaxis.get_majorticklabels(), rotation=45, ha='right')\n", | |
| "\n", | |
| "# 2. CAGE signal distribution by promoter type\n", | |
| "ax2 = plt.subplot(3, 3, 2)\n", | |
| "for ptype in df_promoters['type'].unique():\n", | |
| " data = df_promoters[df_promoters['type'] == ptype]['cage_signal']\n", | |
| " ax2.hist(data, alpha=0.5, label=ptype, bins=30)\n", | |
| "ax2.set_title('Expression Distribution', fontsize=12, fontweight='bold')\n", | |
| "ax2.set_xlabel('CAGE signal')\n", | |
| "ax2.set_ylabel('Frequency')\n", | |
| "ax2.legend()\n", | |
| "\n", | |
| "# 3. RETI width distribution\n", | |
| "ax3 = plt.subplot(3, 3, 3)\n", | |
| "for ptype in ['off-to-on', 'down-regulated']:\n", | |
| " data = df_promoters[df_promoters['type'] == ptype]['reti_width']\n", | |
| " ax3.hist(data, alpha=0.5, label=ptype, bins=20)\n", | |
| "ax3.axvline(x=11, color='red', linestyle='--', label='Narrow/broad threshold')\n", | |
| "ax3.set_title('RETI Width Distribution', fontsize=12, fontweight='bold')\n", | |
| "ax3.set_xlabel('RETI width (bp)')\n", | |
| "ax3.set_ylabel('Frequency')\n", | |
| "ax3.legend()\n", | |
| "\n", | |
| "# 4. Positional distribution of tMAC-ChIP motif\n", | |
| "ax4 = plt.subplot(3, 3, 4)\n", | |
| "if 'tMAC-ChIP' in position_data and len(position_data['tMAC-ChIP']['hist']) > 0:\n", | |
| " ax4.bar(position_data['tMAC-ChIP']['bin_centers'], \n", | |
| " position_data['tMAC-ChIP']['hist'],\n", | |
| " width=4, color='steelblue')\n", | |
| "ax4.axvline(x=0, color='red', linestyle='--', linewidth=2, label='TSS')\n", | |
| "ax4.axvline(x=-60, color='orange', linestyle='--', linewidth=1.5, label='Expected position')\n", | |
| "ax4.set_title('tMAC-ChIP Motif Position', fontsize=12, fontweight='bold')\n", | |
| "ax4.set_xlabel('Position relative to TSS (bp)')\n", | |
| "ax4.set_ylabel('Count')\n", | |
| "ax4.legend()\n", | |
| "\n", | |
| "# 5. Positional distribution of ACA motif\n", | |
| "ax5 = plt.subplot(3, 3, 5)\n", | |
| "if 'ACA' in position_data and len(position_data['ACA']['hist']) > 0:\n", | |
| " ax5.bar(position_data['ACA']['bin_centers'], \n", | |
| " position_data['ACA']['hist'],\n", | |
| " width=4, color='forestgreen')\n", | |
| "ax5.axvline(x=0, color='red', linestyle='--', linewidth=2, label='TSS')\n", | |
| "for pos in [26, 28, 30]:\n", | |
| " ax5.axvline(x=pos, color='orange', linestyle='--', linewidth=1, alpha=0.7)\n", | |
| "ax5.set_title('ACA Motif Position', fontsize=12, fontweight='bold')\n", | |
| "ax5.set_xlabel('Position relative to TSS (bp)')\n", | |
| "ax5.set_ylabel('Count')\n", | |
| "ax5.legend()\n", | |
| "\n", | |
| "# 6. Expression by number of motifs\n", | |
| "ax6 = plt.subplot(3, 3, 6)\n", | |
| "expr_by_motifs = df_off_to_on.groupby('n_motifs')['cage_signal'].agg(['mean', 'std', 'count'])\n", | |
| "ax6.errorbar(expr_by_motifs.index, expr_by_motifs['mean'], \n", | |
| " yerr=expr_by_motifs['std']/np.sqrt(expr_by_motifs['count']),\n", | |
| " marker='o', capsize=5, capthick=2, linewidth=2, markersize=8)\n", | |
| "ax6.set_title('Expression vs Motif Count', fontsize=12, fontweight='bold')\n", | |
| "ax6.set_xlabel('Number of motifs')\n", | |
| "ax6.set_ylabel('Mean CAGE signal ± SEM')\n", | |
| "ax6.set_xticks(range(6))\n", | |
| "\n", | |
| "# 7. Promoter class distribution\n", | |
| "ax7 = plt.subplot(3, 3, 7)\n", | |
| "class_counts = df_off_to_on['promoter_class'].value_counts()\n", | |
| "colors_class = {'narrow-high': 'darkred', 'narrow-low': 'lightcoral',\n", | |
| " 'broad-high': 'darkblue', 'broad-low': 'lightblue'}\n", | |
| "ax7.pie(class_counts.values, labels=class_counts.index, autopct='%1.1f%%',\n", | |
| " colors=[colors_class.get(x, 'gray') for x in class_counts.index])\n", | |
| "ax7.set_title('Promoter Class Distribution\\n(off-to-on genes)', \n", | |
| " fontsize=12, fontweight='bold')\n", | |
| "\n", | |
| "# 8. Logistic regression probabilities\n", | |
| "ax8 = plt.subplot(3, 3, 8)\n", | |
| "if len(df_prob) > 0:\n", | |
| " ax8.bar(df_prob['n_motifs'], df_prob['mean_probability'], \n", | |
| " yerr=df_prob['std_probability'], capsize=5, color='purple', alpha=0.7)\n", | |
| "ax8.set_title('Narrow-High Probability by Motif Count', fontsize=12, fontweight='bold')\n", | |
| "ax8.set_xlabel('Number of motifs')\n", | |
| "ax8.set_ylabel('Probability')\n", | |
| "ax8.set_ylim([0, 1.1])\n", | |
| "ax8.axhline(y=0.5, color='gray', linestyle='--', label='50% threshold')\n", | |
| "ax8.legend()\n", | |
| "\n", | |
| "# 9. Model feature importance\n", | |
| "ax9 = plt.subplot(3, 3, 9)\n", | |
| "coef_df_sorted = coef_df.sort_values('coefficient')\n", | |
| "colors_coef = ['red' if x < 0 else 'green' for x in coef_df_sorted['coefficient']]\n", | |
| "ax9.barh(range(len(coef_df_sorted)), coef_df_sorted['coefficient'], color=colors_coef, alpha=0.7)\n", | |
| "ax9.set_yticks(range(len(coef_df_sorted)))\n", | |
| "ax9.set_yticklabels(coef_df_sorted['motif'])\n", | |
| "ax9.axvline(x=0, color='black', linewidth=1)\n", | |
| "ax9.set_title('Logistic Regression Coefficients', fontsize=12, fontweight='bold')\n", | |
| "ax9.set_xlabel('Coefficient (log-odds)')\n", | |
| "\n", | |
| "plt.tight_layout()\n", | |
| "plt.savefig('analysis_summary.png', dpi=150, bbox_inches='tight')\n", | |
| "plt.show()\n", | |
| "\n", | |
| "print(\"\\nFigure saved as 'analysis_summary.png'\")" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "## 2.9 Summary and Conclusions\n", | |
| "\n", | |
| "Let's summarize the key findings from our analysis." | |
| ] | |
| }, | |
| { | |
| "cell_type": "code", | |
| "execution_count": 11, | |
| "metadata": {}, | |
| "outputs": [ | |
| { | |
| "name": "stdout", | |
| "output_type": "stream", | |
| "text": [ | |
| "======================================================================\n", | |
| " ANALYSIS SUMMARY\n", | |
| "======================================================================\n", | |
| "\n", | |
| "1. MOTIF ENRICHMENT\n", | |
| "----------------------------------------------------------------------\n", | |
| "\n", | |
| "Key motifs enriched in spermatocyte-specific promoters:\n", | |
| " • tMAC-ChIP motif (TAGTACC) - upstream regulator\n", | |
| " • Achi/Vis motif (TGTCA) - transcription factor binding site\n", | |
| " • Initiator (TCA) - precise TSS positioning\n", | |
| " • ACA motif - downstream element at +26/+28/+30\n", | |
| " • CNAAATT motif - downstream element at +29 to +60\n", | |
| "\n", | |
| "2. POSITIONAL ENRICHMENT\n", | |
| "----------------------------------------------------------------------\n", | |
| "\n", | |
| "tMAC-ChIP:\n", | |
| " Occurrences: 182\n", | |
| " Mean position: -58.1 bp from TSS\n", | |
| "\n", | |
| "ACA:\n", | |
| " Occurrences: 1201\n", | |
| " Mean position: +1.9 bp from TSS\n", | |
| "\n", | |
| "CNAAATT:\n", | |
| " Occurrences: 149\n", | |
| " Mean position: +35.5 bp from TSS\n", | |
| "\n", | |
| "3. MOTIF EFFECTS ON EXPRESSION\n", | |
| "----------------------------------------------------------------------\n", | |
| "\n", | |
| "Inr:\n", | |
| " Fold change in expression: 2.74×\n", | |
| " Statistical significance: P = 2.04e-38\n", | |
| "\n", | |
| "ACA:\n", | |
| " Fold change in expression: 2.09×\n", | |
| " Statistical significance: P = 6.40e-26\n", | |
| "\n", | |
| "CNAAATT:\n", | |
| " Fold change in expression: 2.01×\n", | |
| " Statistical significance: P = 4.87e-25\n", | |
| "\n", | |
| "4. ADDITIVE EFFECTS\n", | |
| "----------------------------------------------------------------------\n", | |
| "\n", | |
| "Mean expression by number of motifs:\n", | |
| " 0 motifs: 92.0 (n=2)\n", | |
| " 1 motifs: 163.7 (n=9)\n", | |
| " 2 motifs: 285.6 (n=23)\n", | |
| " 3 motifs: 447.9 (n=37)\n", | |
| " 4 motifs: 682.8 (n=30)\n", | |
| " 5 motifs: 982.1 (n=99)\n", | |
| "\n", | |
| "5. LOGISTIC REGRESSION MODEL\n", | |
| "----------------------------------------------------------------------\n", | |
| "\n", | |
| "Model performance (test set):\n", | |
| " Accuracy: 97.5%\n", | |
| " ROC AUC: 0.974\n", | |
| "\n", | |
| "Feature importance (by coefficient):\n", | |
| " ACA: 2.333 (OR = 10.31)\n", | |
| " AchiVis: 2.273 (OR = 9.71)\n", | |
| " CNAAATT: 2.178 (OR = 8.83)\n", | |
| " Inr: 2.005 (OR = 7.43)\n", | |
| " tMAC: 1.788 (OR = 5.98)\n", | |
| "\n", | |
| "Promoters with all 5 motifs:\n", | |
| " Probability of narrow-high: 96.9% ± 0.0%\n", | |
| " (Paper: 92% ± 5.5%)\n", | |
| "\n", | |
| "======================================================================\n", | |
| " KEY BIOLOGICAL INSIGHTS\n", | |
| "======================================================================\n", | |
| "\n", | |
| "1. CELL TYPE-SPECIFIC PROMOTER ARCHITECTURE:\n", | |
| " Spermatocyte-specific promoters use a distinct architecture from canonical\n", | |
| " core promoters, lacking TATA and DPE elements.\n", | |
| "\n", | |
| "2. CHROMATIN OPENING MECHANISM:\n", | |
| " The tMAC complex binds upstream (~60 bp) and creates a ~100 bp\n", | |
| " nucleosome-free region for transcription initiation.\n", | |
| "\n", | |
| "3. PRECISE SPATIAL ORGANIZATION:\n", | |
| " Motif positions are precisely defined relative to TSS, suggesting\n", | |
| " constraints from nucleosome positioning.\n", | |
| "\n", | |
| "4. ADDITIVE REGULATORY LOGIC:\n", | |
| " Multiple motifs work together additively to define promoter strength\n", | |
| " and TSS usage efficiency.\n", | |
| "\n", | |
| "5. NARROW-HIGH PROMOTER SIGNATURE:\n", | |
| " The combination of all 5 motifs at optimal positions creates a\n", | |
| " highly expressed, narrowly-initiated promoter (~92% probability).\n", | |
| "\n", | |
| "6. DEVELOPMENTAL GENE REGULATION:\n", | |
| " This study reveals how promoter-proximal elements and cell type-specific\n", | |
| " chromatin binding complexes collaborate to establish robust, tissue-specific\n", | |
| " transcription programs during differentiation.\n", | |
| "\n", | |
| "======================================================================\n", | |
| " ANALYSIS COMPLETE\n", | |
| "======================================================================\n" | |
| ] | |
| } | |
| ], | |
| "source": [ | |
| "print(\"=\"*70)\n", | |
| "print(\" \"*20 + \"ANALYSIS SUMMARY\")\n", | |
| "print(\"=\"*70)\n", | |
| "\n", | |
| "print(\"\\n1. MOTIF ENRICHMENT\")\n", | |
| "print(\"-\" * 70)\n", | |
| "print(\"\\nKey motifs enriched in spermatocyte-specific promoters:\")\n", | |
| "print(\" • tMAC-ChIP motif (TAGTACC) - upstream regulator\")\n", | |
| "print(\" • Achi/Vis motif (TGTCA) - transcription factor binding site\")\n", | |
| "print(\" • Initiator (TCA) - precise TSS positioning\")\n", | |
| "print(\" • ACA motif - downstream element at +26/+28/+30\")\n", | |
| "print(\" • CNAAATT motif - downstream element at +29 to +60\")\n", | |
| "\n", | |
| "print(\"\\n2. POSITIONAL ENRICHMENT\")\n", | |
| "print(\"-\" * 70)\n", | |
| "for motif_name in ['tMAC-ChIP', 'ACA', 'CNAAATT']:\n", | |
| " if motif_name in position_data and position_data[motif_name]['n_occurrences'] > 0:\n", | |
| " mean_pos = np.mean(position_data[motif_name]['positions'])\n", | |
| " print(f\"\\n{motif_name}:\")\n", | |
| " print(f\" Occurrences: {position_data[motif_name]['n_occurrences']}\")\n", | |
| " print(f\" Mean position: {mean_pos:+.1f} bp from TSS\")\n", | |
| "\n", | |
| "print(\"\\n3. MOTIF EFFECTS ON EXPRESSION\")\n", | |
| "print(\"-\" * 70)\n", | |
| "for _, row in df_tss_results.iterrows():\n", | |
| " print(f\"\\n{row['motif']}:\")\n", | |
| " print(f\" Fold change in expression: {row['fold_change']:.2f}×\")\n", | |
| " print(f\" Statistical significance: P = {row['p_value']:.2e}\")\n", | |
| "\n", | |
| "print(\"\\n4. ADDITIVE EFFECTS\")\n", | |
| "print(\"-\" * 70)\n", | |
| "print(\"\\nMean expression by number of motifs:\")\n", | |
| "for n in range(6):\n", | |
| " subset = df_off_to_on[df_off_to_on['n_motifs'] == n]\n", | |
| " if len(subset) > 0:\n", | |
| " print(f\" {n} motifs: {np.mean(subset['cage_signal']):.1f} (n={len(subset)})\")\n", | |
| "\n", | |
| "print(\"\\n5. LOGISTIC REGRESSION MODEL\")\n", | |
| "print(\"-\" * 70)\n", | |
| "print(f\"\\nModel performance (test set):\")\n", | |
| "print(f\" Accuracy: {accuracy_score(y_test, y_pred):.1%}\")\n", | |
| "print(f\" ROC AUC: {roc_auc_score(y_test, y_pred_proba):.3f}\")\n", | |
| "\n", | |
| "print(\"\\nFeature importance (by coefficient):\")\n", | |
| "for _, row in coef_df.iterrows():\n", | |
| " print(f\" {row['motif']}: {row['coefficient']:.3f} (OR = {row['odds_ratio']:.2f})\")\n", | |
| "\n", | |
| "if len(all_motifs) > 0:\n", | |
| " print(f\"\\nPromoters with all 5 motifs:\")\n", | |
| " print(f\" Probability of narrow-high: {mean_prob:.1%} ± {std_prob:.1%}\")\n", | |
| " print(f\" (Paper: 92% ± 5.5%)\")\n", | |
| "\n", | |
| "print(\"\\n\" + \"=\"*70)\n", | |
| "print(\" \"*15 + \"KEY BIOLOGICAL INSIGHTS\")\n", | |
| "print(\"=\"*70)\n", | |
| "\n", | |
| "print(\"\"\"\n", | |
| "1. CELL TYPE-SPECIFIC PROMOTER ARCHITECTURE:\n", | |
| " Spermatocyte-specific promoters use a distinct architecture from canonical\n", | |
| " core promoters, lacking TATA and DPE elements.\n", | |
| "\n", | |
| "2. CHROMATIN OPENING MECHANISM:\n", | |
| " The tMAC complex binds upstream (~60 bp) and creates a ~100 bp\n", | |
| " nucleosome-free region for transcription initiation.\n", | |
| "\n", | |
| "3. PRECISE SPATIAL ORGANIZATION:\n", | |
| " Motif positions are precisely defined relative to TSS, suggesting\n", | |
| " constraints from nucleosome positioning.\n", | |
| "\n", | |
| "4. ADDITIVE REGULATORY LOGIC:\n", | |
| " Multiple motifs work together additively to define promoter strength\n", | |
| " and TSS usage efficiency.\n", | |
| "\n", | |
| "5. NARROW-HIGH PROMOTER SIGNATURE:\n", | |
| " The combination of all 5 motifs at optimal positions creates a\n", | |
| " highly expressed, narrowly-initiated promoter (~92% probability).\n", | |
| "\n", | |
| "6. DEVELOPMENTAL GENE REGULATION:\n", | |
| " This study reveals how promoter-proximal elements and cell type-specific\n", | |
| " chromatin binding complexes collaborate to establish robust, tissue-specific\n", | |
| " transcription programs during differentiation.\n", | |
| "\"\"\")\n", | |
| "\n", | |
| "print(\"=\"*70)\n", | |
| "print(\" \"*20 + \"ANALYSIS COMPLETE\")\n", | |
| "print(\"=\"*70)" | |
| ] | |
| }, | |
| { | |
| "cell_type": "markdown", | |
| "metadata": {}, | |
| "source": [ | |
| "---\n", | |
| "\n", | |
| "# Part 3: Scaling to Full Experiments\n", | |
| "\n", | |
| "## How to Adapt This Notebook for Full-Scale Analysis\n", | |
| "\n", | |
| "This notebook demonstrates the workflow with small-scale synthetic data. For full replication of the paper's analysis, researchers would need:\n", | |
| "\n", | |
| "### 1. Computational Resources\n", | |
| "\n", | |
| "- **Memory:** 32-64 GB RAM for genome-wide analyses\n", | |
| "- **Storage:** 100+ GB for raw sequencing data and intermediate files\n", | |
| "- **CPU:** Multi-core processor for parallel processing\n", | |
| "- **Time:** Several hours to days for complete pipeline\n", | |
| "\n", | |
| "### 2. Data Generation\n", | |
| "\n", | |
| "**RNA-seq:**\n", | |
| "- ~100 million reads per condition (2 replicates × multiple time points)\n", | |
| "- Ribo-Zero depletion and stranded library prep\n", | |
| "- Alignment with STAR to *Drosophila* genome dm6\n", | |
| "\n", | |
| "**CAGE:**\n", | |
| "- ~50 million reads per condition\n", | |
| "- nAnTi-CAGE protocol for precise 5' end capture\n", | |
| "- CAGEr package for cluster identification\n", | |
| "\n", | |
| "**ATAC-seq:**\n", | |
| "- ~20-30 million reads per condition\n", | |
| "- Standard ATAC-seq protocol with Nextera transposase\n", | |
| "- NucleoATAC for nucleosome positioning\n", | |
| "\n", | |
| "### 3. Software Requirements\n", | |
| "\n", | |
| "**Sequencing analysis:**\n", | |
| "```bash\n", | |
| "# Read processing\n", | |
| "trimgalore # Adapter trimming\n", | |
| "STAR # RNA-seq alignment\n", | |
| "bwa # ATAC-seq alignment\n", | |
| "samtools # BAM file processing\n", | |
| "bedtools # Genomic intervals\n", | |
| "\n", | |
| "# Differential expression\n", | |
| "DESeq2 # R package\n", | |
| "\n", | |
| "# CAGE analysis\n", | |
| "CAGEr # R package\n", | |
| "\n", | |
| "# ATAC-seq\n", | |
| "NucleoATAC # Nucleosome positioning\n", | |
| "deepTools # Visualization\n", | |
| "\n", | |
| "# Motif discovery\n", | |
| "MEME-ChIP # De novo motif discovery\n", | |
| "DREME # Short motif discovery \n", | |
| "CENTRIMO # Positional enrichment\n", | |
| "FIMO # Motif scanning\n", | |
| "```\n", | |
| "\n", | |
| "**Python packages:**\n", | |
| "```bash\n", | |
| "pip install numpy pandas scipy matplotlib seaborn\n", | |
| "pip install biopython pybedtools\n", | |
| "pip install scikit-learn statsmodels\n", | |
| "pip install pysam HTSeq\n", | |
| "```\n", | |
| "\n", | |
| "**R packages:**\n", | |
| "```R\n", | |
| "install.packages(c(\"DESeq2\", \"CAGEr\", \"ggplot2\", \"dplyr\"))\n", | |
| "```\n", | |
| "\n", | |
| "### 4. Data Files Needed\n", | |
| "\n", | |
| "- *Drosophila melanogaster* genome (dm6)\n", | |
| "- Gene annotations (Ensembl BDGP6.84)\n", | |
| "- Raw FASTQ files from sequencing\n", | |
| "- ChIP-seq data for tMAC component (from Kim et al. 2017)\n", | |
| "\n", | |
| "### 5. Full Pipeline Steps\n", | |
| "\n", | |
| "**Step 1: RNA-seq processing**\n", | |
| "```bash\n", | |
| "# Quality control and trimming\n", | |
| "trim_galore --paired read1.fq.gz read2.fq.gz\n", | |
| "\n", | |
| "# Alignment\n", | |
| "STAR --genomeDir dm6_index --readFilesIn read1_trimmed.fq read2_trimmed.fq \\\n", | |
| " --outSAMtype BAM SortedByCoordinate\n", | |
| "\n", | |
| "# Count features\n", | |
| "featureCounts -p -t exon -g gene_id -a annotation.gtf -o counts.txt aligned.bam\n", | |
| "```\n", | |
| "\n", | |
| "**Step 2: Differential expression in R**\n", | |
| "```R\n", | |
| "library(DESeq2)\n", | |
| "dds <- DESeqDataSetFromMatrix(countData = counts, \n", | |
| " colData = metadata, \n", | |
| " design = ~ condition)\n", | |
| "dds <- DESeq(dds)\n", | |
| "results <- results(dds, contrast = c(\"condition\", \"72hrPHS\", \"bam\"))\n", | |
| "```\n", | |
| "\n", | |
| "**Step 3: CAGE analysis**\n", | |
| "```R\n", | |
| "library(CAGEr)\n", | |
| "# Build CAGE clusters and calculate RETI\n", | |
| "# (See Supplemental Methods in paper)\n", | |
| "```\n", | |
| "\n", | |
| "**Step 4: ATAC-seq processing**\n", | |
| "```bash\n", | |
| "# Alignment and filtering\n", | |
| "bwa mem dm6.fa read1.fq read2.fq | samtools view -b - > aligned.bam\n", | |
| "samtools sort aligned.bam -o sorted.bam\n", | |
| "picard MarkDuplicates I=sorted.bam O=dedup.bam M=metrics.txt REMOVE_DUPLICATES=true\n", | |
| "\n", | |
| "# Nucleosome positioning\n", | |
| "nucleoatac run --bed peaks.bed --bam dedup.bam --fasta dm6.fa --out output\n", | |
| "```\n", | |
| "\n", | |
| "**Step 5: Motif discovery**\n", | |
| "```bash\n", | |
| "# Extract promoter sequences (300 bp centered on CAGE cluster)\n", | |
| "bedtools getfasta -fi dm6.fa -bed promoters.bed -fo promoters.fa\n", | |
| "\n", | |
| "# Run MEME-ChIP\n", | |
| "meme-chip -oc output_dir -db motif_databases.meme promoters.fa\n", | |
| "```\n", | |
| "\n", | |
| "**Step 6: Statistical analysis (as in this notebook)**\n", | |
| "\n", | |
| "### 6. Expected Runtime\n", | |
| "\n", | |
| "With appropriate computational resources:\n", | |
| "- Sequencing alignment: 2-4 hours per sample\n", | |
| "- Differential expression: 30 minutes\n", | |
| "- CAGE clustering: 1-2 hours\n", | |
| "- ATAC-seq processing: 2-3 hours\n", | |
| "- Motif discovery: 4-8 hours (computationally intensive)\n", | |
| "- Statistical analyses: 1-2 hours\n", | |
| "\n", | |
| "**Total:** ~1-2 days for complete pipeline\n", | |
| "\n", | |
| "### 7. Key Differences from This Notebook\n", | |
| "\n", | |
| "| Aspect | This Notebook | Full Analysis |\n", | |
| "|--------|---------------|---------------|\n", | |
| "| Promoters | 480 synthetic | ~9,000 real genomic |\n", | |
| "| Sequence source | Generated | Extracted from dm6 genome |\n", | |
| "| CAGE data | Simulated | Real TSS mapping |\n", | |
| "| Expression data | Model-based | Sequencing-based |\n", | |
| "| Motif discovery | Known motifs | De novo + validation |\n", | |
| "| Statistical power | Demonstration | Publication-grade |\n", | |
| "| Runtime | ~10 minutes | ~1-2 days |\n", | |
| "| Memory | <1 GB | 32-64 GB |\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "## Conclusion\n", | |
| "\n", | |
| "This notebook provides an educational overview of the computational methods used to identify and characterize promoter-proximal regulatory elements in Drosophila spermatogenesis. The small-scale demonstration:\n", | |
| "\n", | |
| "✓ Explains the biological and statistical frameworks \n", | |
| "✓ Implements working code for all key analyses \n", | |
| "✓ Runs within resource constraints (~10 minutes, <1GB memory) \n", | |
| "✓ Provides clear guidance for scaling to full experiments \n", | |
| "\n", | |
| "Researchers can use this as a template and scale up to their own genomic datasets using the guidance provided above.\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "**Citation:** \n", | |
| "Lu D, Sin HS, Lu C, Fuller MT. (2020). Developmental regulation of cell type-specific transcription by novel promoter-proximal sequence elements. *Genes & Development* 34:663-677. doi: 10.1101/gad.335331.119\n", | |
| "\n", | |
| "**Data Availability:** \n", | |
| "All sequencing data: GEO GSE145975 \n", | |
| "Analysis scripts: https://github.com/danrlu/Fuller_Lab_paper\n", | |
| "\n", | |
| "---\n", | |
| "\n", | |
| "*Notebook generated for educational purposes. For questions about methodology, please refer to the original paper and supplemental materials.*" | |
| ] | |
| } | |
| ], | |
| "metadata": { | |
| "kernelspec": { | |
| "display_name": "Python 3", | |
| "language": "python", | |
| "name": "python3" | |
| }, | |
| "language_info": { | |
| "codemirror_mode": { | |
| "name": "ipython", | |
| "version": 3 | |
| }, | |
| "file_extension": ".py", | |
| "mimetype": "text/x-python", | |
| "name": "python", | |
| "nbconvert_exporter": "python", | |
| "pygments_lexer": "ipython3", | |
| "version": "3.8.0" | |
| } | |
| }, | |
| "nbformat": 4, | |
| "nbformat_minor": 4 | |
| } |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment