Skip to content

Instantly share code, notes, and snippets.

View janduplessis883's full-sized avatar

Jan du Plessis janduplessis883

  • London
  • 20:33 (UTC)
View GitHub Profile
@janduplessis883
janduplessis883 / theme.toml
Last active November 18, 2025 06:22
streamlit_custom_theme
[theme]
# The theme that your custom theme inherits from.
#
# This can be one of the following:
# - "light": Streamlit's default light theme.
# - "dark": Streamlit's default dark theme.
# - A local file path to a TOML theme file: A local custom theme, like
# "themes/custom.toml".
# - A URL to a TOML theme file: An externally hosted custom theme, like
@janduplessis883
janduplessis883 / dataset.py
Created November 11, 2025 13:56
SetFit Synthetic Dataset
# Healthcare Feedback Dataset for SetFit
# 90 examples (15 per class) - suitable for few-shot learning
# Optimized label names for ML
LABEL_MAPPING = {
0: "access_availability",
1: "information_provision",
2: "privacy_confidentiality",
3: "continuity_care",
4: "clinical_communication",
import spacy
from spacy.matcher import Matcher
import math
import pandas as pd
# ============================================================================
# Installation Requirements:
# 1. pip install spacy
# 2. python -m spacy download en_core_web_sm
# ============================================================================
@janduplessis883
janduplessis883 / data.py
Created November 5, 2025 20:59
Project Noema
import math
import os
import re
from nltk import pos_tag, word_tokenize
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import pairwise_distances
import warnings
warnings.filterwarnings("ignore")
import matplotlib.pyplot as plt
@janduplessis883
janduplessis883 / custom_tools.py
Created January 7, 2025 02:14
crewAI Notion Integration Tools
import toml
from crewai_tools import BaseTool
from typing import ClassVar, Union, Dict, Any, List
import requests
# Load the TOML file
with open("notioncrew/config_secrets.toml", "r") as f:
config_secrets = toml.load(f)
# Load environment variables from streamlit secrets
@janduplessis883
janduplessis883 / Association Rule Mining in Python Tutorial.ipynb
Last active May 8, 2024 14:26
Association Rule Mining in Python Tutorial
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@janduplessis883
janduplessis883 / DataPreprocessingTool.py
Created May 6, 2024 21:16 — forked from Cdaprod/DataPreprocessingTool.py
Langchain tool for preprocessing text data. Version one million nine-hundred and fifty two 😂 jk version 1
import spacy
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from langchain.tools import BaseTool
from typing import Optional, Union, List
from langchain.callbacks.manager import CallbackManagerForToolRun, AsyncCallbackManagerForToolRun
class DataPreprocessingTool(BaseTool):
name = "DataPreprocessingTool"
description = "A tool for preprocessing and structuring unstructured data."
@janduplessis883
janduplessis883 / README.txt
Last active May 2, 2024 02:51
Pinecone Preprocessing Data for Vector Database
In this walkthrough we will see how to use Pinecone for semantic search.
@janduplessis883
janduplessis883 / 01_Embedding_Data_From_A_Pandas_DataFrame_Chroma_LangChain_Ollama.py
Last active July 16, 2025 00:18
Embedding Data from a Pandas DataFrame into a Chroma Vector Database using LangChain and Ollama
import pandas as pd
from langchain.schema import Document
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.vectorstores import Chroma
from tqdm import tqdm
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.