- Use DSpy https://dspy.ai/ for automatic prompt optimisation.
- Microsoft's attempt to fix the prompt management: https://github.com/microsoft/prompty. Supported by LangChain.
- Use DLPack https://dmlc.github.io/dlpack/latest/ for zero-data-copy of tensor data between between different libs (PyTorch, NumPy, etc.).
- Use Apache Arrow https://arrow.apache.org/ for zero-data-copy of tabular data between different libs (Pandas, Polars, DuckDB, etc.).
- If you're using NVIDIA GPUs, use cuDF-cu12 https://docs.rapids.ai/api/cudf/stable/
Qdrant https://github.com/qdrant/qdrant is generally loved by many as a vector DB.
ArcFace https://github.com/deepinsight/insightface and SigLip2 https://huggingface.co/blog/siglip2 are your best friends here.
Marimo is emerging as a Jupyter notebook replacement.
- Some suggest to use universal LLM proxies for corporate use. LiteLLM https://docs.litellm.ai/ is one of such. Same folks suggest using Open WebUI as a generally available tool for corporate use.
- MCP gets tremendous amount of attention.
- Proper (optimal) implementation of RAGs get a lot of attention too. Make sure you're familiar with pre, mid, and post optimisations.
- Agents get tremendous amount of attention. New frameworks like Dapr https://github.com/dapr/dapr emerge.
- Dutch local governance is actively integrating LLMs into their processes.
- Some suggest to use outlines https://github.com/dottxt-ai/outlines for structured output.
- Some suggest SmolLM https://github.com/huggingface/smollm or https://ollama.com/library/smollm for development/testing, as it's extremely small and fast. There's SmolVLM too, if you need a multimodal version.
- Some suggest to use DuckDB https://duckdb.org/why_duckdb.html for querying extremely large datasets.
- Docling is generally loved by many for format conversion. Check docling-langchain https://github.com/docling-project/docling-langchain/ for better LangChain integration.