DALL·E 3, AlphaMissense, and LoRA for finetuning LLMs for longer context windows
Here are your weekly articles, guides, and news about NLP and AI chosen for you by NLPlanet!
😎 News From The Web
- OpenAI announces DALL·E 3. OpenAI is launching DALL·E 3, an improved version that excels in following instructions, requires less prompt engineering, and can communicate with ChatGPT. This integration enables users to refine prompts for DALL·E 3 by describing their ideas to ChatGPT. Starting in October, DALL·E 3 will be available to ChatGPT Plus and Enterprise customers.
- DeepMind announces AlphaMissense, a catalogue of genetic mutations to help pinpoint the cause of diseases. DeepMind has released AlphaMissense, a model that uses AlphaFold’s protein structure prediction to categorize missense genetic mutations as benign or malign. It surpasses human efforts by classifying 89% of 71 million variants.
- Announcing Microsoft Copilot, your everyday AI companion. Microsoft Copilot will provide tailored assistance based on workplace data and web context. It enhances productivity and creativity in Windows 11, Microsoft 365, Edge, and Bing, while prioritizing privacy. Additionally, Bing and Edge users will enjoy a personalized experience with OpenAI’s DALL.E 3 model, including AI shopping and image creation.
- Bard can now connect to your Google apps and services. The new Bard Extensions feature provides integration with various Google tools, enabling AI professionals to collaborate more effectively. This includes fetching and displaying relevant information from Gmail, Docs, Drive, Maps, YouTube, Flights, and hotels, regardless of its scattered nature.
- How California is using AI to snuff out wildfires before they explode. The California Department of Forestry and Fire Protection is utilizing AI technology to improve wildfire detection and response. Through the Alert California program, AI scans the wilderness for anomalies like smoke, alerting officials when fires are detected.
📚 Guides From The Web
- Adept.ai shares curious errors occuring during their large training runs. Adept.ai, an AI company, shares insights on errors that can occur during large training runs. These errors can lead to learning curve issues, where models may appear fine but not function correctly due to accumulating small errors over time.
- DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI.. Mustafa Suleyman, co-founder of DeepMind, emphasizes the positive impact of technology on healthcare and leads an AI policy development team at Google. Backed by influential figures and companies, Suleyman introduces Pi, a friendly AI, and advocates for interactive AI as a means to connect technology with societal impact.
- GPT 3.5 vs Llama 2 fine-tuning: A Comprehensive Comparison. A comparison of ChatGPT 3.5 and Llama 2 on an SQL task and a functional representation task reveals that GPT 3.5 slightly outperforms Llama 2. However, the cost of training and deploying GPT 3.5 is 4–6 times higher than Llama 2.
- Object Detection Leaderboard. Hugging Face has introduced the Object Detection Leaderboard, featuring top-performing models based on the DETA and DETR architectures.
- Generative AI for time-series. Generative Adversarial Networks (GANs) offer promise in generating high-quality synthetic time series data, but face challenges in preserving temporal relationships and mapping complex connections. However, a unique architecture called DoppelGANger employs batch generation, auto- normalization, and joint distribution modeling to overcome these challenges and accurately generate realistic temporal patterns.
🔬 Interesting Papers and Repositories
- LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models. LongLoRA is a method for efficiently extending the context size of pre-trained language models (LLMs) in artificial intelligence. By utilizing sparse local attention during training and dense global attention during inference, this approach allows for cost-effective fine-tuning and maintains performance. LongLoRA demonstrates impressive results on various tasks and enables context extension up to 100k tokens in LLMs.
- vllm-project/vllm: A high-throughput and memory-efficient inference and serving engine for LLMs. vLLM is an open-source serving engine that offers exceptional speed and improved efficiency for LLMs. It integrates seamlessly with Hugging Face, supporting high-throughput serving with advanced decoding algorithms. With its impressive performance, vLLM outperforms Hugging Face Transformers and Text Generation Inference in terms of throughput.
- Chain-of-Verification Reduces Hallucination in Large Language Models. Chain-of-Verification (CoVe) is a straightforward approach that effectively minimizes hallucinations in Language Model-based systems. Through its systematic process of generating, verifying, and delivering responses, CoVe has proven its success in reducing hallucinations across various tasks, including question answering and text generation.
- Fast Feedforward Networks. Fast Feedforward Networks (FFF) are binary tree structures with smaller neural networks as leaves, offering significantly faster performance compared to Mixture-of-Experts networks. Despite challenges like fragmentation due to an overly deep tree, FFF networks hold great promise for scenarios requiring fast inference and encoding of minor details.
- Contrastive Decoding Improves Reasoning in Large Language Models. Contrastive decoding in LLM is a powerful method for reasoning tasks. It surpasses greedy decoding and nucleus sampling, excelling in benchmarks like HellaSwag and GSM8K.
- PDFTriage: Question Answering over Long, Structured Documents. Researchers have developed PDFTriage, a solution that enhances the performance of Language Model- based question answering systems on structured documents like PDFs. By incorporating document structure and content, PDFTriage outperforms existing models in answering complex questions across various categories.
- CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages. CulturaX is a curated multilingual dataset containing 6T tokens, designed for language models in 167 languages. The dataset undergoes thorough cleaning stages to ensure top-quality training data for AI language models.
- An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models. Researchers have found that improving image resolution and mixing multimodal-language data during training can enhance the performance of multimodal models like LLaVA and MiniGPT-4. Additionally, they have discovered that tuning visual instructions can further improve the language capabilities of these models.
- Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers. EvoPrompt, a new framework using evolutionary algorithms, optimizes prompt generation for language models like GPT-3.5 and Alpaca. It surpasses human-engineered prompts and current methods, demonstrating its effectiveness for language tasks.
- Scaling Laws for Sparsely-Connected Foundation Models. Researchers have discovered a unique scaling law that shows the relationship between weight sparsity, non-zero parameters, and training data volume in foundation models. They also found that the optimal sparsity level for performance increases with more data.