AI Research Insights
Posts
↗️ AI/ML Research Updates: NVIDIA AI Researchers Propose Tied-Lora; Microsoft Research Introduces Florence-2; LLMWare Launches RAG-Specialized 7B Parameter LLMs; .. and many more research trends

↗️ AI/ML Research Updates: NVIDIA AI Researchers Propose Tied-Lora; Microsoft Research Introduces Florence-2; LLMWare Launches RAG-Specialized 7B Parameter LLMs; .. and many more research trends

This newsletter brings AI research news that is much more technical than most resources but still digestible and applicable

ASIF RAZZAQ
November 24, 2023

Hey Folks!

This newsletter will discuss some cool AI research papers. Happy learning!

👉 What is Trending in AI/ML Research?

➡️ NVIDIA AI Researchers Propose Tied-Lora: A Novel Artificial Intelligence Approach that Aims to Improve the Parameter Efficiency of the Low-rank Adaptation (LoRA) Methods

How can the efficiency of the Low-rank Adaptation (LoRA) method be enhanced for language models? This paper introduces "Tied-LoRA," a paradigm that combines weight tying with selective training to boost LoRA's parameter efficiency. The research explores various combinations of parameter training and freezing, alongside weight tying, to find an optimal balance between performance and the trainable parameters' count. This exploration, conducted across different tasks and two base language models, reveals the trade-offs between efficiency and performance. Notably, a specific Tied-LoRA configuration is highlighted for its exceptional performance, achieving results comparable to standard LoRA while using only about 13% of its parameters.

➡️ Microsoft Research Introduces Florence-2: A Novel Vision Foundation Model with a Unified Prompt-based Representation for a Variety of Computer Vision and Vision-Language Tasks

How can a unified model handle the complexity of various computer vision and vision-language tasks with simple text prompts? Florence-2 addresses this by introducing a vision foundation model capable of understanding and executing a wide range of tasks based on text-prompt instructions. It operates across diverse domains such as captioning, object detection, grounding, and segmentation. This versatility is powered by the extensive FLD-5B dataset, comprising 5.4 billion annotations on 126 million images. Florence-2 employs a sequence-to-sequence structure, enabling it to adeptly translate textual prompts into accurate visual tasks. Its performance, tested extensively, showcases its robust zero-shot learning and fine-tuning abilities, positioning it as a formidable player in vision foundation models.

➡️ LLMWare Launches RAG-Specialized 7B Parameter LLMs: Production-Grade Fine-Tuned Models for Enterprise Workflows Involving Complex Business Documents

How can enterprises efficiently implement Retriever-Augmented Generation (RAG) systems using Large Language Models (LLMs) for complex workflows? Ai Bloks addresses this with the launch of "llmware", an open-source framework designed for constructing enterprise-grade LLM-based workflow applications. The latest addition to this suite is the DRAGON series, featuring seven billion parameter LLMs optimized for business workflows, particularly focusing on fact-based question-answering within intricate business and legal documents. LLMWare specifically caters to the enterprise demand for a unified framework that integrates LLMs with workflow tools, provides high-quality, specialized LLMs for enterprise tasks, and allows private, customizable, and cost-effective deployment options. The DRAGON models, available on Hugging Face, are fine-tuned for RAG tasks, ensuring production-grade readiness for diverse enterprise applications.

➡️ This AI Paper from Yale Proposes ML-BENCH: A Novel Artificial Intelligence Approach Developed to Assess the Effectiveness of LLMs in Leveraging Existing Functions in Open-Source Libraries

How can LLMs bridge the gap between code generation benchmarks and practical programming involving pre-existing libraries? This paper introduces "ML-Bench", a benchmark designed to evaluate the effectiveness of LLMs in leveraging open-source libraries for machine learning tasks. It comprises 10,044 samples across 130 tasks from 14 major GitHub repositories. LLMs are tested on generating code for a given machine learning task, requiring them to interpret complex, language-code mixed documents and multi-file code structures. Despite GPT-4's advanced capabilities, achieving only 39.73% task completion highlights a significant improvement area. To address this, "ML-Agent" is proposed, enhancing GPT-4's ability to navigate codebases, locate documentation, retrieve relevant code, and generate executable solutions. This approach marks a substantial improvement over traditional LLM applications in coding.

➡️ This AI Research Presents Drivable 3D Gaussian Avatars (D3GA): The First 3D Controllable Model for Human Bodies Rendered with Gaussian Splats

How can we create photorealistic, controllable 3D avatars in real-time without the need for dense input images or accurate 3D registrations? This paper introduces Drivable 3D Gaussian Avatars (D3GA), a novel approach utilizing 3D Gaussian Splatting (3DGS) for rendering realistic human figures. Unlike traditional methods that rely on neural radiance fields and suffer from slow performance, D3GA uses dense calibrated multi-view videos for input, rendering at real-time framerates. The framework employs cage deformations instead of linear blend skinning (LBS) for a more efficient deformation of 3D Gaussian primitives. These deformations are driven by joint angles and keypoints, making D3GA ideal for telepresence applications. Tested on nine diverse subjects, D3GA demonstrates superior quality compared to existing methods, using identical training and test datasets.

✅ Check Out These FREE Tutorials and Notebooks from Our Partners