site:www.marktechpost.com

NVIDIA AI Releases Eagle2 Series Vision-Language Model: Achieving SOTA Results Across Various Multimodal Benchmarks

Vision-Language Models (VLMs) have significantly expanded AI’s ability to process multimodal information, yet they face persistent challenges. Proprietary models such as GPT-4V and Gemini-1.5-Pro ...

marktechpost3d

TensorLLM: Enhancing Reasoning and Efficiency in Large Language Models through Multi-Head Attention Compression and Tensorisation

LLMs based on transformer architectures, such as GPT and LLaMA series, have excelled in NLP tasks due to their extensive parameterization and large training datasets. However, research indicates that ...

marktechpost3d

Meta AI Introduces MR.Q: A Model-Free Reinforcement Learning Algorithm with Model-Based Representations for Enhanced Generalization

Reinforcement learning (RL) trains agents to make sequential decisions by maximizing cumulative rewards. It has diverse applications, including robotics, gaming, and automation, where agents interact ...

marktechpost5d

This AI Paper Introduces IXC-2.5-Reward: A Multi-Modal Reward Model for Enhanced LVLM Alignment and Performance

Artificial intelligence has grown significantly with the integration of vision and language, allowing systems to interpret and generate information across multiple data modalities. This capability ...

marktechpost5d

Test-Time Preference Optimization: A Novel AI Framework that Optimizes LLM Outputs During Inference with an Iterative Textual Reward Policy

Large Language Models (LLMs) have become an indispensable part of contemporary life, shaping the future of nearly every conceivable domain. They are widely acknowledged for their impressive ...

marktechpost5d

DeepSeek-AI Releases Janus-Pro 7B: An Open-Source multimodal AI that Beats DALL-E 3 and Stable Diffusion

Multimodal AI integrates diverse data formats, such as text and images, to create systems capable of accurately understanding and generating content. By bridging textual and visual data, these models ...

marktechpost5d

Building a Retrieval-Augmented Generation (RAG) System with DeepSeek R1: A Step-by-Step Guide

With the release of DeepSeek R1, there is a buzz in the AI community. The open-source model offers some best-in-class performance across many metrics, even at par with state-of-the-art proprietary ...

marktechpost4d

Qwen AI Releases Qwen2.5-VL: A Powerful Vision-Language Model for Seamless Computer Interaction

In the evolving landscape of artificial intelligence, integrating vision and language capabilities remains a complex challenge. Traditional models often struggle with tasks requiring a nuanced ...

marktechpost5d

Advancing Single-Cell Genomics with Self-Supervised Learning: Techniques, Applications, and Insights

SSL is a powerful technique for extracting meaningful patterns from large, unlabelled datasets, proving transformative in fields like computer vision and NLP. In single-cell genomics (SCG), SSL offers ...

marktechpost4d

Qwen AI Introduces Qwen2.5-Max: A large MoE LLM Pretrained on Massive Data and Post-Trained with Curated SFT and RLHF Recipes

The field of artificial intelligence is evolving rapidly, with increasing efforts to develop more capable and efficient language models. However, scaling these models comes with challenges, ...

marktechpost3d

Optimization Using FP4 Quantization For Ultra-Low Precision Language Model Training

Large Language Models (LLMs) have emerged as transformative tools in research and industry, with their performance directly correlating to model size. However, training these massive models presents ...

marktechpost6d

Meet Open R1: The Full Open Reproduction of DeepSeek-R1, Challenging the Status Quo of Existing Proprietary LLMs

Open Source LLM development is going through great change through fully reproducing and open-sourcing DeepSeek-R1, including training data, scripts, etc. Hosted on Hugging Face’s platform, this ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results