Alibaba Launches Qwen2-VL, Surpasses GPT-4o & Claude 3.5 Sonnet

The advanced 72 billion parameter version of Qwen2-VL suprasses GPT-4o and Claude 3.5 Sonnet. The post Alibaba Launches Qwen2-VL, Surpasses GPT-4o & Claude 3.5 Sonnet appeared first on AIM.

Alibaba Launches Qwen2-VL, Surpasses GPT-4o & Claude 3.5 Sonnet


Click the link to join the registration on WhatsApp: https://chat.whatsapp.com/KThkuZlWaTcCu3iJHGBmLM


Click the link to join the registration on WhatsApp: https://chat.whatsapp.com/KThkuZlWaTcCu3iJHGBmLM


Click the link to join the registration on WhatsApp: https://chat.whatsapp.com/KThkuZlWaTcCu3iJHGBmLM
Alibaba Releases Qwen2, Outperforms Llama 3 on Several Benchmarks

Alibaba recently released Qwen2-VL, the latest model in its vision-language series. The new model can chat via camera, play card games, and control mobile phones and robots by acting as an agent. It is available in three versions: the open source 2 billon and 7 billion models, and the more advanced 72 billion model, accessible using API. 

The advanced 72 billion model of Qwen2-VL achieved SOTA visuals understanding across 20 benchmarks. “Overall, our 72B model showcases top-tier performance across most metrics, often surpassing even closed-source models like GPT-4o and Claude 3.5-Sonnet,” read the blog, saying that it demonstrates a significant edge in document understanding. 

Qwen2-VL performs exceptionally well in benchmarks like MathVista (for math reasoning), DocVQA (for document understanding), and RealWorldQA (for answering real-world questions using visual information). 

The model can analyse videos longer than 20 minutes, provide detailed summaries, and answer questions about the content. Qwen2-VL can also function as a control agent, operating devices like mobile phones and robots using visual cues and text commands. 

Interestingly, it can recognise and understand text in images across multiple languages, including European languages, Japanese, Korean, and Arabic, making it accessible to a global audience.

Check out the model here

Key highlights: 

One of the key architectural improvements in Qwen2-VL includes the implementation of Naive Dynamic Resolution support. The model can adapt to and process images of different sizes and clarity. 

“Unlike its predecessor, Qwen2-VL can handle arbitrary image resolutions, mapping them into a dynamic number of visual tokens, thereby ensuring consistency between the model input and the inherent information in images,” said Binyuan Hui, the creator of OpenDevin and core maintainer at Qwen. 

He said that this approach more closely mimics human visual perception, allowing the model to process images of any clarity or size. 

Another key architectural upgrade, according to Hui, is the innovation of Multimodal Rotary Position Embedding (M-ROPE). “By deconstructing the original rotary embedding into three parts representing temporal and spatial (height and width) information,M-ROPE enables LLM to concurrently capture and integrate 1D textual, 2D visual, and 3D video positional information,” he added. 

In other words, this technique enables the model to understand and integrate text, image and video data. “Data is All You Need!” said Hui. 

The use cases are plenty. William J.B. Mattingly, a digital nomad on X recently praised this development calling it his new favorite Handwritten Text Recognition (HTR) model while trying to convert a handwritten text into digital format. 

Ashutosh Shrivastava a user on X  used this model to solve a calculus problem and reported successful results for the same, proving its validity in problem solving.

The post Alibaba Launches Qwen2-VL, Surpasses GPT-4o & Claude 3.5 Sonnet appeared first on AIM.

What's Your Reaction?

like

dislike

love

funny

angry

sad

wow

Keep advertising to get more people

You could reach thousands of more people for every ₦1,000 you spend. https://doacweb.com/advertising

Adverts on doacWeb can be informative, educative or persuasive in nature.

doacWeb Ads is always directed at a broad audience (reaching thousands of people day by day), not few individuals — it deliver your advert to the target audience at the same time. Putting your offer in front of the right people — who have the money and interest in what you sell.

doacWeb Advertising gives you advantage as adverts passes through https://doacweb.com to the internet, reaching millions of people over the internet, and to grow your audience.

Grow your business, Be known, Boost your visibility, Drive engagements, Get new customers and Increase sales.

doacWeb acts as a global advertising media, to let people — individuals — and businesses, to promote and reach more interested people.

Get Started.

WhatsApp: 09031633831

Email: info@doacweb.com
Keep advertising to get more people

You could reach thousands of more people for every ₦1,000 you spend. https://doacweb.com/advertising

Adverts on doacWeb can be informative, educative or persuasive in nature.

doacWeb Ads is always directed at a broad audience (reaching thousands of people day by day), not few individuals — it deliver your advert to the target audience at the same time. Putting your offer in front of the right people — who have the money and interest in what you sell.

doacWeb Advertising gives you advantage as adverts passes through https://doacweb.com to the internet, reaching millions of people over the internet, and to grow your audience.

Grow your business, Be known, Boost your visibility, Drive engagements, Get new customers and Increase sales.

doacWeb acts as a global advertising media, to let people — individuals — and businesses, to promote and reach more interested people.

Get Started.

WhatsApp: 09031633831

Email: info@doacweb.com
Keep advertising to get more people

You could reach thousands of more people for every ₦1,000 you spend. https://doacweb.com/advertising

Adverts on doacWeb can be informative, educative or persuasive in nature.

doacWeb Ads is always directed at a broad audience (reaching thousands of people day by day), not few individuals — it deliver your advert to the target audience at the same time. Putting your offer in front of the right people — who have the money and interest in what you sell.

doacWeb Advertising gives you advantage as adverts passes through https://doacweb.com to the internet, reaching millions of people over the internet, and to grow your audience.

Grow your business, Be known, Boost your visibility, Drive engagements, Get new customers and Increase sales.

doacWeb acts as a global advertising media, to let people — individuals — and businesses, to promote and reach more interested people.

Get Started.

WhatsApp: 09031633831

Email: info@doacweb.com