How to Compare AI Models from OpenAI, Google, and More

tanya rai
lastmile ai — blog
4 min readMay 31, 2023

--

In today’s evolving landscape of artificial intelligence (AI), new models are being developed at a rapid pace from both big tech companies and the open source community, each promising better performance and accuracy.

Unfortunately, it has not been easy to evaluate and compare these models against each other, especially for people not well-versed in AI technology.

We’re here to help! LastMile AI is a platform where you can easily experiment with and compare models from various providers like OpenAI, Google, and the open-source community.

Models Available on LastMile AI

Today on LastMile AI you can compare large language models (ChatGPT vs PaLM 2) and image generation models (DALL-E 2 vs StableDiffusion v1.4). We are working hard to add new models so stay tuned!

The table below provides a quick overview of the different models available today:

1/ Compare StableDiffusion with DALL-E

Diffusion models refer to a type of machine learning algorithm that can generate high-quality images by progressively refining them over time. Over the last decade, diffusion models have been further refined and developed by researchers around the world. A major breakthrough was the development of Improved Diffusion Probabilistic Models (IDP) in 2021 which incorporated more efficient algorithms to generate images more quickly.

Two popular diffusion-based image generation models are:

DALL-E tends to offer an easy-to-use interface where simple, short prompts can get you far. StableDiffusion offers much more customization and parameters so you can create much more specific images. For example, negative prompts on Stable Diffusion allow you to tell the model what you don’t want to see in the generated images.

Let’s try out a couple examples!

Landscape with an Aesthetic: DALL-E 2 vs StableDiffusion v1.4

The landscape from DALL-E more aptly emulates Salvador Dali’s aesthetic and style than the Stable Diffusion response, although the Stable Diffusion response has more characteristics of San Francisco as a city. In addition, we are able to further customize our images with negative prompts using Stable Diffusion.

Character with an Aesthetic: DALL-E 2 vs StableDiffusion v1.4

Link to Comparison: https://lastmileai.dev/reviews/clic29d8z0057pian5mj5r8a6

StableDiffusion excels in the Studio Ghibli aesthetic for Batman, whereas the DALL-E output has little resemblance to the style requested.

2/ Compare ChatGPT with PaLM

LLMs have become a powerful tool for interactive communication experiences such as chatbots.

Two of the most popular AI chatbots available today are:

In light of this, many top models have variants focused on such chat experiences. ChatGPT leverages the foundation model GPT-3.5 and Bard leverages the foundation model PaLM 2. Note that PaLM 2 includes two models: 1/ text-bison for single-turn input/output scenarios and 2/ chat-bison for chat-like scenarios. Both text-bison and chat-bison models are available on LastMile AI and labeled as PaLM and PaLM Chat respectively.

Generally, ChatGPT is reputable for having a human-like conversational tone and PaLM is supposedly better for coding tasks. Let’s test this out through some examples!

Conversation Tone: ChatGPT vs PaLM

Link to Comparison: https://lastmileai.dev/reviews/clic07sn3004fpiandcvw1bag

ChatGPT provides a much more casual, human-like tone while PaLM sounds robotic and unnecessarily wordy. In addition, PaLM included information that was never provided and incorrect. In the AI world, we call this tendency a “hallucination” and it is one of the major challenges in the industry right now.

Coding: ChatGPT vs PaLM

Link to Comparison: https://lastmileai.dev/reviews/clibue0ra001ipian3s5bil1y

Surprisingly, ChatGPT provides a better coding response than PaLM in this example. The PaLM solution is missing capitalization on ‘wrapper’ and has an unnecessary state value and useEffect, which is detrimental to render performance. However, PaLM was faster in getting a response. We would love to know if you’ve found code examples where PaLM outperforms ChatGPT!

Reach out!

We would love to see what insights you find from comparing models from different providers! Please share your work and tag us on our socials- we are excited to see what you learn.

Our vision at LastMile AI is to make it easy for you to work with and experiment with AI all on one platform. We’re just getting started and would greatly appreciate any feedback!

Check us out: lastmileai.dev

Follow us on Twitter: @lastmileaidev

Join our Discord: LastMile AI

--

--