New Alternatives to AI Models - How to Choose the Right One (Part 2)
The development of AI-driven models is radically changing the way we interact with machines, tackle large volumes of data and see the future of creativity and problem solving. What started as simplified text prediction evolved into the sophisticated capabilities of today's models. This journey has been marked by a series of important milestones, each introducing deeper understanding and more nuanced communication capabilities.
As we observe the transformation of the digital landscape powered by AI, it is crucial to understand the challenges and hurdles faced by these cutting-edge models. In this instalment of my blog series, I will introduce some lesser-known models that are sure to gain a more prominent place among the large language models.
Grok - unconventionality and boldness from xAI
The Grok chatbot is powered by Grok-1, the LLM model developed by xAI, the AI company founded by Elon Musk. it is distinguished by its ability to tackle bold questions that other models might refuse to answer and add humour to its responses. While this is a fresh approach to AI interactions, it can present challenges in maintaining relevance and avoiding misunderstandings. The company claims Grok-1 exceeds the performance of GPT-3.5 and comparable AI models, while also adding a dash of humour. Grok supports branching conversations, allowing users to move between different topics. In addition, users have the unique ability to modify any part of Grok's output as though it were the original answer.

MGIE - Apple's open-source artificial intelligence model
MGIE, which is short for MLLM-Guided Image Editing, was developed by Apple in collaboration with the University of California, Santa Barbara, and specialises in executing text-based image editing commands. Based on text prompts from users, MGIE can perform a variety of image editing tasks such as cropping, resizing and rotating, as well as adjusting brightness, colour balance and contrast. It significantly improves the efficiency of image editing across different metrics and maintains competitive inference performance.
The technology is used to perform Photoshop-style alterations, photo optimisation and local editing. Online media outlets predict that Apple will showcase a revolution in the AI market at its Worldwide Developers Conference (WWDC 2024) (in partnership with Google) with Siri positioned at the core of the solution.

Mistral AI - the French breakthrough for developers
Mistral.ai is breaking new ground in artificial intelligence with its Mistral Large, Mistral Small and Mistral Embeddings (models, highlighting outstanding performance and versatility. The models are designed to reduce bias and provide modular control, answering a wide range of user needs. The technology is open and portable with flexible deployment and customisation options. This approach not only drives innovation in AI but also enables the integration of advanced AI solutions into business systems, making Mistral.ai a top choice for developers and enterprises looking to integrate AI into their applications or services.

Ernie - the Chinese alternative to ChatGPT
Ernie, short for Enhanced Representation through Knowledge Integration, is an AI-powered conversational chatbot created by the Chinese company Baidu and is the Chinese answer to OpenAI's ChatGPT. It is based on Baidu's own in-house large-scale language model (LLM), Ernie 3.0-Titan, and a pre-trained conversation generation model known as PLATO. Ernie boasts multi-modal capabilities that allow users to interact with text and images in prompts and responses using an AI service. Its main advantage is its integration of diverse data types, while its specificity and localisation for the Chinese market may pose challenges when it comes to global use.

Comparing large language models
Company | Model name | Chatbot | Free version | Langauge support | |
---|---|---|---|---|---|
GPT | OpenAI | Free: GPT 3.5, Paid: GPT 4.0 | ChatGPT | Yes | 95, including Slovenian |
Gemini | Free: Gemini Pro, Paid: Gemini Ultra 1.0 | Google Gemini (previously Bard) | Yes | 40, including Slovenian | |
Claude | Anthropic | Free: Claude Sonnet, Paid: Claude Opus | Claude | Yes | English, Japanese, Spanish, French |
Grok | xAI | Grok-1 | Grok anything | No, enterprise | 200+ |
Llama | Meta | Llama 2-70B | Llama2.ai | Yes | English |
Mistral | Mistral AI | Mistral Small, Mistral Large, Mistral Embedding | Le Chat Mistral | No, enterprise | English, French, Italian, German, Spanish and excellent in coding |
Ernie | Baidu | Ernie 3.0-Totan | Ernie bot | No, enterprise | Chinese |
Each LLM model has its strengths and weaknesses. The effectiveness of each model is context-dependent, tailored to specific user needs and usage scenarios. This two-part series of blog posts is just a brief introduction to some of the world's best-known models, with new ones emerging seemingly every day. Some of the more well-known development projects include BloombergGPT, AleksaTM, Bloom and Koala.
Then there are also text-to-video (SORA) and text-to-image models (DALL-E 3, Adobe Sensei) that focus on video and image creation. Artificial intelligence is an industry with huge growth potential, both in terms of innovation and investments. As such, we can expect many new innovations in the future, including new large-scale language models, or rather large-scale action models (LAMs), which represent the next step in the evolution of modelling. Stay tuned for more insights on this topic in a future blog! 😊