Clicking the links below will open the listed AI tools in new tabs in your browser. The list includes the frontier LLM models (proprietary, open-source, regional and reasoning) as well as consolidators (like Poe) and a few fine-tuned models (like Latimer). Agents and browser extensions are below, but API tools are now on a separate page.
PROPRIETARY FRONTIER MODELS: Here are some AI models you should know. They are from different companies, using different different neural networks and with different personalities and abilities. The paid versions are often substantially better and smarter.
THE BIG THREE
There are now 1000s of GenAI products, models and tools. Some models are more creative writers while others are better at math or video. Specialized models (below) offer more realistic conversation (Sesame), better infographics (Napkin) or scientific research tools (Future House), but if you want to use one model: the big three (ChatGPT from OpenAI, Gemini from Google & Claude from Anthropic) still provide the best collection of power and features in one place. Ethan Mollick has great advice on which model you should use and on why/how to give your AI a job interview.
All of them have slow, medium and fast models (slow for analysis, medium for most tasks, and fast chat when you just need a quick idea from your creative Uncle Claude). ChatGPT 5.2 will even select the right mode for your prompt (but this means that if you are using the free version it will send you to the cheapest and worse model 5.2 Instant. You need to select 5.2 THINKING to get a substantially better model!!) The big three also offer the most standard features in one place: reasoning and deep research models, voice mode, visual mode (to see images and documents), the ability to create images, documents and code (and run the code) and a mobile app. There are specialty tools (on the API page) for video (Hailou, Kling and Google’s own Veo 3) but you can also make video natively in ChatGPT and Gemini; making video within a smarter multimodal model allows for easier prompting but offers fewer cinematic tools. Even these general tools are much better at math now.
What you get for free changes constantly and the names and placements of models are a confusing mess. You will need to hunt for buttons and drop-down menus to find what you want (especially important to make sure you get an internet search when you want it) while you ponder the difference between Deep Research (Gemini) and Extended Thinking (Claude). Also try Google AI Studio which has the latest betas and cool new tools.
You will need to login, even for the free versions, but this also allows you to adjust your settings to turn off training features. You may or may not want ChatGPT to remember everything about you in order to respond better. A month of a paid model will give you a better sense of what these models can really do but at the beginning of a new release you can sometimes try the best tools a few times a day for free. e. (Poe, BoodleBox and other consolidators allow you to try a variety of models through one platform for one fee.) Also try the app (and voice mode!!): it will give you a better idea of how AI is about to change everything. They get news from different places: Google also has a deal with the Associated Press to get real-time news updates. (Meta uses Fox and CNN and Grok uses X.)
Copilot is just another version of ChatGPT (Microsoft owns half of OpenAI) that integrates with Microsoft projects–so it should be better with Excel and Ppt. If your organization gives you access to this in MS Office you should also be FERPA and HIPPA secure.
OTHER MODELS
- Grok 4.1 (released Nov 17, 2025) is once again at the top of many leaderboards. Yes, it is designed with “extreme personalities” and a political bias, but you should also try it. It can analyze images and has real-time access to the internet and social network X. Grok Studio is a collaborative space and also has direct Google Drive integration.
- Sonus is a new “set”family” of proprietary models (Pro, Air, Mini and Pro with Reasoning, see below) that is already competitive with the very best existing models. For the moment, this is the best (only) way you can try the new reasoning models for free: here.
- WolframAlpha combines the computational powers of Wolfram|Alpha with ChatGPT. Google’s AlphaGeometry 2 now competes at the level of gold-medal students in the International Mathematical Olympiad, but not generally available. Even the general models are much better at math than earlier AI. Read this about AI and Math.
- Ernie from Baidu (the Chinese Google search engine) is another multimodal frontier model. Ernie X1 is a reasoning model.
- Thaura is a free model from two Syrian engineers that is dedicated to global justice and guarantees never to train on your data and no military contracts. It is designed to comply with all European privacy rules.
- Pi is focused on emotional intelligence, dialogue and role-playing. At the moment, it can talk, but can’t hear, so you have to text it.
- You.com is set up to be a search engine competitor to Google but with more privacy and easier customization (so it now faces competition from Gemini, but also ChatGPT Search).
- Amazon Nova are both good Class 2 models (so on a par with GPT 4).
- Ethan Mollick has written an excellent summary (Jan 26, 2025) of the differences and how to pick which model to use.
OPEN SOURCE MODELS: There are now open source models that are just as good as the best proprietary frontier models, and even better in some specialized areas. You can download most of these models from Github, Azure (Microsoft) or HuggingFace. You can then fine-tune and run them on your laptop, which deals with most privacy issues but also transfers the security risk to you.
- Chinese DeepSeek is strong with text, can also search the web and has a very strong 685B math model that won IMO gold (Math-V2 that verifies step-by-step proof reasoning). It is a cheaper API option and it was built for a fraction of the price/chips/energy of the big models through the clever use of Multi-head Latent Attention (MLA) that combines even more values into tokens (the simple version is tokens that read phrases, so less precision but turns out it was not needed and not all tokens are active all the time for a huge energy, cost and time savings). Here is a great non-technical summary of how DeepSeek is important or you could read this tech paper. All free to download.
- Kimi 2 is an excellent free multimodal open source Chinese AI that has a particularly large context window (good for long papers, prompts, and conversations – you can upload 50 files 100MB EACH), does very well in math and coding (beating GPT-4o and Claude Sonnet 3.5 on Codeforces), searches the web, can analyze charts and also has reasoning.
- Qwen 3 from Alibaba has a range of models that can do all of the usual things and allows you to determine the reasoning level with a slider. You can also use Qwen Chat in guest mode without a login (although you have to login to use voice and some other features). Qwen2.5 beat GPT-4o, Claude-3.5 Sonnet, and DeepSeek-V3 while Qwen2-Math does very well at math.
- Meta AI is now Llama 4 (and now a family of models) which is a huge Class 3 model (which means it can remember more pages than others) but it they also seemed to have fudged the benchmarks. It does not require a login.
- Mistral (available as an API and Le Chat and also in a reasoning version called Magistral) is an open source LLM from France that real time internet search (with press wire access for news) and is very fast and more multilingual than the big four. It also creates great images using Black Forest Labs Flux Ultra. Mistral Medium outperformed Claude 3.7 and 4o in many benchmarks.
- MiniMax is another excellent Chinese AI company that has open source reasoning models (M1) and other tools including video and agents.
- Deep Cognito also has a family of open-sources models in a variety of sizes.
- MiMo from Xiaomi is an open-source reasoning model that outperforms o1mini.
- Huggingface is a chatbot running on Llama. Start here to get a sense of what open source can do. No login is required.
REGIONAL and CULTURALLY-SPECIFIC MODELS: Since people and cultures think differently, we are starting to see LLMs that are trained on culturally specific data sets. Here is an MIT paper on the problems and the process of creating cultural and regional LLMs. Note that if you want a culturally specific answer, you can and should still try this with the frontier models (try asking Claude to response as a Black professor and compare the response to Latimer).
- Latimer (named after African-American engineer Lewis Latimer) aims to better represent diverse communities by adding further training from (verified and licensed) books, oral histories and sources from Black and Brown communities.(Latimer is a fine-tuned version of LLAMA.)
- LatamGPT from Chile’s National Center for Artificial Intelligence (Cenia) is an open source model trained on “characteristic data from Latin America.” You can try its chatbot here.
- Brazil’s Maritaca AI has a Portuguese chatbot and a set of fine-tuned LLaMA models trained in Portuguese called Sabia.
- Fanar is a “culturally and regionally aware” Arabic LLM fluent in Arabic dialects from the Qatar Ministry of Communications and Information Technology (MCIT) and the Qatar Computing Research Institute of Hamad Bin Khalifa University (HBKU). You can find a summary of the state of Arabic models here. Dallah was an early fine-tuned version of LLAMA-2 trained on six Arabic dialects, but the models listed here are now all native foundational LLMs.
- Falcon is an open source LLM from the UAE uses new “state space” architecture (SSLM) instead of the transformer architecture and says it is the top performing Arabic model. It is also designed to handle a range of dialects.
- Mistral Saba is a 24B parameter version of the French Mistral model trained on curated datasets from across the Middle East and South Asia. It supports Arabic and many Indian-origin languages like Tamil.
- Jais is “trained on the largest Arabic dataset ever used to train an open-source foundational model, ensuring linguistic accuracy and cultural sensitivity across standard Arabic and its dialects.”
- Qalam focuses on translation in writing and seems a bit like an Arabic Grammerly? It also has a plagiarism detector and has sentiment analysis as a feature.
- EthioLLM is trained on five Ethiopian languages (Amharic, Ge’ez, Afan Oromo, Somali, and Tigrinya) and English. More on the dataset here.
- Humain (also in both Arabic and English) is the Saudi version (working with Qualcomm) which has also launched an AI laptop, the Humain Horizon Pro. Nvidia is partnering with Humain to build “AI factories” in Saudi Arabia.
- Apertus from the Swiss National Supercomputing Centre (CSCS) focuses on transparency and diversity: trained on 1000 languages with 40% of the data is non-English (including Swiss German, Romansh). Apertus is Latin for “open” and the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.
- OpenLLM-Ro is building a set of LLaMA fine-tuned Romanian models.
- Bielik is a family of open source models with deep integration of Polish language and cultural context.
- Nanda is trained on a dataset containing 65 billion Hindi tokens.
- SEA-Lion is an open source LLM designed to understand Southeast Asia’s diverse languages, cultures, and contexts including Indonesian, Thai, Vietnamese, Filipino, Burmese, Khmer, English, Mandarin, Malay, Tamil, and Lao.
- Doubao (from ByteDance) also has voice mode and is one of the most popular AI in China. (It is in Chinese, but can be used in English with Google translation in your browser.)
- Hunyuan from Chinese Tencent is a 13B parameter open-source model that scores on par with DeepSeek despite being 15% of the size–so more efficient–and another native Chinese speaker.
- TAIDE is a 7B model grounded in Taiwanese culture, “incorporating unique linguistic elements, values, and customs.”
- Komodo (from yellow.ai) is the first LLM built on Indonesian languages including Banjarese, Sudanese, Balinese and Toba Batak.
- HyperCLOVA (from Naver) is designed to understand the Korean language in the larger societal context.
- Typhoon is a set of open source AI models “optimized for the Thai language.”
- PersianLLaMA is a 13B parameter model fine-tuned on Persian Wikipedia.
- GigaChat is an open source Russian language model. Claude, Gemini and DeepSeek all do better on MERA (the leaderboard for testing how models do with Russian tasks) but it still useful to have a cultural Russian model.
- Sherkala is pre-trained on Kazakh and English sources with some Russian and Turkish sources.
- UCCIX is an Irish language LLM.
- Laxta is a family of LLaMA-based Basque models.
- There is also a group of behavioral scientists exploring the training of LLMs on historic texts Viking, Latin or Medieval Arabic etc. You can read about Latin BERT or try it here (on Github).
REASONING MODELS: In 2025 models started to process through problems before answering. They do NOT actually reason (although it appears that way) but they have internal instructions that break problems down into steps which (especially when combined with web searching) improves accuracy and allows much more complicated problem solving. You need to use them a little differently (more here): give it something hard to do and note (or ask) how it describes its reasoning. Look at this example. The progress here has been rapid and substantial (read this report about the new o3 from Dec 2024). Most of the above models now have free reasoning (but with different names so look for buttons: Kimi calls this “researcher.”)
CONSOLIDATORS: Poe (currently $5/month!!) ChatPlayground ($17/month) and ChatHub are consolidators that provides access to multiple AI through one interface. BoodleBox also offers a wide range of models and also lots of educational tools and controls.
CUSTOM BOTS: Each of the big models also has a way to build and then distribute your own fine-tuned applications with your own prompt instructions. There are also GPTs (from OpenAI), Assistants (from HuggingFace), Bots (from Poe). There are also educational platforms, like BoodleBox which allows the teacher to see everything students do–and has lots of other faculty features like “coach mode” which the chat default (and won’t provide students with direct answers. Much more (including how to build them) on the Custom Bot page.
MINI MODELS and EDGE AI: These are smaller, faster and more specialized (often) OPEN SOURCE tools that you customize to live and run on your phone. Note that the ways to make an LLM better are model size (see Frontier models above), data set size and and the amount of training. Since it is not clear that larger more capable models will be cost effective, these faster smaller models (with more training) may end up being more useful. Apple Intelligence will test this idea. More smaller models are coming.
- Phi-3.5 from Microsoft comes in three sizes Mini, Small and Medium (3.8-41B parameters)
- OpenELM is the Apple version that comes in four sizes (270M-3B parameters)
- Gemma is the open source smaller model from Google also in several sizes
AGENTS
A chatbot can only chat with you, but an “agent” can plan and execute a series of tasks, like building you a website or finding information on your computer. Agents code, but that is essentially everything you do on a computer, so don’t think of these just as coding tools. Agents can use multiple tools and know when to switch, so an AI agent can manage a workflow. It is like a contractor rather than a chatbot. Here are details about the “Agent2Agent” (or A2A) or “Model Context Protocol” that create these two-way connections between data sources and AI-powered tools. The distinction between agents and vibe-coding apps is narrowing–the difference is partly workflow and that many of the apps use other foundation models. Stay tuned. There are now lots of demos of agents doing students homework.
- Start by watching the demos from Genspark or Manus. Here are some use case examples. Then ask it to build you an interactive course website using the best research and including links to video and with interactive learning activities (or just a new episode of a TV show you like). ChatGPT has agenic capabilities but I think it is behind these at the moment.
- Gemini 3 Antigravity runs on your laptop and interacts with everything Google, but has an “inbox” that allows it to do things and asks you to intervene only when necessary. .
- I have also built simulations with Macaly which pitches itself as more of a vibe-coding app.
- Here is a website created with MiniMax by Marc Watkins.
- Claude Computer Use now Claude Code can also do tasks like create and bedbug code. Codex (also open source!) from OpenAI is similar, but both seem more focused on actual coding for me.
- Kimi-Researcher, Asana, Swarm and Devin are other early tools
- OpenAI has introduced Operator (Jan 23, 2025)–it is called “computer use” in CoPilot. Here is a demo (from Graham Clay) where Operator has been asked to write an essay in a GoogleDoc at human speed with edits.
- Another use of agents (that is also about growing use of synthetic data) is this simulated hospital with AI agents as both patients and doctors, which allowed the AI doctors to gain experience (treating 10,000 patients) and “evolve” become better.
- LinkedIn has an agent that helps recruit job seekers (and also an AI jobs match tool).
- Zoom has also given its AI companion some agency capability.
AGENIC BROWSERS
First came the browser extensions (with a large cohort of fill in the answer cheating extensions). Google now has its Gemini AI built into its browser(although AI mode is substantially better than the default AI summaries.) Microsoft Edge now has CoPilot integrated. Chrome and other browsers are also starting to integrate, but we now have new ground up AI-browsers with the ability to both ask questions and do things built in.
- Dia was one of the first and I like how minimalistic it is while also coming with a way to keep common tasks/prompts available. It used it to plan a trip and it was great at summarizing ideas from different websites it found..
- Comet from Perplexity is more for research and integrates with Google calendar etc but has heavily pitched itself to students to do school work…
- Atlas is the powerful OpenAI entry (not to be confused with AtlasAI geospatial) but early days.
- Opera Neon has tasks to organize workflow but is still early access.
- Disco from Google has a feature called GenTabs that create interactive web applications that help answer questions or complete tasks.
- Neo from Norton bills itself as “the world’s first AI Native browser that doesn’t compromise power and privacy.”
ROBOTS & MORE
AI is also propelling an advance in robotics including new home robots like Neo, Figure 03, the Walker S2 (from UBTech now actually mass delivered) and robots dogs). Look at the ridable hydrogen-powered Corleo from Kawasaki. There is also AlterEgo, which allows you to have silent (“almost telepathic”) conversations with AI or another human.
EpochAI is an important independent organization that is keeping track of these models, how they compare and where we might be going. They maintain a great dashboardcomparing capabilities of the best models (against their own benchmarks) and also this larger data set of virtual all models. They produce excellent reports about trends including a recent prediction that AI will continue to improve rapidly.
You can find a complete list of AI products (tracked by Ithaka S+R) here
Here is a great AI guide for students.