There are now 1000s of GenAI products, models and tools. Some models are more creative writers while others are better at math or video. Specialized models (below) offer more realistic conversation (Sesame), better infographics (Napkin) or scientific research tools (Future House).
Clicking the links will open the listed AI tools in new tabs in your browser. Agents and browser extensions are below, but API tools are on a separate page.
You will need to login, even for the free versions, but this also allows you to adjust your settings to turn off training features!!! You may or may not want ChatGPT to remember everything about you in order to respond better.
Also try the apps (and voice mode!!)
They get news from different places: Google has a deal with the Associated Press to get real-time news updates, while Meta uses Fox and CNN and Grok uses X.
A month of a paid model will give you a better sense of what these models can really do but at the beginning of a new release you can sometimes try the best tools a few times a day for free

THE BIG THREE PROPRIETARY FRONTIER MODELS

If you want to use one model: the big three (ChatGPT from OpenAI, Gemini from Google & Claude from Anthropic) still provide the best collection of power and features in one place. These are from different companies, using different different neural networks and with different personalities and abilities. The paid versions are often substantially better and smarter. Ethan Mollick has great advice on why/how to give your AI a job interview.
All of them have slow, medium and fast models (slow for analysis, medium for most tasks, and fast chat when you just need a quick idea from your creative Uncle Claude). You will need to hunt for buttons and drop-down menus to find what you want. You can select which version of the model before you start your chat. (ChatGPT now selects the mode for your prompt which usually means the cheapest and worse model, so you need to ASK for deeper thinking.)
The big three also offer the most standard features in one place: voice mode, visual mode (to see images and documents), the ability to create images, documents and code (and run the code) and a mobile app. There are specialty tools (on the API page) for images, video, slides, and voice etc, but you can also make these natively in the big three; making video within a smarter multimodal model allows for easier prompting but offers fewer cinematic tools.
All three all offer some version of “study” or “learning” mode. (And Google has lots of tools for education like Learn About and NotebookLM.)
The biggest and smartest models will probably continue to be ChatGPT and Claude as Google has pivoted (May 2026) to faster, more integrated (with Google products) and good enough.

OTHER PROPRIETARY MODELS

Grok 4.1 is designed with “extreme personalities” and a political bias, but you should also try it. It can analyze images and has real-time access to the internet and social network X. Grok Studio is a collaborative space and also has direct Google Drive integration.
Copilot is just another version of ChatGPT (Microsoft owns half of OpenAI) that integrates with Microsoft projects, but often seems less good than ChatGPT. If your organization gives you access to this in MS Office you should also be FERPA and HIPPA secure.
MIA-Thinking-1 is the first in-house model from Microsoft. It is a medium-size model (similar to Sonnet) built for math, coding, and “real-world enterprise” and they claim it is focused on the practical and human use.
Euria from Swiss Infomaniak promises privacy and security, is powered by renewable energy, and uses the recovered energy to heat homes. Their kSuite offers a suite of tools that combines messaging, calendar, transfer and storage (kSuite) all with the same environmental and security promises. It also integrates with Microsoft tools.
Ernie from Baidu (the Chinese Google search engine) is another multimodal frontier model. Ernie X1 is a reasoning model.
Thaura is a free model from two Syrian engineers that is dedicated to global justice and guarantees never to train on your data and no military contracts. It is designed to comply with all European privacy rules.
Sonus is a family of proprietary models (Pro, Air, Mini and Pro with Reasoning).
Pi is focused on emotional intelligence, dialogue and role-playing. At the moment, it can talk, but can’t hear, so you have to text it.
You.com was set up to be a search engine competitor to Google but with more privacy and easier customization and is now focused on specialized API.
Amazon Nova are both good Class 2 models (so on a par with GPT 4).

SCIENCE and MATH MODELS

OlmoEarth is pre-trained and fine-tuned foundation models for Earth observation and remote sensing. Examples: tracking mangrove change to classifying drivers of forest loss to producing country-scale crop-type maps.
WolframAlpha combines the computational powers of Wolfram|Alpha with ChatGPT.
Edison Platform is a single environment for scientific R&D with multiple tools. (Edison is the commercial spinout of FutureHouse. They have multiple tools:
- Kosmos is a complete lab-in-the loop AI Scientist. Given a research objective and one or more datasets, Kosmos autonomously reads literature, writes and executes analysis code, generates hypotheses, and produces a comprehensive cited report. A single run involves reading ~1,500 papers and executing ~42,000 lines of code. Every conclusion is fully auditable. You can trace any finding back to the specific code or literature passage that produced it.
- Literature handles literature search and synthesis. It accesses 175M+ papers, trials, and patents with native understanding of citation graphs, journal quality, and clinical trial data. You can ask it a complex scientific question and get a high-accuracy, cited response, or task it with a deep literature review synthesizing conflicting evidence across hundreds of papers.
- Analysis specializes in processing complex experimental data, including flow cytometry, RNA-seq, and other biological datasets. It turns raw data into detailed analyses, statistical results, and publication-ready figures.
- Precedent determines whether a research idea has been tried before. It searches across fields to assess novelty and identify gaps, helping you avoid duplicating existing work and focus on what’s actually new.
- Molecules is a chemistry-focused agent for molecular design and analysis.
Lila is a scientific AI that is connected to fully autonomous laboratories where AI systems generate hypotheses, design experiments, operate lab equipment, analyze results, and iterate at machine speed with minimal human intervention. It is an AI trained on 10 trillion tokens of scientific reasoning data from experimental results, but that dataset is set to double this year. (The usable internet training data for LLMs is about 15 trillion tokens.) Lila trains across scientific areas from life sciences to energy and outperforms other frontier models on scientific tasks.
Walrus and AION-1, from members of the Polymathic AI collaboration and researchers from the University of Cambridge, are trained using real scientific datasets to tackle problems in astronomy and fluid-like systems, rather than text and images. Walrus’ domain is fluids and fluid-like systems while AION-1 is trained on data from astronomical surveys.
Ether0 is a 24B-parameter reasoning model built on Mistral-Small-24B and post-tained for chemistry.
MatterChat is a new AI “framework”bridge” model from Lawrence Berkeley National Laboratory (Berkeley Lab) that connects the conversational power of a Large Language Model (LLM) with a physics-based AI that models “interatomic potentials”: the complex physical forces between atoms.
SCIGEN (short for structural constraint integration in a GENerative model for the discovery of quantum materials) from MIT (led by Mingda Li) steers models toward producing materials with structures known to host quantum phenomena, such as Kagome and Archimedean lattices.
Google’s AlphaGeometry 2 now competes at the level of gold-medal students in the International Mathematical Olympiad, but not generally available.
Numina-Lean-Agent (read the paper here) uses standard models to do math reasoning. Even the general models are much better at math than earlier AI.
Mathstral is built on Mistral 7B and designed for scientific discovery and mathematical reasoning. It is an open source tool for complex derivations and scientific tasks.
Read this about AI and Math and this about AxiomProver.

OPEN SOURCE MODELS

There open source models are often very close to the best proprietary frontier models, and better in some specialized areas. You can download most of these models from Github, Azure (Microsoft) or HuggingFace. You can then fine-tune and run them on your laptop, which deals with most privacy issues but also transfers the security risk to you.
Chinese DeepSeek is strong with text, can also search the web and has a very strong 685B math model that won IMO gold (Math-V2 that verifies step-by-step proof reasoning). It is a cheaper API option and it was built for a fraction of the price/chips/energy of the big models through the clever use of Multi-head Latent Attention (MLA) that combines even more values into tokens (the simple version is tokens that read phrases, so less precision but turns out it was not needed and not all tokens are active all the time for a huge energy, cost and time savings). Here is a great non-technical summary of how DeepSeek is important or you could read this tech paper. All free to download.
Kimi 2 is an excellent free multimodal open source Chinese AI that has a particularly large context window (good for long papers, prompts, and conversations – you can upload 50 files 100MB EACH), does very well in math and coding (beating GPT-4o and Claude Sonnet 3.5 on Codeforces), searches the web, can analyze charts and also has reasoning.
Qwen 3 from Alibaba has a range of models that can do all of the usual things and allows you to determine the reasoning level with a slider. You can also use Qwen Chat in guest mode without a login (although you have to login to use voice and some other features). Qwen2.5 beat GPT-4o, Claude-3.5 Sonnet, and DeepSeek-V3 while Qwen2-Math does very well at math.
Meta AI is now Llama 4 (and now a family of models) which is a huge Class 3 model (which means it can remember more pages than others) but it they also seemed to have fudged the benchmarks. It does not require a login.
Mistral (available as an API and Le Chat and also in a reasoning version called Magistral) is an open source LLM from France that real time internet search (with press wire access for news) and is very fast and more multilingual than the big four. It also creates great images using Black Forest Labs Flux Ultra. Mistral Medium outperformed Claude 3.7 and 4o in many benchmarks.
MiniMax is another excellent Chinese AI company that has open source reasoning models (M1) and other tools including video and agents.
Deep Cognito also has a family of open-sources models in a variety of sizes.
MiMo from Xiaomi is an open-source reasoning model that outperforms o1mini.
Huggingface is a chatbot running on Llama. Start here to get a sense of what open source can do. No login is required.

REGIONAL and CULTURALLY-SPECIFIC MODELS

Since people and cultures think differently, we are starting to see LLMs that are trained on culturally specific data sets. He re is an MIT paper on the problems and the process of creating cultural and regional LLMs. Note that if you want a culturally specific answer, you can and should still try this with the frontier models (try asking Claude to response as a Black professor and compare the response to Latimer). Some of this is also the desire for Sovereign AI and many governments are putting money here so they do not have to rely on US models.
Latimer (named after African-American engineer Lewis Latimer) aims to better represent diverse communities by adding further training from (verified and licensed) books, oral histories and sources from Black and Brown communities.(Latimer is a fine-tuned version of LLAMA.)
LatamGPT from Chile’s National Center for Artificial Intelligence (Cenia) is an open source model trained on “characteristic data from Latin America.” You can try its chatbot here.
Fanar is a “culturally and regionally aware” Arabic LLM fluent in Arabic dialects from the Qatar Ministry of Communications and Information Technology (MCIT) and the Qatar Computing Research Institute of Hamad Bin Khalifa University (HBKU). You can find a summary of the state of Arabic models here. D allah was an early fine-tuned version of LLAMA-2 trained on six Arabic dialects, but the models listed here are now all native foundational LLMs.
Falcon is an open source LLM from the UAE uses new “state space” architecture (SSLM) instead of the transformer architecture and says it is the top performing Arabic model. It is also designed to handle a range of dialects.
Mistral Saba is a 24B parameter version of the French Mistral model trained on curated datasets from across the Middle East and South Asia. It supports Arabic and many Indian-origin languages like Tamil.
Vulavula speaks, transcribes and translates a wide range of low-resource languages across Africa and the broader Global South like isiZulu, Sesotho Afrikaans, African French and more.
Brazil’s Maritaca AI has a Portuguese chatbot and a set of fine-tuned LLaMA models trained in Portuguese called S abia.
Jais is “trained on the largest Arabic dataset ever used to train an open-source foundational model, ensuring linguistic accuracy and cultural sensitivity across standard Arabic and its dialects.”
Qalam focuses on translation in writing and seems a bit like an Arabic Grammerly? It also has a plagiarism detector and has sentiment analysis as a feature.
Et hioLLM is tra ined on five Ethiopian languages (Amharic, Ge’ez, Afan Oromo, Somali, and Tigrinya) and English. More on the dataset here.
Hu main (also in both Arabic and English) is the Saudi version (working with Qualcomm) which has also launched an AI laptop, the Huma in Horizon Pro. Nvidia is par tnering with Humain to build “AI factories” in Saudi Arabia.
Apertus from the Swiss National Supercomputing Centre (CSCS) focuses on transparency and diversity: trained on 1000 languages with 40% of the data is non-English (including Swiss German, Romansh). Apertus is Latin for “open” and the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.
OpenLL M-Ro is building a set of LLaMA fine-tuned Romanian models.
B ielik is a family of open source models with deep integration of Polish language and cultural context.
Airavata is a 7B OpenHathi model optimized for the Hindi language and specific Indian natural language processing needs, fine-tuned using IndicTrans2. Developed by AI4Bharat.
Nanda is trained on a dataset containing 65 billion Hindi tokens.
SEA-Lion is an open source LLM designed to understand Southeast Asia’s diverse languages, cultures, and contexts including Indonesian, Thai, Vietnamese, Filipino, Burmese, Khmer, English, Mandarin, Malay, Tamil, and Lao.
Doubao (from ByteDance) also has voice mode and is one of the most popular AI in China. (It is in Chinese, but can be used in English with Google translation in your browser.)
Hunyuan from Chinese Tencent is a 13B parameter open-source model that scores on par with DeepSeek despite being 15% of the size–so more efficient–and another native Chinese speaker.
TA IDE is a 7B model grounded in Taiwanese culture, “incorporating unique linguistic elements, values, and customs.”
Komodo (from yellow.ai) is the first LLM built on Indonesian languages including Banjarese, Sudanese, Balinese and Toba Batak.
Fugaku-LLM is a large language model with enhanced Japanese language capability, using the RIKEN supercomputer Fugaku.
Hyper CLOVA (from Naver) is designed to understand the Korean language in the larger societal context.
Ty phoon is a set of open source AI models “optimized for the Thai language.”
Per sianLLaMA is a 13B parameter model fine-tuned on Persian Wikipedia.
GigaChat is an open source Russian language model. Claude, Gemini and DeepSeek all do better on MERA (the leaderboard for testing how models do with Russian tasks) but it still useful to have a cultural Russian model.
Sherkala is pre-trained on Kazakh and English sources with some Russian and Turkish sources.
UC CIX is an Irish language LLM.
La xta is a family of LLaMA-based Basque models.

HISTORICAL LLMS

This is a Victorian model trained entirely from scratch on 28,000 Victorian-era British texts published between 1837 and 1899, drawn from a dataset made available by the British Library. Here is another student-produced TimeCapsuleLLM, trained on texts from 1800–1875 London to “speak Victorian“.
Talkie is a 13-billion-parameter language model trained on pre-1931 (public domain) text. You can read about lots of interesting experiments or watch it talk to Claude in this introductory blog post. I asked “What would happen to Germany if in 1933 a democratic socialist named Adolf Hitler was elected to power on a platform of making Germany great again” and it predicted Germany would build up its armies and soon conquer all of Europe and eventually “establish herself all over the world.”
A group of behavioral scientists started training of LLMs on historic texts (Viking, Latin or Medieval Arabic etc) in 2024. You can read about Latin BER T or try it here (on Github).
MonadGPT was trained on 11,000 texts from 1400 to 1700 CE using 17th-century knowledge frameworks.
XunziALLM, uses the formal rules of classical Chinese poetry.

CONSOLIDATORS

Perplexity.ai is an AI-powered chatbot search engine designed to answer questions with sources cited using multiple frontier models. (Pro users get a choice of which model.) It also has an Internal Knowledge Search that will search your files for info and many other useful tools.
Poe, ChatPlayground, and ChatHub all provides access to multiple AI through one interface.
BoodleBox also offers a wide range of models along with lots of educational tools and controls.
Google AI Studio has the latest Google betas and cool new tools but also allows you to try multiple tools side-by-side.

REASONING MODELS: In 2025 models started to process through problems before answering. They do NOT actually reason (although it appears that way) but they have internal instructions that break problems down into steps which (especially when combined with web searching) improves accuracy and allows much more complicated problem solving. You need to use them a little differently (more here): give it something hard to do and note (or ask) how it describes its reasoning. Look at this example. The progress here has been rapid and substantial (read this report about the new o3 from Dec 2024). Most of the above models now have free reasoning (but with different names so look for buttons: Kimi calls this “researcher.”)

CUSTOM BOTS: Each of the big models also has a way to build and then distribute your own fine-tuned applications with your own prompt instructions. There are also GPTs (from OpenAI), Assistants (from HuggingFace), Bots (from Poe). There are also educational platforms, like BoodleBox which allows the teacher to see everything students do–and has lots of other faculty features like “coach mode” which the chat default (and won’t provide students with direct answers. Much more (including how to build them) on the Custom Bot page.

MINI MODELS and EDGE AI: These are smaller, faster and more specialized (often) OPEN SOURCE tools that you customize to live and run on your phone. Note that the ways to make an LLM better are model size (see Frontier models above), data set size and and the amount of training. Since it is not clear that larger more capable models will be cost effective, these faster smaller models (with more training) may end up being more useful. Apple Intelligence will test this idea. More smaller models are coming.

Phi-3.5 from Microsoft comes in three sizes Mini, Small and Medium (3.8-41B parameters)
OpenELM is the Apple version that comes in four sizes (270M-3B parameters)
Gemma is the open source smaller model from Google also in several sizes

AGENTS

A chatbot can only chat with you, but an “agent” can plan and execute a series of tasks, like building you a website or finding information on your computer. GenAI can create content, but Agentic AI can work autonomously. Agents can also respond to a trigger–without you having to initiate. Agents code, but that is essentially everything you do on a computer, so don’t think of these just as coding tools. Agents can use multiple tools and know when to switch, so an AI agent can manage a workflow. It is like a contractor rather than a chatbot. Here are details about the “Agent2Agent” (or A2A) or “Model Context Protocol.” MCP is an open-source universal interface for connecting AI to external systems and data. MCP is like a port on your computer (or like being connected to the internet)–you don’t need to know how it works, just that it connects files and tools (like AI and your calendar). MCP was released by Anthropic in Nov 2024 and rapidly adopted. Where RAG is static and one directional (Ai can look for answers in a file) MCP is bi-directional, but they are both ways to give AI context. The distinction between agents and vibe-coding apps is narrowing–the difference is partly workflow and that many of the apps use other foundation models. Stay tuned. There are now lots of demos of agents doing students homework.

Here is a great summary with demos and further resources from a great AAC&U panel. And here is a Guide to Agentic AI from Stanford.
Start by watching the demos from Genspark or Manus (which was bought by Meta in Dec 2025). Here are some use case examples. Then ask it to build you an interactive course website using the best research and including links to video and with interactive learning activities (or just a new episode of a TV show you like). ChatGPT has agenic capabilities but I think it is behind these at the moment.
Gemini 3 Antigravity runs on your laptop and interacts with everything Google, but has an “inbox” that allows it to do things and asks you to intervene only when necessary.
Claude Computer Use now Claude Code is an agentic coding tools that can create and bedbug code, but also read your files. Claude Cowork is the desktop app version (only Mac so far) with a much easier interface.
OpenAI introduced Operator (Jan 23, 2025)–called “computer use” in CoPilot. Here is a demo (from Graham Clay) where Operator has been asked to write an essay in a GoogleDoc at human speed with edits. Codex lives on your computer (which means it can do things with your files, but it can only work when your computer is running–which I find less useful when I am doing a massive search). It is temporarily free at launch (January 2026) so you can try GPT5.2 Thinking here too.
Zap ier connects to other tools you already have (Calendar, Google Drive, Slack, LMS).
You text Lindy (like an assistant) to ask it to do things (like schedule a meeting)
Chinese Z.ai has buttons that allow you to switch from chat to agent.
Kimi has modes for both Agent and an Agent Swarm (which allows for up to 300 sub-agents running 4,000 coordinated steps in parallel).
Kimi-Researcher, Asana, Swarm and Devin were other early tools, but there are many more like Gumloop, Relay, Cofounder, Make, HockeyStack and Stack AI. Composer is another agentic coding model.
I have also built simulations with Macaly which pitches itself as more of a vibe-coding app.
Here is a website created with MiniMax by Marc Watkins.
Another use of agents (that is also about growing use of synthetic data) is this simulated hospital with AI agents as both patients and doctors, which allowed the AI doctors to gain experience (treating 10,000 patients) and “evolve” become better.
LinkedIn has an agent that helps recruit job seekers (and also an AI jobs match tool).
Zoom has also given its AI companion some agency capability.

AGENIC BROWSERS

First came the browser extensions (with a large cohort of fill in the answer cheating extensions). Google now has its Gemini AI built into its browser(although AI mode is substantially better than the default AI summaries.) Microsoft Edge now has CoPilot integrated. Chrome and other browsers are also starting to integrate, but we now have new ground up AI-browsers with the ability to both ask questions and do things built in.

Dia was one of the first and I like how minimalistic it is while also coming with a way to keep common tasks/prompts available. It used it to plan a trip and it was great at summarizing ideas from different websites it found..
Comet from Perplexity is more for research and integrates with Google calendar etc but has heavily pitched itself to students to do school work…
Atlas is the powerful OpenAI entry (not to be confused with AtlasAI geospatial) but early days.
Opera Neon has tasks to organize workflow but is still early access.
Disco from Google has a feature called GenTabs that create interactive web applications that help answer questions or complete tasks.
Neo from Norton bills itself as “the world’s first AI Native browser that doesn’t compromise power and privacy.”

ROBOTS & MORE

AI is also propelling an advance in robotics including new home robots like Neo, Figure 03, the Walker S2 (from UBTech now actually mass delivered) and robots dogs). Look at the ridable hydrogen-powered Corleo from Kawasaki. There is also AlterEgo, which allows you to have silent (“almost telepathic”) conversations with AI or another human.

EpochAI is an important independent organization that is keeping track of these models, how they compare and where we might be going. They maintain a great dashboardcomparing capabilities of the best models (against their own benchmarks) and also this larger data set of virtual all models. They produce excellent reports about trends including a recent prediction that AI will continue to improve rapidly.

You can find a complete list of AI products (tracked by Ithaka S+R) here

Here is a great AI guide for students.

Models