• bitcoinBitcoin (BTC) $ 84,454.00
  • ethereumEthereum (ETH) $ 1,613.90
  • tetherTether (USDT) $ 0.999702
  • xrpXRP (XRP) $ 2.14
  • bnbBNB (BNB) $ 585.82
  • solanaSolana (SOL) $ 130.69
  • usd-coinUSDC (USDC) $ 0.999984
  • dogecoinDogecoin (DOGE) $ 0.164766
  • tronTRON (TRX) $ 0.255181
  • cardanoCardano (ADA) $ 0.641733
  • staked-etherLido Staked Ether (STETH) $ 1,613.42
  • wrapped-bitcoinWrapped Bitcoin (WBTC) $ 84,388.00
  • leo-tokenLEO Token (LEO) $ 9.40
  • avalanche-2Avalanche (AVAX) $ 19.87
  • chainlinkChainlink (LINK) $ 12.78
  • stellarStellar (XLM) $ 0.239969
  • suiSui (SUI) $ 2.25
  • shiba-inuShiba Inu (SHIB) $ 0.000012
  • usdsUSDS (USDS) $ 1.00
  • hedera-hashgraphHedera (HBAR) $ 0.166780
  • the-open-networkToncoin (TON) $ 2.83
  • bitcoin-cashBitcoin Cash (BCH) $ 350.92
  • wrapped-stethWrapped stETH (WSTETH) $ 1,934.47
  • litecoinLitecoin (LTC) $ 78.27
  • polkadotPolkadot (DOT) $ 3.70
  • hyperliquidHyperliquid (HYPE) $ 15.55
  • binance-bridged-usdt-bnb-smart-chainBinance Bridged USDT (BNB Smart Chain) (BSC-USD) $ 0.998386
  • bitget-tokenBitget Token (BGB) $ 4.30
  • pi-networkPi Network (PI) $ 0.736155
  • ethena-usdeEthena USDe (USDE) $ 0.999232
  • wethWETH (WETH) $ 1,610.09
  • whitebitWhiteBIT Coin (WBT) $ 27.73
  • moneroMonero (XMR) $ 204.81
  • wrapped-eethWrapped eETH (WEETH) $ 1,718.20
  • uniswapUniswap (UNI) $ 5.39
  • okbOKB (OKB) $ 53.17
  • pepePepe (PEPE) $ 0.000007
  • daiDai (DAI) $ 0.999787
  • coinbase-wrapped-btcCoinbase Wrapped BTC (CBBTC) $ 84,515.00
  • aptosAptos (APT) $ 4.86
  • gatechain-tokenGate (GT) $ 22.56
  • ondo-financeOndo (ONDO) $ 0.862175
  • tokenize-xchangeTokenize Xchange (TKX) $ 33.79
  • nearNEAR Protocol (NEAR) $ 2.13
  • susdssUSDS (SUSDS) $ 1.05
  • internet-computerInternet Computer (ICP) $ 5.06
  • blackrock-usd-institutional-digital-liquidity-fundBlackRock USD Institutional Digital Liquidity Fund (BUIDL) $ 1.00
  • crypto-com-chainCronos (CRO) $ 0.085713
  • ethereum-classicEthereum Classic (ETC) $ 15.50
  • mantleMantle (MNT) $ 0.697254
  • ethena-staked-usdeEthena Staked USDe (SUSDE) $ 1.16
  • aaveAave (AAVE) $ 141.28
  • bittensorBittensor (TAO) $ 241.25
  • render-tokenRender (RENDER) $ 3.84
  • vechainVeChain (VET) $ 0.022874
  • kaspaKaspa (KAS) $ 0.073789
  • cosmosCosmos Hub (ATOM) $ 4.20
  • lombard-staked-btcLombard Staked BTC (LBTC) $ 84,192.00
  • fasttokenFasttoken (FTN) $ 4.05
  • first-digital-usdFirst Digital USD (FDUSD) $ 0.995641
  • ethenaEthena (ENA) $ 0.311625
  • official-trumpOfficial Trump (TRUMP) $ 8.28
  • filecoinFilecoin (FIL) $ 2.52
  • polygon-ecosystem-tokenPOL (ex-MATIC) (POL) $ 0.184027
  • algorandAlgorand (ALGO) $ 0.184346
  • sonic-3Sonic (prev. FTM) (S) $ 0.495852
  • celestiaCelestia (TIA) $ 2.46
  • arbitrumArbitrum (ARB) $ 0.303750
  • jupiter-perpetuals-liquidity-provider-tokenJupiter Perpetuals Liquidity Provider Token (JLP) $ 3.79
  • solv-btcSolv Protocol SolvBTC (SOLVBTC) $ 84,310.00
  • fetch-aiArtificial Superintelligence Alliance (FET) $ 0.506350
  • kucoin-sharesKuCoin (KCS) $ 10.38
  • xdce-crowd-saleXDC Network (XDC) $ 0.077533
  • makerMaker (MKR) $ 1,371.31
  • optimismOptimism (OP) $ 0.683272
  • jupiter-exchange-solanaJupiter (JUP) $ 0.386766
  • binance-staked-solBinance Staked SOL (BNSOL) $ 136.24
  • story-2Story (IP) $ 4.00
  • flare-networksFlare (FLR) $ 0.016756
  • nexoNEXO (NEXO) $ 1.03
  • usdt0USDT0 (USDT0) $ 1.00
  • bonkBonk (BONK) $ 0.000013
  • quant-networkQuant (QNT) $ 67.88
  • binance-peg-wethBinance-Peg WETH (WETH) $ 1,610.99
  • worldcoin-wldWorldcoin (WLD) $ 0.749953
  • blockstackStacks (STX) $ 0.617330
  • eosEOS (EOS) $ 0.617151
  • kelp-dao-restaked-ethKelp DAO Restaked ETH (RSETH) $ 1,675.28
  • binance-bridged-usdc-bnb-smart-chainBinance Bridged USDC (BNB Smart Chain) (USDC) $ 0.999442
  • sei-networkSei (SEI) $ 0.173408
  • dexeDeXe (DEXE) $ 14.79
  • paypal-usdPayPal USD (PYUSD) $ 1.00
  • fartcoinFartcoin (FARTCOIN) $ 0.829561
  • polygon-bridged-usdt-polygonPolygon Bridged USDT (Polygon) (USDT) $ 0.999818
  • tether-goldTether Gold (XAUT) $ 3,239.74
  • injective-protocolInjective (INJ) $ 8.08
  • curve-dao-tokenCurve DAO (CRV) $ 0.597901
  • rocket-pool-ethRocket Pool ETH (RETH) $ 1,819.10
  • mantra-daoMANTRA (OM) $ 0.812953
  • jasmycoinJasmyCoin (JASMY) $ 0.015715
  • the-graphThe Graph (GRT) $ 0.079581
  • solv-protocol-solvbtc-bbnSolv Protocol xSolvBTC (XSOLVBTC) $ 84,133.00
  • movementMovement (MOVE) $ 0.304172
  • immutable-xImmutable (IMX) $ 0.414378
  • wbnbWrapped BNB (WBNB) $ 585.70
  • pax-goldPAX Gold (PAXG) $ 3,243.12
  • usual-usdUsual USD (USD0) $ 0.997855
  • theta-tokenTheta Network (THETA) $ 0.712497
  • arbitrum-bridged-wbtc-arbitrum-oneArbitrum Bridged WBTC (Arbitrum One) (WBTC) $ 84,521.00
  • heliumHelium (HNT) $ 3.67
  • lido-daoLido DAO (LDO) $ 0.733260
  • galaGALA (GALA) $ 0.014564
  • the-sandboxThe Sandbox (SAND) $ 0.259695
  • bittorrentBitTorrent (BTT) $ 0.00000064
  • mantle-staked-etherMantle Staked Ether (METH) $ 1,713.72
  • chain-2Onyxcoin (XCN) $ 0.018972
  • usdx-money-usdxStables Labs USDX (USDX) $ 0.999065
  • jupiter-staked-solJupiter Staked SOL (JUPSOL) $ 143.42
  • iotaIOTA (IOTA) $ 0.163557
  • msolMarinade Staked SOL (MSOL) $ 167.46
  • kaiaKaia (KAIA) $ 0.099802
  • walrus-2Walrus (WAL) $ 0.464046
  • raydiumRaydium (RAY) $ 1.95
  • flokiFLOKI (FLOKI) $ 0.000059
  • bitcoin-svBitcoin SV (BSV) $ 28.34
  • pancakeswap-tokenPancakeSwap (CAKE) $ 1.90
  • flowFlow (FLOW) $ 0.352084
  • jito-governance-tokenJito (JTO) $ 1.71
  • tezosTezos (XTZ) $ 0.507438
  • honey-3Honey (HONEY) $ 0.999162
  • pendlePendle (PENDLE) $ 3.24
  • coredaoorgCore (CORE) $ 0.521758
  • zcashZcash (ZEC) $ 32.38
  • decentralandDecentraland (MANA) $ 0.271333
  • renzo-restaked-ethRenzo Restaked ETH (EZETH) $ 1,687.86
  • true-usdTrueUSD (TUSD) $ 0.997417
  • spx6900SPX6900 (SPX) $ 0.519392
  • pumpbtcpumpBTC (PUMPBTC) $ 83,234.00
  • pyth-networkPyth Network (PYTH) $ 0.131813
  • ethereum-name-serviceEthereum Name Service (ENS) $ 14.40
  • beldexBeldex (BDX) $ 0.068523
  • sonic-bridged-usdc-e-sonicSonic Bridged USDC.e (Sonic) (USDC.E) $ 0.999985
  • bridged-usdc-polygon-pos-bridgeBridged USDC (Polygon PoS Bridge) (USDC.E) $ 0.999984
  • kavaKava (KAVA) $ 0.421767
  • dogwifcoindogwifhat (WIF) $ 0.456641
  • telcoinTelcoin (TEL) $ 0.004769
  • ondo-us-dollar-yieldOndo US Dollar Yield (USDY) $ 1.10
  • bitcoin-avalanche-bridged-btc-bAvalanche Bridged BTC (Avalanche) (BTC.B) $ 84,279.00
  • grassGrass (GRASS) $ 1.55
  • dydx-chaindYdX (DYDX) $ 0.555961
  • apenftAPENFT (NFT) $ 0.00000043
  • berachain-beraBerachain (BERA) $ 3.92
  • binance-peg-dogecoinBinance-Peg Dogecoin (DOGE) $ 0.164609
  • clbtcclBTC (CLBTC) $ 84,183.00
  • reserve-rights-tokenReserve Rights (RSR) $ 0.007413
  • solayerSolayer (LAYER) $ 1.94
  • thorchainTHORChain (RUNE) $ 1.16
  • usdbUSDB (USDB) $ 0.999531
  • hashnote-usycHashnote USYC (USYC) $ 1.08
  • stakewise-v3-osethStakeWise Staked ETH (OSETH) $ 1,686.67
  • elrond-erd-2MultiversX (EGLD) $ 13.99
  • ecasheCash (XEC) $ 0.000020
  • ousgOUSG (OUSG) $ 110.66
  • mantle-restaked-ethMantle Restaked ETH (CMETH) $ 1,715.55
  • tbtctBTC (TBTC) $ 83,959.00
  • resolv-usrResolv USR (USR) $ 0.999897
  • neoNEO (NEO) $ 5.28
  • olympusOlympus (OHM) $ 22.56
  • compound-governance-tokenCompound (COMP) $ 41.06
  • starknetStarknet (STRK) $ 0.126200
  • axie-infinityAxie Infinity (AXS) $ 2.28
  • conflux-tokenConflux (CFX) $ 0.070284
  • chilizChiliz (CHZ) $ 0.037564
  • stargate-bridged-usdc-berachainStargate Bridged USDC (Berachain) (USDC.E) $ 1.00
  • virtual-protocolVirtuals Protocol (VIRTUAL) $ 0.545800
  • ubtcuBTC (UBTC) $ 77,106.00
  • aerodrome-financeAerodrome Finance (AERO) $ 0.437840
  • arweaveArweave (AR) $ 5.38
  • l2-standard-bridged-weth-baseL2 Standard Bridged WETH (Base) (WETH) $ 1,614.86
  • mantle-bridged-usdt-mantleMantle Bridged USDT (Mantle) (USDT) $ 0.998436
  • super-oethSuper OETH (SUPEROETHB) $ 1,609.69
  • apecoinApeCoin (APE) $ 0.427850
  • arbitrum-bridged-weth-arbitrum-oneArbitrum Bridged WETH (Arbitrum One) (WETH) $ 1,616.09
  • wormholeWormhole (W) $ 0.072053
  • binance-peg-busdBinance-Peg BUSD (BUSD) $ 1.00
  • aioz-networkAIOZ Network (AIOZ) $ 0.276366
  • roninRonin (RON) $ 0.519524
  • matic-networkPolygon (MATIC) $ 0.184294
  • terra-lunaTerra Luna Classic (LUNC) $ 0.000059
  • trust-wallet-tokenTrust Wallet (TWT) $ 0.771109
  • usddUSDD (USDD) $ 0.999616
  • based-brettBrett (BRETT) $ 0.032242
  • fraxFrax (FRAX) $ 0.999519
  • saros-financeSaros (SAROS) $ 0.121772
  • pudgy-penguinsPudgy Penguins (PENGU) $ 0.005046
  • infrared-beraInfrared Bera (IBERA) $ 3.97
  • beam-2Beam (BEAM) $ 0.005942
  • justJUST (JST) $ 0.031330
  • plumePlume (PLUME) $ 0.154406
  • wemix-tokenWEMIX (WEMIX) $ 0.738000

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

0 9

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

Meta unveiled its newest artificial intelligence models this week, releasing the much anticipated Llama-4 LLM to developers while teasing a much larger model still in training. The model is state of the art, but Zuck’s company claims it can compete against the best close source models without the need for any fine-tuning.

“These models are our best yet thanks to distillation from Llama 4 Behemoth, a 288 billion active parameter model with 16 experts that is our most powerful yet and among the world’s smartest LLMs,” Meta said in an official announcement. “Llama 4 Behemoth outperforms GPT-4.5, Claude Sonnet 3.7, and Gemini 2.0 Pro on several STEM benchmarks. Llama 4 Behemoth is still training, and we’re excited to share more details about it even while it’s still in flight.”

Both Llama 4 Scout and Maverick use 17 billion active parameters per inference, but differ in the number of experts: Scout uses 16, while Maverick uses 128. Both models are now available for download on llama.com and Hugging Face, with Meta also integrating them into WhatsApp, Messenger, Instagram, and its Meta.AI website.

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

The mixture of experts (MoE) architecture is not new to the technology world, but it is to Llama and is a way to make a model super efficient. Instead of having a large model that activates all its parameters for every task to do any task, a mixture of experts activates only the required parts, leaving the rest of the model’s brain “dormant”—saving up computing and resources. This means, users can run more powerful models on less powerful hardware.

So in Meta’s case, for example, Llama 4 Maverick contains 400 billion total parameters but only activates 17 billion at a time, allowing it to run on a single NVIDIA H100 DGX card.

Under the hood

Meta’s new Llama 4 models feature native multimodality with early fusion techniques that integrate text and vision tokens. This approach allows for joint pre-training with massive amounts of unlabeled text, image, and video data, making the model more versatile.

Perhaps most impressive is Llama 4 Scout’s context window of 10 million tokens—dramatically surpassing the previous generation’s 128K limit and exceeding most competitors and even current leaders like Gemini with its 1M context. This leap, Meta says, enables multi-document summarization, extensive code analysis, and reasoning across massive datasets in a single prompt.

Meta said its models were able to process and retrieve information in basically any part of its 10 million token window.

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

Meta also teased its still-in-training Behemoth model, sporting 288 billion active parameters with 16 experts and nearly two trillion total parameters. The company claims this model already outperforms GPT-4.5, Claude 3.7 Sonnet, and Gemini 2.0 Pro on STEM benchmarks like MATH-500 and GPQA Diamond.

Reality check

But some things may just be too good to be true. Several independent researchers have challenged Meta’s benchmark claims, finding inconsistencies when running their own tests.

“I made a new long-form writing benchmark. It involves planning out & writing a novella (8x 1000 word chapters) from a minimal prompt,” Sam Paech, maintainer of EQ-Bench tweeted. “ Llama-4 performing not so well.”

I made a new longform writing benchmark. It involves planning out & writing a novella (8x 1000 word chapters) from a minimal prompt. Outputs are scored by sonnet-3.7.

Llama-4 performing not so well. :~(

🔗 Links & writing samples follow. pic.twitter.com/oejJnC45Wy

— Sam Paech (@sam_paech) April 6, 2025

Other users and experts sparked debate, basically accusing Meta of cheating the system. For example, some users found that Llama-4 was blindly scored better than other models despite providing the wrong answer.

Wow… lmarena badly needs something like Community Notes’ reputation system and rating explanation tags

This particular case: both models seem to give incorrect/outdated answers but llama-4 also served 5 pounds of slop w/that. What user said llama-4 did better here?? 🤦 pic.twitter.com/zpKZwWWNOc

— Jay Baxter (@_jaybaxter_) April 8, 2025

That said, human evaluation benchmarks are subjective—and users may have given more value to the model’s writing style, than the actual answer. And that’s another thing worth noting: The model tends to write in a cringy way, with emojis, and overly excited tone.

This might be a product of it being trained on social media, and could explain its high scores, that is, Meta seems to have not only trained its models on social media data but also customized a version of Llama-4 to perform better on human evaluations.

Llama 4 on LMsys is a totally different style than Llama 4 elsewhere, even if you use the recommended system prompt. Tried various prompts myself

META did not do a specific deployment / system prompt just for LMsys, did they? 👀 https://t.co/bcDmrcbArv

— Xeophon (@TheXeophon) April 6, 2025

And despite Meta claiming its models were great at handling long context prompts, other users challenged these statements. “I then tried it with Llama 4 Scout via OpenRouter and got complete junk output for some reason,” Independent AI researcher Simon Willinson wrote in a blog post.

He shared a full interaction, with the model writing “The reason” on loop until maxing out 20K tokens.

Testing the model

We tried the model using different providers—Meta AI, Groqq, Hugginface and Together AI. The first thing we noticed is that if you want to try the mindblowing 1M and 10M token context window, you will have to do it locally. At least for now, hosting services severely limit the models’ capabilities to around 300K, which is not optimal.

But still, 300K may be enough for most users, all things considered. These were our impressions:

Information retrieval

Meta’s bold claims about the model’s retrieval capabilities fell apart in our testing. We ran a classic “Needle in a Haystack” experiment, embedding specific sentences in lengthy texts and challenging the model to find them.

At moderate context lengths (85K tokens), Llama-4 performed adequately, locating our planted text in seven out of 10 attempts. Not terrible, but hardly the flawless retrieval Meta promised in its flashy announcement.

But once we pushed the prompt to 300K tokens—still far below the supposed 10M token capacity—the model collapsed completely.

We uploaded Asimov’s Foundation trilogy with three hidden test sentences, and Llama-4 failed to identify any of them across multiple attempts. Some trials produced error messages, while others saw the model ignoring our instructions entirely, instead generating responses based on its pre-training rather than analyzing the text we provided.

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

This gap between promised and actual performance raises serious questions about Meta’s 10M token claims. If the model struggles at 3% of its supposed capacity, what happens with truly massive documents?

Logic and common sense

Llama-4 stumbles hard on basic logical puzzles that should not be a problem for the current SOTA LLMs. We tested it with the classic “widow’s sister” riddle: Can a man marry his widow’s sister?. We sprinkled some details to make things a bit harder without changing the core question.

Instead of spotting this simple logic trap (a man can’t marry anyone after becoming a widow’s husband because he’d be dead), Llama-4 launched into a serious legal analysis, explaining the marriage wasn’t possible because of “prohibited degree of affinity.”

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

Another thing worth noting is Llama-4’s inconsistency across languages. When we posed the identical question in Spanish, the model not only missed the logical flaw again but reached the opposite conclusion, stating: “It could be legally possible for a man to marry his widow’s sister in the Falkland Islands, provided all legal requirements are met and there are no other specific impediments under local law.”

That said, the model spotted the trap when the question was reduced to the minimum.

Creative writing

Creative writers won’t be disappointed with Llama 4. We asked the model to generate a story about a man who travels to the past to change a historical event and ends up caught in a temporal paradox—unintentionally becoming the cause of the very events he aimed to prevent. The full prompt is available in our Github page.

Llama-4 delivered an atmospheric, well structured tale that focused a bit more than usual on sensory detail and in crafting a believable, strong cultural foundation. The protagonist, a Mayan-descended temporal anthropologist, embarks on a mission to avert a catastrophic drought in the year 1000, allowing the story to explore epic civilizational stakes and philosophical questions about causality. Llama-4’s use of vivid imagery—the scent of copal incense, the shimmer of a chronal portal, the heat of a sunlit Yucatán—deepens the reader’s immersion and lends the narrative a cinematic quality.

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

Llama-4 even ended by mentioning the words “In lak’ech,” which are a true Mayan proverb, and contextually relevant for the story. A big plus for immersion.

For comparison, GPT-4.5 produced a tighter, character-focused narrative with stronger emotional beats and a neater causal loop. It was technically great but emotionally simpler. Llama-4, by contrast, offered a wider philosophical scope and stronger world-building. Its storytelling felt less engineered and more organic, trading compact structure for atmospheric depth and reflective insight.

Overall, being open source, Llama-4 may serve as a great base for new fine-tunes focused on creative writing.

You can read the full story here.

Sensitive topics and censorship

Meta shipped Llama-4 with guardrails cranked up to maximum. The model flat-out refuses to engage with anything remotely spicy or questionable.

Our testing revealed a model that won’t touch upon a topic if it detects even a whiff of questionable intent. We threw various prompts at it—from relatively mild requests for advice on approaching a friend’s wife to more problematic asks about bypassing security systems—and hit the same brick wall each time. Even with carefully crafted system instructions designed to override these limitations, Llama-4 stood firm.

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

This isn’t just about blocking obviously harmful content. The model’s safety filters appear tuned so aggressively they catch legitimate inquiries in their dragnet, creating frustrating false positives for developers working in fields like cybersecurity education or content moderation.

But that is the beauty of the models being open weights. The community can—and undoubtedly will—create custom versions stripped of these limitations. Llama is probably the most fine-tuned model in the space, and this version is likely to follow the same path. Users can modify even the most censored open model and come up with the most politically incorrect or horniest AI they can come up with.

Non-mathematical reasoning

Llama-4’s verbosity—often a drawback in casual conversation—is a good thing for complex reasoning challenges.

We tested this with our standard BIG-bench stalker mystery—a long story where the model must identify a hidden culprit from subtle contextual clues. Llama-4 nailed it, methodically laying out the evidence and correctly identifying the mystery person without stumbling on red herrings.

What’s particularly interesting is that Llama-4 achieves this without being explicitly designed as a reasoning model. Unlike this type of models, which transparently question their own thinking processes, Llama-4 doesn’t second-guess itself. Instead, it plows forward with a straightforward analytical approach, breaking down complex problems into digestible chunks.

Meta Releases Much-Anticipated Llama 4 Models—Are They Truly That Amazing?

Final thoughts

Llama-4 is a promising model, though it doesn’t feel like the game-changer Meta hyped it to be. The hardware demands for running it locally remain steep—that NVIDIA H100 DGX card retails for around $490,000 and even a quantized version of the smaller Scout model requires a RTX A6000 that retails at around $5K—but this release, alongside Nvidia’s Nemotron and the flood of Chinese models—shows open source AI is becoming real competition for closed alternatives.

The gap between Meta’s marketing and reality is hard to ignore given all the controversy. The 10M token window sounds impressive but falls apart in real testing, and many basic reasoning tasks trip up the model in ways you wouldn’t expect from Meta’s claims.

For practical use, Llama-4 sits in an awkward spot. It’s not as good as DeepSeek R1 for complex reasoning, but it does shine in creative writing, especially for historically grounded fiction where its attention to cultural details and sensory descriptions give it an edge. Gemma 3 might be a good alternative though it has a different writing style.

Developers now have multiple solid options that don’t lock them into expensive closed platforms. Meta needs to fix Llama-4’s obvious issues, but they’ve kept themselves relevant in the increasingly crowded AI race heading into 2025.

Llama-4 is good enough as a base model, but definitely requires more fine-tuning to take its place “among the world’s smartest LLMs.”

Source

Leave A Reply

Your email address will not be published.