☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agoHow ‘Embeddings’ Encode What Words Meanwww.quantamagazine.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkHow ‘Embeddings’ Encode What Words Meanwww.quantamagazine.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 2 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoNew AI model “learns” how to simulate Super Mario Bros. from video footageplus-squarearstechnica.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkNew AI model “learns” how to simulate Super Mario Bros. from video footageplus-squarearstechnica.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoReflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)plus-squarehuggingface.coexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkReflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o)plus-squarehuggingface.co☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoIt’s Not Intelligent If It Always Halts: A Critical Perspective on Current Approaches to AGIwww.lifeiscomputation.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkIt’s Not Intelligent If It Always Halts: A Critical Perspective on Current Approaches to AGIwww.lifeiscomputation.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoThe Difference Between Speaking and Thinkingplus-squarewww.theatlantic.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkThe Difference Between Speaking and Thinkingplus-squarewww.theatlantic.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoDiffusion Models Are Real-Time Game Enginesgamengen.github.ioexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkDiffusion Models Are Real-Time Game Enginesgamengen.github.io☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoLiger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%.plus-squaregithub.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkLiger Kernel is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduces memory usage by 60%.plus-squaregithub.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoTransformer Explainerplus-squarepoloclub.github.ioexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTransformer Explainerplus-squarepoloclub.github.io☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agoAlibaba claims no. 1 spot in AI math models with Qwen2-Mathventurebeat.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkAlibaba claims no. 1 spot in AI math models with Qwen2-Mathventurebeat.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 3 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agoNew Open-Source AI Image Generator Beats Midjourney, SD3 and Auraflowplus-squaredecrypt.coexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkNew Open-Source AI Image Generator Beats Midjourney, SD3 and Auraflowplus-squaredecrypt.co☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agoAI models collapse when trained on recursively generated dataplus-squarewww.nature.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkAI models collapse when trained on recursively generated dataplus-squarewww.nature.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 4 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agoRouteLLM: An Open-Source Framework for Cost-Effective LLM Routinglmsys.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkRouteLLM: An Open-Source Framework for Cost-Effective LLM Routinglmsys.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agoAlibaba's Qwen LLM model leading open source rankingshuggingface.coexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkAlibaba's Qwen LLM model leading open source rankingshuggingface.co☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · edit-25 months agoBy using the same techniques Google used to solve Go (MTCS and backprop), Llama8B gets 96.7% on math benchmark GSM8K. That’s better than GPT-4, Claude and Gemini, with 200x fewer parameters!plus-squarearxiv.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkBy using the same techniques Google used to solve Go (MTCS and backprop), Llama8B gets 96.7% on math benchmark GSM8K. That’s better than GPT-4, Claude and Gemini, with 200x fewer parameters!plus-squarearxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · edit-25 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agoMixture of Agents (MoA) leverages several open-source LLM agents to achieve a score of 65.1% on AlpacaEval 2.0www.together.aiexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkMixture of Agents (MoA) leverages several open-source LLM agents to achieve a score of 65.1% on AlpacaEval 2.0www.together.ai☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 5 months agomessage-square0fedilink
ylai@lemmy.mlEnglish · 5 months agoFrom DeepSpeed to FSDP and Back Again with Hugging Face Acceleratehuggingface.coexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkFrom DeepSpeed to FSDP and Back Again with Hugging Face Acceleratehuggingface.coylai@lemmy.mlEnglish · 5 months agomessage-square0fedilink
keepthepace@slrpnk.net · 6 months agoTorrent tracker for open modelsplus-squareaitracker.artexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkTorrent tracker for open modelsplus-squareaitracker.artkeepthepace@slrpnk.net · 6 months agomessage-square0fedilink
wargreymon@sh.itjust.works · 6 months agoCan gpt generate a gpt model?plus-squaremessage-squaremessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1message-squareCan gpt generate a gpt model?plus-squarewargreymon@sh.itjust.works · 6 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 6 months agoSakuga-42M Dataset: Scaling Up Cartoon Researcharxiv.orgexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkSakuga-42M Dataset: Scaling Up Cartoon Researcharxiv.org☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 6 months agomessage-square0fedilink
☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agoHow AI 'Understands' Images (CLIP)plus-squarewww.youtube.comexternal-linkmessage-square0fedilinkarrow-up11arrow-down10
arrow-up11arrow-down1external-linkHow AI 'Understands' Images (CLIP)plus-squarewww.youtube.com☆ Yσɠƚԋσʂ ☆@lemmy.mlEnglish · 7 months agomessage-square0fedilink