SegMoE: Segmind Mixture of Diffusion Experts

SegMoE is an exciting framework for creating Mixture-of-Experts Diffusion models from scratch! SegMoE is comprehensively integrated within the Hugging Face ecosystem and comes supported with diffusers 🔥! Among the features and integrations being released today: Table of Contents What is SegMoE? SegMoE models follow the same architecture as Stable Diffusion. Like Mixtral 8x7b, a SegMoE model    

Read more

From OpenAI to Open LLMs with Messages API on Hugging Face

We are excited to introduce the Messages API to provide OpenAI compatibility with Text Generation Inference (TGI) and Inference Endpoints. Starting with version 1.4.0, TGI offers an API compatible with the OpenAI Chat Completion API. The new Messages API allows customers and users to transition seamlessly from OpenAI models to open LLMs. The API can be directly used with OpenAI’s client libraries or third-party tools, like LangChain or LlamaIndex. “The new Messages API with OpenAI compatibility makes it easy   […]

Read more

Introducing the Open Ko-LLM Leaderboard: Leading the Korean LLM Evaluation Ecosystem

In the fast-evolving landscape of Large Language Models (LLMs), building an “ecosystem” has never been more important. This trend is evident in several major developments like Hugging Face’s democratizing NLP and Upstage building a Generative AI ecosystem. Inspired by these industry milestones, in September of 2023, at Upstage we initiated the Open Ko-LLM Leaderboard. Our goal was to quickly develop and introduce an evaluation ecosystem for Korean LLM data, aligning with the global movement towards open and collaborative AI development. […]

Read more

Welcome Gemma – Google’s new open LLM

An update to the Gemma models was released two months after this post, see the latest versions in this collection. Gemma, a new family of state-of-the-art open LLMs, was released today by Google! It’s great to see Google reinforcing its commitment to open-source AI, and we’re excited to fully support the launch with comprehensive integration in Hugging Face. Gemma comes in two sizes: 7B parameters, for efficient deployment and development on consumer-size GPU and TPU and 2B versions for CPU […]

Read more

🪆 Introduction to Matryoshka Embedding Models

In this blogpost, we will introduce you to the concept of Matryoshka Embeddings and explain why they are useful. We will discuss how these models are theoretically trained and how you can train them using Sentence Transformers. Additionally, we will provide practical guidance on how to use Matryoshka Embedding models and share a comparison between a Matryoshka embedding model and a regular embedding model. Finally, we invite you to check out our interactive demo that showcases the power of these […]

Read more

Introducing the Red-Teaming Resistance Leaderboard

Content warning: since this blog post is about a red-teaming leaderboard (testing elicitation of harmful behavior in LLMs), some users might find the content of the related datasets or examples unsettling. LLM research is moving fast. Indeed, some might say too fast. While researchers in the field continue to rapidly expand and improve LLM performance, there is growing concern over whether these models are capable of realizing increasingly more undesired and unsafe behaviors. In recent months, there has been no […]

Read more

Fine-Tuning Gemma Models in Hugging Face

We recently announced that Gemma, the open weights language model from Google Deepmind, is available for the broader open-source community via Hugging Face. It’s available in 2 billion and 7 billion parameter sizes with pretrained and instruction-tuned flavors. It’s available on Hugging Face, supported in TGI, and easily accessible for deployment and fine-tuning in the Vertex Model Garden and Google Kubernetes Engine. The Gemma family of models also happens to be well suited for prototyping and experimentation using the free […]

Read more
1 32 33 34 35 36 1,021