An Idiot's Guide to Hugging Face
And why it's important to dip a toe into the developer community
Hugging Face is the self-described ‘home for all Machine Learning tasks’, a first stop for any developer trying to build something using AI or machine learning. So it’s understandably an incredibly intimidating place to land if you know nothing about coding. But it’s where most of the interesting AI developments are taking place so here is a short guide to gingerly stepping into Hugging Face (h/t to Saurabh Bhambry who did a similar guide on Twitter/X recently).
This is the page you should bookmark. If for no other reason than it can give you a breakdown of what constitutes Natural Language Processing models versus Computer Vision versus Audio version Multimodal. Some categories (eg speech recognition) have thousands of models, some have only two (eg image-to-3D).
You can also browse datasets and licenses (think Meta’s not-quite open-sourced Llama2), but Tasks show things a little further down the production line. If you want to go all-in and teach yourself this start-to-finish then ChatGPT or equivalent can be great for walkthroughs and troubleshooting. Years ago, before GPT, I did an incredibly basic self-taught coding task and it took a few weeks. With GPT or equivalent, I think I could cut that down to a few days (still very slow and very basic, but progress!).
If you prefer video then the HuggingFace YouTube channel also carries some relatively straightforward explainers to basic concepts.
Once you click on a model category, say text-to-image, you can then choose to browse the models and sort by the always opaque ‘trending’, you can also filter by ‘most likes’, ‘most downloads’, ‘recently created’, and ‘recently updated’. If you are just looking for a basic understanding of all of this then ‘most downloads’ is probably the best filter.
At this point you may, completely understandably, be lost. There are a lot of sections that look like this:
When you see this, just treat it like a different language. It’s cool that some people speak it, maybe one day you will learn, but for the moment it’s not something you understand and Google Translate cannot help (yet!). But if most of the site is written in a language you don’t read then what is the point of being on there?
Well, that brings us to:
Spaces
This is where working examples live. Here is AI Tube where all videos are generated using AI:
Here is TryEmoji, where emojis are rendered into AI artwork.
And (a lot more techy, this one) here is a leaderboard for open-sourced LLMs (with an equivalent leaderboard for chatbots here).
You get this idea. There’s nothing incredible in there, most of the demos are stripped back and will perform better when given a full release. But it does showcase lots of cool imaginative projects or very inside-baseball geekery. And often from independent developers. A good place to see the AI gimmicks that are coming down the line that, often, will be repurposed by profit-driven companies in the future.
Messing around on Hugging Face for an hour will not suddenly make you understand the concepts behind artificial intelligence or machine learning, but it will give you a good sense of what’s going on in those communities, and how that will affect the average consumer. At least a better sense than that given by social media.
Small bits #1: The consistent face problem
A constant complaint over the past year with text-to-image has been the difficulty of rendering the same face consistently. I messed around with RenderNet this morning and its Facelock feature seems to be a good fix.
Small bits #2: ChatGPT, lazy at Christmas
We are all perhaps overfond of projecting human attributes onto artificial intelligence but the AI winter break hypothesis is gaining traction thanks to the above compelling test. Tl;dr, when December comes GPT has one eye on the Christmas party and another on getting a final few shopping bits down. And, in that spirit, there will only be two Small Bits today 🎄.