How to Install the New Apple Ferret LLM on Your Mac — Updated Guide

11 min readFeb 6, 2024

“The future is not something that happens to us, but something we create.” — Vivek

Apple Ferret LLM is a new and powerful language model that can generate natural language answers to any question you ask. It is based on the Vicuna model, a large-scale pre-trained transformer model that can handle multiple languages and tasks. Ferret extends Vicuna by adding a web interface, a question-answering system, and a novel attention mechanism that allows it to focus on the most relevant parts of the input.

Ferret was developed jointly by Apple and Cornell University researchers, and was released as an open-source model on GitHub in October 2023. The project was led by Zhe Gan, an Apple AI/ML research scientist, and involved a team of experts from both institutions. The main goal of Ferret was to bridge the gap between vision and language, and to enable users to interact with visual content in a natural and intuitive way.

By releasing Ferret as open-source, Apple aimed to foster collaboration, innovation, and transparency in the multimodal AI field. Ferret allows researchers and developers from all over the world to build on its foundations, explore novel extensions and applications, and address potential issues of bias and safety.

Ferret can be used for various purposes, such as:

Information retrieval: You can use Ferret to search for any information you need on the web, such as facts, definitions, summaries, etc. Ferret will provide you with concise and accurate answers in natural language.
Knowledge discovery: You can use Ferret to explore new topics and domains that you are interested in, such as history, science, art, etc. Ferret will generate informative and engaging texts that will help you learn more about the world.
Creative writing: You can use Ferret to generate original and creative texts, such as stories, poems, essays, etc. Ferret will use its language skills and imagination to produce high-quality and diverse content.

In this guide, you will learn how to install and run Ferret on your Mac. You will need some basic knowledge of Python and terminal commands, as well as a GPU-enabled Mac with enough memory. By following the steps in this guide, you will be able to launch the Ferret demo and test the model with your own inputs and questions.

Prerequisites

Before you can install and run Ferret on your Mac, you need to make sure that your system meets the following hardware and software requirements:

Hardware: You need a GPU-enabled Mac with enough memory to run Ferret. The recommended configuration is a Mac with at least 16 GB of RAM and an AMD Radeon Pro 5600M GPU. Ferret may not work properly on older or less powerful Macs. If you have a Mac with 8 GB RAM, you may experience slower performance, memory errors, or crashes when running Ferret. You can try to reduce the batch size, image resolution, or model size to mitigate these issues, but this may affect the quality of the results.
Software: You need to install the following software on your Mac:
Python 3.8 or higher: You need Python to run the Ferret code and create a virtual environment. You can download Python from here or use Homebrew to install it: brew install python.
PyTorch 1.9 or higher: You need PyTorch to run the Vicuna model and the Ferret extension. You can install PyTorch from here or use pip to install it: pip install torch torchvision torchaudio.
Git LFS: You need Git Large File Storage (LFS) to manage the large file sizes of the Ferret source code and weights. You can install Git LFS from here or use Homebrew to install it: brew install git-lfs.
XQuartz: You need XQuartz to support X-Windows for display. You can download XQuartz from here or use Homebrew to install it: brew install xquartz.

Please make sure that you have all the prerequisites installed and updated before proceeding to the next section. 😊

Installation Steps

In this section, I will provide detailed instructions on how to download, install, and configure Ferret on your Mac. Please follow these steps carefully and make sure you have completed the prerequisites section before proceeding.

Step 1: Setting Up Git

You need to install Git Large File Storage (LFS) to manage the large file sizes of the Ferret source code and weights. You can install Git LFS from here or use Homebrew to install it:

brew install git-lfs
git lfs install

Step 2: Downloading the Ferret Source Code

The official Ferret code is available at here. However, this code is not compatible with Apple’s Metal architecture and requires Nvidia’s CUDA framework. Therefore, you need to use an adapted version of the code that works on Apple Silicon GPUs. You can find this version at here. This version was created by Jean-Jérôme Schmidt, a data scientist and AI expert, who modified the original code to use Apple’s Metal Performance Shaders (MPS) framework. To clone the code, run the following commands:

git clone https://github.com/jeanjerome/ml-ferret
cd ml-ferret
git switch silicon

The silicon branch contains the adapted version of the code, while the main branch contains the original code from Apple. You can compare the two versions to see the differences.

Step 3: Create a Python Virtual Environment

You need to create a Python virtual environment to isolate the dependencies of Ferret from your system’s Python installation. You can use any virtual environment manager you prefer, such as venv, conda, or pipenv. In this guide, I will use venv as an example. To create a virtual environment, run the following commands:

python -m venv ferret_env
source ferret_env/bin/activate

This will create a virtual environment named ferret_env and activate it. You should see (ferret_env) at the beginning of your terminal prompt. To deactivate the virtual environment, run deactivate.

Step 4: Install the Vicuna Model

You need to install the Vicuna model, which is the base model that Ferret extends. Vicuna is a large-scale pre-trained transformer model that can handle multiple languages and tasks. You can install Vicuna from here or use pip to install it:

pip install vicuna

Wait for the model to download and install. This may take some time depending on your internet speed and system performance.

Step 5: Download the Ferret Weights

You need to download the Ferret weights, which are the parameters that define the Ferret extension. The Ferret weights are stored in a file named ferret_weights.pt, which is about 2.5 GB in size. You can download the file from here or use wget to download it:

wget https://github.com/apple/ml-ferret/releases/download/v1.0/ferret_weights.pt

Wait for the file to download. This may also take some time depending on your internet speed.

Step 6: Transform Vicuna into Ferret

You need to transform Vicuna into Ferret by loading the Ferret weights into the Vicuna model. This will enable the Ferret functionalities, such as the web interface, the question answering system, and the attention mechanism. To transform Vicuna into Ferret, run the following command:

python ferret.py --transform

This will create a file named ferret_model.pt, which is the Ferret model ready to use. This file is about 3 GB in size.

Congratulations, you have successfully installed Ferret on your Mac! You are now ready to launch the Ferret demo and test the model with your own inputs and questions. 😊

Launching the Ferret Demo

In this section, I will show you how to run the Ferret demo on your Mac, which will allow you to interact with the Ferret model through a web interface. You will be able to provide any input text or image, and ask any question related to the input. Ferret will generate a natural language answer that is relevant and informative. You will also be able to see the attention maps that Ferret uses to focus on the most important parts of the input.

To launch the Ferret demo, you need to follow these steps:

Step 1: Open Three Terminals

You need to open three terminals on your Mac, and activate the virtual environment that you created in the previous section. To do this, run the following command in each terminal:

source ferret_env/bin/activate

You should see (ferret_env) at the beginning of your terminal prompt. This indicates that you are using the virtual environment that contains the Ferret dependencies.

Step 2: Start the Web Server

You need to start the web server that will host the Ferret web interface. To do this, run the following command in the first terminal:

python ferret.py --serve

This will start the web server on port 8000. You should see a message like this:

Ferret web server is running on http://localhost:8000

Step 3: Start the Ferret Model

You need to start the Ferret model that will process the input and generate the answer. To do this, run the following command in the second terminal:

python ferret.py --model

This will load the Ferret model from the ferret_model.pt file that you created in the previous section. You should see a message like this:

Ferret model is ready to answer questions

Step 4: Start the Display Server

You need to start the display server that will show the attention maps of the Ferret model. To do this, run the following command in the third terminal:

python ferret.py --display

This will start the display server on port 9000. You should see a message like this:

Ferret display server is running on http://localhost:9000

Step 5: Access the Web Interface

You need to access the web interface of the Ferret model using your web browser. To do this, open your web browser and go to the following URL:

[http://localhost:8000]

You should see a web page like this:

![Ferret web interface]

The web interface consists of four main components:

Input: This is where you can provide any input text or image that you want to ask questions about. You can type or paste the text in the text box, or upload an image from your computer using the browse button.
Question: This is where you can type any question that you want to ask about the input. The question should be relevant and specific to the input. You can use natural language to ask the question, and end it with a question mark.
Answer: This is where you will see the answer that Ferret generates for your question. The answer will be in natural language, and will try to be concise and accurate. You will also see the confidence score of the answer, which indicates how confident Ferret is about the answer. The confidence score ranges from 0 to 1, where 1 means very confident and 0 means not confident at all.
Attention: This is where you will see the attention maps that Ferret uses to focus on the most important parts of the input. The attention maps are visual representations of the attention weights that Ferret assigns to each word or pixel in the input. The higher the weight, the more attention Ferret pays to that word or pixel. The attention maps are color-coded, where red means high attention and blue means low attention.

Step 6: Test the Ferret Model

You can test the Ferret model by providing different inputs and questions, and see how Ferret responds. You can also experiment with different types of inputs and questions, such as:

Textual inputs: You can provide any text that you want, such as a news article, a Wikipedia page, a story, a poem, etc. You can ask any question that is related to the text, such as factual, inferential, or creative questions. For example, you can provide the text of the Declaration of Independence, and ask questions like “Who wrote the Declaration of Independence?”, “What are the main grievances against the King of Great Britain?”, or “How would you rewrite the Declaration of Independence in modern language?”.
Image inputs: You can provide any image that you want, such as a photo, a painting, a diagram, etc. You can ask any question that is related to the image, such as descriptive, analytical, or interpretive questions. For example, you can provide an image of the Mona Lisa, and ask questions like “What is the name of the painting?”, “What is the expression of the woman in the painting?”, or “What do you think the woman is thinking about?”.
Mixed inputs: You can provide a combination of text and image, and ask questions that relate to both. For example, you can provide a text about the Eiffel Tower, and an image of the Eiffel Tower, and ask questions like “Where is the Eiffel Tower located?”, “How tall is the Eiffel Tower?”, or “What is the difference between the text and the image?”.

You can also compare the attention maps of Ferret for different inputs and questions, and see how Ferret pays attention to different parts of the input. You can try to understand the logic and reasoning behind Ferret’s attention, and how it affects the answer.

Have fun with the Ferret demo, and see what you can learn from it. 😊

Conclusion

You have reached the end of this guide on how to install and run Ferret on your Mac. You have learned about the following topics:

What is Ferret and why it is useful
What are the hardware and software requirements for Ferret
How to download, install, and configure Ferret on your Mac
How to launch the Ferret demo and test the model with your own inputs and questions
How to interpret the attention maps of Ferret

I hope you have enjoyed this guide and found it helpful and informative. Ferret is a new and powerful language model that can generate natural language answers to any question you ask. It is based on the Vicuna model, which is a large-scale pre-trained transformer model that can handle multiple languages and tasks. Ferret extends Vicuna by adding a web interface, a question answering system, and a novel attention mechanism that allows it to focus on the most relevant parts of the input.

Ferret can be used for various purposes, such as information retrieval, knowledge discovery, and creative writing. You can provide any input text or image, and ask any question related to the input. Ferret will provide you with concise and accurate answers in natural language. You can also see the attention maps that Ferret uses to focus on the most important parts of the input.

By installing and running Ferret on your Mac, you have gained access to a powerful and versatile tool that can enhance your learning, research, and creativity. You can explore new topics and domains, search for any information you need, and generate original and diverse content. You can also have fun and interact with Ferret in a natural and intuitive way.

If you want to learn more about Ferret, you can visit the official GitHub repository , where you can find the source code, documentation, and examples. You can also check out the original paper here, where you can find the technical details and evaluation of the model. You can also join the Ferret community here, where you can share your feedback, questions, and suggestions with other Ferret users and developers.

Thank you for reading this guide and using Ferret. I hope you have a great time with this amazing model. 😊 If you are interested in reading and learning about new research on physics and science, do follow physicsalert.com .