如何在Mac上免费运行Qwen3模型：使用MLX

如何在Mac上免费运行Qwen3模型：使用MLX
How to vibe code for free: Running Qwen3 on your Mac, using MLX

原始链接: https://localforge.dev/blog/running-qwen3-macbook-mlx

本文发表于2025年5月1日，详细介绍了如何在Mac上使用MLX本地运行Qwen3大型语言模型，并将其与Localforge代理框架集成以实现自主代码生成。该过程涉及安装MLX和mlx-lm，然后运行一个服务器来提供从MLX社区下载的Qwen3模型，并使用指定的端口。然后通过添加两个提供程序配置Localforge：一个用于Ollama（使用Gemma3处理更简单的任务），另一个用于本地托管的Qwen3模型，使用OpenAI API v1兼容性。对于Qwen3提供程序，API URL指向本地服务器。在Localforge中创建一个自定义代理，使用Qwen3作为“主”模型，使用Gemma3作为“辅助”模型。保存代理设置后，用户可以通过Localforge聊天界面与其交互。成功运行LS工具演示了Qwen3执行命令的能力，并且可以使用简单的代码生成任务来演示代码生成能力。作者鼓励进一步尝试系统提示和模型设置以优化性能。

这个Hacker News帖子讨论了在苹果硅Mac上使用Qwen3大型语言模型进行“氛围编程”（vibe coding），即使用AI生成代码，人工干预最少。用户报告了令人印象深刻的性能，尤其是在使用MLX框架的300亿参数模型上。讨论涵盖了实际问题，例如RAM需求（从16GB到32GB+不等）、token生成速度以及模型执行工具调用、文件导航和代码重构等任务的能力。一些用户发现较小的模型（0.6B，4B）可用于数据提取或推测解码等特定任务。该帖子还提到了Ollama和LM Studio等本地LLM工具替代方案，以及将本地模型与云端服务集成以实现协同编码工作流程的方法。帖子也提出了对模型输出准确性以及在复杂编码任务上循环的倾向的担忧。提示和模型配置的作用也得到了讨论，并且包含了关于Localforge的作者是提交者的事实声明。

Show HN：开放 Codex – 基于开源大型语言模型的 OpenAI Codex 命令行工具 2025-04-21

(评论) 2025-04-21

Show HN：LocalScore – 本地LLM基准测试 2025-04-06

MLx社区/OLMo-2-0325-32B-Instruct-4bit 2025-03-18

原文

Published: 1 May 2025

Today I wanted to test running Qwen3 latest models locally on my mac, and putting that in an agentic loop using localforge.

(or how to Vibe code for free!)

Qwen3 turns out to be a quite capable model available on ollama:

https://ollama.com/library/qwen3

And also on mlx community: https://huggingface.co/collections/mlx-community/qwen3-680ff3bcb446bdba2c45c7c4

Feel free to grab a model of your choice depending on mac hardware and let's dive in.

Here is what I did step by step:

Step 1: Install the core MLX library

pip install mlx

Step 2: Install the LLM helper library

pip install mlx-lm

Step 3: Run the model server

mlx_lm.server --model mlx-community/Qwen3-30B-A3B-8bit --trust-remote-code --port 8082

This command will both download and serve it (change port to whatever you want, and be ready to download tens of gigabytes of stuff)

After download is done you should see something like:

2025-05-01 13:56:26,964 - INFO - Starting httpd at 127.0.0.1 on port 8082...

Meaning your model is ready to receive requests. Time to configure it in localforge!

Configure Localforge

Get your latest localforge copy at https://localforge.dev (either npm install for any platform or if you want there are DMG and ZIP files available for OSX and Windows)

Once running open settings and set it up like this:

1) In provider list add provider

I have added two providers: one is ollama for a weak model, and another is for mlx qwen3

a) Ollama provider settings:

Choose name: LocalOllama
Choose ollama from provider types
No settings required
Important prerequisite: You need to have ollama installed on your machine with some sort of model serving, preferably gemma3:latest
Install instructions for this are here: https://ollama.com/library/gemma3
This model is needed for simple gerund and aux interactions such as for agent to figure out what is going on, but not serious stuff.

b) Qwen provider settings:

Choose any provider name such as qwen3:mlx:30b
Choose openai as provider type, because we are going to be using openai api v1
For API key put something like "not-needed"
For API url put: http://127.0.0.1:8082/v1/ (note the port you used in previous steps)

2) Create a custom agent

After you made your provider, make custom agent! Go to agents tab in settings and click +Add Agent, type in some name, like qwen3-agent

And then click pencil icon to edit your agent. This will open a huge window, in it you care about Main and Auxiliary cards at top (ignore the Expert card, can be anything or empty)

For Main put in your qwen provider, and as model name type in: mlx-community/Qwen3-30B-A3B-8bit (or whatever you downloaded from the mlx community)
For Auxiliary, choose your LocalOllama provider, and for model put in gemma3:latest

You can leave agent prompt same for now, although it may make sense to simplify it for qwen. In the tool sections you can unselect browser tools to make it more simple, although this is optional.

Using Your New Agent

Now that this is done, press command+s, and close the agent editor, and then close settings.

You should appear in the main chat window, in it on very top there is select box saying - select agent. Choose your new agent (qwen3-agent)

Your agent is ready to use tools!

I typed in something simple like:

"use LS tool to show me files in this folder"

And it did!

Localforge with Qwen3 agent successfully using tools

Qwen3 successfully running the LS tool through Localforge

And here's a website created by Qwen3:

Website created by Qwen3 through Localforge

A website created by Qwen3 using Localforge

Conclusion

This may require a bit more experimenting such as simplifying system prompt, or tinkering with mlx settings and model choices, but I think this is definitely possible to use to get some autonomous code generation on YOUR MAC, totally free of charge!

Happy tinkering!

Published 1 May 2025