--- library_name: mlx license: apache-2.0 license_link: https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct/blob/main/LICENSE pipeline_tag: text-generation tags: - mlx base_model: Qwen/Qwen3-Next-80B-A3B-Instruct --- # Qwen3-Next-80B-A3B-Instruct-qx64-mlx Test prompt: ```bash mlx_lm.generate --model release/Qwen3-Next-80B-A3B-Instruct-qx64-mlx --max-tokens 20000 \ --prompt "If you were to write an Agnostic Agent as a CLI for a user to interact with Ollama API, and want to use FP, what language would you prefer" ``` If I were to write an **Agnostic Agent** as a CLI to interact with the **Ollama API** using **Functional Programming (FP)** principles, I would prefer: > **Haskell** --- ### ✅ Why Haskell? #### 1. **Pure Functional Programming First-Class** Haskell enforces immutability, pure functions, and explicit side-effect handling via `IO` and monads — perfect for modeling an agent that: - Receives user input (side effect) - Makes HTTP requests to Ollama (side effect) - Processes responses (pure logic) - Maintains state (e.g., conversation history) via pure state transformers (`StateT`, `ReaderT`) This aligns with the “Agnostic Agent” ideal: logic is decoupled from I/O, making it testable, composable, and reusable. #### 2. **Strong Type System + Algebraic Data Types (ADTs)** You can model the Ollama API interactions with precise types: ```haskell data Model = Model { name :: String, description :: Maybe String } data Request = Request { model :: Model , prompt :: String , stream :: Bool , options :: Maybe Options } data Response = Response { completion :: String , done :: Bool , context :: [Int] } data Error = ApiError String | JsonParseError String | TimeoutError ``` This makes invalid states unrepresentable — a core FP strength. #### 3. **Excellent HTTP & JSON Libraries** - [`http-client`](https://hackage.haskell.org/package/http-client) — robust, streaming, async-capable - [`aeson`](https://hackage.haskell.org/package/aeson) — fast, composable JSON encoding/decoding - [`swagger`](https://hackage.haskell.org/package/swagger) (optional) — if you want to generate clients from OpenAPI specs You can write clean, type-safe wrappers around Ollama’s `/api/generate`, `/api/tags`, etc. #### 4. **CLI Tooling: `optparse-applicative`** A mature, composable, FP-style CLI parser: ```haskell main :: IO () main = do opts <- execParser optsParser result <- runAgent opts either (putStrLn . show) putStrLn result ``` No mutable state. All logic flows through pure functions. #### 5. **Agnostic by Design** - The core agent logic (`runAgent :: AgentConfig -> IO AgentResult`) is pure and testable. - I/O is isolated at the edges (CLI input, HTTP calls). - You could swap Ollama for Llama.cpp, vLLM, or OpenAI with a different `Backend` module — no logic changes. #### 6. **Concurrency & Streaming** Ollama supports streaming responses (`stream: true`). Haskell’s lazy evaluation and `conduit`/`pipes` libraries make streaming response processing elegant and memory-safe. #### 7. **No Runtime Overhead / Native Binaries** `stack build --executable` produces a single static binary — perfect for CLI tools. --- ### 🆚 Alternatives (and why they’re secondary) | Language | Pros | Cons | |---------|------|------| | **Rust** | Fast, safe, great CLI tools (`clap`) | Not purely functional; FP possible but not enforced | | **Elm** | Pure FP, great for UI | No CLI/HTTP support; not suited for backend/CLI | | **PureScript** | Haskell-like, compiles to JS | Weak CLI tooling; Ollama API needs HTTP, not browser | | **F#** | Great FP + .NET | Less mature for CLI/HTTP; ecosystem smaller than Haskell | | **Scala** | FP + JVM | Heavy, slow startup, overkill for CLI | | **Clojure** | Functional, Lisp | Dynamic typing, less type safety; harder to reason about API contracts | --- ### 🧠 Example Flow (Haskell-style) ```haskell main :: IO () main = do input <- getLine let request = buildRequest "llama3" input response <- sendToOllama request putStrLn $ formatResponse response ``` - `buildRequest` — pure function - `sendToOllama` — `IO Response` - `formatResponse` — pure function All logic is composable. You can unit test `buildRequest` and `formatResponse` without touching the network. --- ### ✅ Conclusion > **Haskell** is the ideal language to build an **Agnostic Agent CLI** for Ollama using **Functional Programming** because it: > - Enforces purity and immutability > - Provides strong typing for API contracts > - Offers excellent tooling for HTTP, JSON, and CLI > - Enables true separation of concerns > - Produces fast, reliable, testable binaries It turns the “Agnostic Agent” from a buzzword into a **practically elegant, mathematically sound system**. If you want to go even further: add **ZIO**-style effect systems via `freer-simple` or `extensible-effects` — but even vanilla Haskell is more than sufficient. ```bash Prompt: 42 tokens, 163.629 tokens-per-sec Generation: 1125 tokens, 54.763 tokens-per-sec Peak memory: 53.458 GB ``` This model [Qwen3-Next-80B-A3B-Instruct-qx64-mlx](https://huggingface.co/Qwen3-Next-80B-A3B-Instruct-qx64-mlx) was converted to MLX format from [Qwen/Qwen3-Next-80B-A3B-Instruct](https://huggingface.co/Qwen/Qwen3-Next-80B-A3B-Instruct) using mlx-lm version **0.27.1**. ## Use with mlx ```bash pip install mlx-lm ``` ```python from mlx_lm import load, generate model, tokenizer = load("Qwen3-Next-80B-A3B-Instruct-qx64-mlx") prompt = "hello" if tokenizer.chat_template is not None: messages = [{"role": "user", "content": prompt}] prompt = tokenizer.apply_chat_template( messages, add_generation_prompt=True ) response = generate(model, tokenizer, prompt=prompt, verbose=True) ```