Overview
MLX is Apple’s machine learning framework optimized for Apple Silicon. Stenox uses MLX to run small, efficient language models locally on your Mac for completely private text enhancement. Key Benefits:- 🔒 100% Private - Text never leaves your Mac
- 🌐 Offline - Works without internet connection
- 💰 Free - No API keys, no usage costs
- 🚀 Optimized - Built for Apple Silicon
Privacy & Security
MLX processes everything locally on your Mac:- Transcribed text is enhanced on-device
- No data sent to cloud services
- No internet connection required
- No API keys or accounts needed
- Perfect for sensitive or confidential content
No API Key Required
Unlike cloud providers, MLX requires no setup:- Select MLX in Stenox Settings
- Download your preferred model
- Start enhancing immediately
Available Models
MLX offers several small language models optimized for Apple Silicon:| Model | Size | Speed | Quality | Recommended For |
|---|---|---|---|---|
| Qwen 2.5 1.5B | ~1 GB | Fastest | Good | Quick enhancement, limited RAM |
| Gemma 3n 2B | ~1.2 GB | Fast | Excellent | Best accuracy, recommended |
| Phi-3 Mini | ~2.4 GB | Medium | Very Good | Balanced quality and speed |
| Gemma 3n 4B | ~2.5 GB | Slower | Very Good | Higher quality, more RAM |
| Qwen 3 4B | ~2.5 GB | Slower | Good | Alternative to Gemma 3n 4B |
Recommended for beginners: Gemma 3n 2B - Best accuracy with fast processing and reasonable size.
Model Selection Guide
Qwen 2.5 1.5B - Fastest and smallest
Qwen 2.5 1.5B - Fastest and smallest
Best for:
- Quick enhancement (2-3 seconds)
- 8GB RAM Macs
- Everyday grammar correction
- Speed over maximum quality
Gemma 3n 2B - Best accuracy (Recommended)
Gemma 3n 2B - Best accuracy (Recommended)
Best for:
- Best accuracy for its size
- 8GB+ RAM Macs
- Professional writing
- Best balance of speed, size, and quality
Phi-3 Mini - Balanced quality
Phi-3 Mini - Balanced quality
Best for:
- Higher quality enhancement
- 12GB+ RAM Macs
- More complex edits
- Good balance of speed and quality
Gemma 3n 4B / Qwen 3 4B - Larger models
Gemma 3n 4B / Qwen 3 4B - Larger models
Best for:
- Maximum local quality
- 16GB+ RAM Macs
- Critical writing
- Quality over speed
Setup Instructions
1
Open Stenox Settings
Click the Stenox icon in your menu bar and select Settings.
2
Navigate to Models tab
Go to the Models tab in the Settings window.
3
Select MLX
Under AI Enhancement Provider, select MLX from the dropdown.
4
Choose a model
Select your preferred model:
- Gemma 3n 2B (recommended - best accuracy)
- Qwen 2.5 1.5B (fastest, smallest)
- Phi-3 Mini (balanced quality)
- Gemma 3n 4B or Qwen 3 4B (maximum quality)
5
Download the model
Click Download and wait for the model to download.Download sizes:
- Qwen 2.5 1.5B: ~1 GB
- Gemma 3n 2B: ~1.2 GB
- Phi-3 Mini: ~2.4 GB
- Gemma 3n 4B / Qwen 3 4B: ~2.5 GB
~/stenox-models/mlx/ by default.6
Start enhancing
Once download completes, your transcriptions will automatically be enhanced!
Performance
MLX performance depends on your Mac’s hardware and chosen model:Apple Silicon Macs
Excellent performance:- Qwen 1.5B: 2-3 seconds per paragraph
- Gemma 2B: 3-4 seconds per paragraph
- Phi-3 3.8B: 5-7 seconds per paragraph
- Metal GPU acceleration
- Apple Neural Engine (ANE)
- Unified memory architecture
Intel Macs
Not supported:Apple Silicon Required
MLX models require Apple Silicon (M1/M2/M3/M4). Intel Mac users should use cloud providers like Gemini or Groq.What MLX Enhancement Does
Grammar Correction
- Fixes grammatical errors
- Subject-verb agreement
- Tense corrections
- Article usage (a, an, the)
- Before: “me and sarah was going to store”
- After: “Sarah and I were going to the store.”
Capitalization
- Sentence beginnings
- Proper nouns (names, places)
- Acronyms and abbreviations
- Before: “john lives in new york city and works for ibm”
- After: “John lives in New York City and works for IBM.”
Punctuation
- Adds missing commas, periods
- Fixes run-on sentences
- Improves readability
- Before: “hello how are you doing today its nice to meet you”
- After: “Hello, how are you doing today? It’s nice to meet you.”
Light Formatting
- Number formatting (spelled out → digits)
- Date and time formatting
- Basic list formatting
MLX models provide basic to good enhancement. For maximum quality, consider cloud providers like Google Gemini.
When to Use MLX
Privacy is critical
Healthcare, legal, financial, or sensitive content.
Working offline
Airplanes, remote locations, or unreliable internet.
No API costs
No per-use charges. Processing happens on your Mac.
100% local setup
Combine with WhisperKit for completely offline dictation.
When to Use Cloud Instead
Consider cloud providers if you need:- Faster processing - Cloud is 2-3x faster (< 1 second vs 2-5 seconds)
- Better quality - Gemini 2.5 Flash and Groq LLMs often outperform small MLX models
- Intel Mac - MLX requires Apple Silicon
- Complex edits - Tone adjustment, style changes, formatting
Custom Prompts
Configure how MLX enhances your text in Profile settings:- Default
- Professional
- Minimal
Troubleshooting
Model download fails
Model download fails
- Check internet connection
- Ensure sufficient disk space (~2-4 GB per model)
- Try downloading again
- Check
~/stenox-models/mlx/for partial downloads and delete them
Enhancement is very slow (> 10 seconds)
Enhancement is very slow (> 10 seconds)
- Try a smaller model (Qwen 2.5 1.5B or Gemma 3n 2B instead of larger models)
- Close other intensive applications
- Check Activity Monitor for available RAM
- Your Mac may not have enough RAM for larger models
MLX option not available / grayed out
MLX option not available / grayed out
- You’re on an Intel Mac (MLX requires Apple Silicon)
- Use Google Gemini or Groq LLMs instead
Enhancement quality is poor
Enhancement quality is poor
- Try Gemma 3n 2B (best accuracy) or a larger model
- Adjust custom prompt in Profile settings
- For maximum quality, use cloud providers (Gemini)
- Add specific formatting instructions to your custom prompt
App crashes or runs out of memory
App crashes or runs out of memory
- Your model is too large for available RAM
- Use Qwen 2.5 1.5B or Gemma 3n 2B on 8GB Macs
- Upgrade to 16GB+ RAM for larger models
- Close other applications to free memory
Model Comparison
| Model | RAM Required | Speed | Quality | Best Use Case |
|---|---|---|---|---|
| Qwen 2.5 1.5B | 8GB+ | Fastest | Good | Quick tasks, 8GB Macs |
| Gemma 3n 2B | 8GB+ | Fast | Excellent | Recommended for most users |
| Phi-3 Mini | 12GB+ | Medium | Very Good | Balanced quality and speed |
| Gemma 3n 4B | 16GB+ | Slower | Very Good | Higher quality, 16GB Macs |
| Qwen 3 4B | 16GB+ | Slower | Good | Alternative larger model |
Next Steps
WhisperKit (Local)
Combine with WhisperKit for 100% private, offline dictation.
Create Profiles
Different providers for different scenarios.
Privacy-First Setup
Complete guide to local-only Stenox configuration.
Cloud Enhancement
Try cloud enhancement for comparison.
Perfect for privacy: MLX + WhisperKit = No cloud providers, no API keys, no internet required, complete privacy.

