Fal
This page adapts the original AI SDK documentation: Fal.
Fal AI provides a generative media platform for developers with lightning-fast inference capabilities. Their platform offers optimized performance for running diffusion models, with speeds up to 4x faster than alternatives.
The Fal provider is available in the FalProvider module. Add it to your Swift package:
// Package.swift (excerpt)dependencies: [ .package(url: "https://github.com/teunlao/swift-ai-sdk", from: "0.14.1")],targets: [ .target( name: "YourTarget", dependencies: [ .product(name: "SwiftAISDK", package: "swift-ai-sdk"), .product(name: "FalProvider", package: "swift-ai-sdk") ] )]Provider Instance
Section titled “Provider Instance”You can import the default provider instance fal:
import SwiftAISDKimport FalProvider
let imageModel = fal.image("fal-ai/flux/dev")If you need a customized setup, use createFal and create a provider instance with your settings:
import FalProvider
let fal = createFal(settings: FalProviderSettings( apiKey: "your-api-key", // optional, defaults to FAL_API_KEY, falling back to FAL_KEY baseURL: "https://fal.run", // optional headers: ["X-Custom-Header": "value"] // optional))You can use the following optional settings to customize the Fal provider instance:
-
baseURL
StringUse a different URL prefix for API calls, e.g. to use proxy servers. The default prefix is
https://fal.run. -
apiKey
StringAPI key that is being sent using the
Authorizationheader. It defaults to theFAL_API_KEYenvironment variable, falling back toFAL_KEY. -
headers
[String: String]Custom headers to include in the requests.
-
fetch
FetchFunctionCustom fetch implementation. You can use it as a middleware to intercept requests, or to provide a custom fetch implementation for e.g. testing.
Image Models
Section titled “Image Models”You can create Fal image models using the .image() factory method.
For more on image generation with the Swift AI SDK see Image Generation.
Basic Usage
Section titled “Basic Usage”import SwiftAISDKimport FalProviderimport Foundation
let result = try await generateImage( model: fal.image("fal-ai/flux/dev"), prompt: "A serene mountain landscape at sunset")
try result.image.data.write(to: URL(fileURLWithPath: "image.png"))Fal image models may return additional information for the images and the request.
Here are some examples of properties that may be set for each image
if let first = result.providerMetadata["fal"]?.images.first { print(first) // JSONValue (provider-specific)}Model Capabilities
Section titled “Model Capabilities”Fal offers many models optimized for different use cases. Here are a few popular examples. For a full list of models, see the Fal AI Search Page.
| Model | Description |
|---|---|
fal-ai/flux/dev | FLUX.1 [dev] model for high-quality image generation |
fal-ai/flux-pro/kontext | FLUX.1 Kontext [pro] handles both text and reference images as inputs, enabling targeted edits and complex transformations |
fal-ai/flux-pro/kontext/max | FLUX.1 Kontext [max] with improved prompt adherence and typography generation |
fal-ai/flux-lora | Super fast endpoint for FLUX.1 with LoRA support |
fal-ai/ideogram/character | Generate consistent character appearances across multiple images. Maintain facial features, proportions, and distinctive traits |
fal-ai/qwen-image | Qwen-Image foundation model with significant advances in complex text rendering and precise image editing |
fal-ai/omnigen-v2 | Unified image generation model for Image Editing, Personalized Image Generation, Virtual Try-On, Multi Person Generation and more |
fal-ai/bytedance/dreamina/v3.1/text-to-image | Dreamina showcases superior picture effects with improvements in aesthetics, precise and diverse styles, and rich details |
fal-ai/recraft/v3/text-to-image | SOTA in image generation with vector art and brand style capabilities |
fal-ai/wan/v2.2-a14b/text-to-image | High-resolution, photorealistic images with fine-grained detail |
Fal models support the following aspect ratios:
- 1:1 (square HD)
- 16:9 (landscape)
- 9:16 (portrait)
- 4:3 (landscape)
- 3:4 (portrait)
- 16:10 (1280x800)
- 10:16 (800x1280)
- 21:9 (2560x1080)
- 9:21 (1080x2560)
Key features of Fal models include:
- Up to 4x faster inference speeds compared to alternatives
- Optimized by the Fal Inference Engine™
- Support for real-time infrastructure
- Cost-effective scaling with pay-per-use pricing
- LoRA training capabilities for model personalization
Modify Image
Section titled “Modify Image”Transform existing images using text prompts.
import SwiftAISDKimport FalProvider
let result = try await generateImage( model: fal.image("fal-ai/flux-pro/kontext/max"), prompt: .imageEditing( images: [ .string("https://v3.fal.media/files/rabbit/rmgBxhwGYb2d3pl3x9sKf_output.png") ], text: "Put a donut next to the flour." ))Images can also be passed as raw Data or a base64-encoded string.
A mask can be passed as well
let imageData = Data() // your image byteslet maskData = Data() // your mask bytes
let result = try await generateImage( model: fal.image("fal-ai/flux-pro/kontext/max"), prompt: .imageEditing( images: [.data(imageData)], text: "Put a donut next to the flour.", mask: .data(maskData) ))Provider Options
Section titled “Provider Options”Fal image models support flexible provider options through providerOptions["fal"]. You can pass any parameters supported by the specific Fal model’s API. Common options include:
- imageUrl - Reference image URL for image-to-image generation (deprecated, use
prompt.imagesinstead) - strength - Controls how much the output differs from the input image
- guidanceScale - Controls adherence to the prompt (range: 1-20)
- numInferenceSteps - Number of denoising steps (range: 1-50)
- enableSafetyChecker - Enable/disable safety filtering
- outputFormat - Output format: ‘jpeg’ or ‘png’
- syncMode - Wait for completion before returning response
- acceleration - Speed of generation: ‘none’, ‘regular’, or ‘high’
- safetyTolerance - Content safety filtering level (1-6, where 1 is strictest)
- useMultipleImages - When true, converts multiple input images to
image_urlsarray for models that support multiple images (e.g., fal-ai/flux-2/edit)
Warning Deprecation Notice: snake_case parameter names (e.g.,
image_url,guidance_scale) are deprecated and will be removed in a future version. Please use camelCase names (e.g.,imageUrl,guidanceScale) instead.
Refer to the Fal AI model documentation for model-specific parameters.
Advanced Features
Section titled “Advanced Features”Fal’s platform offers several advanced capabilities:
- Private Model Inference: Run your own diffusion transformer models with up to 50% faster inference
- LoRA Training: Train and personalize models in under 5 minutes
- Real-time Infrastructure: Enable new user experiences with fast inference times
- Scalable Architecture: Scale to thousands of GPUs when needed
For more details about Fal’s capabilities and features, visit the Fal AI documentation.
Transcription Models
Section titled “Transcription Models”You can create models that call the Fal transcription API
using the .transcription() factory method.
The first argument is the model id without the fal-ai/ prefix e.g. wizper.
let model = fal.transcription("wizper")You can also pass additional provider-specific options using the providerOptions argument. For example, supplying the batchSize option will increase the number of audio chunks processed in parallel.
import SwiftAISDKimport FalProviderimport Foundation
let audioData = try Data(contentsOf: URL(fileURLWithPath: "audio.mp3"))
let result = try await transcribe( model: fal.transcription("wizper"), audio: .data(audioData), providerOptions: ["fal": ["batchSize": 10]])The following provider options are available:
-
language
StringLanguage of the audio file. Defaults to ‘en’. If set to null, the language will be automatically detected. Accepts ISO language codes like ‘en’, ‘fr’, ‘zh’, etc. Optional. -
diarize
BoolWhether to diarize the audio file (identify different speakers). Defaults to true. Optional. -
chunkLevel
StringLevel of the chunks to return. Either ‘segment’ or ‘word’. Default value: “segment” Optional. -
version
StringVersion of the model to use. All models are Whisper large variants. Default value: “3” Optional. -
batchSize
IntBatch size for processing. Default value: 64 Optional. -
numSpeakers
IntNumber of speakers in the audio file. If not provided, the number of speakers will be automatically detected. Optional.
Model Capabilities
Section titled “Model Capabilities”| Model | Transcription | Duration | Segments | Language |
|---|---|---|---|---|
whisper | ✓ | ✓ | ✓ | ✓ |
wizper | ✓ | ✓ | ✓ | ✓ |
Speech Models
Section titled “Speech Models”You can create models that call Fal text-to-speech endpoints using the .speech() factory method.
Basic Usage
Section titled “Basic Usage”import SwiftAISDKimport FalProviderimport Foundation
let result = try await experimental_generateSpeech( model: fal.speech("fal-ai/minimax/speech-02-hd"), text: "Hello from the Swift AI SDK!")
try result.audio.data.write(to: URL(fileURLWithPath: "speech.\\(result.audio.format)"))Model Capabilities
Section titled “Model Capabilities”| Model | Description |
|---|---|
fal-ai/minimax/voice-clone | Clone a voice from a sample audio and generate speech from text prompts |
fal-ai/minimax/voice-design | Design a personalized voice from a text description and generate speech from text prompts |
fal-ai/dia-tts/voice-clone | Clone dialog voices from a sample audio and generate dialogs from text prompts |
fal-ai/minimax/speech-02-hd | Generate speech from text prompts and different voices |
fal-ai/minimax/speech-02-turbo | Generate fast speech from text prompts and different voices |
fal-ai/dia-tts | Directly generates realistic dialogue from transcripts with audio conditioning for emotion control. Produces natural nonverbals like laughter and throat clearing |
resemble-ai/chatterboxhd/text-to-speech | Generate expressive, natural speech with Resemble AI’s Chatterbox. Features unique emotion control, instant voice cloning from short audio, and built-in watermarking |
Provider Options
Section titled “Provider Options”Pass provider-specific options via providerOptions["fal"] depending on the model:
-
voice_setting object
voice_id(string): predefined voice IDspeed(number): 0.5–2.0vol(number): 0–10pitch(number): -12–12emotion(enum): happy | sad | angry | fearful | disgusted | surprised | neutralenglish_normalization(boolean)
-
audio_setting object Audio configuration settings specific to the model.
-
language_boost enum Chinese | Chinese,Yue | English | Arabic | Russian | Spanish | French | Portuguese | German | Turkish | Dutch | Ukrainian | Vietnamese | Indonesian | Japanese | Italian | Korean | Thai | Polish | Romanian | Greek | Czech | Finnish | Hindi | auto
-
pronunciation_dict object Custom pronunciation dictionary for specific words.
Model-specific parameters (e.g., audio_url, prompt, preview_text, ref_audio_url, ref_text) can be passed directly under providerOptions[\"fal\"] and will be forwarded to the Fal API.
Video Models
Section titled “Video Models”You can create models that call Fal video generation endpoints using the .video() factory method.
For more on video generation with the Swift AI SDK see Video Generation.
Basic Usage
Section titled “Basic Usage”import SwiftAISDKimport FalProviderimport Foundation
let result = try await experimental_generateVideo( model: fal.video("luma-dream-machine/ray-2"), prompt: "A cat walking on a treadmill", providerOptions: ["fal": ["resolution": "1080p"]])
try result.video.data.write(to: URL(fileURLWithPath: "video.mp4"))Model Capabilities
Section titled “Model Capabilities”Fal supports multiple video model IDs. Here are a few popular examples:
| Model | Description |
|---|---|
luma-dream-machine/ray-2 | High-quality text-to-video model (Ray 2) |
hunyuan-video | Image-to-video capable model (supports an input image) |
minimax-video | Text-to-video model optimized for speed |
Provider Options
Section titled “Provider Options”Fal video models support additional settings via providerOptions["fal"]:
- resolution
String- e.g."720p","1080p"(Fal-specific; recommended over the top-levelresolutionparameter). - loop
Bool- whether the generated video should loop seamlessly. - motionStrength
Double- 0…1, controls motion intensity. - negativePrompt
String- negative prompt. - promptOptimizer
Bool- enable Fal prompt optimization. - pollIntervalMs
Int- polling interval for queue status checks. - pollTimeoutMs
Int- polling timeout in milliseconds.