Generate captivating and unique art from conversations


Phrame generates captivating and unique art by listening to conversations around it, transforming spoken words and emotions into visually stunning masterpieces. Unleash your creativity and transform the soundscape around you.


Phrame relies on the SpeechRecognition interface of the Web Speech API to transform audio into text. This text is processed by OpenAI, producing a condensed summary. The summary is then combined with the configured generative AI image services and the final images are saved.


If you would like to make a donation to support development, please use GitHub Sponsors.

Minimum Requirements

Phrame can be used without a microphone and any modern browser will work. However, if you would like to use speech recognition, you will need a compatible browser.


Supported Architecture

  • amd64
  • arm64
  • arm/v7

Supported AIs

Voice Commands

Interact with Phrame by using the following voice commands.

CommandActionHey PhrameWake word to generate images on demandNext ImageAdvance to next imagePrevious ImageAdvance to previous imageLast ImageAdvance to previous image


Docker Compose

version: '3.9'


    container_name: phrame
    image: jakowenko/phrame
    restart: unless-stopped
      - phrame:/.storage
      - 3000:3000


Configurable options are saved to /.storage/config/config.yml and are editable via the UI at http://localhost:3000/config.

Note: Default values do not need to be specified in configuration unless they need to be overwritten.


# image settings (default: shown below)

  # time in seconds between image animation
  interval: 60
  # order of images to display: random, recent
  order: recent


Images are generated by processing transcripts. This can be scheduled with a cron expression. All of the transcripts within X minutes will then be processed by OpenAI using to summarize the transcripts.

# transcript settings (default: shown below)

  # schedule as a cron expression for processing transcripts (at every 30th minute)
  cron: '*/30 * * * *'
  # how many minutes of files to look back for (process the last 30 minutes of transcripts)
  minutes: 30
  # minimum number of transcripts required to process
  minimum: 5


To configure OpenAI, obtain an API key and add it to your config like the following. All other default settings found bellow will also be applied. You can overwrite the settings by updating your config.yml file.

# openai settings (default: shown below)

  key: sk-XXXXXXX

    # model name (
    model: gpt-3.5-turbo
    # the prompt used to generate a summary
    prompt: You are a helpful assistant that will take a string of random conversations and pull out a few keywords and topics that were talked about. You will then turn this into a short description to describe a picture, painting, or artwork. It should be no more than two or three sentences and be something that DALL·E can use. Make sure it doesn't contain words that would be rejected by your safety system.
    # size of the generated images: 256x256, 512x512, or 1024x1024
    size: 512x512
    # number of images to generate for each style
    n: 1
    # used with summary to generate image (summary, style)
      - cinematic


To configure Stability AI, obtain an API key and add it to your config like the following. All other default settings found bellow will also be applied. You can overwrite the settings by updating your config.yml file.

# stabilityai settings (default: shown below)

  key: sk-XXXXXXX

    # number of seconds before the request times out and is aborted
    timeout: 30
    # engined used for image generation
    engine_id: stable-diffusion-512-v2-1
    # width of the image in pixels, must be in increments of 64
    width: 512
    # height of the image in pixels, must be in increments of 64
    height: 512
    # how strictly the diffusion process adheres to the prompt text (higher values keep your image closer to your prompt)
    cfg_scale: 7
    # number of images to generate
    samples: 1
    # number of diffusion steps to run
    steps: 50
      - cinematic


# time settings (default: shown below)
  # defaults to iso 8601 format with support for token-based formatting
  # time zone used in logs
  timezone: UTC


# log settings (default: shown below)
# options: silent, error, warn, info, http, verbose, debug, silly
  level: info


# telemetry settings (default: shown below)
# self hosted version of
# 100% anonymous, used to help improve project
# no cookies and fully compliant with GDPR, CCPA and PECR
telemetry: true


Run Local Services

ServiceCommandURLUInpm run local:frontendlocalhost:8080APInpm run local:apilocalhost:3000

Build Local Docker Image



View Github

