๐Ÿ–ฅ๏ธ GPU Scaling Filter

This is a simple filter that reduces the number of GPU layers in use

by Ollama when it detects that Ollama has crashed (via empty response

coming in to OpenWebUI). Right now, the logic is very basic, just

using static numbers to reduce GPU layer counts. It doesn't take into

account the number of layers in models or dynamically monitor VRAM

use.

There are three settings:

โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—โ€—

โคด๏ธ [/projects/open-webui-filters] ๐Ÿ  Home