ModelService

The ModelService class is the central utility for managing various models, including Large Language Models (LLMs), Vision-Language Models (VLMs), and Text-to-Image (T2I) models. It abstracts the complexity of model initialization, request handling, and response generation, providing a unified interface for both text and image-based tasks.

Quick Start

The ModelService is included in the trustgen package. To get started:

from trusteval import ModelService

# Initialize ModelService
model_service = ModelService(
    request_type="llm",
    handler_type="api",
    model_name="gpt-4o",
    config_path="/path/to/config.yaml"
)

# Process a single prompt
response = model_service.process("Your prompt here.")

Initialization

To create an instance of the ModelService class, you need to specify the following parameters:

Constructor Parameters

Parameter	Type	Description
request_type	str	The type of request: “llm” (Language Model), “vlm” (Vision Model), or “t2i” (Text-to-Image).
handler_type	str	The handler type: “api” (remote API-based inference) or “local” (local inference).
model_name	str	The name of the model to use, mapped internally.
config_path	str	Path to the YAML configuration file.
save_folder	str	(Optional) Folder to save the generated images (for t2i requests).
file_name	str	(Optional) Name of the file to save the generated image as (for t2i requests).
image_urls	List[str]	(Optional) List of image URLs or local image paths (for vlm requests).
**kwargs	dict	Additional parameters to customize behavior (e.g., temperature).

Supported Models

The ModelService supports a wide range of models for text, vision, and text-to-image tasks. Below is a comprehensive table of the supported models, categorized by request type (llm, vlm, t2i), model name, and whether the model uses api or local inference.

Request Type	Model Name	Handler Type
LLM	gpt-4o	api
LLM	gpt-4o-mini	api
LLM	gpt-3.5-turbo	api
LLM	text-embedding-ada-002	api
LLM	glm-4	api
LLM	glm-4-plus	api
LLM	llama-3-8B	api/local
LLM	llama-3.1-70B	api/local
LLM	llama-3.1-8B	api/local
LLM	qwen-2.5-72B	api/local
LLM	mistral-7B	api/local
LLM	mistral-8x7B	api/local
LLM	claude-3.5-sonnet	api
LLM	claude-3-haiku	api
LLM	gemini-1.5-pro	api
LLM	gemini-1.5-flash	api
LLM	command-r-plus	api
LLM	command-r	api
LLM	gemma-2-27B	api/local
LLM	deepseek-chat	api/local
LLM	yi-lightning	api
VLM	glm-4v	api
VLM	glm-4v-plus	api
VLM	llama-3.2-90B-V	api/local
VLM	llama-3.2-11B-V	api/local
VLM	qwen-vl-max-0809	api
VLM	qwen-2-vl-72B	api/local
VLM	internLM-72B	api/local
VLM	claude-3-haiku	api
VLM	gemini-1.5-pro	api
VLM	gemini-1.5-flash	api
T2I	dall-e-3	api
T2I	flux-1.1-pro	api
T2I	flux_schnell	api
T2I	cogview-3-plus	api
T2I	sd-3.5-large	local
T2I	sd-3.5-large-turbo	local
T2I	HunyuanDiT	local
T2I	kolors	local
T2I	playground-v2.5	local

Pipeline Initialization

The _initialize_pipeline method sets up the appropriate pipeline based on the provided parameters. It automatically configures the model, handler, and other runtime options.

Example: Initialize a GPT-4o Pipeline for API Use

model_service = ModelService(
    request_type="llm",
    handler_type="api",
    model_name="gpt-4o",
    config_path="/path/to/config.yaml"
)

Methods

process

Definition: process(prompt: Union[str, List[str]], **kwargs) -> str

Processes a single prompt or a list of prompts synchronously. It supports both one-off interactions and multi-turn conversations.

Parameters:

prompt (str or List[str]): The input prompt(s).
kwargs: Additional parameters for model customization.

Returns: Model-generated responses as a string.

Example:

# Single prompt
response = model_service.process("Your prompt here.")

# Multi-turn interaction
prompts = [
    "What is the capital of France?",
    "What is the population of Paris?"
]
responses = model_service.process(prompts)

process_async

Definition: process_async(prompt: Union[str, List[str]], **kwargs) -> str

Handles requests asynchronously, enabling high concurrency for demanding applications.

Parameters:

prompt (str or List[str]): The input prompt(s).
kwargs: Additional parameters for model customization.

Returns: Model-generated responses as a string.

Example:

# Asynchronous prompt
response = await model_service.process_async("Your prompt here.")