OpenAI #Chat: Answer Prompt with Images
Creates a response for a prompt with images. MODEL (GPT-4 with Vision) takes in images and answers questions related to them, such as what the image represents, what is in the image, and more (eg. ideas for dinner based on what is in fridge).
Configs for this Auto Step
- AuthzConfU1
- U1: Select HTTP_Authz Setting (Secret API Key as “Fixed Value”) *
- StrConfA1
- A1: Set Text PROMPT *#{EL}
- StrConfA2
- A2: Set Image URLs on each line *#{EL}
- SelectConfC1
- C1: Select STRING that stores Generated Text (update)
- StrConfM
- M: Set MODEL Name (default “gpt-4-vision-preview”)#{EL}
- StrConfU2
- U2: Set OpenAI Organization ID (“org-xxxx”)#{EL}
- StrConfB1
- B1: Set DETAIL parameter (“high” or “low”:default)#{EL}
- StrConfB2
- B2: Set MaxTokens#{EL}
- SelectConfC2
- C2: Select NUMERIC that stores PROMPT Tokens (update)
- SelectConfC3
- C3: Select NUMERIC that stores Total Tokens (update)
Script (click to open)
// Script Example of Business Process Automation
// for 'engine type: 3' ("GraalJS standard mode")
// cf. 'engine type: 2' ("GraalJS Nashorn compatible mode") (renamed from "GraalJS" at 20230526)
//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){
////// == Config Retrieving / 工程コンフィグの参照 ==
const strAuthzSetting = configs.get ( "AuthzConfU1" ); /// REQUIRED
engine.log( " AutomatedTask Config: Authz Setting: " + strAuthzSetting );
const strOrgId = configs.get ( "StrConfU2" ); // NotRequired
engine.log( " AutomatedTask Config: OpenAI-Organization: " + strOrgId );
const strModel = configs.get ( "StrConfM" ) !== "" ? // NotRequired
configs.get ( "StrConfM" ) : "gpt-4-vision-preview"; // (default)
engine.log( " AutomatedTask Config: OpenAI Model: " + strModel );
const strTextPrompt = configs.get ( "StrConfA1" ); /// REQUIRED
if( strTextPrompt === "" ){
throw new Error( "\n AutomatedTask ConfigError:" +
" Config {A1: Prompt} must be non-empty \n" );
}
const strImageUrls = configs.get ( "StrConfA2" ); /// REQUIRED
if( strImageUrls === "" ){
throw new Error( "\n AutomatedTask ConfigError:" +
" Config {A2: ImageUrls} must be non-empty \n" );
}
const arrImageUrls = strImageUrls.split("\n");
const strDetail = configs.get ( "StrConfB1" ) !== "" ? // NotRequired
configs.get ( "StrConfB1" ) : "low"; // (default)
const strMaxTokens = configs.get ( "StrConfB2" ); // NotRequired
const numMaxTokens = parseInt ( strMaxTokens, 10 );
engine.log( " AutomatedTask Config: Max Tokens: " + numMaxTokens );
const strPocketGenerated = configs.getObject ( "SelectConfC1" ); /// REQUIRED NotRequired
const numPocketPrompt = configs.getObject ( "SelectConfC2" ); // NotRequired
const numPocketTotal = configs.getObject ( "SelectConfC3" ); // NotRequired
////// == Data Retrieving / ワークフローデータの参照 ==
// (Nothing. Retrieved via Expression Language in Config Retrieving)
////// == Calculating / 演算 ==
//// OpenAI API > Documentation > API REFERENCE > CHAT
//// https://platform.openai.com/docs/api-reference/chat/create (not updated)
//// https://platform.openai.com/docs/guides/vision
/// prepare json
let strJson = {};
strJson.model = strModel;
if ( ! isNaN(numMaxTokens) ) {
strJson.max_tokens = numMaxTokens;
}
// strJson.response_format = {};
// strJson.response_format.type = "json_object";
strJson.messages = [];
strJson.messages[0] = {};
strJson.messages[0].role = "user";
strJson.messages[0].content = [];
strJson.messages[0].content[0] = {};
strJson.messages[0].content[0].type = "text";
strJson.messages[0].content[0].text = strTextPrompt;
for ( let i = 0; i < arrImageUrls.length; i++ ) {
const objTmp = {};
objTmp.type = "image_url";
objTmp.image_url = arrImageUrls[i];
strJson.messages[0].content.push ( objTmp );
}
/// prepare request1
let request1Uri = "https://api.openai.com/v1/chat/completions";
let request1 = httpClient.begin(); // HttpRequestWrapper
request1 = request1.queryParam( "detail", strDetail );
request1 = request1.authSetting( strAuthzSetting ); // with "Authorization: Bearer XX"
request1 = request1.body( JSON.stringify( strJson ), "application/json" );
if ( strOrgId !== "" ){
request1 = request1.header( "OpenAI-Organization", strOrgId );
}
/// try request1
const response1 = request1.post( request1Uri ); // HttpResponseWrapper
engine.log( " AutomatedTask ApiRequest1 Start: " + request1Uri );
const response1Code = response1.getStatusCode() + ""; // JavaNum to string
const response1Body = response1.getResponseAsString();
engine.log( " AutomatedTask ApiResponse1 Status: " + response1Code );
if( response1Code !== "200"){
throw new Error( "\n AutomatedTask UnexpectedResponseError: " +
response1Code + "\n" + response1Body + "\n" );
}
/// parse response1
/* engine.log( response1Body ); // debug
{
"id": "chatcmpl-8I9gORGEvsLaVE10Dir0pOMZI5I37",
"object": "chat.completion",
"created": 1699337816,
"model": "gpt-4-1106-vision-preview",
"usage": {
"prompt_tokens": 1887,
"completion_tokens": 16,
"total_tokens": 1903
},
"choices": [{
"message": {
"role": "assistant",
"content": "\u6700\u521d\u306e\u753b\u50cf\u306b\u306f\u3001\"Fine tuning workflow\" \u3068\u3044\u3046"
},
"finish_details": {
"type": "max_tokens"
},
"index": 0
}]
}
*/
const response1Obj = JSON.parse( response1Body );
engine.log( " AutomatedTask ApiResponse1 finish_details: " + response1Obj.choices[0].finish_details.type );
////// == Data Updating / ワークフローデータへの代入 ==
if( strPocketGenerated !== null ){
engine.setData( strPocketGenerated, response1Obj.choices[0].message.content );
}
if( numPocketPrompt !== null ){
engine.setData( numPocketPrompt, new java.math.BigDecimal( response1Obj.usage.prompt_tokens ) );
}
if( numPocketTotal !== null ){
engine.setData( numPocketTotal, new java.math.BigDecimal( response1Obj.usage.total_tokens ) );
}
} //////// END "main()" /////////////////////////////////////////////////////////////////
/*
Notes:
- This [Automated Step] obtains the answer text via OpenAI API (Chat endpoint).
- Specify the instruction (prompt) using Text and Image-Url.
- Also possible to specify multiple images (Image-Urls).
- Compatible with GPT-4V preview version (gpt-4-vision-preview).
- Specifications are subject to change.
- https://platform.openai.com/docs/guides/vision
- If place this [Automated Atep] in the workflow diagram, communication will occur every time a process arrives.
- Request from the Questetra BPM Suite server to the OpenAI server.
- Analyzes the response from the OpenAI server and stores the necessary information.
- [HTTP Authz Settings] is required for workflow apps that include this [Automated Step].
- An API key is required to use OpenAI API. Please obtain an API key in advance.
- https://platform.openai.com/api-keys
- Set 'Secret API Key' as communication token. [HTTP Authz Settings] > [Token Fixed Value]
APPENDIX
- For low res mode, a 512px x 512px image is expected.
- For high res mode,
- the short side of the image should be less than 768px and
- the long side of the image should be less than 2,000px.
- Supported type of files
- PNG (.png), JPEG (.jpeg .jpg), WEBP (.webp), and non-animated GIF (.gif)
- An error may occur depending on the timing.
- 429 error ('Too Many Requests')
- Timeout script error `java.util.concurrent.TimeoutException`
Notes-ja:
- この[自動工程]は、OpenAI API (Chat エンドポイント)を通じて、回答文を取得します。
- 指示文(プロンプト)は Text および Image-Url にて指定します。
- 複数画像(Image-Urls)の指定も可能です。
- GPT-4V プレビュー版(gpt-4-vision-preview)に対応します。
- 仕様が変更される可能性があります。
- https://platform.openai.com/docs/guides/vision
- この[自動工程]をワークフロー図に配置すれば、案件到達の度に通信が発生します。
- Questetra BPM Suite サーバから OpenAI サーバに対してリクエストします。
- OpenAI サーバからのレスポンスを解析し、必要情報を格納します。
- この[自動工程]を含むワークフローアプリには、[HTTP 認証設定]が必要です。
- OpenAI API の利用には API key が必要です。あらかじめ API Key を取得しておいてください。
- https://platform.openai.com/api-keys
- 'Secret API Key' を通信トークンとしてセットします。[HTTP 認証設定]>[トークン直接指定]
APPENDIX-ja
- 低解像度モード(low)の場合、512px x 512px 画像が推奨です。
- 高解像度モード(high)の場合、
- 画像の短辺は 768px 未満、
- 画像の長辺は 2,000px 未満である必要があります。
- サポートされる画像フォーマット
- PNG (.png), JPEG (.jpeg .jpg), WEBP (.webp), and non-animated GIF (.gif)
- タイミングによって、エラーになる場合があります。
- 429エラー('Too Many Requests')
- Timeout スクリプトエラー `java.util.concurrent.TimeoutException`
*/
Download
- openai-chat-answer-prompt-with-images-202311.xml
- 2023-11-07 (C) Questetra, Inc. (MIT License)
(Installing Addon Auto-Steps are available only on the Professional edition.)
Notes
- This [Automated Step] obtains the answer text via OpenAI API (Chat endpoint).
- Specify the instruction (prompt) using Text and Image-Url.
- Also possible to specify multiple images (Image-Urls).
- Compatible with GPT-4V preview version (gpt-4-vision-preview).
- Specifications are subject to change.
- https://platform.openai.com/docs/guides/vision
- If you place this [Automated Step] in the workflow diagram, communication will occur every time a process arrives.
- Request from the Questetra BPM Suite server to the OpenAI server.
- Analyzes the response from the OpenAI server and stores the necessary information.
- [HTTP Authz Settings] is required for workflow apps that include this [Automated Step].
- An API key is required to use OpenAI API. Please obtain an API key in advance.
- Set ‘Secret API Key’ as the communication token. [HTTP Authz Settings] > [Token Fixed Value]
Capture


Appendix
- For low res mode, a 512px x 512px image is expected.
- For high res mode,
- the short side of the image should be less than 768px and
- the long side of the image should be less than 2,000px.
- Supported type of files
- PNG (.png), JPEG (.jpeg .jpg), WEBP (.webp), and non-animated GIF (.gif)
- An error may occur depending on the timing.
- 429 error (‘Too Many Requests’)
- Timeout script error
java.util.concurrent.TimeoutException