Download Process Files Using the Workflow API

Using the Workflow API of Questetra BPM Suite, I will introduce a script that will download all files that have been uploaded in a Process where certain conditions have been met. For example, you can implement the following download conditions:

  • Files uploaded by Processes that were started this term in a workflow application
  • Files uploaded in Processes started by a specific User

The general procedure is as follows:

  • Prepare XML data (criteria) which represents the search conditions
  • Use the Process retrieval API endpoint to retrieve Process data that matches the conditions
  • Extract relevant data (e.g. File ID) about files from the acquired Process data
  • Download the target file using the API endpoint for file downloading

Required Environment

When Using a Shell Script

  • jq command
  • For Windows, the curl command runs on Windows Subsystem for Linux

When Using Python

  • Python (3.x series)
  • Requests library (edited in Japanese)

Search for Processes That Match Criteria

For details on how to search for processes using the Workflow API or information on preparing criteria XML data that defines the search conditions, Basic Authentication by using the curl command, and API access methods, you can refer to Downloading Process and Task Lists Using the Workflow API.

The API endpoint that I will use this time is /API/OR/ProcessInstance/list (JSON, UTF-8).

Download Files

The API endpoint for downloading files is /API/OR/ProcessInstance/File/download, but this endpoint takes two parameters.

  • id: the ID assigned to the file
  • processDataInstanceId: the ID assigned to each data item of each process

These two pieces of data can be extracted from the JSON of the process search results. An example of the result is shown below, so let’s take a look. For clarity, only one process is written.

Process Search API Response Example

[
 {
   "activeTokenNodeName": null,
   "data": {
     "1": { ← Data Item number (as displayed in Modeler)
       "dataType": "STRING", ← character Data Item
       ...(omitted)...
     },
     "2": { ← Data item number (as displayed in Modeler)
       "dataType": "FILE2", ← File-type Data Item
       "id": 9143, ← same as processDataInstanceId
       "processDataDefinitionNumber": 2,
       "subType": null,
       "value": [
         {
           "contentType": "image/png",
           "id": 9149, ← ID assigned to the file
           "image": true,
           "length": 71550,
           "lengthText": "69.9 KB",
           "name": "ques-kun-01.png", ← File name
           "processDataInstanceId": 9143 ← ID assigned to each Data Item of each Process
         },
         {
           "contentType": "text/plain",
           "id": 9150, ← ID assigned to the file
           "image": false,
           "length": 1056,
           "lengthText": "1.1 KB",
           "name": "README.txt", ← File name
           "processDataInstanceId": 9143 ← ID assigned to each Data Item of each Process
         }
       ],
       "viewOrder": 2
     }
   },
   ...,
   "processInstanceId": 183,
   ...,
   "processInstanceTitle": "Test",
   ...,
   "processModelInfoName": "For test: file download",
   ...
 },
 ...
]

Data Item objects are stored in a map where the Data Item number is a key.

An object whose dataType is FILE2 represents a File-type Data Item. You can see that ques-kun-01.png and README.txt are uploaded Processes. The id and processDataInstanceId are also stored in this object. By parsing the JSON and extracting these two items, the parameters for API access are complete.

Access the File Download API

By sending the id and processDataInstanceId to the API endpoint, you can download the corresponding file. Only the required parameters are different, but you can access it in the same way as the process search API. The API response is binary data of a file, so save it as appropriate.

Below is a sample code that performs functions from Process retrieval to file downloading. There are two examples; Shell script and Python. This script downloads all the files which have been uploaded in a Process that matches the search conditions. The file storage directory is divided for each process.

The directory name is {process ID} _ {app name} _ ({subject})

If you need the batch download of attachments from the Process, please refer to the code below!

Shell script code example

#!/bin/sh
# Script for Process search
# -u: Basic Authentication option
# Parameter...criteria: search condition XML, start: search result acquisition start position, limit: maximum number of records
process_search=$(curl -u {email address}:{Basic password}\
                'https://example.questetra.net/API/OR/ProcessInstance/list'\
                --data-urlencode criteria@process-criteria.txt\
                --data-urlencode 'start=0'\
                --data-urlencode 'limit=10')
 
# From the JSON returned from the process search API, cut out only the process data with the attached file
# tr -d '[:cntrl:]'removes control characters
processes=$(echo $process_search | tr -d '[:cntrl:]' | jq '.processInstances[]' | jq -s)
# get number of elements for FOR loop
processes_len=$(echo $processes | jq length)
 
# About each process
for i in $(seq 0 $(($processes_len-1)))
do
   # Check if the file is attached
   # For each data, extract dataType is FILE2 and value is not null (file type item where file is uploaded)
   data=$(echo $processes | jq .[$i].data[] | jq 'select(.dataType == "FILE2" and .value != null)' | jq -s)
   # get number of elements for FOR loop
   data_len=$(echo $data | jq length)
   # Element count is 0
   if [ $data_len -eq 0 ] ; then
       continue
   fi
 
   # Create directory name from Process information
   process_id=$(echo $processes | jq .[$i].processInstanceId)
   # Remove double quotes around string with sed
   app_name=$(echo $processes | jq .[$i].processModelInfoName | sed -e "s/\"\$//" | sed -e "s/^\"//")
   process_title=$(echo $processes | jq .[$i].processInstanceTitle | sed -e "s/\"\$//" | sed -e "s/^\"//")
   directory_name=$process_id'_'$process_title'_('$app_name')'
   # Create directory if it doesn't exist
   mkdir -p "./file/$directory_name"
 
   # About each file type item
   for j in $(seq 0 $(($data_len-1)))
   do
       # Extract information of uploaded file
       files=$(echo $data | jq .[$j].value[] | jq -s)
       # get number of elements for FOR loop
       files_len=$(echo $files | jq length)
       # About each file
       for k in $(seq 0 $(($files_len-1)))
       do
           # File ID
           file_id=$(echo $files | jq .[$k].id)
           # Process data ID
           process_data_id=$(echo $files | jq .[$k].processDataInstanceId)
           # file name
           file_name=$(echo $files | jq .[$k].name | sed -e "s/\"\$//" | sed -e "s/^\"//")
           # Save file
           file_path='./file/'$directory_name'/'$file_name
           # Script for file download
           # Save the standard output file by connecting with > (overwrite)
           file_download=$(curl -u {email address}:{Basic password}\
                           'https://example.questetra.net/API/OR/ProcessInstance/File/download'\
                           --data-urlencode 'id='$file_id\
                           --data-urlencode 'processDataInstanceId='$process_data_id > $file_path)
           echo $file_download
       done
   done
done

Python code example

import os
import requests
import json
 
if __name__ == "__main__":
   # API endpoint
   process_search_url = 'https://example.questetra.net/API/OR/ProcessInstance/list'
   file_download_url = 'https://example.questetra.net/API/OR/ProcessInstance/File/download'
 
   # Basic Authentication information
   auth = ("{email address}", "{Basic password}")
 
   # load criteria
   with open('./process-criteria.txt', 'r') as f:
       criteria = f.read()
 
   # Parameters for Process search
   search_params = {
       'criteria': criteria,
       'start': 0,
       'limit': 10,
   }
 
   try:
       # Execute Process search
       r_search = requests.post(process_search_url, data=search_params, auth=auth) # POST submission
       r_search.raise_for_status()
   except requests.exceptions.HTTPError as e:  # catch HTTP errors
       print('Error')
       print('Status Code: {0}'.format(r_search.status_code))
       print(r_search.text)
   except requests.exceptions.Timeout as e:  # catch timeout
       print('Timeout')
 
   # Load JSON returned by process search API
   processes = json.loads(r_search.text)['processInstances']
   # About each Process
   for process in processes:
       # About each Data Item
       for data in process['data'].values():
           # Continue if not File-type or no uploaded file
           if data['dataType'] != 'FILE2' or data['value'] is None:
               continue
           # Create directory name
           directory = './file_py/{0}_{1}_({2})'.format(process['processInstanceId'],
                                                        process['processModelInfoName'],
                                                        process['processInstanceTitle'])
           # Create directory if it doesn't exist
           os.makedirs(directory, exist_ok=True)
           # About each file
           for file_info in data['value']:
               # Parameters for file download
               download_params = {
                   'id': file_info['id'],
                   'processDataInstanceId': file_info['processDataInstanceId']
               }
 
               try:
                   # Execute file download
                   r_download = requests.post(file_download_url, data=download_params, auth=auth)
                   r_download.raise_for_status()
               except requests.exceptions.HTTPError as e:  # catch HTTP errors
                   print('Error')
                   print('Status Code: {0}'.format(r_search.status_code))
                   print(r_search.text)
               except requests.exceptions.Timeout as e:  # catch timeout
                   print('Timeout')
 
               # Write to file (open in binary mode)
               with open(directory + '/' + file_info['name'], 'wb') as f:
                   f.write(r_download.content)
%d bloggers like this: