Automating Data Aggregation with [Update CSV Data]

This article explains how to efficiently use SQL aggregate functions to analyze and update numerical data within your CSV files using the [Update CSV Data] process. We’ll cover the basics of aggregate functions and provide configuration examples that show you how to calculate multiple aggregate results in a single query, saving each result to individual Numeric-type Data Items.

Plus, the Appendix offers tips on using generative AI to create queries and troubleshoot common issues.

[Update CSV Data] Related Articles:

1. Understanding Aggregate Functions

SQL aggregate functions let you calculate totals, averages, minimums, maximums, and more from your CSV data. In the [Update CSV Data] process, numerical values from the CSV are initially treated as text. To perform calculations, you’ll need to convert them to a numeric type (like integer) within your SQL query using the `CAST` function.

  • SUM: Calculates the total
  • AVG: Calculates the average
  • MAX / MIN: Determines the maximum and minimum values
SELECT
  SUM(CAST("Sales" as integer))
FROM "SalesTable"

2. Aggregating Your Data

2-1. Calculating Daily Sales Totals

This example shows how to aggregate daily sales amounts from a “SalesTable”.

Configuration Example

Output Options

  • For [C4: How to save query results], select “Save the entire table as a single data item in CSV/TSV format”
  • For [C4b: Data Format When Saving], specify “Header Included (1 row: column names, data starts from row 2)”.

Query Example

From the “Sales Table”, that is set in [C1b: Table name when accessed by query], converts the values in the “Sales” column from the data into numbers, sums them for each date, and sorts them by date.

SELECT
  "Date",
  SUM(CAST("Sales" as integer)) AS "Total Sales"
FROM "Sales Table"
GROUP BY "Date"
ORDER BY "Date"
Daily Sales Total

2-2. Saving the Aggregated Results to a Numeric-type

By selecting the [Save each cell value to a separate data item] option, you can save multiple calculated aggregate values ​​to multiple Numeric-type Data Items simultaneously.

For instance, with a single query on your “Sales Table” data, you can calculate the maximum, average, and total sales, and then save each of these values to separate data items.

Configuration Example

Output Options

  • For [C4: How to save query results] , select “Save each cell value to a separate data item
  • For [C4a: Data item that will save query results], specify which data item should be used to store the value from each cell in the first row of the query result
Query Result CellData Item for Saving
First Cell (Highest Value: MAX)Numeric: Highest Sales
Second Cell (Average Value: AVG)Numeric: Average Sales
Third Cell (Total Value: SUM)Numeric: Total Sales

Query Example

From the “Sales Table”, that is set in [C1b: Table name when accessed by query], converts values in the “Sales” column into numbers and calculates them using functions. Each function is positioned so that the first cell outputs the maximum value, the second the average value, and the third the total value.

SELECT
   MAX(CAST("Sales" as integer)),
   AVG(CAST("Sales" as integer)),
   SUM(CAST("Sales" as integer)) 
FROM "Sales Table"
Saving to Three Separate Numeric-type Data Items

The [CSV Data Update] function allows you to specify [Text Type (Single Line)] data items as the input target using input options. This allows even a single string to be treated as a single-row, single-column CSV data entry.

Using this mechanism, you can read a number stored as STRING in [Text Type (Single Line)] and save them to [Numeric-Type] data items. In other words, you can convert numbers stored as STRING into NUMERIC data.

3. Summary

Data aggregation, which used to require scripting or complex configurations, can now be done simply with SQL. By saving the aggregation results to Numeric data items, you can immediately use them in subsequent processes for conditional branching (e.g. “If the total amount is $〇〇 or more, proceed to the Approval Flow”) or report generation.


Appendix: Creating Queries with Generative AI and Troubleshooting

1. Creating Queries with ChatGPT

You can use generative AI tools like ChatGPT to quickly create SQL queries by simply describing the task you want to accomplish. For example, you could ask it: “Join product master and sales data by product ID and extract only products with an amount of $100 or more.”

  • Even without SQL knowledge, you can express your desired processing steps
  • Including instructions like “a query suitable for an H2 Database” will help the AI provide a more accurate response
    • H2 Database is the RDB used internally
  • Remember to adjust table and column names to match your specific workflow application before using the generated SQL directly.

Here’s an example where ChatGPT (o1) generated SQL. The prompt was adjusted slightly based on the above instructions to improve the output. In this case, because “Product ID” was a string and not a number, type conversion (`CAST`) wasn’t needed. Simply removing that part allowed the query to work correctly and produce the expected result.

Example Prompt to ChatGPT and Generated SQL Query

2. Common Errors and Solutions

If an operation fails, check the error details in the [Error Notification Email] or the [Process Log] to troubleshoot the problem. You can access the [Process Log] by activating “Admin Mode” on the process details screen. (You need [Control Permission] on the target application to enable “Admin Mode”.)

  • Column Name Issues
    • Errors can occur if column names contain spaces, exceed 64 characters, or are duplicated
    • Double-check the header row (the first row) for typos
  • Query Syntax Errors
    • Common errors include forgetting double quotes or using the wrong order for `WHERE` and `FROM` clauses
    • Always double-check the syntax, even when using AI-generated SQL
Error Output Examples in [Process Log]

Scroll to Top

Discover more from Questetra Support

Subscribe now to keep reading and get access to the full archive.

Continue reading