TSV String, Create Cross Table for Numeric Column
Aggregates the values in a numeric column with 2 aggregation-key columns (pivot table). Sum, Percentage, Count and Average are tabulated as a cross table TSV. E.g. “Sales for each combination of Client-Y and Store-X” is aggregated from “Sales log TSV”.
Configs
  • A1: Set TSV *#{EL}
  • B1: Set Column ID of Numeric Field (eg “3” ) *#{EL}
  • B2: Set Column ID of Aggregation-Key Y-Field (eg “4” ) *#{EL}
  • B3: Set Column ID of Aggregation-Key X-Field (eg “5” ) *#{EL}
  • C1: To Sort by Numeric Value, Set DESC or ASC (eg “ASC” )#{EL}
  • C2: To Sort by Y-Agg-Key, Set DESC or ASC (eg “ASC” )#{EL}
  • C3: To Sort by X-Agg-Key, Set DESC or ASC (eg “ASC” )#{EL}
  • D1: Select STRING DATA that stores Total Cross TSV (update)
  • D2: Select STRING DATA that stores Percent Cross TSV (update)
  • D3: Select STRING DATA that stores Count Cross TSV (update)
  • D4: Select STRING DATA that stores Average Cross TSV (update)
Script (click to open)
// GraalJS Script (engine type: 2)

//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strTsv         = configs.get( "StrConfA1" );           /// REQUIRED /////////////
  if( strTsv       === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A1: Tsv} is empty \n" );
  }
  const arr2dTsv     = parseAsRectangular( strTsv );
const strNumField    = configs.get( "StrConfB1" );           /// REQUIRED /////////////
  if( strNumField  === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B1: NumField} is empty \n" );
  }
  const numNumField  = parseInt( strNumField, 10 );
  if( isNaN( numNumField ) || numNumField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B1: NumField} must be a positive integer \n" );
  }
  if( numNumField   >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B1: NumField} is larger than TsvWidth \n" );
  }
const strYaggField   = configs.get( "StrConfB2" );           /// REQUIRED /////////////
  if( strYaggField === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: Y-AggField} is empty \n" );
  }
  const numYaggField = parseInt( strYaggField, 10 );
  if( isNaN( numYaggField ) || numYaggField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: YaggField} must be a positive integer \n" );
  }
  if( numYaggField  >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: YaggField} is larger than TsvWidth \n" );
  }
const strXaggField   = configs.get( "StrConfB3" );           /// REQUIRED /////////////
  if( strXaggField === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: X-AggField} is empty \n" );
  }
  const numXaggField = parseInt( strXaggField, 10 );
  if( isNaN( numXaggField ) || numXaggField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: XaggField} must be a positive integer \n" );
  }
  if( numXaggField  >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: XaggField} is larger than TsvWidth \n" );
  }
const strNumSort     = configs.get( "StrConfC1" );           // NotRequired ///////////
const strYaggSort    = configs.get( "StrConfC2" );           // NotRequired ///////////
const strXaggSort    = configs.get( "StrConfC3" );           // NotRequired ///////////

const strPocketTotalPivot   = configs.getObject( "SelectConfD1" ); // NotRequired /////
const strPocketPercentPivot = configs.getObject( "SelectConfD2" ); // NotRequired /////
const strPocketCountPivot   = configs.getObject( "SelectConfD3" ); // NotRequired /////
const strPocketAveragePivot = configs.getObject( "SelectConfD4" ); // NotRequired /////


//// == Data Retrieving / ワークフローデータの参照 ==
// (Nothing. Retrieved via Expression Language in Config Retrieving)


//// == Calculating / 演算 ==

// omit repeated Keys in Y-Aggregation Field values (make Keys uniq)
let arrYaggKeys = [];
for( let i = 0; i < arr2dTsv.length; i++ ){
  if( arrYaggKeys.indexOf( arr2dTsv[i][ numYaggField ] ) === -1){
      arrYaggKeys.push(    arr2dTsv[i][ numYaggField ] );
  }
  // Array.indexOf(): strict equality `===` 
  // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf
}
// omit repeated Keys in X-Aggregation Field values (make Keys uniq)
let arrXaggKeys = [];
for( let i = 0; i < arr2dTsv.length; i++ ){
  if( arrXaggKeys.indexOf( arr2dTsv[i][ numXaggField ] ) === -1){
      arrXaggKeys.push(    arr2dTsv[i][ numXaggField ] );
  }
  // Array.indexOf(): strict equality `===` 
  // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf
}

// sort by Keys
if( strYaggSort === "ASC" ){ // ASC: ascending alphabetical from A to Z
  arrYaggKeys.sort( function( strA, strB ){
    if( strA > strB ){ return  1; }
    if( strA < strB ){ return -1; }
    return 0;
  });
}
if( strYaggSort === "DESC" ){ // DESC: descending from Z to A
  arrYaggKeys.sort( function( strA, strB ){
    if( strA < strB ){ return  1; }
    if( strA > strB ){ return -1; }
    return 0;
  });
}
if( strXaggSort === "ASC" ){ // ASC: ascending alphabetical from A to Z
  arrXaggKeys.sort( function( strA, strB ){
    if( strA > strB ){ return  1; }
    if( strA < strB ){ return -1; }
    return 0;
  });
}
if( strXaggSort === "DESC" ){ // DESC: descending from Z to A
  arrXaggKeys.sort( function( strA, strB ){
    if( strA < strB ){ return  1; }
    if( strA > strB ){ return -1; }
    return 0;
  });
}

// initialize arr2dTotalPivot[y][x]
let arr2dTotalPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dTotalPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dTotalPivot[i][j] = 0;
  }
}
// initialize arr2dPercentPivot[y][x]
let arr2dPercentPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dPercentPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dPercentPivot[i][j] = 0;
  }
}
// initialize arr2dCountPivot[y][x]
let arr2dCountPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dCountPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dCountPivot[i][j] = 0;
  }
}
// initialize arr2dAveragePivot[y][x]
let arr2dAveragePivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dAveragePivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dAveragePivot[i][j] = 0;
  }
}

// sum up
let numGrandTotal = 0;
for( let i = 0; i < arr2dTsv.length; i++ ){
  let numValue = parseFloat( arr2dTsv[i][numNumField].replace(/,/g, '') );
  if( isNaN(numValue) ){
    engine.log( " AutomatedTask StringWarning:" +
                " CellStr is not numeric at line: " + i );
    numValue = 0;
  }
  let strYKey = arr2dTsv[i][numYaggField];
  let strXKey = arr2dTsv[i][numXaggField];
  arr2dTotalPivot[ arrYaggKeys.indexOf( strYKey ) ][ arrXaggKeys.indexOf( strXKey ) ] += numValue;
  arr2dCountPivot[ arrYaggKeys.indexOf( strYKey ) ][ arrXaggKeys.indexOf( strXKey ) ] += 1;
  numGrandTotal += numValue;
}

//// --debug--
// engine.log( " arr2dTotalPivot:" );
// for( let i = 0; i < arr2dTotalPivot.length; i++ ){
//   for( let j= 0; j < arr2dTotalPivot[0].length; j++ ){
//     engine.log( "  " + i + "-" + j + ": " + arr2dTotalPivot[i][j] );
//   }
// }

// X-subtotal Total
let arrXsubtotalTotal = [];
  arrXsubtotalTotal.push( "total" );
for( let i = 0; i < arr2dTotalPivot[0].length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dTotalPivot.length; j++ ){
    numSubtotal += arr2dTotalPivot[j][i];
  }
  arrXsubtotalTotal.push( numSubtotal );
}
arrXsubtotalTotal.push( numGrandTotal );

// X-subtotal Count
let arrXsubtotalCount = [];
  arrXsubtotalCount.push( "total" );
for( let i = 0; i < arr2dCountPivot[0].length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dCountPivot.length; j++ ){
    numSubtotal += arr2dCountPivot[j][i];
  }
  arrXsubtotalCount.push( numSubtotal );
}
arrXsubtotalCount.push( arr2dTsv.length );

// add Y-key and Y-subtotal Total
for( let i = 0; i < arr2dTotalPivot.length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dTotalPivot[i].length; j++ ){
    numSubtotal += arr2dTotalPivot[i][j];
  }
  arr2dTotalPivot[i].push( numSubtotal );          // add Y-total to the end of an array
  arr2dTotalPivot[i].unshift( arrYaggKeys[i] ); // add Y-key to the beginning of an array
}
// Y-subtotal Count
for( let i = 0; i < arr2dCountPivot.length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dCountPivot[i].length; j++ ){
    numSubtotal += arr2dCountPivot[i][j];
  }
  arr2dCountPivot[i].push( numSubtotal );
  arr2dCountPivot[i].unshift( arrYaggKeys[i] );
}

// sort by Y-subtotal
if( strNumSort === "ASC" ){ // ASC: ascending alphabetical from 0 to 10
  arr2dTotalPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
  arr2dCountPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
}
if( strNumSort === "DESC" ){ // DESC: descending from 10 to 0
  arr2dTotalPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
  arr2dCountPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
}

// calc Percent and Average
for( let i = 0; i < arr2dTotalPivot.length; i++ ){
  arr2dPercentPivot[i][0] = arr2dTotalPivot[i][0]; // Y-key
  arr2dAveragePivot[i][0] = arr2dTotalPivot[i][0];
  for( let j = 1; j < arr2dTotalPivot[0].length; j++ ){
    arr2dPercentPivot[i][j] = (arr2dTotalPivot[i][j] / numGrandTotal).toFixed(3);
    arr2dAveragePivot[i][j] = (arr2dTotalPivot[i][j] / arr2dCountPivot[i][j]).toFixed(3);
  }
}

// X-subtotal Percent and Average
let arrXsubtotalPercent = [];
let arrXsubtotalAverage = [];
  arrXsubtotalPercent.push( "total" );
  arrXsubtotalAverage.push( "total" );
for( let i = 1; i < arrXsubtotalTotal.length; i++ ){
  arrXsubtotalPercent.push( (arrXsubtotalTotal[i] / numGrandTotal).toFixed(3) );
  arrXsubtotalAverage.push( (arrXsubtotalTotal[i] / arrXsubtotalCount[i] ).toFixed(3) );
}

// output Cross Tabulation
let strTotalPivot = "";
strTotalPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dTotalPivot.length; i++ ){
  strTotalPivot += arr2dTotalPivot[i].join("\t") + "\n";
}
strTotalPivot += arrXsubtotalTotal.join("\t");

let strPercentPivot = "";
strPercentPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dPercentPivot.length; i++ ){
  strPercentPivot += arr2dPercentPivot[i].join("\t") + "\n";
}
strPercentPivot += arrXsubtotalPercent.join("\t");

let strCountPivot = "";
strCountPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dCountPivot.length; i++ ){
  strCountPivot += arr2dCountPivot[i].join("\t") + "\n";
}
strCountPivot += arrXsubtotalCount.join("\t");

let strAveragePivot = "";
strAveragePivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dAveragePivot.length; i++ ){
  strAveragePivot += arr2dAveragePivot[i].join("\t") + "\n";
}
strAveragePivot += arrXsubtotalAverage.join("\t");


//// == Data Updating / ワークフローデータへの代入 ==
if( strPocketTotalPivot !== null ){
  engine.setData( strPocketTotalPivot, strTotalPivot );
}
if( strPocketPercentPivot !== null ){
  engine.setData( strPocketPercentPivot, strPercentPivot );
}
if( strPocketCountPivot !== null ){
  engine.setData( strPocketCountPivot, strCountPivot );
}
if( strPocketAveragePivot !== null ){
  engine.setData( strPocketAveragePivot, strAveragePivot );
}


} //////// END "main()" /////////////////////////////////////////////////////////////////


// Parses TSV string as two-dimensional rectangular data matrix and creates a 2D array.
function parseAsRectangular( strTsv ){
  const arrTsv = strTsv.split("\n");

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let numMinWidth   = Infinity; // cf. String-Type Max: 1 million
  let numMaxWidth   = 0;
  let numBlanklines = 0;
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      numBlanklines += 1;
      continue;
    }
    let arrCells = arrTsv[i].split("\t");
    if( numMinWidth > arrCells.length ){ numMinWidth = arrCells.length; }
    if( numMaxWidth < arrCells.length ){ numMaxWidth = arrCells.length; }
  }
  engine.log( " AutomatedTask TsvDataCheck:" + 
              " MinWidth:" + numMinWidth +
              " MaxWidth:" + numMaxWidth +
              " Lines:" + arrTsv.length +
              " (BlankLines:" + numBlanklines + ")" );

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let arr2dTsv      = [];
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      continue;
    }
    let arrTmp = [];
    let arrCells = arrTsv[i].split("\t");
    for( let j = 0; j < numMaxWidth; j++ ){
      if( j < arrCells.length ){
        arrTmp[j] = arrCells[j];
      }else{
        arrTmp[j] = "";
      }
    }
    arr2dTsv.push( arrTmp );
  }

  return arr2dTsv;
}

/*
Notes:
- When the process reaches this automated task, TSV is automatically read.
    - TSV: Monthly sales, log for questionnaire with rating, etc.
- If there is a blank line in the input TSV text, it will be skipped.
    - The line feed code for the last line is not added either.
- The output TSV text (total table TSV, etc.) is rectangular data.
    - The number of rows in TSV is #of {Y aggregate Key types} plus 2 ("header" + "total").
    - The number of columns of TSV is #of {X aggregate Key types} plus 2 ("heading" + "total").

APPENDIX:
- TSV (Tab Separated Values) text assumes rectangular data.
    - Data that is not rectangular is automatically formatted with empty characters.
    - Blank lines (including the end) are ignored.
- Only "DESC" or "ASC" is valid for the sort config.
    - If no sort setting, it will be in the order of appearance.
    - Character sorting is in code order.
    - If both character sort and numeric sort are configured, numeric sort is performed after character sort.
- Specify the field column with the column ID (starting with zero).
    - Parsing numeric field values depends on JavaScript `parseFloat()`.
    - If commas in the numeric field, they are considered a thousands separator (The removed string is parsed).
    - A string that cannot be parsed is considered zero.
- Division by zero is output as NaN.

Notes-ja:
- 案件が自動処理工程に到達した際、文字列型データに保存されているTSVが自動的に読み込まれます。
    - TSV: 月次売上、評点付きアンケートのログ、など
- 入力TSVテキストに空行がある場合、スキップされます。
    - 出力TSVの最終行に改行コードは付与されません。
- 出力されるTSVテキスト(合計値テーブルTSV等)は、矩形データです。
    - TSVの行数は、{Y集約Keyの種類数} に2("ヘッダ行" + "合計行")を加えた数になります。
    - TSVの列数は、{X集約Keyの種類数} に2("見出し列" + "合計列")を加えた数になります。

APPENDIX-ja:
- TSV(Tab Separated Values)テキストは、矩形データを前提とします。
    - 矩形でないデータは、空文字によって自動整形されます。
    - 空行(末尾改行を含む)は無視されます。
- ソート設定は "DESC" もしくは "ASC" のみが有効です。
    - ソート設定がない場合、出現順になります。
    - 文字ソートは文字コード順になります。
    - 文字ソートと数値ソートがどちらも設定された場合、文字ソートの後に数値ソートが実行されます。
- フィールド列の指定は、列ID(ゼロ始まり)で設定してください。
    - 数値フィールド値の解析(パース)は JavaScript `parseFloat()` に依存します。
    - 数値フィールド値にカンマが存在する場合、桁区切り文字とみなされます(除去された文字列が解析されます)。
    - 解析できない文字列はゼロと見なされます。
- 0 の除算は NaN と出力されます。
*/

Download

2021-09-07 (C) Questetra, Inc. (MIT License)
https://support.questetra.com/addons/tsv-string-create-cross-table-for-numeric-column-2021/
The Add-on import feature is available with Professional edition.
Freely modifiable JavaScript (ECMAScript) code. No warranty of any kind.

Notes

  • When the process reaches this automated task, the TSV is automatically read.
    • TSV: Monthly sales, log for questionnaire with rating, etc.
  • If there is a blank line in the input TSV text, it will be skipped.
    • The line feed code for the last line is not added either.
      - The output TSV text (total table TSV, etc.) is rectangular data.
    • The number of rows in TSV is #of {Y aggregate Key types} plus 2 (“header” + “total”).
    • The number of columns of TSV is #of {X aggregate Key types} plus 2 (“heading” + “total”).

Capture

No CAggregates the values in a numeric column by 2 aggregation-key columns: Pivot table. Sum, Percentage, Count and Average are tabulated as a cross table TSV. Eg, "Sales for each combination of Client-Y and Store-X" is aggregated from "Sales log TSV".ode Crosstab
Aggregates the values in a numeric column by 2 aggregation-key columns: Pivot table. Sum, Percentage, Count and Average are tabulated as a cross table TSV. Eg, "Sales for each combination of Client-Y and Store-X" is aggregated from "Sales log TSV".

Appendix

  • TSV (Tab Separated Values) text assumes rectangular data.
    • Data that is not rectangular is automatically formatted with empty characters.
    • Blank lines (including the end) are ignored.
  • Only “DESC” or “ASC” is valid for the sort config.
    • If no sort settings are specified, it will be sorted in order of appearance.
    • Character sorting is in code order.
    • If both character sort and numeric sort are configured, numeric sort is performed after character sort.
  • Specify the field column with the column ID (starting with zero).
    • Parsing numeric field values depends on JavaScript parseFloat().
    • If there are commas in the numeric field, they are considered as thousands separators (The removed string is parsed).
    • A string that cannot be parsed is considered zero.
  • Division by zero is output as NaN.

See also

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: