TSV String, Create Cross Table for Numeric Column

TSV String, Create Cross Table for Numeric Column
Aggregates the values in a numeric column with 2 aggregation-key columns (pivot table). Sum, Percentage, Count and Average are tabulated as a cross table TSV. E.g. “Sales for each combination of Client-Y and Store-X” is aggregated from “Sales log TSV”.
Configs
  • A1: Set TSV *#{EL}
  • B1: Set Column ID of Numeric Field (eg “3” ) *#{EL}
  • B2: Set Column ID of Aggregation-Key Y-Field (eg “4” ) *#{EL}
  • B3: Set Column ID of Aggregation-Key X-Field (eg “5” ) *#{EL}
  • C1: To Sort by Numeric Value, Set DESC or ASC (eg “ASC” )#{EL}
  • C2: To Sort by Y-Agg-Key, Set DESC or ASC (eg “ASC” )#{EL}
  • C3: To Sort by X-Agg-Key, Set DESC or ASC (eg “ASC” )#{EL}
  • D1: Select STRING DATA that stores Total Cross TSV (update)
  • D2: Select STRING DATA that stores Percent Cross TSV (update)
  • D3: Select STRING DATA that stores Count Cross TSV (update)
  • D4: Select STRING DATA that stores Average Cross TSV (update)
Script (click to open)
// GraalJS Script (engine type: 2)

//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strTsv         = configs.get( "StrConfA1" );           /// REQUIRED /////////////
  if( strTsv       === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A1: Tsv} is empty \n" );
  }
  const arr2dTsv     = parseAsRectangular( strTsv );
const strNumField    = configs.get( "StrConfB1" );           /// REQUIRED /////////////
  if( strNumField  === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B1: NumField} is empty \n" );
  }
  const numNumField  = parseInt( strNumField, 10 );
  if( isNaN( numNumField ) || numNumField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B1: NumField} must be a positive integer \n" );
  }
  if( numNumField   >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B1: NumField} is larger than TsvWidth \n" );
  }
const strYaggField   = configs.get( "StrConfB2" );           /// REQUIRED /////////////
  if( strYaggField === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: Y-AggField} is empty \n" );
  }
  const numYaggField = parseInt( strYaggField, 10 );
  if( isNaN( numYaggField ) || numYaggField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: YaggField} must be a positive integer \n" );
  }
  if( numYaggField  >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: YaggField} is larger than TsvWidth \n" );
  }
const strXaggField   = configs.get( "StrConfB3" );           /// REQUIRED /////////////
  if( strXaggField === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: X-AggField} is empty \n" );
  }
  const numXaggField = parseInt( strXaggField, 10 );
  if( isNaN( numXaggField ) || numXaggField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: XaggField} must be a positive integer \n" );
  }
  if( numXaggField  >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: XaggField} is larger than TsvWidth \n" );
  }
const strNumSort     = configs.get( "StrConfC1" );           // NotRequired ///////////
const strYaggSort    = configs.get( "StrConfC2" );           // NotRequired ///////////
const strXaggSort    = configs.get( "StrConfC3" );           // NotRequired ///////////

const strPocketTotalPivot   = configs.getObject( "SelectConfD1" ); // NotRequired /////
const strPocketPercentPivot = configs.getObject( "SelectConfD2" ); // NotRequired /////
const strPocketCountPivot   = configs.getObject( "SelectConfD3" ); // NotRequired /////
const strPocketAveragePivot = configs.getObject( "SelectConfD4" ); // NotRequired /////


//// == Data Retrieving / ワークフローデータの参照 ==
// (Nothing. Retrieved via Expression Language in Config Retrieving)


//// == Calculating / 演算 ==

// omit repeated Keys in Y-Aggregation Field values (make Keys uniq)
let arrYaggKeys = [];
for( let i = 0; i < arr2dTsv.length; i++ ){
  if( arrYaggKeys.indexOf( arr2dTsv[i][ numYaggField ] ) === -1){
      arrYaggKeys.push(    arr2dTsv[i][ numYaggField ] );
  }
  // Array.indexOf(): strict equality `===` 
  // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf
}
// omit repeated Keys in X-Aggregation Field values (make Keys uniq)
let arrXaggKeys = [];
for( let i = 0; i < arr2dTsv.length; i++ ){
  if( arrXaggKeys.indexOf( arr2dTsv[i][ numXaggField ] ) === -1){
      arrXaggKeys.push(    arr2dTsv[i][ numXaggField ] );
  }
  // Array.indexOf(): strict equality `===` 
  // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf
}

// sort by Keys
if( strYaggSort === "ASC" ){ // ASC: ascending alphabetical from A to Z
  arrYaggKeys.sort( function( strA, strB ){
    if( strA > strB ){ return  1; }
    if( strA < strB ){ return -1; }
    return 0;
  });
}
if( strYaggSort === "DESC" ){ // DESC: descending from Z to A
  arrYaggKeys.sort( function( strA, strB ){
    if( strA < strB ){ return  1; }
    if( strA > strB ){ return -1; }
    return 0;
  });
}
if( strXaggSort === "ASC" ){ // ASC: ascending alphabetical from A to Z
  arrXaggKeys.sort( function( strA, strB ){
    if( strA > strB ){ return  1; }
    if( strA < strB ){ return -1; }
    return 0;
  });
}
if( strXaggSort === "DESC" ){ // DESC: descending from Z to A
  arrXaggKeys.sort( function( strA, strB ){
    if( strA < strB ){ return  1; }
    if( strA > strB ){ return -1; }
    return 0;
  });
}

// initialize arr2dTotalPivot[y][x]
let arr2dTotalPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dTotalPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dTotalPivot[i][j] = 0;
  }
}
// initialize arr2dPercentPivot[y][x]
let arr2dPercentPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dPercentPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dPercentPivot[i][j] = 0;
  }
}
// initialize arr2dCountPivot[y][x]
let arr2dCountPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dCountPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dCountPivot[i][j] = 0;
  }
}
// initialize arr2dAveragePivot[y][x]
let arr2dAveragePivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dAveragePivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dAveragePivot[i][j] = 0;
  }
}

// sum up
let numGrandTotal = 0;
for( let i = 0; i < arr2dTsv.length; i++ ){
  let numValue = parseFloat( arr2dTsv[i][numNumField].replace(/,/g, '') );
  if( isNaN(numValue) ){
    engine.log( " AutomatedTask StringWarning:" +
                " CellStr is not numeric at line: " + i );
    numValue = 0;
  }
  let strYKey = arr2dTsv[i][numYaggField];
  let strXKey = arr2dTsv[i][numXaggField];
  arr2dTotalPivot[ arrYaggKeys.indexOf( strYKey ) ][ arrXaggKeys.indexOf( strXKey ) ] += numValue;
  arr2dCountPivot[ arrYaggKeys.indexOf( strYKey ) ][ arrXaggKeys.indexOf( strXKey ) ] += 1;
  numGrandTotal += numValue;
}

//// --debug--
// engine.log( " arr2dTotalPivot:" );
// for( let i = 0; i < arr2dTotalPivot.length; i++ ){
//   for( let j= 0; j < arr2dTotalPivot[0].length; j++ ){
//     engine.log( "  " + i + "-" + j + ": " + arr2dTotalPivot[i][j] );
//   }
// }

// X-subtotal Total
let arrXsubtotalTotal = [];
  arrXsubtotalTotal.push( "total" );
for( let i = 0; i < arr2dTotalPivot[0].length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dTotalPivot.length; j++ ){
    numSubtotal += arr2dTotalPivot[j][i];
  }
  arrXsubtotalTotal.push( numSubtotal );
}
arrXsubtotalTotal.push( numGrandTotal );

// X-subtotal Count
let arrXsubtotalCount = [];
  arrXsubtotalCount.push( "total" );
for( let i = 0; i < arr2dCountPivot[0].length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dCountPivot.length; j++ ){
    numSubtotal += arr2dCountPivot[j][i];
  }
  arrXsubtotalCount.push( numSubtotal );
}
arrXsubtotalCount.push( arr2dTsv.length );

// add Y-key and Y-subtotal Total
for( let i = 0; i < arr2dTotalPivot.length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dTotalPivot[i].length; j++ ){
    numSubtotal += arr2dTotalPivot[i][j];
  }
  arr2dTotalPivot[i].push( numSubtotal );          // add Y-total to the end of an array
  arr2dTotalPivot[i].unshift( arrYaggKeys[i] ); // add Y-key to the beginning of an array
}
// Y-subtotal Count
for( let i = 0; i < arr2dCountPivot.length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dCountPivot[i].length; j++ ){
    numSubtotal += arr2dCountPivot[i][j];
  }
  arr2dCountPivot[i].push( numSubtotal );
  arr2dCountPivot[i].unshift( arrYaggKeys[i] );
}

// sort by Y-subtotal
if( strNumSort === "ASC" ){ // ASC: ascending alphabetical from 0 to 10
  arr2dTotalPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
  arr2dCountPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
}
if( strNumSort === "DESC" ){ // DESC: descending from 10 to 0
  arr2dTotalPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
  arr2dCountPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
}

// calc Percent and Average
for( let i = 0; i < arr2dTotalPivot.length; i++ ){
  arr2dPercentPivot[i][0] = arr2dTotalPivot[i][0]; // Y-key
  arr2dAveragePivot[i][0] = arr2dTotalPivot[i][0];
  for( let j = 1; j < arr2dTotalPivot[0].length; j++ ){
    arr2dPercentPivot[i][j] = (arr2dTotalPivot[i][j] / numGrandTotal).toFixed(3);
    arr2dAveragePivot[i][j] = (arr2dTotalPivot[i][j] / arr2dCountPivot[i][j]).toFixed(3);
  }
}

// X-subtotal Percent and Average
let arrXsubtotalPercent = [];
let arrXsubtotalAverage = [];
  arrXsubtotalPercent.push( "total" );
  arrXsubtotalAverage.push( "total" );
for( let i = 1; i < arrXsubtotalTotal.length; i++ ){
  arrXsubtotalPercent.push( (arrXsubtotalTotal[i] / numGrandTotal).toFixed(3) );
  arrXsubtotalAverage.push( (arrXsubtotalTotal[i] / arrXsubtotalCount[i] ).toFixed(3) );
}

// output Cross Tabulation
let strTotalPivot = "";
strTotalPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dTotalPivot.length; i++ ){
  strTotalPivot += arr2dTotalPivot[i].join("\t") + "\n";
}
strTotalPivot += arrXsubtotalTotal.join("\t");

let strPercentPivot = "";
strPercentPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dPercentPivot.length; i++ ){
  strPercentPivot += arr2dPercentPivot[i].join("\t") + "\n";
}
strPercentPivot += arrXsubtotalPercent.join("\t");

let strCountPivot = "";
strCountPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dCountPivot.length; i++ ){
  strCountPivot += arr2dCountPivot[i].join("\t") + "\n";
}
strCountPivot += arrXsubtotalCount.join("\t");

let strAveragePivot = "";
strAveragePivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dAveragePivot.length; i++ ){
  strAveragePivot += arr2dAveragePivot[i].join("\t") + "\n";
}
strAveragePivot += arrXsubtotalAverage.join("\t");


//// == Data Updating / ワークフローデータへの代入 ==
if( strPocketTotalPivot !== null ){
  engine.setData( strPocketTotalPivot, strTotalPivot );
}
if( strPocketPercentPivot !== null ){
  engine.setData( strPocketPercentPivot, strPercentPivot );
}
if( strPocketCountPivot !== null ){
  engine.setData( strPocketCountPivot, strCountPivot );
}
if( strPocketAveragePivot !== null ){
  engine.setData( strPocketAveragePivot, strAveragePivot );
}


} //////// END "main()" /////////////////////////////////////////////////////////////////


// Parses TSV string as two-dimensional rectangular data matrix and creates a 2D array.
function parseAsRectangular( strTsv ){
  const arrTsv = strTsv.split("\n");

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let numMinWidth   = Infinity; // cf. String-Type Max: 1 million
  let numMaxWidth   = 0;
  let numBlanklines = 0;
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      numBlanklines += 1;
      continue;
    }
    let arrCells = arrTsv[i].split("\t");
    if( numMinWidth > arrCells.length ){ numMinWidth = arrCells.length; }
    if( numMaxWidth < arrCells.length ){ numMaxWidth = arrCells.length; }
  }
  engine.log( " AutomatedTask TsvDataCheck:" + 
              " MinWidth:" + numMinWidth +
              " MaxWidth:" + numMaxWidth +
              " Lines:" + arrTsv.length +
              " (BlankLines:" + numBlanklines + ")" );

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let arr2dTsv      = [];
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      continue;
    }
    let arrTmp = [];
    let arrCells = arrTsv[i].split("\t");
    for( let j = 0; j < numMaxWidth; j++ ){
      if( j < arrCells.length ){
        arrTmp[j] = arrCells[j];
      }else{
        arrTmp[j] = "";
      }
    }
    arr2dTsv.push( arrTmp );
  }

  return arr2dTsv;
}

/*
Notes:
- When the process reaches this automated task, TSV is automatically read.
    - TSV: Monthly sales, log for questionnaire with rating, etc.
- If there is a blank line in the input TSV text, it will be skipped.
    - The line feed code for the last line is not added either.
- The output TSV text (total table TSV, etc.) is rectangular data.
    - The number of rows in TSV is #of {Y aggregate Key types} plus 2 ("header" + "total").
    - The number of columns of TSV is #of {X aggregate Key types} plus 2 ("heading" + "total").

APPENDIX:
- TSV (Tab Separated Values) text assumes rectangular data.
    - Data that is not rectangular is automatically formatted with empty characters.
    - Blank lines (including the end) are ignored.
- Only "DESC" or "ASC" is valid for the sort config.
    - If no sort setting, it will be in the order of appearance.
    - Character sorting is in code order.
    - If both character sort and numeric sort are configured, numeric sort is performed after character sort.
- Specify the field column with the column ID (starting with zero).
    - Parsing numeric field values depends on JavaScript `parseFloat()`.
    - If commas in the numeric field, they are considered a thousands separator (The removed string is parsed).
    - A string that cannot be parsed is considered zero.
- Division by zero is output as NaN.

Notes-ja:
- 案件が自動処理工程に到達した際、文字列型データに保存されているTSVが自動的に読み込まれます。
    - TSV: 月次売上、評点付きアンケートのログ、など
- 入力TSVテキストに空行がある場合、スキップされます。
    - 出力TSVの最終行に改行コードは付与されません。
- 出力されるTSVテキスト(合計値テーブルTSV等)は、矩形データです。
    - TSVの行数は、{Y集約Keyの種類数} に2("ヘッダ行" + "合計行")を加えた数になります。
    - TSVの列数は、{X集約Keyの種類数} に2("見出し列" + "合計列")を加えた数になります。

APPENDIX-ja:
- TSV(Tab Separated Values)テキストは、矩形データを前提とします。
    - 矩形でないデータは、空文字によって自動整形されます。
    - 空行(末尾改行を含む)は無視されます。
- ソート設定は "DESC" もしくは "ASC" のみが有効です。
    - ソート設定がない場合、出現順になります。
    - 文字ソートは文字コード順になります。
    - 文字ソートと数値ソートがどちらも設定された場合、文字ソートの後に数値ソートが実行されます。
- フィールド列の指定は、列ID(ゼロ始まり)で設定してください。
    - 数値フィールド値の解析(パース)は JavaScript `parseFloat()` に依存します。
    - 数値フィールド値にカンマが存在する場合、桁区切り文字とみなされます(除去された文字列が解析されます)。
    - 解析できない文字列はゼロと見なされます。
- 0 の除算は NaN と出力されます。
*/

Download

The Add-on import feature is available with Professional edition.
Freely modifiable JavaScript (ECMAScript) code. No warranty of any kind.

Notes

  • When the process reaches this automated task, the TSV is automatically read.
    • TSV: Monthly sales, log for questionnaire with rating, etc.
  • If there is a blank line in the input TSV text, it will be skipped.
    • The line feed code for the last line is not added either.
      - The output TSV text (total table TSV, etc.) is rectangular data.
    • The number of rows in TSV is #of {Y aggregate Key types} plus 2 (“header” + “total”).
    • The number of columns of TSV is #of {X aggregate Key types} plus 2 (“heading” + “total”).

Capture

No CAggregates the values in a numeric column by 2 aggregation-key columns: Pivot table. Sum, Percentage, Count and Average are tabulated as a cross table TSV. Eg, "Sales for each combination of Client-Y and Store-X" is aggregated from "Sales log TSV".ode Crosstab
Aggregates the values in a numeric column by 2 aggregation-key columns: Pivot table. Sum, Percentage, Count and Average are tabulated as a cross table TSV. Eg, "Sales for each combination of Client-Y and Store-X" is aggregated from "Sales log TSV".

Appendix

  • TSV (Tab Separated Values) text assumes rectangular data.
    • Data that is not rectangular is automatically formatted with empty characters.
    • Blank lines (including the end) are ignored.
  • Only “DESC” or “ASC” is valid for the sort config.
    • If no sort settings are specified, it will be sorted in order of appearance.
    • Character sorting is in code order.
    • If both character sort and numeric sort are configured, numeric sort is performed after character sort.
  • Specify the field column with the column ID (starting with zero).
    • Parsing numeric field values depends on JavaScript parseFloat().
    • If there are commas in the numeric field, they are considered as thousands separators (The removed string is parsed).
    • A string that cannot be parsed is considered zero.
  • Division by zero is output as NaN.

See also

4 thoughts on “TSV String, Create Cross Table for Numeric Column”

  1. Pingback: TSV String, Create Cross Tab for Numerical Sum – Questetra Support

  2. Pingback: TSV String, Switch Rows and Columns – Questetra Support

  3. Pingback: TSV String, Create Cross Table of Count – Questetra Support

  4. Pingback: String, Batch Add Thousands Separators – Questetra Support

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top

Discover more from Questetra Support

Subscribe now to keep reading and get access to the full archive.

Continue reading