TSV String, Create Cross Table of Count

TSV String, Create Cross Table of Count

TSV String, Create Cross Table of Count
Aggregates the count of data by 2 aggregation-key columns (pivot table). Count and Percentage are tabulated as a cross table TSV. E.g. Survey Results TSV is automatically generated from Survey Records TSV. 2D frequency distribution.
Configs
  • A1: Set TSV *#{EL}
  • B2: Set Column ID of Aggregation-Key Y-Field (eg “4” ) *#{EL}
  • B3: Set Column ID of Aggregation-Key X-Field (eg “5” ) *#{EL}
  • C1: To Sort by Count, Set DESC or ASC (eg “ASC” )#{EL}
  • C2: To Sort by Y-Agg-Key, Set DESC or ASC (eg “ASC” )#{EL}
  • C3: To Sort by X-Agg-Key, Set DESC or ASC (eg “ASC” )#{EL}
  • D1: Select STRING DATA that stores Count Cross TSV (update)
  • D2: Select STRING DATA that stores Percent Cross TSV (update)
Script (click to open)
// GraalJS Script (engine type: 2)

//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strTsv         = configs.get( "StrConfA1" );           /// REQUIRED /////////////
  if( strTsv       === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A1: Tsv} is empty \n" );
  }
  const arr2dTsv     = parseAsRectangular( strTsv );
const strYaggField   = configs.get( "StrConfB2" );           /// REQUIRED /////////////
  if( strYaggField === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: Y-AggField} is empty \n" );
  }
  const numYaggField = parseInt( strYaggField, 10 );
  if( isNaN( numYaggField ) || numYaggField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: YaggField} must be a positive integer \n" );
  }
  if( numYaggField  >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: YaggField} is larger than TsvWidth \n" );
  }
const strXaggField   = configs.get( "StrConfB3" );           /// REQUIRED /////////////
  if( strXaggField === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: X-AggField} is empty \n" );
  }
  const numXaggField = parseInt( strXaggField, 10 );
  if( isNaN( numXaggField ) || numXaggField < 0 ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: XaggField} must be a positive integer \n" );
  }
  if( numXaggField  >= arr2dTsv[0].length ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B3: XaggField} is larger than TsvWidth \n" );
  }
const strNumSort     = configs.get( "StrConfC1" );           // NotRequired ///////////
const strYaggSort    = configs.get( "StrConfC2" );           // NotRequired ///////////
const strXaggSort    = configs.get( "StrConfC3" );           // NotRequired ///////////

const strPocketCountPivot   = configs.getObject( "SelectConfD1" ); // NotRequired /////
const strPocketPercentPivot = configs.getObject( "SelectConfD2" ); // NotRequired /////


//// == Data Retrieving / ワークフローデータの参照 ==
// (Nothing. Retrieved via Expression Language in Config Retrieving)


//// == Calculating / 演算 ==

// omit repeated Keys in Y-Aggregation Field values (make Keys uniq)
let arrYaggKeys = [];
for( let i = 0; i < arr2dTsv.length; i++ ){
  if( arrYaggKeys.indexOf( arr2dTsv[i][ numYaggField ] ) === -1){
      arrYaggKeys.push(    arr2dTsv[i][ numYaggField ] );
  }
  // Array.indexOf(): strict equality `===` 
  // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf
}
// omit repeated Keys in X-Aggregation Field values (make Keys uniq)
let arrXaggKeys = [];
for( let i = 0; i < arr2dTsv.length; i++ ){
  if( arrXaggKeys.indexOf( arr2dTsv[i][ numXaggField ] ) === -1){
      arrXaggKeys.push(    arr2dTsv[i][ numXaggField ] );
  }
  // Array.indexOf(): strict equality `===` 
  // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Global_Objects/Array/indexOf
}

// sort by Keys
if( strYaggSort === "ASC" ){ // ASC: ascending alphabetical from A to Z
  arrYaggKeys.sort( function( strA, strB ){
    if( strA > strB ){ return  1; }
    if( strA < strB ){ return -1; }
    return 0;
  });
}
if( strYaggSort === "DESC" ){ // DESC: descending from Z to A
  arrYaggKeys.sort( function( strA, strB ){
    if( strA < strB ){ return  1; }
    if( strA > strB ){ return -1; }
    return 0;
  });
}
if( strXaggSort === "ASC" ){ // ASC: ascending alphabetical from A to Z
  arrXaggKeys.sort( function( strA, strB ){
    if( strA > strB ){ return  1; }
    if( strA < strB ){ return -1; }
    return 0;
  });
}
if( strXaggSort === "DESC" ){ // DESC: descending from Z to A
  arrXaggKeys.sort( function( strA, strB ){
    if( strA < strB ){ return  1; }
    if( strA > strB ){ return -1; }
    return 0;
  });
}

// initialize arr2dCountPivot[y][x]
let arr2dCountPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dCountPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dCountPivot[i][j] = 0;
  }
}
// initialize arr2dPercentPivot[y][x]
let arr2dPercentPivot = [];
for( let i = 0; i < arrYaggKeys.length; i++ ){
  arr2dPercentPivot[i] = [];
  for( let j = 0; j < arrXaggKeys.length; j++ ){
    arr2dPercentPivot[i][j] = 0;
  }
}

// count up
for( let i = 0; i < arr2dTsv.length; i++ ){
  let strYKey = arr2dTsv[i][numYaggField];
  let strXKey = arr2dTsv[i][numXaggField];
  arr2dCountPivot[ arrYaggKeys.indexOf( strYKey ) ][ arrXaggKeys.indexOf( strXKey ) ] += 1;
}

// X-subtotal Count
let arrXsubtotalCount = [];
  arrXsubtotalCount.push( "total" );
for( let i = 0; i < arr2dCountPivot[0].length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dCountPivot.length; j++ ){
    numSubtotal += arr2dCountPivot[j][i];
  }
  arrXsubtotalCount.push( numSubtotal );
}
arrXsubtotalCount.push( arr2dTsv.length );

// add Y-key and Y-subtotal Count
for( let i = 0; i < arr2dCountPivot.length; i++ ){
  let numSubtotal = 0;
  for( let j = 0; j < arr2dCountPivot[i].length; j++ ){
    numSubtotal += arr2dCountPivot[i][j];
  }
  arr2dCountPivot[i].push( numSubtotal );       // add Y-total to the end of an array
  arr2dCountPivot[i].unshift( arrYaggKeys[i] ); // add Y-key to the beginning of an array
}

// sort by Y-subtotal
if( strNumSort === "ASC" ){ // ASC: ascending alphabetical from 0 to 10
  arr2dCountPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
}
if( strNumSort === "DESC" ){ // DESC: descending from 10 to 0
  arr2dCountPivot.sort( function( arrA, arrB ){
    if( arrA[arrA.length-1] < arrB[arrB.length-1] ){ return  1; }
    if( arrA[arrA.length-1] > arrB[arrB.length-1] ){ return -1; }
    return 0;
  });
}

// calc Percent
for( let i = 0; i < arr2dCountPivot.length; i++ ){
  arr2dPercentPivot[i][0] = arr2dCountPivot[i][0]; // Y-key
  for( let j = 1; j < arr2dCountPivot[0].length; j++ ){
    arr2dPercentPivot[i][j] = (arr2dCountPivot[i][j] / arr2dTsv.length).toFixed(3);
  }
}

// X-subtotal Percent and Average
let arrXsubtotalPercent = [];
  arrXsubtotalPercent.push( "total" );
for( let i = 1; i < arrXsubtotalCount.length; i++ ){
  arrXsubtotalPercent.push( (arrXsubtotalCount[i] / arr2dTsv.length).toFixed(3) );
}

// output Cross Tabulation
let strCountPivot = "";
strCountPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dCountPivot.length; i++ ){
  strCountPivot += arr2dCountPivot[i].join("\t") + "\n";
}
strCountPivot += arrXsubtotalCount.join("\t");

let strPercentPivot = "";
strPercentPivot += "cross\t" + arrXaggKeys.join("\t") + "\ttotal\n";
for( let i = 0; i < arr2dPercentPivot.length; i++ ){
  strPercentPivot += arr2dPercentPivot[i].join("\t") + "\n";
}
strPercentPivot += arrXsubtotalPercent.join("\t");


//// == Data Updating / ワークフローデータへの代入 ==
if( strPocketCountPivot !== null ){
  engine.setData( strPocketCountPivot, strCountPivot );
}
if( strPocketPercentPivot !== null ){
  engine.setData( strPocketPercentPivot, strPercentPivot );
}


} //////// END "main()" /////////////////////////////////////////////////////////////////


// Parses TSV string as two-dimensional rectangular data matrix and creates a 2D array.
function parseAsRectangular( strTsv ){
  const arrTsv = strTsv.split("\n");

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let numMinWidth   = Infinity; // cf. String-Type Max: 1 million
  let numMaxWidth   = 0;
  let numBlanklines = 0;
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      numBlanklines += 1;
      continue;
    }
    let arrCells = arrTsv[i].split("\t");
    if( numMinWidth > arrCells.length ){ numMinWidth = arrCells.length; }
    if( numMaxWidth < arrCells.length ){ numMaxWidth = arrCells.length; }
  }
  engine.log( " AutomatedTask TsvDataCheck:" + 
              " MinWidth:" + numMinWidth +
              " MaxWidth:" + numMaxWidth +
              " Lines:" + arrTsv.length +
              " (BlankLines:" + numBlanklines + ")" );

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let arr2dTsv      = [];
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      continue;
    }
    let arrTmp = [];
    let arrCells = arrTsv[i].split("\t");
    for( let j = 0; j < numMaxWidth; j++ ){
      if( j < arrCells.length ){
        arrTmp[j] = arrCells[j];
      }else{
        arrTmp[j] = "";
      }
    }
    arr2dTsv.push( arrTmp );
  }

  return arr2dTsv;
}

/*
TSV Example:
2021-08	Orange	200
2021-08	Apple	1,200
2021-09	Apple	1,100
2021-09	Orange	200
2021-09	Tomato	500
2021-09	Orange	200
2021-09	Tomato	500
2021-10	Tomato	500
2021-10	Tomato	500
2021-10	Tomato	450
2021-10	Tomato	450


Notes:
- When the process reaches this automated task, TSV is automatically read.
    - TSV: Sales Records, Survey Records, etc.
- If there is a blank line in the input TSV text, it will be skipped.
    - The line feed code for the last line is not added either.
- The output TSV text (total table TSV, etc.) is rectangular data.
    - The number of rows in TSV is #of {Y aggregate Key types} plus 2 ("header" + "total").
    - The number of columns of TSV is #of {X aggregate Key types} plus 2 ("heading" + "total").

APPENDIX:
- TSV (Tab Separated Values) text assumes rectangular data.
    - Data that is not rectangular is automatically formatted with empty characters.
    - Blank lines (including the end) are ignored.
- Only "DESC" or "ASC" is valid for the sort config.
    - If no sort setting, it will be in the order of appearance.
    - Character sorting is in code order.
    - If both character sort and numeric sort are configured, numeric sort is performed after character sort.

Notes-ja:
- 案件が自動処理工程に到達した際、文字列型データに保存されているTSVが自動的に読み込まれます。
    - TSV: 売上レコード、アンケートのログ、など
- 入力TSVテキストに空行がある場合、スキップされます。
    - 出力TSVの最終行に改行コードは付与されません。
- 出力されるTSVテキスト(合計値テーブルTSV等)は、矩形データです。
    - TSVの行数は、{Y集約Keyの種類数} に2("ヘッダ行" + "合計行")を加えた数になります。
    - TSVの列数は、{X集約Keyの種類数} に2("見出し列" + "合計列")を加えた数になります。

APPENDIX-ja:
- TSV(Tab Separated Values)テキストは、矩形データを前提とします。
    - 矩形でないデータは、空文字によって自動整形されます。
    - 空行(末尾改行を含む)は無視されます。
- ソート設定は "DESC" もしくは "ASC" のみが有効です。
    - ソート設定がない場合、出現順になります。
    - 文字ソートは文字コード順になります。
    - 文字ソートと数値ソートがどちらも設定された場合、文字ソートの後に数値ソートが実行されます。
*/

Download

2021-09-12 (C) Questetra, Inc. (MIT License)
https://support.questetra.com/addons/tsv-string-create-cross-table-of-count-2021/
The Add-on import feature is available with Professional edition.
Freely modifiable JavaScript (ECMAScript) code. No warranty of any kind.

Notes

  • When the process reaches this automated task, the TSV is read automatically.
    • TSV: Sales Records, Survey Records, etc.
  • If there is a blank line in the input TSV text, it will be skipped.
    • The line feed code for the last line is not added either.
      - The output TSV text (total table TSV, etc.) is rectangular data.
    • The number of rows in TSV is #of {Y aggregate Key types} plus 2 (“header” + “total”).
    • The number of columns of TSV is #of {X aggregate Key types} plus 2 (“heading” + “total”).

Capture

Aggregates the count of data by 2 aggregation-key columns: Pivot table. Count and Percentage are tabulated as a cross table TSV. Eg, Survey Results TSV is automatically generated from Survey Records TSV. 2D frequency distribution.
Aggregates the count of data by 2 aggregation-key columns: Pivot table. Count and Percentage are tabulated as a cross table TSV. Eg, Survey Results TSV is automatically generated from Survey Records TSV. 2D frequency distribution.

Appendix

  • TSV (Tab Separated Values) text assumes rectangular data.
    • Data that is not rectangular is automatically formatted with empty characters.
    • Blank lines (including the end) are ignored.
  • Only “DESC” or “ASC” is valid for the sort config.
    • If no sort setting is specified, it will be sorted in the order of appearance.
    • Character sorting is in code order.
    • If both character sort and numeric sort are configured, numeric sort is performed after character sort.

See also

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Scroll to Top

Discover more from Questetra Support

Subscribe now to keep reading and get access to the full archive.

Continue reading