TSV String: Extract Cells

TSV String, Extract Cells

translate TSV文字列, セルデータの抽出

Extracts the values of specified cells in A1 notation. The extracted string can be stored not only in the STRING type data but also in the NUMERIC type, DATE type, DATETIME type, SELECT type, USER type, and ORG type data if the format allows.

Auto Step icon
Configs for this Auto Step
StrConfA
A: Set TSV String *#{EL}
StrConfB
B: Set A1 notations of Extract Cells to each line (up to 8) *#{EL}
SelectConfC1
C1: Select DATA to store Extracted Cell String (update)
SelectConfC2
C2: Select DATA to store Extracted Cell String (update)
SelectConfC3
C3: Select DATA to store Extracted Cell String (update)
SelectConfC4
C4: Select DATA to store Extracted Cell String (update)
SelectConfC5
C5: Select DATA to store Extracted Cell String (update)
SelectConfC6
C6: Select DATA to store Extracted Cell String (update)
SelectConfC7
C7: Select DATA to store Extracted Cell String (update)
SelectConfC8
C8: Select DATA to store Extracted Cell String (update)
SelectConfD1
D1: Select NUMERIC for Number of TSV Lines (update)
SelectConfD2
D2: Select NUMERIC for Number of TSV Lines Trimmed (update)
Script (click to open)
// GraalJS Script (engine type: 3)


//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strTsv        = configs.get       ( "StrConfA" );     // REQUIRED
  if( strTsv === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A: TSV} is empty \n" );
  }
  const numTsvLines = strTsv.split("\n").length;
  const numTsvNoLf  = strTsv.replace(/[\n]*$/, "").split("\n").length;
                      // delete Line Feed codes at the end

  // get TSV as Array-Array (2d Array)
  const arr2dTsv    = parseAsRectangular( strTsv );  // [row,col]

const strExtractNotations   = configs.get       ( "StrConfB" );     // REQUIRED
  if( strExtractNotations === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B: IDs} not specified \n" );
  }

const multiPocketC1 = configs.getObject ( "SelectConfC1" );  // not required
const multiPocketC2 = configs.getObject ( "SelectConfC2" );  // not required
const multiPocketC3 = configs.getObject ( "SelectConfC3" );  // not required
const multiPocketC4 = configs.getObject ( "SelectConfC4" );  // not required
const multiPocketC5 = configs.getObject ( "SelectConfC5" );  // not required
const multiPocketC6 = configs.getObject ( "SelectConfC6" );  // not required
const multiPocketC7 = configs.getObject ( "SelectConfC7" );  // not required
const multiPocketC8 = configs.getObject ( "SelectConfC8" );  // not required

const arrMultiPocketC = [];
  arrMultiPocketC.push ( multiPocketC1 );
  arrMultiPocketC.push ( multiPocketC2 );
  arrMultiPocketC.push ( multiPocketC3 );
  arrMultiPocketC.push ( multiPocketC4 );
  arrMultiPocketC.push ( multiPocketC5 );
  arrMultiPocketC.push ( multiPocketC6 );
  arrMultiPocketC.push ( multiPocketC7 );
  arrMultiPocketC.push ( multiPocketC8 );

const numPocketD1 = configs.getObject ( "SelectConfD1" );  // not required
const numPocketD2 = configs.getObject ( "SelectConfD2" );  // not required



//// == Data Retrieving / ワークフローデータの参照 ==
// (nothing)



//// == Calculating / 演算 ==

/// Get A1Notations as Coordinate-Array for Extraction;
/// eg: "A1, C1, H1" -> [ [0,0], [2, 0], [7, 0] ]   // [colNum,rowNum] Array (0-start)
let   arrExtractCoords = [];
const regA1Notation = /^(?<colId>[A-Z]+)(?<rowNum>[0-9]+)$/; // RegExp
  // Named capturing group / 名前付きキャプチャグループ
  // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Regular_expressions/Named_capturing_group

const arrExtractNotations = strExtractNotations.split("\n");
for ( let i = 0; i < arrExtractNotations.length; i++ ){
  if( arrExtractNotations[i] === "" ){ // Skip blank lines
    continue;
  }

  let objMatch = arrExtractNotations[i].match( regA1Notation );
  // engine.log( " debug:" + objMatch.groups.colId + " " + objMatch.groups.rowNum );
  let arrTmp = [ alpha2Decimal ( objMatch.groups.colId ) -1, objMatch.groups.rowNum -1 ]; // 0-start
  // engine.log( " debug:" + alpha2Decimal (objMatch.groups.colId) + " " + objMatch.groups.rowNum );

  arrExtractCoords.push ( arrTmp );
}



//// == Data Updating / ワークフローデータへの代入 ==
/// ref) Retrieving / Updating from ScriptTasks
/// https://questetra.zendesk.com/hc/en-us/articles/360024574771-R2301
/// https://questetra.zendesk.com/hc/en-us/articles/360024574771-R2301
// 
/// prepare
const regBpmsYMD     = /^\d{4}-\d{2}-\d{2}$/;                     // RegExp
const regEmail       = /^[\w][\w_\.\-\+]*@[\w\.\-]+\.[a-zA-Z]+$/; // RegExp
const regPositiveInt = /^([1-9]\d*|0)$/;                          // RegExp


///  Get Cell Values  and  Store (loop)

for ( let i = 0; i < arrExtractCoords.length; i++ ){
  let strCellValue = arr2dTsv[ arrExtractCoords[i][1] ][ arrExtractCoords[i][0] ];  // 0 start
  if( strCellValue === "" ){
    engine.log( " AutomatedTask CellString[" + arrExtractCoords[i][0]  + "][" + arrExtractCoords[i][1] + "]: (empty)" );
    engine.setData( arrMultiPocketC[i], null );
    return;
  }
  engine.log( " AutomatedTask CellString[" + arrExtractCoords[i][0]  + "][" + arrExtractCoords[i][1] + "]: " + strCellValue );

  // ▽▽Case: STRING▽▽
  if( arrMultiPocketC[i].matchDataType("STRING") ){
    engine.setData( arrMultiPocketC[i], strCellValue );

  }else if( arrMultiPocketC[i].matchDataType("DECIMAL") ){
    let numCellValue = parseFloat( strCellValue.replace(/,/g, "") );
    if( isNaN( numCellValue ) ){
      throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                       " Invalid Numeric Value \n" );
    }
    engine.setData( arrMultiPocketC[i], new java.math.BigDecimal( numCellValue ) );

  // ▽▽Case: DATE▽▽
  }else if( arrMultiPocketC[i].matchDataType("DATE") ){
    if( regBpmsYMD.test( strCellValue ) ){
      engine.setData( arrMultiPocketC[i], java.sql.Date.valueOf( strCellValue ) );
    }else{
      engine.log( " AutomatedTask DateStringWarning:" +
                  " CellValue should be YYYY-MM-DD" );
      let numMillSec = Date.parse( strCellValue );
      if( isNaN( numMillSec ) ){
        throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                         " Invalid Date Value \n" );
      }
      engine.setData( arrMultiPocketC[i], new java.sql.Date( numMillSec ) );
    }

  // ▽▽Case: DATETIME▽▽
  }else if( arrMultiPocketC[i].matchDataType("DATETIME") ){
    let strDatetimeTmp = strCellValue;
    if( regBpmsYMD.test( strDatetimeTmp ) ){
      strDatetimeTmp += "T00:00:00";
    }
    let numMillSec = Date.parse( strDatetimeTmp );
    if( isNaN( numMillSec ) ){
      throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                       " Invalid Datetime Value \n" );
    }
    engine.setData( arrMultiPocketC[i], new java.sql.Timestamp( numMillSec ) );

  // ▽▽Case: SELECT_SINGLE▽▽
  }else if( arrMultiPocketC[i].matchDataType("SELECT_SINGLE") ){
    let arrTmp = new java.util.ArrayList();
    arrTmp.add( strCellValue );
    engine.setData( arrMultiPocketC[i], arrTmp );

  // ▽▽Case: QUSER▽▽
  }else if( arrMultiPocketC[i].matchDataType("QUSER") ){
    let quserTmp;
    if( regPositiveInt.test( strCellValue ) ){
      quserTmp = quserDao.findById( parseInt(strCellValue) );
      if( quserTmp === null ){
        throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                         " Invalid User Value \n" );
      }
      engine.setData( arrMultiPocketC[i], quserTmp );
    }else if( regEmail.test( strCellValue ) ){
      quserTmp = quserDao.findByEmail( strCellValue );
      if( quserTmp === null ){
        throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                         " Invalid User Value \n" );
      }
      engine.setData( arrMultiPocketC[i], quserTmp );
    }else{
      throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                       " CellValue must be UserId or UserEmail \n" );
    }

  // ▽▽Case: QGROUP▽▽
  }else if( arrMultiPocketC[i].matchDataType("QGROUP") ){
    let qorgTmp;
    if( regPositiveInt.test( strCellValue ) ){
      qorgTmp = qgroupDao.findById( parseInt(strCellValue) );
      if( qorgTmp === null ){
        throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                         " Invalid Organization Value \n" );
      }
      engine.setData( arrMultiPocketC[i], qorgTmp );
    }else{
      throw new Error( "\n AutomatedTask UnexpectedStringError:" +
                       " CellValue must be OrganizationId \n" );
    }

  // ▽▽Case: error▽▽
  }else{ // Unknown
    throw new Error( "\n AutomatedTask CriticalError:" +
                     " DataType Unmatched \n" );
  }
}

if ( numPocketD1 !== null ){ 
  engine.setData( numPocketD1, new java.math.BigDecimal( numTsvLines ) );
}
if ( numPocketD2 !== null ){ 
  engine.setData( numPocketD2, new java.math.BigDecimal( numTsvNoLf ) );
}

} //////// END "main()" /////////////////////////////////////////////////////////////////


function alpha2Decimal ( str ){
  // alpha2Decimal ( alpha: string ): number
  // alpha2Decimal ('a') -> 1
  // alpha2Decimal ('AA') -> 27

  str = str.toUpperCase();
  let numSum = 0;
  const numLen = str.length;
  let   i      = numLen;
  while( -- i > -1 ){
    numSum += ( str.charCodeAt( i ) - 64 ) * Math.pow( 26, numLen - 1 - i ); 
  }
  return numSum;
}


// Parses TSV string as two-dimensional rectangular data matrix and creates a 2D array.
function parseAsRectangular( strTsv ){
  const arrTsv = strTsv.split("\n");

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let numMinWidth   = Infinity; // cf. String-Type Max: 1 million
  let numMaxWidth   = 0;
  let numBlanklines = 0;
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      numBlanklines += 1;
      continue;
    }
    let arrCells = arrTsv[i].split("\t");
    if( numMinWidth > arrCells.length ){ numMinWidth = arrCells.length; }
    if( numMaxWidth < arrCells.length ){ numMaxWidth = arrCells.length; }
  }
  engine.log( " AutomatedTask TsvDataCheck:" + 
              " MinWidth:" + numMinWidth +
              " MaxWidth:" + numMaxWidth +
              " Lines:" + arrTsv.length +
              " (BlankLines:" + numBlanklines + ")" );

  /// Get numMinWidth and numMaxWidth (blank lines are excluded)
  let arr2dTsv      = [];
  for( let i = 0; i < arrTsv.length; i++ ){
    if( arrTsv[i] === "" ){ // Skip blank lines
      continue;
    }
    let arrTmp = [];
    let arrCells = arrTsv[i].split("\t");
    for( let j = 0; j < numMaxWidth; j++ ){
      if( j < arrCells.length ){
        arrTmp[j] = arrCells[j];
      }else{
        arrTmp[j] = "";
      }
    }
    arr2dTsv.push( arrTmp );
  }

  return arr2dTsv;
}




/*
NOTES
- If you place this "Automated Step" in the Workflow diagram, the Extraction will be executed.
- Extracts any cell in the TSV text (specified using A1 notation).
    - If specify `B3` in the config, the third row of column B (second column) will be extracted.
    - If specify `B#{#q_i}` in the config, the i-th row will be extracted.
    - Up to eight cells can be extracted at the same time.
- The number of lines of TSV text can also be extracted at the same time.
- "A1 notation" is a syntax used to define a cell (or cell range) using the column letter and row number.
    - The top left cell (cell in column A, first row) is specified as `A1`.
    - The cell to the right of that is specified as `B1`, and the cell below is specified as `A2`.
- An error will occur if the TSV string is an empty string.
    - An error will occur if the specified cell does not exist.
- If there is a blank line in the input TSV text, it will be skipped.
    - The line feed code for the last line is not added either.
- The data will be deleted if the extracted Cell string is empty. (overwritte with `null`)
- Numerical String is evaluated with `parseFloat()` after removing the comma.
    - An error will occur, if failed to parse.
    - Truncation after the decimal point depends on the format of DataItem definition
- Date or Datetime String is evaluated with `Date()`
    - An error will occur, if failed to parse.
    - `new Date('December 17, 1995 03:24:00')`
    - `new Date('1995-12-17T03:24:00')`
    - `new Date('1995-12-17 03:24')`
    - `new Date('1995-12-17')`
        - In principle, "format including time" is interpreted as local time.
        - In principle, "date only format" is interpreted as UTC time.
        - However, in case of "YYYY-MM-DD", "T00:00:00" is added and localized.
- SELECT String  will be stored only if match the Choice ID (option id)
    - NO ERROR will occur (cannot be detected), even if it does not match any ID.
- QUSER String will be stored only if match the user id or email address.
    - An error will occur, if user not found.
- QORG String will be stored only if match the organization id.
    - An error will occur, if organization not found.

NOTES-ja
- この[自動工程]をワークフロー図に配置すれば、案件到達の度に「抽出処理」が自動実行されます。
- TSVテキスト内の任意のセル(A1表記で指定)を自動的に抽出します。
    - [自動工程]のコンフィグにて `B3` と指定しておけば、B列(第2列目)の3行目が抽出されます。
    - [自動工程]のコンフィグにて `B#{#q_i}` と指定しておけば、B列(第2列目)のi行目が抽出されます。
    - 同時に8セルまで抽出できます。
- TSVテキストの行数も同時に抽出できます。
- "A1表記" とは、列文字と行番号を使用してセル (またはセルの範囲) を定義するために使用される構文です。
    - 最も左上のセル(A列1行目のセル)は `A1` です。
    - その右隣のセルは `B1` 、下隣のセルは `A2` です。
- 入力TSVテキストに空行がある場合、スキップされます。
    - 出力TSVの最終行に改行コードは付与されません。
- TSV文字列が空文字列の場合、エラーとなります。
    - 指定セルが存在しない場合、エラーとなります。
- 抽出した文字列が空文字列の場合、格納先のデータは削除されます(`null` で上書)
- 数値文字列は、カンマ(桁区切り文字)が除去された上で、yparseFloat()` 関数で評価されます
    - 文字列の解析に失敗した場合、エラーとなります
    - 小数点以下の切り捨て処理は、格納先データ項目の書式に寄ります
- 日付文字列および日時文字列は、`Date()` 関数で評価されます
    - 文字列の解析に失敗した場合、エラーとなります
    - `new Date('December 17, 1995 03:24:00')`
    - `new Date('1995-12-17T03:24:00')`
    - `new Date('1995-12-17 03:24')`
    - `new Date('1995-12-17')`
        - なお「時刻を含む形式」はローカルタイムとして解釈されます
        - また「日付のみの形式」は原則 UTC 時刻と解釈されます
        - ただし "YYYY-MM-DD" 書式は "T00:00:00" が自動付加(ローカルタイム化)されます
- 選択文字列は、選択肢IDに合致する場合に限り代入されます
    - なお、いずれのIDにも合致しない場合でも、エラーになりません(検知できません)
- ユーザを表す文字列は、ユーザIDもしくはメールアドレスに合致する場合に限り代入されます
    - ユーザが存在しない場合、エラーとなります
- 組織を表す文字列は、組織IDに合致する場合に限り代入されます
    - 組織が存在しない場合、エラーとなります
*/

/*
APPENDIX
- Error Log sample1
    - AutomatedTask ConfigError: Config {B:ID} must be less than TSV size
- Error Log sample2 (Wrong Email, Wrong Id, etc)
    - AutomatedTask UnexpectedStringError: Invalid User Value 
*/

Download

warning Freely modifiable JavaScript (ECMAScript) code. No warranty of any kind.
(Installing Addon Auto-Steps are available only on the Professional edition.)

Notes

  • If you place this Automated Step in the Workflow diagram, the extraction process will be executed automatically each time a token arrives.
  • Extracts any cell in the TSV text (specified using A1 notation).
    • If you specify B3 in the configs, the third row of column B (second column) will be extracted.
    • If you specify B#{#q_i} in the configs, the i-th row will be extracted.
    • Up to eight cells can be extracted at the same time.
  • The number of lines of TSV text can also be extracted at the same time.
  • “A1 notation” is a syntax used to define a cell (or cell range) using the column letter and row number.
    • The top left cell (cell in column A, first row) is specified as A1.
    • The cell to the right of that is specified as B1, and the cell below is specified as A2.
  • An error will occur if the TSV string is an empty string.
    • An error will occur if the specified cell does not exist.
  • If there is a blank line in the input TSV text, it will be skipped.
    • The line feed code for the last line is not added either.
  • The data will be deleted if the extracted Cell string is empty. (overwritten with null)
  • Numerical Strings are evaluated with parseFloat() after removing commas.
    • An error will occur if parsing the string fails.
    • Truncation after the decimal point depends on the format of the Data Item definition
  • Date or Datetime Strings are evaluated with Date()
    • An error will occur if parsing the string fails.
    • new Date('December 17, 1995 03:24:00')
    • new Date('1995-12-17T03:24:00')
    • new Date('1995-12-17 03:24')
    • new Date('1995-12-17')
      • Formats that include the time are interpreted as local time.
      • Formats that only include the date are generally interpreted as UTC time.
      • However, YYYYY-MM-DD format will automatically add “T00:00:00” (local time)
  • SELECT Strings will be stored only if they match the Choice ID (option id)
    • If it does not match any ID, NO ERROR will occur (cannot be detected)
  • QUSER Strings will be stored only if they match the user id or email address.
    • An error will occur if the user is not found.
  • QORG Strings will be stored only if they match the organization id.
    • An error will occur if the organization is not found.

Capture

Appendix

  • Error Log sample1
    • AutomatedTask ConfigError: Config {B:ID} must be less than TSV size
  • Error Log sample2 (Wrong Email, Wrong Id, etc)
    • AutomatedTask UnexpectedStringError: Invalid User Value

See Also

Discover more from Questetra Support

Subscribe now to keep reading and get access to the full archive.

Continue reading