TSV String, Filter by Keywords

TSV String, Filter by Keywords

TSV 文字列, Keywords フィルタ

Filters TSV text to column-matched lines only. The match condition can be either “equals”, “contains”, “begins with”, or “ends with”. A case-insensitive filter can also be specified. It is also possible to separately store the excluded rows (filtered rows).

Auto Step icon
Configs for this Auto Step
StrConfA1
A1: Set TSV Text *#{EL}
StrConfB0
B0: Set ID of COLUMN to be Matched (eg: “0”) *#{EL}
OptionConfB1
B1: Select OPERATOR for Matching Condition *
StrConfB2
B2: Set KEYWORD for Matching Condition (Multi Rows: OR-Search) *#{EL}
SelectConfC1
C1: Select STRING that stores Narrowed Text (update)
SelectConfC2
C2: Select STRING that stores Removed Lines (update)
Script (click to open)
// GraalJS Script (engine type: 2)

//////// START "main()" ////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strInput         = configs.get      ( "StrConfA1" );    /// REQUIRED
  if( strInput       === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A1: String} is empty \n" );
  }
const strColMatchId    = configs.get      ( "StrConfB0" );    /// REQUIRED
  if( strColMatchId  === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B0: String} is empty \n" );
  }
const strOperator      = configs.get      ( "OptionConfB1" ); /// REQUIRED (SELECT_ITEM)
  // "equals",  "conains",  "stratswith",  "endswith",
  // "equals2", "conains2", "startswith2", "endswith2" (2:ignore case)
const strKeywords      = configs.get      ( "StrConfB2" );    /// REQUIRED
  if( strKeywords    === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {B2: String} is empty \n" );
  }
const strPocketOutput  = configs.getObject( "SelectConfC1" ); // NotRequired
const strPocketRemoved = configs.getObject( "SelectConfC2" ); // NotRequired


//// == Data Retrieving / ワークフローデータの参照 ==
// (Nothing. Retrieved via Expression Language in Config Retrieving)


//// == Calculating / 演算 ==
const numColMatchId    = parseInt ( strColMatchId );

const arrInput     = strInput.split( '\n' );
const arrKeywords  = strKeywords.split( '\n' );

let   arrOutput    = [];
let   arrRemoved   = [];

if ( strOperator === "equals" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ] ===
          arrKeywords[j] ){
          // optional chaining (?.) operator returns `undefined` instead of throwing an error.
          // https://developer.mozilla.org/docs/Web/JavaScript/Reference/Operators/Optional_chaining
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else if ( strOperator === "equals2" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ].toUpperCase() ===
           arrKeywords[j].toUpperCase() ){
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else if ( strOperator === "contains" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ]
           .includes( arrKeywords[j] ) ){
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else if ( strOperator === "contains2" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ].toUpperCase()
           .includes( (arrKeywords[j].toUpperCase()) ) ){
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else if ( strOperator === "startswith" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ]
           .startsWith( arrKeywords[j] ) ){
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else if ( strOperator === "startswith2" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ].toUpperCase()
           .startsWith( (arrKeywords[j].toUpperCase()) ) ){
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else if ( strOperator === "endswith" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ]
           .endsWith( arrKeywords[j] ) ){
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else if ( strOperator === "endswith2" ) {
  loopInput:
  for ( let i = 0; i < arrInput.length; i++ ) {
    for ( let j = 0; j < arrKeywords.length; j++ ) {
      if ( arrKeywords[j] === "" ){
        engine.log( " AutomatedTask RuntimeWorning: Empty in Keywords config" );
        continue;
      }
      if ( arrInput[i].split( '\t', (numColMatchId + 1) )?.[ numColMatchId ].toUpperCase()
           .endsWith( (arrKeywords[j].toUpperCase()) ) ){
        arrOutput.push ( arrInput[i] );
        continue loopInput;
      }
    }
    arrRemoved.push ( arrInput[i] );
  }
} else {
  throw new Error( "\n AutomatedTask RuntimeError:" +
                   " Unexpected OPERATOR \n" );
}

let strOutput  = arrOutput?.join( '\n' ) ?? "";
let strRemoved = arrRemoved?.join( '\n' ) ?? "";
// https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Optional_chaining


//// == Data Updating / ワークフローデータへの代入 ==
if ( strPocketOutput !== null ) {
  engine.setData ( strPocketOutput,  strOutput );
}
if ( strPocketRemoved !== null ) {
  engine.setData ( strPocketRemoved, strRemoved );
}

} //////// END "main()" /////////////////////////////////////////////////////////////////



/*
Notes:
- When a process reaches this automated step, "TSV text" is automatically processed.
    - Column value in each line of TSV is compared to the Keyword, and only lines matching the condition are output.
        - If "0" is specified for the column ID, the first value of each line and Keyword are compared.
        - If "1" is specified for the column ID, the second value of each line and Keyword are compared.
    - The match condition can be one of the following: "equals", "contains", "starts with", or "ends with".
    - Note that it is possible to store not only filtered-down rows, but also filtered-out rows.
        - In other words, "does not equal", "does not contain", "does not start with" and "does not end with".
- In case of OR search, set each Keyword separately in each line.
    - OR search: "equals one of the Keywords", "contains one of", "starts with one of", "ends with one of"
    - For example, if you want to extract rows containing "Tokyo" or "Kyoto"
    - Set "Tokyo" on the first row and "Kyoto" on the second row. (OR-Search)
- For AND search, design a workflow diagram such as "Multiple automated steps" or "Place in loop structure".
    - For example, if you want to extract rows containing "Kyoto" and "Spa" design the process in two steps.
        - First, extract rows containing "Kyoto".
        - Then, extract rows containing "Spa".
- You can automate the work process in various business processes (workflow).
    - Confirm registration of product master
    - Confirm registration of e-mail address list
    - Extract only specific header information from mail headers
    - Extract only specific city addresses from a customer address list
    - Extract only the current week's records from server logs on a regular basis
    - Extract only the current month's log from attendance logs or absence logs on a regular basis

Appendix-en:
- In the Keyword setting field, blank lines (blank character settings) are skipped.
    - There is no way to match (extract) only blank lines.
        - In other words, blank lines are always filtered out.
    - If you want to remove blank lines in advance:
        - Use "Multiline String, Delete Empty Lines"
- If the amount of calculation is too large, the process will be terminated in the middle.
    - A large number of Lines and Keywords that requires more than 30 secs will result in a termination.
- No line feed code is inserted at the end of the output string.
    - "Narrowed Text" (filtered down)
    - "Removed Lines" (filtered out)
- Old Versions
    - TSV String, Filter by Text
        - https://support.questetra.com/ja/addons/tsv-string-filter-by-text-2021/
    - TSV String, Filter Out by Text
        - https://support.questetra.com/ja/addons/tsv-string-filter-out-by-text-2021/

Notes-ja:
- 案件がこの自動工程に到達した際、「TSVテキスト」が自動的に加工さます。
    - TSVテキスト各行の指定カラムがそれぞれ Keyword と比較され、条件にマッチした行だけが出力されます。
        - カラムIDに「0」が指定された場合、各行の先頭値と Keyword が比較されます。
        - カラムIDに「1」が指定された場合、各行の2番目の値と Keyword が比較されます。
    - マッチ条件には「Keyword と等しい」「を含む」「で始まる」「で終わる」のいずれかを指定します。
    - なお、フィルタ抽出された行だけでなく、フィルタ除去された行を格納することも可能です。
        - 言い換えれば「と等しくない」「を含まない」「で始まらない」「で終わらない」の抽出も可能です。
- OR 検索の場合は、それぞれの Keyword を各行に分けて設定してください。
    - OR 検索:「Keywords のいずれかと等しい」「のいずれかを含む」「のいずれかで始まる」「のいずれかで終わる」
    - たとえば「東京都」もしくは「京都府」を「含む行」を抽出したい場合は、
    - 「東京都」を1行目に、「京都府」を2行目に設定してください。(OR-Search)
- なお AND 検索は、「自動工程を複数配置」や「ループ構造内に配置」といったワークフロー図を設計してください。
    - たとえば「京都府」かつ「銭湯」を含む行を抽出したい場合は、2段階で処理されるように設計してください。
        - まず「京都府」を含む行を抽出する
        - さらに「銭湯」を含む行を抽出する
- 様々な業務プロセス(ワークフロー)で、作業工程を自動化(無人化)できます。
    - 商品マスターの登録を確認する
    - メールアドレスリストの登録を確認する
    - メールヘッダから特定のヘッダ情報だけを抽出する
    - 顧客住所リストから特定都市の住所だけを抽出する
    - サーバーログから当週レコードだけを定時抽出する
    - 出退勤ログや欠勤申請ログから当月ログだけを定時抽出する

Appendix-ja:
- Keyword 設定欄に "空行" があった場合、無視されます。
    - "空行" だけにマッチさせる設定方法(空行だけを抽出する設定方法)はありません。
        - 言い換えれば、"空行" は必ずフィルタ除去されます。
    - あらかじめ "空行" を除去しておきたい場合:
        - 『複数行文字列, 空行の削除』
- 計算量が多すぎる場合、途中終了します。
    - 行数が多い・Keywordが多いなど、30秒を超えるような処理は途中で終了する可能性があります。
- 出力文字列の末尾に改行コードは挿入されません。
    - "絞り込まれたテキスト"
    - "除去された行"
- 以前のバージョン
    - TSV文字列, テキストフィルタ
        - https://support.questetra.com/ja/addons/tsv-string-filter-by-text-2021/
    - TSV文字列, 除外テキストフィルタ
        - https://support.questetra.com/ja/addons/tsv-string-filter-out-by-text-2021/


Test Data for Debug:
"Process ID"	"Start User ID"	"勤務日"	"出勤時刻"	"退勤時刻"
1977470	458	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 18:00"
1977469	456	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 19:00"
1977468	14	"2023-01-13"	"2023-01-13 08:17"	"2023-01-13 18:05"
1977467	15	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 18:00"
1977466	16	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 17:00"
1977465	92	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 17:00"
1977462	89	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 18:00"
1977459	88	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 18:30"
1977451	85	"2023-01-13"	"2023-01-13 09:00"	"2023-01-13 17:00"
1976698	15	"2023-01-11"	"2023-01-11 09:00"	"2023-01-11 18:30"
1976697	15	"2023-01-10"	"2023-01-10 09:00"	"2023-01-10 19:00"
1976610	458	"2023-01-12"	"2023-01-12 09:00"	"2023-01-12 18:00"
1976609	456	"2023-01-12"	"2023-01-12 09:00"	"2023-01-12 19:00"
1976608	14	"2023-01-12"	"2023-01-12 08:16"	"2023-01-12 18:06"
1976607	15	"2023-01-12"	"2023-01-12 09:00"	"2023-01-12 19:00"
1976606	16	"2023-01-12"	"2023-01-12 09:00"	"2023-01-12 17:00"
1976601	89	"2023-01-12"	"2023-01-12 09:00"	"2023-01-12 18:00"
1976598	88	"2023-01-12"	"2023-01-12 09:00"	"2023-01-12 18:30"
1976589	85	"2023-01-12"	"2023-01-12 09:00"	"2023-01-12 17:00"
1975830	89	"2023-01-10"	"2023-01-10 09:00"	"2023-01-10 18:30"

*/

Download

warning Freely modifiable JavaScript (ECMAScript) code. No warranty of any kind.
(Installing Addon Auto-Steps are available only on the Professional edition.)

Notes

  • When a process reaches this automated step, TSV text is automatically processed.
    • Column value in each line of TSV is compared to the Keyword, and only lines matching the condition are output.
      • If “0” is specified for the column ID, the first value of each line and Keyword are compared.
      • If “1” is specified for the column ID, the second value of each line and Keyword are compared.
    • The match condition can be one of the following: “equals”, “contains”, “starts with”, or “ends with”.
    • Note that it is possible to store not only filtered-down rows, but also filtered-out rows.
      • In other words, “does not equal”, “does not contain”, “does not start with” and “does not end with”.
  • In case of OR search, set each Keyword separately in each line.
    • OR search: “equals one of the Keywords”, “contains one of”, “starts with one of”, “ends with one of”
    • For example, if you want to extract rows containing “Tokyo” or “Kyoto”
    • Set “Tokyo” on the first row and “Kyoto” on the second row. (OR-Search)
  • For AND searches, please design a workflow diagram with multiple automatic processes or place them in a loop structure.
    • For example, if you want to extract rows containing “Kyoto” and “Spa” design the process in two steps.
      • First, extract rows containing “Kyoto”.
      • Then, extract rows containing “Spa”.
  • You can automate the work process in various business processes (workflow).
    • Confirm registration of product master
    • Confirm registration of email address list
    • Extract only specific header information from mail headers
    • Extract only specific city addresses from a customer address list
    • Extract only the current week’s records from server logs on a regular basis
    • Extract only the current month’s log from attendance logs or absence logs on a regular basis

Capture

Narrows down to just the column-matched lines from the TSV text. The match condition operator can be set to "equals", "contains", "starts with", or "ends with". Case-insensitive also available. Also stores the filtered-out lines (removed lines).

Appendix

  • In the Keyword setting field, blank lines (blank character settings) are skipped.
    • There is no way to match (extract) only blank lines.
      • In other words, blank lines are always filtered out.
    • If you want to remove blank lines in advance:
      • Use “Multiline String, Delete Empty Lines”
  • If the amount of calculation is too large, the process will be terminated in the middle.
    • A large number of Lines and Keywords that requires more than 30 secs will result in a termination.
  • No line feed code is inserted at the end of the output string.
    • “Narrowed Text” (filtered down)
    • “Removed Lines” (filtered out)
  • Old Versions

See Also

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: