#String: Extract Subpattern by RegExp

#String: Extract Subpattern by RegExp

translate #文字列: 正規表現でサブパターン抽出

Extracts one string that matches the regular expression and extracts its subpatterns (Capturing groups). For example, if the regular expression (\d{4})-(\d{2})-(\d{2}) is set, the first complete match in the text—formatted as YYYY-MM-DD—will be extracted, and then the individual values for YYYY, MM, and DD will also be captured. "2025-05-19""2025", "05", "19".

Auto Step icon
Configs for this Auto Step
StrConfA
A: Set Text to Search for *#{EL}
StrConfB1
B1: Set Regular Expression (eg: “(\d{4})-(\d{2})-(\d{2})”) *#{EL}
BoolConfB2
B2: Case Sensitive or Case should be Ignored
SelectConfC0
C0: Select DATA to store Extracted Strings Whole (update)
SelectConfC1
C1: Select DATA for Subpattern CapuringGroupID:1 (update)
SelectConfC2
C2: Select DATA for Subpattern CapuringGroupID:2 (update)
SelectConfC3
C3: Select DATA for Subpattern CapuringGroupID:3 (update)
SelectConfC4
C4: Select DATA for Subpattern CapuringGroupID:4 (update)
SelectConfC5
C5: Select DATA for Subpattern CapuringGroupID:5 (update)
SelectConfC6
C6: Select DATA for Subpattern CapuringGroupID:6 (update)
SelectConfC7
C7: Select DATA for Subpattern CapuringGroupID:7 (update)
SelectConfC8
C8: Select DATA for Subpattern CapuringGroupID:8 (update)
SelectConfC9
C9: Select DATA for Subpattern CapuringGroupID:9 (update)
SelectConfD1
D1: Select NUMERIC for Number of Text Lines (update)
Script (click to open)
// Script Example of Business Process Automation
// for 'engine type: 3' ("GraalJS standard mode")
// cf. 'engine type: 2' ("GraalJS Nashorn compatible mode") (renamed from "GraalJS" at 20230526)


//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strInput       = configs.get       ( "StrConfA" );      // REQUIRED
  if( strInput     === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A: InputText} is empty \n" );
  }
const numInputLines  = strInput.split("\n").length;

const strRegExp      = configs.get       ( "StrConfB1" );     // REQUIRED
const boolIgnoreCase = configs.getObject ( "BoolConfB2" );    // TOGGLE
  // https://questetra.zendesk.com/hc/ja/articles/360024574471-R2300 "Boolean object"
const strPocketC0    = configs.getObject ( "SelectConfC0" );  // not required
const strPocketC1    = configs.getObject ( "SelectConfC1" );  // not required
const strPocketC2    = configs.getObject ( "SelectConfC2" );  // not required
const strPocketC3    = configs.getObject ( "SelectConfC3" );  // not required
const strPocketC4    = configs.getObject ( "SelectConfC4" );  // not required
const strPocketC5    = configs.getObject ( "SelectConfC5" );  // not required
const strPocketC6    = configs.getObject ( "SelectConfC6" );  // not required
const strPocketC7    = configs.getObject ( "SelectConfC7" );  // not required
const strPocketC8    = configs.getObject ( "SelectConfC8" );  // not required
const strPocketC9    = configs.getObject ( "SelectConfC9" );  // not required
const numPocketD1    = configs.getObject ( "SelectConfD1" );  // not required



//// == Data Retrieving / ワークフローデータの参照 ==
// (nothing)



//// == Calculating / 演算 ==
const regSearch  = boolIgnoreCase ?
                   new RegExp( strRegExp, 'i' ) : new RegExp( strRegExp );
let arrMatches   = strInput.match ( regSearch );


//// == Data Updating / ワークフローデータへの代入 ==
/// ref) Retrieving / Updating from ScriptTasks
/// https://questetra.zendesk.com/hc/en-us/articles/360024574771-R2301
/// https://questetra.zendesk.com/hc/ja/articles/360024574771-R2301

if ( strPocketC0 !== null ){ 
  engine.setData( strPocketC0, arrMatches?.[0] ?? "" );
}
if ( strPocketC1 !== null ){ 
  engine.setData( strPocketC1, arrMatches?.[1] ?? "" );
}
if ( strPocketC2 !== null ){ 
  engine.setData( strPocketC2, arrMatches?.[2] ?? "" );
}
if ( strPocketC3 !== null ){ 
  engine.setData( strPocketC3, arrMatches?.[3] ?? "" );
}
if ( strPocketC4 !== null ){ 
  engine.setData( strPocketC4, arrMatches?.[4] ?? "" );
}
if ( strPocketC5 !== null ){ 
  engine.setData( strPocketC5, arrMatches?.[5] ?? "" );
}
if ( strPocketC6 !== null ){ 
  engine.setData( strPocketC6, arrMatches?.[6] ?? "" );
}
if ( strPocketC7 !== null ){ 
  engine.setData( strPocketC7, arrMatches?.[7] ?? "" );
}
if ( strPocketC8 !== null ){ 
  engine.setData( strPocketC8, arrMatches?.[8] ?? "" );
}
if ( strPocketC9 !== null ){ 
  engine.setData( strPocketC9, arrMatches?.[9] ?? "" );
}

if ( numPocketD1 !== null ){ 
  engine.setData( numPocketD1, new java.math.BigDecimal( numInputLines ) );
}

} //////// END "main()" /////////////////////////////////////////////////////////////////



/*
NOTES
- The Process reaches this [Automated Step], the "Extraction" is automatically executed.
    - "Match strings" within the input text are extracted.
- The number of lines in the input text can also be recorded.
- Regular Expressions
    - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions

NOTES-ja
- この[自動工程]に案件が到達すると、「抽出処理」が自動実行されます。
    - Inputテキスト内にある「マッチ文字列」が抽出されます。
- 入力Inputテキストの行数も記録可能です。
- 正規表現とは
    - https://developer.mozilla.org/ja/docs/Web/JavaScript/Guide/Regular_expressions

▼Test Data for Debug:
Project Name: API Auto-Integration Improvement
Responsible Engineer: Ichiro Tanaka
Reporting Period: 2025-04-01 to 2025-05-01

APPENDIX
- A capture group is a part enclosed in "()" in a regular expression.
    - You can extract and use a part of the matched string.
- Capture groups are very useful in the following situations:
    - Extract and reuse a part of the matched string (e.g., splitting up the year, month, and date)
    - Rearrange during replacement (e.g., format conversion like $1 year $2 month $3 day)
- Difference from non-capturing groups
    - Even inside parentheses, if you use syntax like "(?:...)", it becomes a non-capturing group.
    - That part will be matched, but will not be included in the group extracted later.
    - This is useful when you want to group parts you don't need.
- RegExp Example
    - URL
        - `https?://[\w/:%#\$&\?\(\)~\.=\+\-]+`
    - Japanese Postal Code
        - `(\d{3}-\d{4})|(\d{7})`
        - `([0-9]{3}-[0-9]{4})|([0-9]{7})`
    - ISO Date from 2024-12-15 to 2025-01-06
        - `(2024-12-1[5-9])|(2024-12-[2-3][0-9])|(2025-01-0[1-6])`


- キャプチャグループとは、正規表現の中で "()" で囲まれた部分のことを指します。
    - マッチした文字列の一部を抜き出して利用できます。
- キャプチャグループは以下のような場面で非常に役立ちます:
    - マッチした文字列の一部分を取り出して再利用(例:年月日を分解)
    - 置換処理時に再配置(例:$1年$2月$3日 のようなフォーマット変換)
- 非キャプチャグループとの違い
    - 括弧の中でも "(?:...)" のような構文を使うと 非キャプチャグループ になります。
    - その部分はマッチの対象にはなりますが、後で抽出されるグループには含まれません。
    - 必要のない部分をグループ化したいときに便利です。
- 正規表現 設定例 / RegExp Example
    - URL
        - `https?://[\w/:%#\$&\?\(\)~\.=\+\-]+`
    - 日本の郵便番号 / Japanese Postal Code
        - `(\d{3}-\d{4})|(\d{7})`
        - `([0-9]{3}-[0-9]{4})|([0-9]{7})`
    - ISO Date from 2024-12-15 to 2025-01-06
        - `(2024-12-1[5-9])|(2024-12-[2-3][0-9])|(2025-01-0[1-6])`
*/

Download

warning Freely modifiable JavaScript (ECMAScript) code. No warranty of any kind.
(Installing Addon Auto-Steps are available only on the Professional edition.)

Notes

Capture

Appendix

  • A capture group is a part enclosed in “()” in a regular expression.
    • You can extract and use a part of the matched string.
  • Capture groups are very useful in the following situations:
    • Extracting and reusing a part of the matched string (e.g., splitting up the year, month, and date)
    • Rearranging during replacement (e.g., format conversion like $1 year $2 month $3 day)
  • Difference from non-capturing groups
    • Even inside parentheses, if you use syntax like “(?:…)”, it becomes a non-capturing group.
    • That part will be matched, but will not be included in the group extracted later.
    • This is useful when you want to group parts you don’t need.
  • RegExp Example
    • Extract Month
      • RegExp: (\d{4})-(\d{2})-(\d{2})
      • CapturingID: 2
      • Input: 2025 2020/04/16 1973-01-31 444
        • Output: 01
    • Extract File Extension
      • RegExp: \.(\w{2,5})
      • CapturingID: 1
      • Input: report.pdf、design.ai、image.jpeg
        • Output: pdf
    • Extract part of URL
      • RegExp: /d/([\w-]+)/
      • CapturingID: 1
      • Input: https://docs.google.com/document/d/12345abcde67890A-CDE1234_abcde67890ABCDE1234/edit?tab=t.0
        • Output: 12345abcde67890A-CDE1234_abcde67890ABCDE1234
      • Input: https://docs.google.com/presentation/d/12345abcde67890A-CDE1234_abcde67890ABCDE1234/edit#slide=id.p1
        • Output: 12345abcde67890A-CDE1234_abcde67890ABCDE1234
    • Extract Email Domain
      • RegExp: ([\w.-]+)@([\w.-]+\.\w+)
      • CapturingID: 2
      • Input: 担当者:yamada@example.com、連絡先:support@example2.co.jp
        • Output: example.com
    • Extract Registered Domain
      • RegExp: @(?:[\w-]+\.)*([\w-]+\.(?:[a-z]{2,}\.[a-z]{2}|[a-z]{3,}))
      • CapturingID: 1
      • Input: 連絡先はsupport@subdomain.example.co.jpです
        • Output: example.co.jp
      • Input: 連絡先はsupport@subdomain.example.comです
        • Output: example.com

See Also

Scroll to Top

Discover more from Questetra Support

Subscribe now to keep reading and get access to the full archive.

Continue reading