#文字列: 正規表現でサブパターン抽出

translate #String: Extract Subpattern by RegExp

正規表現にマッチする文字列を１つ抽出し、そのサブパターン（キャプチャグループ）を抽出します。たとえば「(\d{4})-(\d{2})-(\d{2})」という正規表現がセットされていれば、まずテキスト文中で最初に完全に一致した YYYY-MM-DD が抽出され、その上で YYYY MM DD の各値が抽出されます。”2025-05-19″ → “2025”, “05”, “19”

Configs for this Auto Step

StrConfA: A: 探索対象テキストをセットしてください *^#{EL}
StrConfB1: B1: 正規表現をセットしてください（例: “(\d{4})-(\d{2})-(\d{2})”） *^#{EL}
BoolConfB2: B2: 大文字小文字を区別 ⇔ 大文字小文字を無視
SelectConfC0: C0: 抽出文字列全体が格納される文字列型データを選択してください (更新)
SelectConfC1: C1: サブパターン（キャプチャID:1）が格納される文字列型データを選択してください (更新)
SelectConfC2: C2: サブパターン（キャプチャID:2）が格納される文字列型データを選択してください (更新)
SelectConfC3: C3: サブパターン（キャプチャID:3）が格納される文字列型データを選択してください (更新)
SelectConfC4: C4: サブパターン（キャプチャID:4）が格納される文字列型データを選択してください (更新)
SelectConfC5: C5: サブパターン（キャプチャID:5）が格納される文字列型データを選択してください (更新)
SelectConfC6: C6: サブパターン（キャプチャID:6）が格納される文字列型データを選択してください (更新)
SelectConfC7: C7: サブパターン（キャプチャID:7）が格納される文字列型データを選択してください (更新)
SelectConfC8: C8: サブパターン（キャプチャID:8）が格納される文字列型データを選択してください (更新)
SelectConfC9: C9: サブパターン（キャプチャID:9）が格納される文字列型データを選択してください (更新)
SelectConfD1: D1: テキスト行数を格納する数値型データを選択してください (更新)

Script (click to open)

// Script Example of Business Process Automation
// for 'engine type: 3' ("GraalJS standard mode")
// cf. 'engine type: 2' ("GraalJS Nashorn compatible mode") (renamed from "GraalJS" at 20230526)


//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strInput       = configs.get       ( "StrConfA" );      // REQUIRED
  if( strInput     === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A: InputText} is empty \n" );
  }
const numInputLines  = strInput.split("\n").length;

const strRegExp      = configs.get       ( "StrConfB1" );     // REQUIRED
const boolIgnoreCase = configs.getObject ( "BoolConfB2" );    // TOGGLE
  // https://questetra.zendesk.com/hc/ja/articles/360024574471-R2300 "Boolean object"
const strPocketC0    = configs.getObject ( "SelectConfC0" );  // not required
const strPocketC1    = configs.getObject ( "SelectConfC1" );  // not required
const strPocketC2    = configs.getObject ( "SelectConfC2" );  // not required
const strPocketC3    = configs.getObject ( "SelectConfC3" );  // not required
const strPocketC4    = configs.getObject ( "SelectConfC4" );  // not required
const strPocketC5    = configs.getObject ( "SelectConfC5" );  // not required
const strPocketC6    = configs.getObject ( "SelectConfC6" );  // not required
const strPocketC7    = configs.getObject ( "SelectConfC7" );  // not required
const strPocketC8    = configs.getObject ( "SelectConfC8" );  // not required
const strPocketC9    = configs.getObject ( "SelectConfC9" );  // not required
const numPocketD1    = configs.getObject ( "SelectConfD1" );  // not required



//// == Data Retrieving / ワークフローデータの参照 ==
// (nothing)



//// == Calculating / 演算 ==
const regSearch  = boolIgnoreCase ?
                   new RegExp( strRegExp, 'i' ) : new RegExp( strRegExp );
let arrMatches   = strInput.match ( regSearch );


//// == Data Updating / ワークフローデータへの代入 ==
/// ref) Retrieving / Updating from ScriptTasks
/// https://questetra.zendesk.com/hc/ja/articles/360024574771-R2301
/// https://questetra.zendesk.com/hc/ja/articles/360024574771-R2301

if ( strPocketC0 !== null ){ 
  engine.setData( strPocketC0, arrMatches?.[0] ?? "" );
}
if ( strPocketC1 !== null ){ 
  engine.setData( strPocketC1, arrMatches?.[1] ?? "" );
}
if ( strPocketC2 !== null ){ 
  engine.setData( strPocketC2, arrMatches?.[2] ?? "" );
}
if ( strPocketC3 !== null ){ 
  engine.setData( strPocketC3, arrMatches?.[3] ?? "" );
}
if ( strPocketC4 !== null ){ 
  engine.setData( strPocketC4, arrMatches?.[4] ?? "" );
}
if ( strPocketC5 !== null ){ 
  engine.setData( strPocketC5, arrMatches?.[5] ?? "" );
}
if ( strPocketC6 !== null ){ 
  engine.setData( strPocketC6, arrMatches?.[6] ?? "" );
}
if ( strPocketC7 !== null ){ 
  engine.setData( strPocketC7, arrMatches?.[7] ?? "" );
}
if ( strPocketC8 !== null ){ 
  engine.setData( strPocketC8, arrMatches?.[8] ?? "" );
}
if ( strPocketC9 !== null ){ 
  engine.setData( strPocketC9, arrMatches?.[9] ?? "" );
}

if ( numPocketD1 !== null ){ 
  engine.setData( numPocketD1, new java.math.BigDecimal( numInputLines ) );
}

} //////// END "main()" /////////////////////////////////////////////////////////////////



/*
NOTES
- The Process reaches this [Automated Step], the "Extraction" is automatically executed.
    - "Match strings" within the input text are extracted.
- The number of lines in the input text can also be recorded.
- Regular Expressions
    - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_expressions

NOTES-ja
- この［自動工程］に案件が到達すると、「抽出処理」が自動実行されます。
    - Inputテキスト内にある「マッチ文字列」が抽出されます。
- 入力Inputテキストの行数も記録可能です。
- 正規表現とは
    - https://developer.mozilla.org/ja/docs/Web/JavaScript/Guide/Regular_expressions

▼Test Data for Debug:
Project Name: API Auto-Integration Improvement
Responsible Engineer: Ichiro Tanaka
Reporting Period: 2025-04-01 to 2025-05-01

APPENDIX
- A capture group is a part enclosed in "()" in a regular expression.
    - You can extract and use a part of the matched string.
- Capture groups are very useful in the following situations:
    - Extract and reuse a part of the matched string (e.g., splitting up the year, month, and date)
    - Rearrange during replacement (e.g., format conversion like $1 year $2 month $3 day)
- Difference from non-capturing groups
    - Even inside parentheses, if you use syntax like "(?:...)", it becomes a non-capturing group.
    - That part will be matched, but will not be included in the group extracted later.
    - This is useful when you want to group parts you don't need.
- RegExp Example
    - URL
        - `https?://[\w/:%#\$&\?\(\)~\.=\+\-]+`
    - Japanese Postal Code
        - `(\d{3}-\d{4})|(\d{7})`
        - `([0-9]{3}-[0-9]{4})|([0-9]{7})`
    - ISO Date from 2024-12-15 to 2025-01-06
        - `(2024-12-1[5-9])|(2024-12-[2-3][0-9])|(2025-01-0[1-6])`


- キャプチャグループとは、正規表現の中で "()" で囲まれた部分のことを指します。
    - マッチした文字列の一部を抜き出して利用できます。
- キャプチャグループは以下のような場面で非常に役立ちます：
    - マッチした文字列の一部分を取り出して再利用（例：年月日を分解）
    - 置換処理時に再配置（例：$1年$2月$3日 のようなフォーマット変換）
- 非キャプチャグループとの違い
    - 括弧の中でも "(?:...)" のような構文を使うと 非キャプチャグループ になります。
    - その部分はマッチの対象にはなりますが、後で抽出されるグループには含まれません。
    - 必要のない部分をグループ化したいときに便利です。
- 正規表現 設定例 / RegExp Example
    - URL
        - `https?://[\w/:%#\$&\?\(\)~\.=\+\-]+`
    - 日本の郵便番号 / Japanese Postal Code
        - `(\d{3}-\d{4})|(\d{7})`
        - `([0-9]{3}-[0-9]{4})|([0-9]{7})`
    - ISO Date from 2024-12-15 to 2025-01-06
        - `(2024-12-1[5-9])|(2024-12-[2-3][0-9])|(2025-01-0[1-6])`
*/

Download

string-extract-subpattern-by-regexp-2025.xml
- 2025-04-07 (C) Questetra, Inc. (MIT License)
string-extract-subpattern-by-regexp-202505.xml
- - 2025-05-13
- - マッチ位置のインデックス `arrMatches.index` を格納可能に。

warning 自由改変可能な JavaScript (ECMAScript) コードです。いかなる保証もありません。
(アドオン自動工程のインストールは Professional editionでのみ可能です)

Notes

この［自動工程］に案件が到達すると、「抽出処理」が自動実行されます。
- Inputテキスト内にある「マッチ文字列」が抽出されます。
入力Inputテキストの行数も記録可能です。
正規表現とは
- https://developer.mozilla.org/ja/docs/Web/JavaScript/Guide/Regular_expressions

Capture

Appendix

キャプチャグループとは、正規表現の中で “()” で囲まれた部分のことを指します。
- マッチした文字列の一部を抜き出して利用できます。
キャプチャグループは以下のような場面で非常に役立ちます：
- マッチした文字列の一部分を取り出して再利用（例：年月日を分解）
- 置換処理時に再配置（例：$1年$2月$3日のようなフォーマット変換）
非キャプチャグループとの違い
- 括弧の中でも “(?:…)” のような構文を使うと非キャプチャグループになります。
- その部分はマッチの対象にはなりますが、後で抽出されるグループには含まれません。
- 必要のない部分をグループ化したいときに便利です。
抽出サンプル
- 月の抽出／ Extract Month
  - RegExp: (\d{4})-(\d{2})-(\d{2})
  - CapturingID: 2
  - Input: 2025 2020/04/16 1973-01-31 444
    - Output: 01
- ファイル拡張子の抽出／ Extract File Extension
  - RegExp: \.(\w{2,5})
  - CapturingID: 1
  - Input: report.pdf、design.ai、image.jpeg
    - Output: pdf
- URL部分抽出／ Extract part of URL
  - RegExp: /d/([\w-]+)/
  - CapturingID: 1
  - Input: https://docs.google.com/document/d/12345abcde67890A-CDE1234_abcde67890ABCDE1234/edit?tab=t.0
    - Output: 12345abcde67890A-CDE1234_abcde67890ABCDE1234
  - Input: https://docs.google.com/presentation/d/12345abcde67890A-CDE1234_abcde67890ABCDE1234/edit#slide=id.p1
    - Output: 12345abcde67890A-CDE1234_abcde67890ABCDE1234
- メールアドレスのDomain部を抽出／ Extract Email Domain
  - RegExp: ([\w.-]+)@([\w.-]+\.\w+)
  - CapturingID: 2
  - Input: 担当者：yamada@example.com、連絡先：support@example2.co.jp
    - Output: example.com
- メールアドレスから登録Domainを抽出／ Extract Registered Domain
  - RegExp: @(?:[\w-]+\.)*([\w-]+\.(?:[a-z]{2,}\.[a-z]{2}|[a-z]{3,}))
  - CapturingID: 1
  - Input: 連絡先はsupport@subdomain.example.co.jpです
    - Output: example.co.jp
  - Input: 連絡先はsupport@subdomain.example.comです
    - Output: example.com