Converter, CSV-String to TSV-String
Converts a CSV string to a TSV string. The TSV is output as the simplest tab-delimited string. If the field contains line breaks or tab, they are replaced with spaces. If the double-quotes in the input CSV are not escaped, the output will be unintended.
Configs
  • A1: Set Original TSV Text *#{EL}
  • B1: Select STRING DATA that stores New TSV Text (update) *
Script (click to open)
// GraalJS Script (engine type: 2)

//////// START "main()" /////////////////////////////////////////////////////////////////
main();
function main(){ 

//// == Config Retrieving / 工程コンフィグの参照 ==
const strInputCsv      = configs.get( "StrConfA1" );         /// REQUIRED ///////////////
  if( strInputCsv    === "" ){
    throw new Error( "\n AutomatedTask ConfigError:" +
                     " Config {A1: CSV} is empty \n" );
  }

const strPocketOutputTsv    = configs.getObject( "SelectConfB1" ); /// REQUIRED /////////


//// == Data Retrieving / ワークフローデータの参照 ==
// (Nothing. Retrieved via Expression Language in Config Retrieving)


//// == Calculating / 演算 ==

/// Replace TAB to " "
let strTmp = strInputCsv.replace( /\t/g, " " );

/// Replace '\n' in the field to ' '
// '\n': LINEBREAKS followed by an odd number of double-quotes
// '\n': その後ろにあるダブルクオートが奇数個となる改行コード
    strTmp = strTmp.replace( /\n(?=([^"]*"[^"]*")*"[^"]*$)/g, ' ' );

/// Split into each line
const regEnclosure = /^"(.*)"$/;
let strOutputTsv = "";

let arrTmpLines = strTmp.split( "\n" );
engine.log( " AutomatedTask CsvDataCheck: " + 
            arrTmpLines.length + " lines" );

for( let i = 0; i < arrTmpLines.length; i++ ){
  if( arrTmpLines[i] === "" ){ continue; } // Skip blank lines

  /// Replace ',' to '\t' followed by an even number of double-quotes
  // ',': COMMA followed by an even number of double-quotes
  // ',': その後ろにあるダブルクオートが偶数個となるカンマ
  let strTmpLine = arrTmpLines[i].replace( /,(?=([^"]*"[^"]*")*[^"]*$)/g, '\t' );

  let arrTmpCells = strTmpLine.split( '\t' );
  engine.log( "  #" + i + ": " + arrTmpCells.length + " cells" );

  for( let j = 0; j < arrTmpCells.length; j++ ){
    /// Remove '"' for enclosure and espaced '""'
    if( regEnclosure.test(arrTmpCells[j]) ){
      strOutputTsv += arrTmpCells[j].slice(1,-1).replace( /""/g, '"' );
    }else{
      strOutputTsv += arrTmpCells[j].replace( /""/g, '"' );
    }
    if( j !== arrTmpCells.length - 1 ){
      strOutputTsv += "\t";
    }
  }
  strOutputTsv += "\n";
}
strOutputTsv = strOutputTsv.slice( 0, -1 ); // delete last "\n"


//// == Data Updating / ワークフローデータへの代入 ==
engine.setData( strPocketOutputTsv,    strOutputTsv );

} //////// END "main()" /////////////////////////////////////////////////////////////////

/*
Notes:
- When the process arrives, the CSV text saved in String data is automatically converted to TSV.
    - If CSV file, it must be stored in String in advance.
        - Converter (Text File to String type data)
            - https://support.questetra.com/bpmn-icons/converter-textfile-to-string/
        - Text Files, Convert Character Encoding
            - https://support.questetra.com/addons/text-files-convert-character-encoding-2021/
- Even if there are TAB codes or line feed codes in the field of CSV, no error will occur.
    - Parsing a CSV with Line Breaks in the Data Fields
    - However, TAB codes and line feed codes are converted to `" "` (space).
        - `2004` ⇒ `2004`
        - `"3,000"` ⇒ `3,000`
        - `"SEA\nNYY"` ⇒ `SEA NYY`
        - `"In the interview, ""If I ever get fat"` ⇒ `In the interview, "If I ever get fat`
- Output TSV text is output as the simplest tab-delimited string.
    - MIME type: `text/tab-separated-values; charset = UTF-8`
    - Double-quotes are also preserved unescaped.

APPENDIX:
- If an odd number of double-quotes after the line feed, be judged as "in the cell" and replaced.
     - Regular expressions are used to replace line feed codes in cell data. (RFC 4180 2-6 2-7)
        - 2004,SEA,161,762,262,262 hits in a single season
        - 2008,SEA,162,749,213,"3,000 top-level professional hits"
        - 2012,"SEA
        - NYY",162,663,178,"In the interview, ""If I ever get fat, I'll quit baseball immediately."""
    - `[^"]*`: Characters other than double-quote, 0 or more times
    - `([^"]*"[^"]*")*[^"]*$`: Contains an even number of double-quotes by the end of the sentence
    - `([^"]*"[^"]*")*"[^"]*$`: Contains an odd number of double-quotes by the end of the sentence
    - `(?:x)`: Non-capturing group: Matches "x" but does not remember the match.
- If there is a blank line in the input CSV text, it will be skipped.
    - The line feed code for the last line is not added either.


Notes-ja:
- 案件(プロセス)が到達した際、文字列型データに保存されているCSVテキストが自動的にTSV変換されます。
    - CSVデータがファイルとして存在している場合、予め文字列型データ項目に格納する必要があります。
        - コンバータ (テキストファイル to 文字型データ)
            - https://support.questetra.com/ja/bpmn-icons/converter-textfile-to-string/
        - Text ファイル, 文字エンコーディングの変換
            - https://support.questetra.com/ja/addons/text-files-convert-character-encoding-2021/
- 入力CSVテキストのセルデータ内(フィールド内)に、TABや改行が存在してもエラーにはなりません。
    - セル内改行に対応した CSV-Parser (Parsing a CSV with Line Breaks in the Data Fields)
    - ただし、TABコードや改行コードは `" "` (半角スペース)に変換されます。 CSVパーサー
        - `2004` ⇒ `2004`
        - `"3,000"` ⇒ `3,000`
        - `"SEA\nNYY"` ⇒ `SEA NYY`
        - `"In the interview, ""If I ever get fat"` ⇒ `In the interview, "If I ever get fat`
- 出力TSVテキストは「もっともシンプルなタブ区切り文字列」として出力されます。
    - MIME type: `text/tab-separated-values; charset=UTF-8`
    - ダブルクオート文字も、エスケープされていない状態で保持されます。

APPENDIX-ja:
- その改行コード以降に奇数個のダブルクオートが存在する場合、「セル内の改行」と判定され変換されます。
    - セルデータ内(フィールド内)にある改行コードの変換には正規表現が利用されます。
        - 2004,SEA,161,762,262,262 hits in a single season
        - 2008,SEA,162,749,213,"3,000 top-level professional hits"
        - 2012,"SEA
        - NYY",162,663,178,"In the interview, ""If I ever get fat, I'll quit baseball immediately."""
    - `[^"]*`: ダブルクオート以外の文字が0回以上繰り返す
    - `([^"]*"[^"]*")*[^"]*$`: 文末までにダブルクオート文字が偶数回出現する
    - `([^"]*"[^"]*")*"[^"]*$`: 文末までにダブルクオート文字が奇数回出現する
    - `(?:x)`: 非キャプチャグループ: x にマッチしますが、マッチした内容は記憶しません。
- 入力CSVテキストに空行がある場合、スキップされます。
    - 最終行の改行コードも付与されません。
*/

Download

2021-08-24 (C) Questetra, Inc. (MIT License)
https://support.questetra.com/addons/converter-csv-string-to-tsv-string-2021/
The Add-on import feature is available with Professional edition.
Freely modifiable JavaScript (ECMAScript) code. No warranty of any kind.

Notes

  • When the process is reached, the CSV text saved in String data is automatically converted to TSV.
  • Even if there are tab codes or line feed codes in the input CSV text field, no error will occur.
    • Parsing a CSV with Line Breaks in the Data Fields
    • However, tab codes and line feed codes are converted to half-width spaces.
      • 20042004
      • "3,000"3,000
      • "SEA\nNYY"SEA NYY
      • "In the interview, ""If I ever get fat"In the interview, "If I ever get fat
  • Output TSV text is output as the simplest tab-delimited string.
    • MIME type: text/tab-separated-values; charset = UTF-8
    • Double-quotes are also preserved unescaped.

Capture

Converts a CSV string to a TSV string. The TSV is output as the simplest tab-delimited string. If the field contains line breaks or tab, they are replaced with spaces. If the double-quotes in the input CSV are not escaped, the output will be unintended.
Converts a CSV string to a TSV string. The TSV is output as the simplest tab-delimited string. If the field contains line breaks or tab, they are replaced with spaces. If the double-quotes in the input CSV are not escaped, the output will be unintended.

Appendix

  • If there is an odd number of double-quotes after the line feed, it will be judged as a line feed in the cell and replaced.
    • Regular expressions are used to replace line feed codes in cell data. (RFC 4180 2-6 2-7)
      • 2004,SEA,161,762,262,262 hits in a single season
        2008,SEA,162,749,213,"3,000 top-level professional hits"
        2012,"SEA
        NYY",162,663,178,"In the interview, ""If I ever get fat, I'll quit baseball immediately."""
    • [^"]*: Characters other than double-quote, 0 or more times
    • ([^"]*"[^"]*")*[^"]*$: Contains an even number of double-quotes by the end of the sentence
    • ([^"]*"[^"]*")*"[^"]*$: Contains an odd number of double-quotes by the end of the sentence
    • (?:x): Non-capturing group: Matches “x” but does not remember the match.
  • If there is a blank line in the input CSV text, it will be skipped.
    • The line feed code for the last line is not added either.

See also

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: