Improvements to Split block for whitespace and lines:

* Split by whitespace now uses the built-in definition of whitespace \s
  This catches all characters definted as whitespace, see below:
  https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
* Split a line by all unicode compliant line breaks. The biggest impact here is
  that OSX and Windows files will now split the same way.
The cr option is still around, but ther's no longer a need for it, IMO.
pull/3/merge
Michael Ball 2014-10-18 23:00:11 -07:00
rodzic 5f3279990b
commit dbf2e6665b
1 zmienionych plików z 8 dodań i 4 usunięć

Wyświetl plik

@ -2134,15 +2134,17 @@ Process.prototype.reportTextSplit = function (string, delimiter) {
str,
del;
if (!contains(types, strType)) {
throw new Error('expecting a text instad of a ' + strType);
throw new Error('expecting text instead of a ' + strType);
}
if (!contains(types, delType)) {
throw new Error('expecting a text delimiter instad of a ' + delType);
throw new Error('expecting a text delimiter instead of a ' + delType);
}
str = (string || '').toString();
switch (this.inputOption(delimiter)) {
case 'line':
del = '\n';
// Entirely Unicode Compliant Line Splitting (Platform independent)
// http://www.unicode.org/reports/tr18/#Line_Boundaries
del = /\r\n|[\n\v\f\r\x85\u2028\u2029]/;
break;
case 'tab':
del = '\t';
@ -2151,7 +2153,9 @@ Process.prototype.reportTextSplit = function (string, delimiter) {
del = '\r';
break;
case 'whitespace':
return new List(str.trim().split(/[\t\r\n ]+/));
str = str.trim();
del = /\s+/;
break;
case 'letter':
del = '';
break;