Sam Jaffe 2 rokov pred
rodič
commit
9c43366571
2 zmenil súbory, kde vykonal 85 pridanie a 7 odobranie
  1. 84 6
      README.md
  2. 1 1
      include/string_utils/tokenizer.h

+ 84 - 6
README.md

@@ -8,12 +8,90 @@ Concatenate the elements of a container with a joining token. Uses ostreams.
 
 ## Tokenizer/Split
 
-Split a string into a vector of strings. There are two different versions of the tokenizer: normal 
-and escapable. The EscapableTokenizer cannot return string\_views, because it may have to doctor 
-the contents.
+Split a string into a vector of strings. There are two different versions of the tokenizer: normal and escapable. The EscapableTokenizer cannot return string\_views, because it may have to doctor the contents.
+
+Provides the following features:
+
+### Ignore Empty Tokens
+Discard any token which is the empty string, enabled by default.
+
+``` c++
+string_utils::Tokenizer split(",");
+std::string_view const input = "A,B,C,,D";
+
+split(input); // [ "A", "B", "C", "D" ]
+
+split.ignore_empty_tokens(false);
+split(input); // [ "A", "B", "C", "", "D" ]
+```
+
+
+### Max Outputs
+Limit the number of outputs returned, the default is _infinite_ (size_t::max).
+
+``` c++
+string_utils::Tokenizer split(",");
+std::string_view const input = "A,B,C,D";
+
+split(input).size(); // 4
+
+split.max_outputs(3);
+split(input).size(); // 3
+```
+
+### Truncate
+If there would be more tokens in the result than the maximum allowable, you can choose to either return all of the rest-tokens in the last token element, or return only the Nth concrete token.
+
+``` c++
+string_utils::Tokenizer split(",");
+split.max_outputs(3);
+
+std::string_view const input = "A,B,C,D";
+
+split(input); // [ "A", "B", "C,D" ]
+
+split.truncate(true);
+split(input); // [ "A", "B", "C" ]
+```
+
+### Reverse Search Order
+Instead of tokenizing the string from front-to-back, do it from back-to-front.
+
+``` c++
+string_utils::Tokenizer split(",");
+split.max_outputs(3);
+split.reverse_search(true);
+
+std::string_view const input = "A,B,C,D";
+
+split(input); // [ "A,B", "C", "D" ]
+
+split.truncate(true);
+split(input); // [ "B", "C", "D" ]
+```
+
+### Quotes
+By providing a special quote character (with an optional escape sequence), it is possible to parse more complicated expressions. This is useful for example with CSV data, as you may need to represent a comma inside one of the fields.
+
+In order to allow the regular tokenize to return a vector of string\_views, this is stored in a different class.
+
+``` c++
+string_utils::Tokenizer split(",");
+// CSVs use a quotation mark for quotes, and we'll define doubled quotes as an escaped quote
+string_utils::EscapableTokenizer esplit = split.escapable({'"', R"("")"});
+
+std::string_view const input = R"(A,B,"C,D",""E"",F)";
+
+esplit(input); // [ "A", "B", "C,D", "\"E\"", "F" ]
+```
 
 ## Cast - Coercing types from strings
 
-In GoogleMock, if you don't want to define an ostream operator for your type, you can define a 
-function `PrintTo(T const &, std::ostream*)` in the same namespace as `T`. GoogleMock then uses ADL 
-to find that function and use it to print out the formatted version.
+In GoogleMock, if you don't want to define an ostream operator for your type, you can define a function `PrintTo(T const &, std::ostream*)` in the same namespace as `T`. GoogleMock then uses ADL to find that function and use it to print out the formatted version.
+
+There are two different functions that are important: the singular token cast and the multi-token cast.
+
+```
+bool cast(T &out, std::string_view);
+bool cast(T &out, std::vector<std::string_view> const &);
+```

+ 1 - 1
include/string_utils/tokenizer.h

@@ -41,7 +41,7 @@ public:
   Tokenizer & truncate(bool new_truncate_overage);
   Tokenizer & ignore_empty_tokens(bool new_ignore_empty_tokens);
   Tokenizer & reverse_search(bool new_reverse);
-  EscapedTokenizer escapable(Quote quote = Quote{'\0', ""}) const;
+  [[nodiscard]] EscapedTokenizer escapable(Quote quote = Quote{'\0', ""}) const;
 
   std::vector<std::string> operator()(std::string && input) const;
   std::vector<std::string_view> operator()(std::string_view input) const;