No Description

Sam Jaffe 5a61d689a8 chore: remove shared scheme 2 months ago
include 1c787e08fb fix: result_of_t is deprecated 9 months ago
src 65f2c5b2c9 Reimplement reverse 2 years ago
string-utils.xcodeproj 5a61d689a8 chore: remove shared scheme 2 months ago
string_utils-test cb0a1b37ac Create the tokenize function in its entirety. 5 years ago
test 40a90787a8 Test coverage 2 years ago
.clang-format ca3151390a Add support for variant cast, add clang-format 4 years ago
README.md 9c43366571 Improve docs 2 years ago

README.md

String Utilities in C++

A couple of utilities for improving string usability

Join

Concatenate the elements of a container with a joining token. Uses ostreams.

Tokenizer/Split

Split a string into a vector of strings. There are two different versions of the tokenizer: normal and escapable. The EscapableTokenizer cannot return string_views, because it may have to doctor the contents.

Provides the following features:

Ignore Empty Tokens

Discard any token which is the empty string, enabled by default.

string_utils::Tokenizer split(",");
std::string_view const input = "A,B,C,,D";

split(input); // [ "A", "B", "C", "D" ]

split.ignore_empty_tokens(false);
split(input); // [ "A", "B", "C", "", "D" ]

Max Outputs

Limit the number of outputs returned, the default is infinite (size_t::max).

string_utils::Tokenizer split(",");
std::string_view const input = "A,B,C,D";

split(input).size(); // 4

split.max_outputs(3);
split(input).size(); // 3

Truncate

If there would be more tokens in the result than the maximum allowable, you can choose to either return all of the rest-tokens in the last token element, or return only the Nth concrete token.

string_utils::Tokenizer split(",");
split.max_outputs(3);

std::string_view const input = "A,B,C,D";

split(input); // [ "A", "B", "C,D" ]

split.truncate(true);
split(input); // [ "A", "B", "C" ]

Reverse Search Order

Instead of tokenizing the string from front-to-back, do it from back-to-front.

string_utils::Tokenizer split(",");
split.max_outputs(3);
split.reverse_search(true);

std::string_view const input = "A,B,C,D";

split(input); // [ "A,B", "C", "D" ]

split.truncate(true);
split(input); // [ "B", "C", "D" ]

Quotes

By providing a special quote character (with an optional escape sequence), it is possible to parse more complicated expressions. This is useful for example with CSV data, as you may need to represent a comma inside one of the fields.

In order to allow the regular tokenize to return a vector of string_views, this is stored in a different class.

string_utils::Tokenizer split(",");
// CSVs use a quotation mark for quotes, and we'll define doubled quotes as an escaped quote
string_utils::EscapableTokenizer esplit = split.escapable({'"', R"("")"});

std::string_view const input = R"(A,B,"C,D",""E"",F)";

esplit(input); // [ "A", "B", "C,D", "\"E\"", "F" ]

Cast - Coercing types from strings

In GoogleMock, if you don't want to define an ostream operator for your type, you can define a function PrintTo(T const &, std::ostream*) in the same namespace as T. GoogleMock then uses ADL to find that function and use it to print out the formatted version.

There are two different functions that are important: the singular token cast and the multi-token cast.

bool cast(T &out, std::string_view);
bool cast(T &out, std::vector<std::string_view> const &);