maybe some experienced people can correct me
Well first thing, I'd recommend saving the gist with a .rs extension so we get syntax highlighting :)
You should convert your loop to iterate over input.chars() instead. Your current loop will have issues if someone writes, for example, "naïveté" due to some of those letters being multiple bytes long in UTF-8 (which is the encoding str and String use). What you can do is:
let mut input = input.chars().peekable();
while let Some(ch) = input.next() {
// ...
}
This also lets you input.peek() in the body to look at the next character without taking it from the iterator. input.peek().is_some() tells you if there's more data (and is_none tells if you're at the end of the input).
Even in C, I'd have made the big if-chain use if-else to avoid evaluating conditions that are known to be false. However, here I'd convert it to a big match statement:
match ch {
' ' => {}
';' => {
// ...
}
_ if ch.is_numeric() => {
// ...
}
// ...
}
As an optional (and more advanced) thing, you can return slices into the input instead of copies of those slices:
fn tokenize(input: &str) -> Vec<(TokenType, &str)>
If you want to do this, you should do .char_indices() instead of .chars() so you know where to slice the input string at.
Otherwise, you can use std::mem::take(&mut current_token) to replace current_token with an empty string and take (without cloning) the existing value out of that variable:
use std::mem::take;
tokens.push((blah, take(&mut current_token)));