Commit Graph

7 Commits

Author SHA1 Message Date
5c4c8bcb80 lexer: Consume invalid characters 2023-10-25 19:28:54 -05:00
0e917837c0 identify_tokens: Use Display on token Data 2023-10-23 21:02:09 -05:00
b5abd2bff1 Lexer rewrite:
- Scan the input string *linearly*, without backtracking
  - Peek at most one character (unicode code-point) ahead
- Store data (unescaped string literals and chars, identifiers, integers, floats) inside Token
  - This unfortunately makes tokens non-Copy
- Refactor Parser to accommodate these changes
  - On the bright side, Parser no longer needs a reference to the text!
- Write a new set of lexer tests
  - TODO: write a new set of token tests using tokendata

Every day, we get closer to parsing `dummy.cl`!
2023-10-22 18:28:20 -05:00
1f0725d2c8 examples/identify_tokens.rs: Overhaul user experience 2023-10-16 22:26:32 -05:00
545483dae6 token: Add Display impl for Type
This improves readability in identify_tokens
2023-09-28 01:34:56 -05:00
097e2c4f11 conlang: Rename literals; split, compose, and document Rules
- Renamed literal Types to reflect their literal nature
  - This allows for consistent naming across future non-literal Types
- Complicated lexer Rules have been split into composable sub-rules,
  and moved into the Rule struct.
  - This improves modularity, and allows sharing of sub-rules across rules.
- Documented each lexer rule with (at least) a one-line blurb
  describing its function
2023-09-27 18:13:01 -05:00
46e72e4889 conlang: add toy program to interact with the tokenizer 2023-09-25 18:06:03 -05:00