Commit Graph

524 Commits

Author SHA1 Message Date
0445598ae8 grammar: Remove Ignore expression to avoid parsing ambiguity :'( 2023-10-26 14:36:55 -05:00
b1f90ca4e9 conlang: Add statements
TODO: Parse `let` statements
2023-10-26 14:33:56 -05:00
d1b5c48aac constr/lerox: remove 2023-10-26 12:23:26 -05:00
5c4c8bcb80 lexer: Consume invalid characters 2023-10-25 19:28:54 -05:00
32bde2f749 Minor documentation fixes 2023-10-23 23:43:11 -05:00
0e1beca43d Cargos.toml: wrong key ("publish") 2023-10-23 23:42:39 -05:00
9196d3b1da Add package metadata 2023-10-23 21:38:06 -05:00
1b14a43129 Add LICENSE 2023-10-23 21:05:51 -05:00
a339dfd549 dummy.cl: Use unicode escape 🦈 2023-10-23 21:02:45 -05:00
0e917837c0 identify_tokens: Use Display on token Data 2023-10-23 21:02:09 -05:00
8b1a1534f3 docs: Improve and update documentation 2023-10-23 21:01:32 -05:00
c43ecf00d1 token::token_data::Data: Renamed from token::TokenData
& Moved into its own module
+ Added token::preamble; common imports when working with Token
2023-10-23 19:43:16 -05:00
2d1b6d82db Update readme 2023-10-22 18:41:10 -05:00
ccfddcc09a Conlang: Add range operators
TODO: Limit range operators to at most 2 operands
2023-10-22 18:30:00 -05:00
b5abd2bff1 Lexer rewrite:
- Scan the input string *linearly*, without backtracking
  - Peek at most one character (unicode code-point) ahead
- Store data (unescaped string literals and chars, identifiers, integers, floats) inside Token
  - This unfortunately makes tokens non-Copy
- Refactor Parser to accommodate these changes
  - On the bright side, Parser no longer needs a reference to the text!
- Write a new set of lexer tests
  - TODO: write a new set of token tests using tokendata

Every day, we get closer to parsing `dummy.cl`!
2023-10-22 18:28:20 -05:00
feb5cc5dd0 AST: Refactor binary operations, fix Walk trait
- Unified math operations into a single self-referential enum
- Walk now visits the children of a node, rather than the node itself
  - The old behavior was super confusing, and led to numerous stack overflows.
2023-10-21 12:24:52 -05:00
4ec91ff806 constr: switch to Rust-style unicode parsing 2023-10-21 12:20:22 -05:00
1e5f7149d9 Token: Add range operators, rename for consistency 2023-10-20 17:09:14 -05:00
34525ec77a lexer: Only allow lines up to 4GB in size. Saves 8 bytes per token 2023-10-20 15:34:54 -05:00
a26a01fc17 constr: Misuse iterators to parse tokens 2023-10-20 15:33:02 -05:00
7f7393d2c6 conlang: Grammatical cleanup; merge control flow expressions 2023-10-19 14:40:03 -05:00
331141356a tests: Move module into file 2023-10-17 13:39:05 -05:00
73f7337fd5 pretty_printer: Move module into file 2023-10-17 13:38:30 -05:00
59617d805a parser: Move module into file 2023-10-17 13:36:21 -05:00
3d0698ba60 lexer: Move module into file 2023-10-17 13:33:07 -05:00
b1d9e4fd03 ast: Move module into file 2023-10-17 13:30:16 -05:00
5d9c45c26a token: Move module into file 2023-10-17 13:23:34 -05:00
71f43d852a readme.md: Clarify goals 2023-10-16 23:01:09 -05:00
d6a80e4961 dummy.cl: Update to include while expressions 2023-10-16 22:55:07 -05:00
b707bcef81 examples: Add parse_input REPL for fun! 2023-10-16 22:51:59 -05:00
8d1107cb57 Conlang: Add a pretty printer for the AST 2023-10-16 22:51:07 -05:00
b89ed307a2 parser: Add recursive descent parser for AST. TODO: Error recovery and synchronization. 2023-10-16 22:50:15 -05:00
0a99a37bd8 ast: Disallow unused imports 2023-10-16 22:47:16 -05:00
c7d1aa4d2c lexer: Make the Lexer IntoIter'able 2023-10-16 22:46:38 -05:00
2ce89eec82 lexer: unmatched delimiters? in my commits? Shame! Shame for a thousand years! 2023-10-16 22:42:32 -05:00
4870ff17ba Token: whoops, these fields need to be public (for now) 2023-10-16 22:37:04 -05:00
3866a2e9f2 libconlang: Define the AST (TODO: clean up the AST) 2023-10-16 22:36:26 -05:00
cdb8c28e64 grammar.ebnf: Initial prototype grammar 2023-10-16 22:33:38 -05:00
e4c827e429 Update .gitignore 2023-10-16 22:32:22 -05:00
8eb1f46a65 Update readme.md 2023-10-16 22:32:06 -05:00
1f0725d2c8 examples/identify_tokens.rs: Overhaul user experience 2023-10-16 22:26:32 -05:00
50bb03ae18 token/lexer: Keywords are identifiers. Add missing operators. Fix <<=/>>=. 2023-10-16 22:25:48 -05:00
9c993b31a1 token: Rename literal tokens & assoc. functions 2023-10-16 22:14:08 -05:00
c15490847a token & lexer: add line & col to tokens 2023-10-13 13:05:14 -05:00
545483dae6 token: Add Display impl for Type
This improves readability in identify_tokens
2023-09-28 01:34:56 -05:00
48f5e5248c conlang: Add break, true, false keywords, and example file 2023-09-28 01:31:46 -05:00
f25685560a conlang: Tokenize punctuation (resolves #9)
Things left to consider:
- token::Type enum is getting fairly large.
  breaking it up could invoke substantial code bloat
- Compound operators might make more sense at the parser level
  - Compound-assign operators are ripe for syntactic desugaring,
    but there must be some reason it's done separately in other languages.
  - Operators like FatArrow may still make sense at the tokenizer level, regardless.
  - What is a lexer? A miserable pile of parsers!
- Operator overloading, or user-defined operators? Hmm...
2023-09-28 00:11:31 -05:00
d3b1e65e35 lexer: Add documentation blurbs to token-class functions 2023-09-28 00:09:47 -05:00
b1621f2522 tests: Add tests for token-class functions 2023-09-28 00:08:51 -05:00
0661789d42 Lexer: reduce code duplication in Rule mapping
produce_token renamed to map_rule
- Skips leading whitespace before any Rule evaluations
- Handles creation and destruction of Rule
2023-09-27 22:51:51 -05:00