Because the node uses flexible array member it should not be
instantiated on stack. Hence a separate type for t_tree is needed so
that user may use tree on stack (as it is convenient).
The change is to return t_tree from pointer to t_tree_node to just
t_tree_node. I don't yet have much grasp on ergonomics of this but I
think this is the better choice.
It was moved to a separate project because it feels more like a
standalone tool than as a useful part of a library.
The project can be found at
git://ljiriste.work/parsing_table_generator
Lukas Jiriste [Wed, 26 Mar 2025 19:55:47 +0000 (20:55 +0100)]
Make Libft C++ aware and compilable
Though compilation is only possible with the -fpermissive flag.
It is not a great idea to use everything (or even most things) in C++,
but it is better to be consistent and it also makes it easier to migrate
a project using Libft from C to C++.
Lukas Jiriste [Thu, 28 Nov 2024 14:31:39 +0000 (15:31 +0100)]
Make parse_tree traversal a little bit easier
The ft_get_node_child function is very simple but may save a lot of
code, because ft_vec_access returns void* which needs to be cast before
further actions can be done.
(And casting to (t_parse_tree_node *) is a lot of code).
Lukas Jiriste [Thu, 28 Nov 2024 10:03:09 +0000 (11:03 +0100)]
Add -fPIE flag to Makefile
The -fPIE flag has something to do with position independent code. I
think I've encountered this problem before when using ft_printf, because
its ability to print pointer values (addresses).
Lukas Jiriste [Fri, 2 Aug 2024 13:09:59 +0000 (15:09 +0200)]
Improve Makefile debug functionality
Makefile now creates .debug file when "make debug" is executed.
This make Makefile able to remember the debug mode for futher builds
and rebuilds. The debug mode can be disabled by removing the .debug file
or by executing "make nodebug" which will also build the project.
Lukas Jiriste [Fri, 2 Aug 2024 12:14:20 +0000 (14:14 +0200)]
Extend the input possibilities for parsing table
This is done so that a parsing table may be included inside a source
file as a string.
Addition new (long) filename to Makefile lead to change of alignment
depth.
Lukas Jiriste [Thu, 1 Aug 2024 10:03:40 +0000 (12:03 +0200)]
Fix invalid read in ft_split
This read happens when the last word ends with the terminating '\0'.
After advancing the index to the '\0' it is then increased once more to
regions not accessible. This is fixed by simple skipping the ++i after a
word has been processed.
This file is then readable by ft_parsing_table_load.
Some lazy checks are written to check that something was written to the
file, but if only partial write happens, those will not notice.
The previous commit solved some wierd things around the add_first
function. These were created because of my misunderstanding of the
process but also (almost) prevented infinite recursion.
The infinite recurson is caused when a search for a first (nonterminal)
tokens encounteres the original token, because then it searches for it
again and is able to again find the token along the search.
The solution is to make the algorithm log what nonterminals it already
went through so that it does not try to find their first tokens during
the search for their first tokens.
Removing the zeroth rule before the translation makes all the items
point one rule ahaed. So instead the rule number is decreased by one
the zeroth rule reduce is converted to accept.
The initialization was caused by an inappropriate use of
ft_vec_insert_range. I was not sure whether it fills the range with the
single element or whether it insert range and I misread the source code
The translation function accessed t_generator_state * instead of the
correct t_generator_state**.
The gotos vector in the table rows is indexed from 0 but uses the tokens
after the eof_token hence the index needs to be offset.
When the add_predictions inside fill_closure has to enlarge closure to
add a new item, the original item pointer points to freed memory.
This is solved by making a copy.
Because the goto_tokens used tokens from table instead of their copies
and were freed when generator states were cleared, the table tokens
then contained pointers to freed memory which were to be used.
There are two ways to fix this. Either not free the tokens in
goto_tokens or append copies. I opted for the latter because every part
of the generator code uses copies too. It may be rewritten to use the
table tokens later on to save memory.
Change the way tokens are compared for zeroth rule
The zeroth rule contains a token which has NULL at type. Because the
function cmp_token_type is used when comparing rules (items) it has to
be able to handle this NULL.
The supporting changes consist of changing the ft_token_dup function to
take pointer to token as input. Taking global static token caused some
error but was solved by passing pointer.
Function to free structs now handle NULL gracefully. This was
encountered because not all used structures are initialized yet.
The zeroth rule is added to generate a table, that returns the original
first token as a root of the table. The type of NULL cannot be achieved
any other way inside the table.rules.
This commit mainly concerns itself with the construction of the closure
table - the closures and the "first" tokens.
The first tokens could be precalculated but I did not want to do that
as that would require to either have a function with some static
variable (which is meh to clean up) or I would have create another
structure and pass it everywhere. It can be added later on.
This function could have been named ft_vec_insert_unique,
but that would invoke the need for insertion index which
I wanted to avoid because of the usage a set.
Lukas Jiriste [Fri, 28 Jun 2024 11:51:06 +0000 (13:51 +0200)]
Add some set functions defined on t_vec
The functions are defined on t_vec instead of implementing t_set
because the t_set struct would not be very useful. The lack of
structure is easy to emulate wth t_vec. The only advantage would be
uniqueness of elements.
Lukas Jiriste [Fri, 21 Jun 2024 09:49:45 +0000 (11:49 +0200)]
Make ft_parse inputs const, move token_free
After playing a little with the functions I thought it stupid to have
all tokens have their own memory for types that repeat so much.
The token_free function I implemented frees both members of token
which leads to multi-frees when reusing a string for type.
This is why I think it will be better to hide token_free again
and the user can decide what to allocate and what to free.
For this to be possible I have guarantee the tokens vector is not
changed inside the ft_parse function, so I've rewriten it a little
to use const.
Lukas Jiriste [Fri, 21 Jun 2024 08:57:50 +0000 (10:57 +0200)]
Fix bug, where NULLs appear inside tree
This bug was caused because the tree nodes are filled from the end
when reducing. Instead of just inserting to the start it was inserting
at the final position.
When inserting to vector outside of range all the nonexistent entries
are initialized to 0. If another insertion happes after that, the
0 entries are not filled but moved to make space for the new entry.
Lukas Jiriste [Fri, 21 Jun 2024 07:43:38 +0000 (09:43 +0200)]
Fix the code to actually produce a tree
As the previous commit was only checked to compile there were some
minor defects. Notably the parser stack was not initialized properly.
It has to return state_num 0 when empty, which was solved by inserting
a dummy state 0 wih no node attached.
Lukas Jiriste [Thu, 20 Jun 2024 15:54:43 +0000 (17:54 +0200)]
Implement the ft_parse function
I did not yet try whether it actually functions (it probably does not).
Because managing ft_parse_inner.h was harder than I imagined
and I wanted to focus on the ft_parse function, I returned
almost everything to ft_parse.h.