Mercurial > hg > th-libs
annotate th_regex.c @ 705:dee28d507da7
Plug a memory leak.
author | Matti Hamalainen <ccr@tnsp.org> |
---|---|
date | Mon, 27 Apr 2020 00:11:28 +0300 |
parents | 7493d4c9ff77 |
children | c91902120e79 |
rev | line source |
---|---|
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1 /* |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
2 * Simple regular expression matching functionality |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
3 * Programmed and designed by Matti 'ccr' Hamalainen |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
4 * (C) Copyright 2020 Tecnic Software productions (TNSP) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
5 * |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
6 * Please read file 'COPYING' for information on license and distribution. |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
7 */ |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
8 #include "th_regex.h" |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
9 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
10 |
635
d191ded8a790
Improve the experimental regex matching debugging macros.
Matti Hamalainen <ccr@tnsp.org>
parents:
614
diff
changeset
|
11 #ifdef TH_EXPERIMENTAL_REGEX_DEBUG |
651 | 12 th_ioctx *th_dbg_fh = NULL; |
647 | 13 |
651 | 14 # define DBG_RE_PRINT(...) do { \ |
15 if (th_dbg_fh != NULL) \ | |
647 | 16 { \ |
651 | 17 th_regex_dump_indent(th_dbg_fh, level); \ |
18 thfprintf(th_dbg_fh, __VA_ARGS__); \ | |
647 | 19 } \ |
20 } while (0) | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
21 #else |
651 | 22 # define DBG_RE_PRINT(...) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
23 #endif |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
24 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
25 |
655 | 26 /// @cond |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
27 enum |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
28 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
29 TH_RE_MATCH_ONCE, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
30 TH_RE_MATCH_COUNT, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
31 TH_RE_MATCH_ANCHOR_START, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
32 TH_RE_MATCH_ANCHOR_END, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
33 }; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
34 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
35 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
36 enum |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
37 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
38 TH_RE_TYPE_CHAR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
39 TH_RE_TYPE_STR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
40 TH_RE_TYPE_ANY_CHAR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
41 TH_RE_TYPE_LIST, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
42 TH_RE_TYPE_LIST_REVERSE, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
43 TH_RE_TYPE_SUBEXPR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
44 }; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
45 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
46 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
47 static const char *re_match_modes[] = |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
48 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
49 "ONCE", |
643 | 50 "COUNT", |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
51 "ANCHOR START", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
52 "ANCHOR END", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
53 }; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
54 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
55 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
56 static const char *re_match_types[] = |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
57 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
58 "CHAR", |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
59 "STR", |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
60 "ANY", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
61 "LIST", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
62 "LIST REVERSE", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
63 "SUBEXPR", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
64 }; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
65 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
66 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
67 typedef struct |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
68 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
69 int type; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
70 th_char_t start, end; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
71 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
72 size_t nchars; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
73 th_char_t *chars; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
74 } th_regex_list_item_t; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
75 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
76 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
77 typedef struct |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
78 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
79 size_t nitems, itemssize; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
80 th_regex_list_item_t *items; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
81 } th_regex_list_t; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
82 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
83 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
84 typedef struct |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
85 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
86 int mode, type; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
87 ssize_t repeatMin, repeatMax; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
88 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
89 struct { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
90 th_char_t chr; |
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
91 th_char_t *str; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
92 th_regex_list_t list; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
93 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
94 th_regex_t *expr; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
95 } match; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
96 } th_regex_node_t; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
97 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
98 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
99 typedef struct |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
100 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
101 const th_char_t *pattern; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
102 size_t offs; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
103 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
104 th_regex_t *data; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
105 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
106 size_t nstack, stacksize; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
107 th_regex_t **stack; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
108 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
109 th_char_t *buf; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
110 size_t bufSize, bufPos; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
111 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
112 } th_regex_parse_ctx_t; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
113 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
114 |
655 | 115 struct th_regex_t |
116 { | |
117 size_t nnodes, nodessize; | |
118 th_regex_node_t *nodes; | |
119 }; | |
120 | |
121 /// @endcond | |
122 | |
123 | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
124 static void th_regex_node_init(th_regex_node_t *node) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
125 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
126 memset(node, 0, sizeof(th_regex_node_t)); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
127 node->mode = TH_RE_MATCH_ONCE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
128 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
129 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
130 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
131 static int th_regex_strndup(th_char_t **pdst, |
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
132 const th_char_t *src, const size_t len) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
133 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
134 if (pdst == NULL) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
135 return THERR_NULLPTR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
136 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
137 if (UINTPTR_MAX / sizeof(th_char_t) < len + 1) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
138 return THERR_BOUNDS; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
139 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
140 if ((*pdst = (th_char_t *) |
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
141 th_malloc((len + 1) * sizeof(th_char_t))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
142 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
143 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
144 memcpy(*pdst, src, len * sizeof(th_char_t)); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
145 (*pdst)[len] = 0; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
146 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
147 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
148 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
149 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
150 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
151 static int th_regex_parse_ctx_get_prev_node( |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
152 th_regex_parse_ctx_t *ctx, th_regex_node_t **pnode) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
153 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
154 if (ctx->data != NULL && ctx->data->nnodes > 0) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
155 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
156 *pnode = &ctx->data->nodes[ctx->data->nnodes - 1]; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
157 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
158 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
159 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
160 return THERR_INVALID_DATA; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
161 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
162 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
163 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
164 static int th_regex_parse_ctx_push(th_regex_parse_ctx_t *ctx) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
165 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
166 if (ctx->stack == NULL || ctx->nstack + 1 >= ctx->stacksize) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
167 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
168 ctx->stacksize += 16; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
169 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
170 if ((ctx->stack = th_realloc(ctx->stack, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
171 ctx->stacksize * sizeof(th_regex_node_t *))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
172 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
173 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
174 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
175 ctx->stack[ctx->nstack] = ctx->data; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
176 ctx->nstack++; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
177 ctx->data = NULL; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
178 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
179 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
180 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
181 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
182 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
183 static int th_regex_parse_ctx_pop(th_regex_parse_ctx_t *ctx, th_regex_t **data) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
184 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
185 if (ctx->nstack > 0) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
186 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
187 *data = ctx->data; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
188 ctx->nstack--; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
189 ctx->data = ctx->stack[ctx->nstack]; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
190 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
191 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
192 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
193 return THERR_INVALID_DATA; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
194 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
195 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
196 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
197 static int th_regex_parse_ctx_node_commit(th_regex_parse_ctx_t *ctx, th_regex_node_t *node) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
198 { |
705 | 199 th_regex_t *data; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
200 |
705 | 201 if (ctx->data == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
202 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
203 if ((data = ctx->data = th_malloc0(sizeof(th_regex_t))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
204 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
205 } |
705 | 206 else |
207 data = ctx->data; | |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
208 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
209 if (data->nodes == NULL || data->nnodes + 1 >= data->nodessize) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
210 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
211 data->nodessize += 16; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
212 if ((data->nodes = th_realloc(data->nodes, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
213 data->nodessize * sizeof(th_regex_node_t))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
214 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
215 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
216 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
217 memcpy(&data->nodes[data->nnodes], node, sizeof(th_regex_node_t)); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
218 data->nnodes++; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
219 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
220 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
221 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
222 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
223 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
224 static BOOL th_regex_find_next(const th_char_t *str, |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
225 const size_t start, size_t *offs, |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
226 const th_char_t delim) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
227 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
228 for (*offs = start; str[*offs] != 0; (*offs)++) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
229 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
230 if (str[*offs] == delim) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
231 return TRUE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
232 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
233 return FALSE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
234 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
235 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
236 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
237 static BOOL th_regex_parse_ssize_t(const th_char_t *str, |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
238 ssize_t *value) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
239 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
240 th_char_t ch; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
241 BOOL neg; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
242 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
243 if (*str == '-') |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
244 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
245 str++; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
246 neg = TRUE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
247 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
248 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
249 neg = FALSE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
250 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
251 // Is the value negative? |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
252 while ((ch = *str++)) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
253 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
254 if (ch >= '0' && ch <= '9') |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
255 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
256 *value *= 10; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
257 *value += ch - '0'; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
258 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
259 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
260 return FALSE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
261 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
262 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
263 if (neg) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
264 *value = -(*value); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
265 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
266 return TRUE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
267 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
268 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
269 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
270 static void th_regex_list_item_init(th_regex_list_item_t *item) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
271 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
272 memset(item, 0, sizeof(th_regex_list_item_t)); |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
273 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
274 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
275 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
276 static int th_regex_list_add_item(th_regex_list_t *list, th_regex_list_item_t *item) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
277 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
278 if (list->items == NULL || list->nitems + 1 >= list->itemssize) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
279 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
280 list->itemssize += 16; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
281 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
282 if ((list->items = th_realloc(list->items, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
283 list->itemssize * sizeof(th_regex_list_item_t))) == NULL) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
284 return THERR_MALLOC; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
285 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
286 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
287 memcpy(&list->items[list->nitems], item, sizeof(th_regex_list_item_t)); |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
288 list->nitems++; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
289 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
290 return THERR_OK; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
291 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
292 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
293 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
294 static void th_regex_list_free(th_regex_list_t *list) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
295 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
296 if (list != NULL) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
297 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
298 for (size_t n = 0; n < list->nitems; n++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
299 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
300 th_free(list->items[n].chars); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
301 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
302 th_free(list->items); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
303 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
304 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
305 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
306 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
307 static int th_regex_parse_list(const th_char_t *str, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
308 const size_t slen, th_regex_list_t *list) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
309 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
310 th_char_t *tmp = NULL; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
311 th_regex_list_item_t item; |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
312 int res; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
313 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
314 if ((res = th_regex_strndup(&tmp, str, slen)) != THERR_OK) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
315 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
316 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
317 // Handle ranges like [A-Z] |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
318 for (size_t offs = 0; offs < slen; offs++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
319 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
320 th_char_t |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
321 *prev = (offs > 0) ? tmp + offs - 1 : NULL, |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
322 *curr = tmp + offs, |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
323 *next = (offs + 1 < slen) ? tmp + offs + 1 : NULL; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
324 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
325 if (*curr == '-') |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
326 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
327 if (prev != NULL && next != NULL) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
328 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
329 // Range |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
330 th_regex_list_item_init(&item); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
331 item.type = 1; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
332 item.start = *prev; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
333 item.end = *next; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
334 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
335 if (item.start >= item.end) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
336 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
337 res = THERR_INVALID_DATA; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
338 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
339 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
340 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
341 *curr = *prev = *next = 0; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
342 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
343 if ((res = th_regex_list_add_item(list, &item)) != THERR_OK) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
344 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
345 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
346 else |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
347 if (next != NULL) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
348 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
349 res = THERR_INVALID_DATA; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
350 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
351 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
352 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
353 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
354 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
355 // Count number of remaining characters |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
356 th_regex_list_item_init(&item); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
357 item.type = 0; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
358 item.nchars = 0; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
359 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
360 for (size_t offs = 0; offs < slen; offs++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
361 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
362 th_char_t curr = tmp[offs]; |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
363 if (curr != 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
364 item.nchars++; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
365 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
366 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
367 if (item.nchars > 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
368 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
369 if ((item.chars = th_malloc(sizeof(th_char_t) * item.nchars)) == NULL) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
370 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
371 res = THERR_MALLOC; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
372 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
373 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
374 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
375 for (size_t offs = 0, n = 0; offs < slen; offs++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
376 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
377 th_char_t curr = tmp[offs]; |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
378 if (curr != 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
379 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
380 item.chars[n] = curr; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
381 n++; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
382 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
383 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
384 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
385 if ((res = th_regex_list_add_item(list, &item)) != THERR_OK) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
386 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
387 th_free(item.chars); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
388 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
389 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
390 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
391 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
392 out: |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
393 th_free(tmp); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
394 return res; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
395 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
396 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
397 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
398 static int th_regex_parse_ctx_node_commit_strchr_do(th_regex_parse_ctx_t *ctx, |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
399 const th_char_t *buf, const size_t bufLen) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
400 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
401 th_regex_node_t node; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
402 th_regex_node_init(&node); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
403 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
404 if (bufLen > 1) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
405 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
406 int res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
407 node.type = TH_RE_TYPE_STR; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
408 if ((res = th_regex_strndup(&node.match.str, buf, bufLen)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
409 return res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
410 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
411 else |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
412 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
413 node.type = TH_RE_TYPE_CHAR; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
414 node.match.chr = buf[0]; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
415 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
416 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
417 return th_regex_parse_ctx_node_commit(ctx, &node); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
418 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
419 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
420 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
421 static int th_regex_parse_ctx_node_commit_strchr(th_regex_parse_ctx_t *ctx, const BOOL split) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
422 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
423 int res = THERR_OK;; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
424 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
425 if (ctx->bufPos > 0) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
426 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
427 if (ctx->bufPos > 1 && split) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
428 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
429 if ((res = th_regex_parse_ctx_node_commit_strchr_do(ctx, ctx->buf, ctx->bufPos - 1)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
430 return res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
431 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
432 res = th_regex_parse_ctx_node_commit_strchr_do(ctx, ctx->buf + ctx->bufPos - 1, 1); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
433 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
434 else |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
435 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
436 res = th_regex_parse_ctx_node_commit_strchr_do(ctx, ctx->buf, ctx->bufPos); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
437 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
438 ctx->bufPos = 0; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
439 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
440 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
441 return res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
442 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
443 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
444 |
655 | 445 /** |
446 * Parse given regular expression @p pattern string into compiled/tokenized | |
447 * form as @c th_regex_t structures. Returns @c THERR_OK if successful, | |
448 * or other @c THERR_* return value if not. In either case, the @p pexpr | |
449 * may have been allocated and must be freed via th_regex_free(). | |
657 | 450 * @param[in,out] pexpr pointer to a pointer of @c th_regex_t structures to be |
655 | 451 * @param[in] pattern regular expression pattern string |
452 * @returns @c THERR_* return value indicating success or failure | |
453 */ | |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
454 int th_regex_compile(th_regex_t **pexpr, const th_char_t *pattern) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
455 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
456 int res = THERR_OK; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
457 th_regex_parse_ctx_t ctx; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
458 th_regex_node_t node, *pnode; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
459 th_char_t *tmp = NULL; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
460 size_t start; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
461 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
462 // Check pointers |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
463 if (pexpr == NULL || pattern == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
464 { |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
465 res = THERR_NULLPTR; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
466 goto out; |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
467 } |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
468 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
469 // Initialize parsing context |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
470 memset(&ctx, 0, sizeof(ctx)); |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
471 ctx.pattern = pattern; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
472 ctx.bufSize = 256; |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
473 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
474 if ((ctx.buf = th_malloc(ctx.bufSize * sizeof(th_char_t))) == NULL) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
475 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
476 res = THERR_MALLOC; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
477 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
478 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
479 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
480 // Start parsing the pattern |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
481 for (; ctx.pattern[ctx.offs] != 0; ctx.offs++) |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
482 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
483 th_char_t cch = ctx.pattern[ctx.offs]; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
484 |
613 | 485 switch (cch) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
486 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
487 case '?': |
613 | 488 case '*': |
489 case '+': | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
490 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, TRUE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
491 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
492 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
493 if ((res = th_regex_parse_ctx_get_prev_node(&ctx, &pnode)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
494 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
495 |
613 | 496 if (cch == '?') |
497 { | |
643 | 498 // Previous token is optional (repeat 0-1 times) (non-greedy matching) |
499 pnode->mode = TH_RE_MATCH_COUNT; | |
500 pnode->repeatMin = 0; | |
501 pnode->repeatMax = 1; | |
613 | 502 } |
503 else | |
504 { | |
641 | 505 // Check if previous was a count ("**", "*+", etc.) |
643 | 506 if (pnode->mode == TH_RE_MATCH_COUNT) |
613 | 507 { |
508 res = THERR_INVALID_DATA; | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
509 goto out; |
613 | 510 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
511 |
643 | 512 pnode->mode = TH_RE_MATCH_COUNT; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
513 |
613 | 514 if (cch == '*') |
515 { | |
516 // Previous token can repeat 0 or more times | |
517 pnode->repeatMin = 0; | |
518 pnode->repeatMax = -1; | |
519 } | |
520 else | |
521 { | |
522 // Previous token must repeat 1 or more times | |
523 pnode->repeatMin = 1; | |
524 pnode->repeatMax = -1; | |
525 } | |
526 } | |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
527 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
528 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
529 case '{': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
530 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, TRUE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
531 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
532 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
533 // {n} | {min,max} |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
534 start = ctx.offs + 1; |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
535 if (!th_regex_find_next(ctx.pattern, start, &ctx.offs, '}')) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
536 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
537 // End not found |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
538 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
539 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
540 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
541 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
542 th_free(tmp); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
543 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
544 if ((res = th_regex_parse_ctx_get_prev_node(&ctx, &pnode)) != THERR_OK || |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
545 (res = th_regex_strndup(&tmp, ctx.pattern + start, |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
546 ctx.offs - start)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
547 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
548 |
643 | 549 pnode->mode = TH_RE_MATCH_COUNT; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
550 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
551 if (th_regex_find_next(tmp, 0, &start, ',')) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
552 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
553 tmp[start] = 0; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
554 if (!th_regex_parse_ssize_t(tmp, &pnode->repeatMin) || |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
555 !th_regex_parse_ssize_t(tmp + start + 1, &pnode->repeatMax)) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
556 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
557 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
558 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
559 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
560 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
561 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
562 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
563 if (!th_regex_parse_ssize_t(tmp, &pnode->repeatMin)) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
564 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
565 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
566 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
567 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
568 pnode->repeatMax = pnode->repeatMin; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
569 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
570 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
571 if (pnode->repeatMin < 0 || pnode->repeatMax < 1 || |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
572 pnode->repeatMax < pnode->repeatMin) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
573 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
574 // Invalid repeat counts |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
575 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
576 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
577 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
578 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
579 |
648 | 580 /* |
581 case '|': | |
582 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) | |
583 goto out; | |
584 | |
585 // Alt pattern .. how to handle these? | |
586 break; | |
587 */ | |
588 | |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
589 case '(': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
590 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
591 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
592 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
593 // Start of subpattern |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
594 if ((res = th_regex_parse_ctx_push(&ctx)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
595 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
596 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
597 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
598 case ')': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
599 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
600 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
601 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
602 // End of subpattern |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
603 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
604 node.type = TH_RE_TYPE_SUBEXPR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
605 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
606 if ((res = th_regex_parse_ctx_pop(&ctx, &node.match.expr)) != THERR_OK || |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
607 (res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
608 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
609 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
610 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
611 case '^': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
612 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
613 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
614 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
615 // Start of line anchor |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
616 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
617 node.mode = TH_RE_MATCH_ANCHOR_START; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
618 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
619 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
620 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
621 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
622 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
623 case '$': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
624 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
625 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
626 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
627 // End of line anchor |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
628 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
629 node.mode = TH_RE_MATCH_ANCHOR_END; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
630 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
631 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
632 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
633 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
634 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
635 case '[': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
636 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
637 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
638 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
639 // Start of char list |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
640 start = ctx.offs + 1; |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
641 if (!th_regex_find_next(ctx.pattern, start, &ctx.offs, ']') || |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
642 ctx.offs == start) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
643 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
644 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
645 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
646 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
647 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
648 th_regex_node_init(&node); |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
649 if (ctx.pattern[start] == '^') |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
650 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
651 node.type = TH_RE_TYPE_LIST_REVERSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
652 start++; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
653 } |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
654 else |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
655 node.type = TH_RE_TYPE_LIST; |
638 | 656 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
657 if ((res = th_regex_parse_list(ctx.pattern + start, |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
658 ctx.offs - start, &node.match.list)) != THERR_OK || |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
659 (res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
660 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
661 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
662 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
663 case '.': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
664 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
665 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
666 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
667 // Any single character matches |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
668 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
669 node.type = TH_RE_TYPE_ANY_CHAR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
670 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
671 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
672 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
673 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
674 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
675 case '\\': |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
676 // Literal escape |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
677 ctx.offs++; |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
678 if (ctx.pattern[ctx.offs] == 0) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
679 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
680 // End of pattern, error |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
681 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
682 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
683 } |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
684 // fall-through |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
685 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
686 default: |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
687 // Given character must match |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
688 if (ctx.bufPos < ctx.bufSize) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
689 ctx.buf[ctx.bufPos++] = ctx.pattern[ctx.offs]; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
690 else |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
691 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
692 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
693 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
694 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
695 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
696 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
697 // Commit last string/char if any |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
698 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
699 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
700 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
701 // Create root node |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
702 th_regex_node_init(&node); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
703 node.type = TH_RE_TYPE_SUBEXPR; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
704 node.match.expr = ctx.data; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
705 ctx.data = NULL; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
706 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
707 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
708 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
709 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
710 out: |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
711 *pexpr = ctx.data; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
712 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
713 // Free temporary buffers |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
714 th_free(tmp); |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
715 th_free(ctx.buf); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
716 return res; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
717 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
718 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
719 |
655 | 720 /** |
721 * Deallocate the given regular expression structure @p expr. | |
722 * All associated data will be freed, though pointers may not | |
723 * be NULLed. | |
724 * | |
725 * @param[in] expr structure to be deallocated | |
726 */ | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
727 void th_regex_free(th_regex_t *expr) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
728 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
729 if (expr != NULL) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
730 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
731 for (size_t nnode = 0; nnode < expr->nnodes; nnode++) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
732 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
733 th_regex_node_t *node = &expr->nodes[nnode]; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
734 |
705 | 735 th_free(node->match.str); |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
736 th_regex_free(node->match.expr); |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
737 th_regex_list_free(&node->match.list); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
738 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
739 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
740 th_free(expr->nodes); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
741 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
742 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
743 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
744 |
651 | 745 static void th_regex_dump_indent(th_ioctx *fh, const int level) |
647 | 746 { |
747 for (int indent = 0; indent < level; indent++) | |
651 | 748 thfputs(" ", fh); |
647 | 749 } |
750 | |
751 | |
651 | 752 static void th_regex_dump_node(th_ioctx *fh, const th_regex_node_t *node) |
647 | 753 { |
651 | 754 thfprintf(fh, |
647 | 755 "%s %s ", |
756 re_match_modes[node->mode], | |
757 re_match_types[node->type]); | |
758 | |
759 if (node->mode == TH_RE_MATCH_COUNT) | |
760 { | |
651 | 761 thfprintf(fh, "min=%" PRId_SSIZE_T ", max=%" PRId_SSIZE_T " : ", |
647 | 762 node->repeatMin, node->repeatMax); |
763 } | |
764 | |
765 switch (node->type) | |
766 { | |
767 case TH_RE_TYPE_CHAR: | |
651 | 768 thfprintf(fh, "'%c'", node->match.chr); |
647 | 769 break; |
770 | |
771 case TH_RE_TYPE_STR: | |
651 | 772 thfprintf(fh, "\"%s\"", node->match.str); |
647 | 773 break; |
774 | |
775 case TH_RE_TYPE_ANY_CHAR: | |
651 | 776 thfprintf(fh, "."); |
647 | 777 break; |
778 | |
779 case TH_RE_TYPE_LIST: | |
780 case TH_RE_TYPE_LIST_REVERSE: | |
651 | 781 thfputs("[ ", fh); |
647 | 782 for (size_t n = 0; n < node->match.list.nitems; n++) |
783 { | |
784 const th_regex_list_item_t *li = &node->match.list.items[n]; | |
785 if (li->type) | |
786 { | |
651 | 787 thfprintf(fh, "'%c-%c' ", li->start, li->end); |
647 | 788 } |
789 else | |
790 { | |
791 for (size_t i = 0; i < li->nchars; i++) | |
651 | 792 thfprintf(fh, "'%c' ", li->chars[i]); |
647 | 793 } |
794 } | |
651 | 795 thfputs("]", fh); |
647 | 796 break; |
797 } | |
798 } | |
799 | |
800 | |
655 | 801 /** |
802 * Print out the contents of given regular expression structure @p expr | |
803 * in "human-readable" format to specified @c th_ioctx context. Typically | |
804 * useful for debugging purposes only. | |
805 * | |
806 * @param[in,out] fh th_ioctx handle to be used for output, must be writable. | |
807 * @param[in] level starting whitespace indentation level | |
808 * @param[in] expr regular expression structure to be "dumped" | |
809 */ | |
651 | 810 void th_regex_dump(th_ioctx *fh, const int level, const th_regex_t *expr) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
811 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
812 if (expr != NULL) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
813 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
814 for (size_t nnode = 0; nnode < expr->nnodes; nnode++) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
815 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
816 th_regex_node_t *node = &expr->nodes[nnode]; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
817 |
647 | 818 th_regex_dump_indent(fh, level); |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
819 |
651 | 820 thfprintf(fh, |
647 | 821 "[%" PRIu_SIZE_T "/%" PRIu_SIZE_T "] ", |
822 nnode + 1, expr->nnodes); | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
823 |
647 | 824 th_regex_dump_node(fh, node); |
651 | 825 thfputs("\n", fh); |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
826 |
647 | 827 if (node->type == TH_RE_TYPE_SUBEXPR) |
828 th_regex_dump(fh, level + 1, node->match.expr); | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
829 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
830 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
831 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
832 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
833 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
834 static BOOL th_regex_match_list(const th_regex_list_t *list, const th_char_t cch) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
835 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
836 // Could be optimized, perhaps .. sort match.chars, binary search etc? |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
837 for (size_t nitem = 0; nitem < list->nitems; nitem++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
838 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
839 const th_regex_list_item_t *item = &list->items[nitem]; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
840 |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
841 if (item->type == 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
842 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
843 for (size_t n = 0; n < item->nchars; n++) |
649 | 844 { |
845 if (item->chars[n] == cch) | |
846 return TRUE; | |
847 } | |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
848 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
849 else |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
850 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
851 if (cch >= item->start && cch <= item->end) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
852 return TRUE; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
853 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
854 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
855 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
856 return FALSE; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
857 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
858 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
859 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
860 static BOOL th_regex_match_expr( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
861 const th_char_t *haystack, |
649 | 862 size_t *offs, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
863 const th_regex_t *expr, |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
864 const size_t startnode, |
647 | 865 const int flags, |
866 const int level | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
867 ); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
868 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
869 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
870 static BOOL th_regex_match_one( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
871 const th_char_t *haystack, |
649 | 872 size_t *offs, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
873 const th_regex_node_t *node, |
647 | 874 const int flags, |
875 const int level | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
876 ) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
877 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
878 th_char_t cch; |
638 | 879 BOOL res = FALSE; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
880 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
881 switch (node->type) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
882 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
883 case TH_RE_TYPE_SUBEXPR: |
649 | 884 res = th_regex_match_expr(haystack, offs, node->match.expr, 0, flags, level + 1); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
885 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
886 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
887 case TH_RE_TYPE_LIST: |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
888 case TH_RE_TYPE_LIST_REVERSE: |
649 | 889 if ((cch = haystack[*offs]) == 0) |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
890 res = FALSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
891 else |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
892 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
893 res = th_regex_match_list(&node->match.list, cch); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
894 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
895 if (node->type == TH_RE_TYPE_LIST_REVERSE) |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
896 res = !res; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
897 |
649 | 898 (*offs)++; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
899 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
900 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
901 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
902 case TH_RE_TYPE_ANY_CHAR: |
649 | 903 if ((cch = haystack[*offs]) == 0) |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
904 res = FALSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
905 else |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
906 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
907 res = TRUE; |
649 | 908 (*offs)++; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
909 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
910 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
911 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
912 case TH_RE_TYPE_CHAR: |
649 | 913 if ((cch = haystack[*offs]) == 0) |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
914 res = FALSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
915 else |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
916 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
917 res = (cch == node->match.chr); |
649 | 918 (*offs)++; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
919 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
920 break; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
921 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
922 case TH_RE_TYPE_STR: |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
923 res = TRUE; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
924 for (th_char_t *str = node->match.str; |
648 | 925 res && *str != 0; |
649 | 926 str++, (*offs)++) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
927 { |
649 | 928 if (haystack[*offs] != *str) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
929 res = FALSE; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
930 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
931 break; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
932 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
933 |
638 | 934 return res; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
935 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
936 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
937 |
649 | 938 static BOOL th_regex_match_count( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
939 const th_char_t *haystack, |
649 | 940 size_t *offs, |
941 const th_regex_t *expr, | |
942 const th_regex_node_t *node, | |
943 size_t *nnode, | |
944 const int flags, | |
945 const int level | |
946 ) | |
947 { | |
667 | 948 size_t toffs = *offs, last_offs = *offs; |
649 | 949 ssize_t count = 0; |
950 | |
951 do | |
952 { | |
666 | 953 // Attempt to match the repeated node once |
667 | 954 size_t poffs = toffs; |
955 if (th_regex_match_one(haystack, &poffs, node, flags, level)) | |
956 { | |
957 // Matched, increase count of repeats | |
958 count++; | |
959 //DBG_RE_PRINT("#%" PRId_SSIZE_T "\n", count); | |
960 | |
961 // poffs should now be at position + 1 from match | |
962 } | |
963 else | |
964 { | |
965 // Did not match, get out if repeatMin > 0 | |
966 if (node->repeatMin > 0) | |
967 break; | |
968 } | |
969 | |
970 // Attempt to match rest of the expression | |
971 size_t qoffs1 = poffs, qoffs2 = toffs; | |
972 DBG_RE_PRINT("try rest '%s' :: '%s'\n", haystack + qoffs1, haystack + qoffs2); | |
973 if (th_regex_match_expr(haystack, &qoffs1, expr, *nnode + 1, flags, level + 1)) | |
974 { | |
975 // Matched | |
976 toffs = last_offs = qoffs1; | |
666 | 977 |
667 | 978 DBG_RE_PRINT(" yes1: count=%" PRId_SSIZE_T " [%" PRId_SSIZE_T " .. %" PRId_SSIZE_T "]\n", count, node->repeatMin, node->repeatMax); |
979 | |
980 // Check min repeats and if we are "not greedy". | |
981 if (count >= node->repeatMin && node->repeatMax == 1) | |
982 break; | |
983 | |
984 // Check max repeats | |
985 if (node->repeatMax > 0 && count >= node->repeatMax) | |
986 break; | |
987 } | |
988 else | |
989 if (node->repeatMin == 0 && | |
990 th_regex_match_expr(haystack, &qoffs2, expr, *nnode + 1, flags, level + 1)) | |
649 | 991 { |
667 | 992 // Matched |
993 toffs = last_offs = qoffs2; | |
994 | |
995 DBG_RE_PRINT(" yes2: count=%" PRId_SSIZE_T " [%" PRId_SSIZE_T " .. %" PRId_SSIZE_T "]\n", count, node->repeatMin, node->repeatMax); | |
996 | |
997 // Check min repeats and if we are "not greedy". | |
998 if (count >= node->repeatMin && node->repeatMax == 1) | |
999 break; | |
1000 | |
1001 // Check max repeats | |
1002 if (node->repeatMax > 0 && count >= node->repeatMax) | |
1003 break; | |
666 | 1004 |
649 | 1005 } |
1006 else | |
666 | 1007 { |
667 | 1008 // Rest of expression did not match, try again |
1009 DBG_RE_PRINT(" no\n"); | |
1010 toffs = poffs; | |
666 | 1011 } |
649 | 1012 |
1013 | |
667 | 1014 } while (haystack[toffs] != 0); |
649 | 1015 |
667 | 1016 // Check results |
1017 BOOL res = count >= node->repeatMin || | |
1018 (node->repeatMax > 0 && count >= node->repeatMax); | |
666 | 1019 |
1020 if (res) | |
649 | 1021 { |
667 | 1022 *offs = last_offs; |
649 | 1023 *nnode = expr->nnodes; |
1024 } | |
1025 | |
666 | 1026 DBG_RE_PRINT("RESULT: %s : offs=%" PRIu_SIZE_T "='%s'\n", |
1027 res ? "YES" : "NO", | |
1028 *offs, haystack + *offs); | |
649 | 1029 |
1030 return res; | |
1031 } | |
1032 | |
1033 | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1034 static BOOL th_regex_match_expr( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
1035 const th_char_t *haystack, |
649 | 1036 size_t *offs, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1037 const th_regex_t *expr, |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1038 const size_t startnode, |
647 | 1039 const int flags, |
1040 const int level | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1041 ) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1042 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1043 BOOL res = TRUE; |
649 | 1044 size_t soffs = *offs; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1045 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1046 for (size_t nnode = startnode; res && nnode < expr->nnodes; nnode++) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1047 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1048 const th_regex_node_t *node = &expr->nodes[nnode]; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1049 |
647 | 1050 #ifdef TH_EXPERIMENTAL_REGEX_DEBUG |
651 | 1051 if (th_dbg_fh != NULL) |
648 | 1052 { |
651 | 1053 th_regex_dump_indent(th_dbg_fh, level); |
1054 | |
1055 thfprintf(th_dbg_fh, | |
648 | 1056 "[%" PRIu_SIZE_T "/%" PRIu_SIZE_T "] ", |
1057 nnode + 1, expr->nnodes); | |
647 | 1058 |
651 | 1059 th_regex_dump_node(th_dbg_fh, node); |
647 | 1060 |
651 | 1061 thfprintf(th_dbg_fh, " <-> \"%s\"\n", |
648 | 1062 haystack + soffs); |
1063 } | |
647 | 1064 #endif |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1065 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1066 switch (node->mode) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1067 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1068 case TH_RE_MATCH_ONCE: |
647 | 1069 res = th_regex_match_one(haystack, &soffs, node, flags, level); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1070 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1071 |
643 | 1072 case TH_RE_MATCH_COUNT: |
649 | 1073 res = th_regex_match_count(haystack, &soffs, expr, node, &nnode, flags, level); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1074 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1075 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1076 case TH_RE_MATCH_ANCHOR_START: |
643 | 1077 res = (soffs == 0); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1078 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1079 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1080 case TH_RE_MATCH_ANCHOR_END: |
643 | 1081 res = (haystack[soffs] == 0); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1082 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1083 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1084 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1085 |
643 | 1086 if (res) |
649 | 1087 *offs = soffs; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1088 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1089 return res; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1090 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1091 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1092 |
655 | 1093 /** |
1094 * Match the specified string @p haystack against specified compiled | |
1095 * regular expression @p expr and return results in optional variables | |
1096 * @p pnmatches for number of matches and/or @p pmatches @c th_regex_match_t | |
1097 * structures for matching sequences information. If @p pmatches is used, | |
1098 * the resulting linked list should be eventually freed via th_regex_free_matches(). | |
1099 * | |
1100 * @param[in] expr regular expression structure to be matched | |
1101 * @param[in] haystack string to be matched against | |
1102 * @param[out] pnmatches pointer to variable to be set to number of found matches, or @c NULL if the information is not desired | |
1103 * @param[out] pmatches pointer to a pointer of @c th_regex_match_t structures, or @c NULL if the information is not desired | |
1104 * @param[in] maxmatches maximum number of matches until bailing out, or @c 0 if no limit | |
1105 * @param[in] flags additional flags, see @c TH_REF_* | |
1106 */ | |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
1107 int th_regex_match(const th_regex_t *expr, const th_char_t *haystack, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1108 size_t *pnmatches, th_regex_match_t **pmatches, const size_t maxmatches, |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1109 const int flags) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1110 { |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1111 size_t nmatches = 0; |
647 | 1112 int level = 0; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1113 (void) flags; |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1114 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1115 if (pnmatches != NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1116 *pnmatches = 0; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1117 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1118 // Check given pattern and string |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1119 if (expr == NULL || haystack == NULL) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1120 return THERR_NULLPTR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1121 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1122 // Start matching |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1123 // XXX NOTE .. lots to think about and to take into account: |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1124 // - anchored and unanchored expressions |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1125 // - how to check if the expression has consumed all possibilities? |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1126 // .. |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1127 for (size_t soffs = 0; haystack[soffs] != 0; ) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1128 { |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1129 size_t coffs = soffs; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1130 |
647 | 1131 if (th_regex_match_expr(haystack, &coffs, expr, 0, flags, level)) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1132 { |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1133 // A match was found, increase count |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1134 nmatches++; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1135 |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1136 // Deliver to caller if required |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1137 if (pnmatches != NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1138 *pnmatches = nmatches; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1139 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1140 if (pmatches != NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1141 { |
647 | 1142 // Add the match region to the list |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1143 th_regex_match_t *match = th_malloc0(sizeof(th_regex_match_t)); |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1144 if (match == NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1145 return THERR_MALLOC; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1146 |
669
7493d4c9ff77
Add some regex flags, features to be implemented "some day".
Matti Hamalainen <ccr@tnsp.org>
parents:
667
diff
changeset
|
1147 match->type = TH_RE_MATCH_EXPR; |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1148 match->start = soffs; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1149 match->len = coffs - soffs; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1150 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1151 th_llist_append_node((th_llist_t **) pmatches, (th_llist_t *) match); |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1152 } |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1153 |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1154 // Check match count limit, if set |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1155 if (maxmatches > 0 && nmatches >= maxmatches) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1156 break; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1157 |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1158 // If offset was not advanced, increase by one |
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1159 // otherwise use end of match offset as new start |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1160 if (soffs == coffs) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1161 soffs++; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1162 else |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1163 soffs = coffs; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1164 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1165 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1166 { |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1167 soffs++; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1168 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1169 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1170 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1171 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1172 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1173 |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1174 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1175 static void th_regex_free_match(th_regex_match_t *node) |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1176 { |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1177 (void) node; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1178 // Nothing to do here at the moment |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1179 } |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1180 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1181 |
655 | 1182 /** |
1183 * Deallocate the given set of @c th_regex_match_t | |
1184 * linked list structures pointed by @p matches. | |
1185 * All associated data will be freed. | |
1186 * | |
1187 * @param[in] matches structure to be deallocated | |
1188 */ | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1189 void th_regex_free_matches(th_regex_match_t *matches) |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1190 { |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1191 th_llist_free_func_node((th_llist_t *) matches, |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1192 (void (*)(th_llist_t *)) th_regex_free_match); |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1193 } |