Mercurial > hg > th-libs
annotate th_regex.c @ 674:dfabc7eef3dd
Add new functions th_split_string() and th_join_string().
author | Matti Hamalainen <ccr@tnsp.org> |
---|---|
date | Tue, 25 Feb 2020 05:16:42 +0200 |
parents | 7493d4c9ff77 |
children | dee28d507da7 |
rev | line source |
---|---|
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1 /* |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
2 * Simple regular expression matching functionality |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
3 * Programmed and designed by Matti 'ccr' Hamalainen |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
4 * (C) Copyright 2020 Tecnic Software productions (TNSP) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
5 * |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
6 * Please read file 'COPYING' for information on license and distribution. |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
7 */ |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
8 #include "th_regex.h" |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
9 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
10 |
635
d191ded8a790
Improve the experimental regex matching debugging macros.
Matti Hamalainen <ccr@tnsp.org>
parents:
614
diff
changeset
|
11 #ifdef TH_EXPERIMENTAL_REGEX_DEBUG |
651 | 12 th_ioctx *th_dbg_fh = NULL; |
647 | 13 |
651 | 14 # define DBG_RE_PRINT(...) do { \ |
15 if (th_dbg_fh != NULL) \ | |
647 | 16 { \ |
651 | 17 th_regex_dump_indent(th_dbg_fh, level); \ |
18 thfprintf(th_dbg_fh, __VA_ARGS__); \ | |
647 | 19 } \ |
20 } while (0) | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
21 #else |
651 | 22 # define DBG_RE_PRINT(...) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
23 #endif |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
24 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
25 |
655 | 26 /// @cond |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
27 enum |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
28 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
29 TH_RE_MATCH_ONCE, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
30 TH_RE_MATCH_COUNT, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
31 TH_RE_MATCH_ANCHOR_START, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
32 TH_RE_MATCH_ANCHOR_END, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
33 }; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
34 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
35 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
36 enum |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
37 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
38 TH_RE_TYPE_CHAR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
39 TH_RE_TYPE_STR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
40 TH_RE_TYPE_ANY_CHAR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
41 TH_RE_TYPE_LIST, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
42 TH_RE_TYPE_LIST_REVERSE, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
43 TH_RE_TYPE_SUBEXPR, |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
44 }; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
45 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
46 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
47 static const char *re_match_modes[] = |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
48 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
49 "ONCE", |
643 | 50 "COUNT", |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
51 "ANCHOR START", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
52 "ANCHOR END", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
53 }; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
54 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
55 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
56 static const char *re_match_types[] = |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
57 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
58 "CHAR", |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
59 "STR", |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
60 "ANY", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
61 "LIST", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
62 "LIST REVERSE", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
63 "SUBEXPR", |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
64 }; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
65 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
66 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
67 typedef struct |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
68 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
69 int type; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
70 th_char_t start, end; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
71 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
72 size_t nchars; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
73 th_char_t *chars; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
74 } th_regex_list_item_t; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
75 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
76 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
77 typedef struct |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
78 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
79 size_t nitems, itemssize; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
80 th_regex_list_item_t *items; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
81 } th_regex_list_t; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
82 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
83 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
84 typedef struct |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
85 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
86 int mode, type; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
87 ssize_t repeatMin, repeatMax; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
88 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
89 struct { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
90 th_char_t chr; |
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
91 th_char_t *str; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
92 th_regex_list_t list; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
93 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
94 th_regex_t *expr; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
95 } match; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
96 } th_regex_node_t; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
97 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
98 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
99 typedef struct |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
100 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
101 const th_char_t *pattern; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
102 size_t offs; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
103 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
104 th_regex_t *data; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
105 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
106 size_t nstack, stacksize; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
107 th_regex_t **stack; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
108 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
109 th_char_t *buf; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
110 size_t bufSize, bufPos; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
111 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
112 } th_regex_parse_ctx_t; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
113 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
114 |
655 | 115 struct th_regex_t |
116 { | |
117 size_t nnodes, nodessize; | |
118 th_regex_node_t *nodes; | |
119 }; | |
120 | |
121 /// @endcond | |
122 | |
123 | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
124 static void th_regex_node_init(th_regex_node_t *node) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
125 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
126 memset(node, 0, sizeof(th_regex_node_t)); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
127 node->mode = TH_RE_MATCH_ONCE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
128 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
129 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
130 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
131 static int th_regex_strndup(th_char_t **pdst, |
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
132 const th_char_t *src, const size_t len) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
133 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
134 if (pdst == NULL) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
135 return THERR_NULLPTR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
136 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
137 if (UINTPTR_MAX / sizeof(th_char_t) < len + 1) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
138 return THERR_BOUNDS; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
139 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
140 if ((*pdst = (th_char_t *) |
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
141 th_malloc((len + 1) * sizeof(th_char_t))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
142 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
143 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
144 memcpy(*pdst, src, len * sizeof(th_char_t)); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
145 (*pdst)[len] = 0; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
146 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
147 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
148 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
149 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
150 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
151 static int th_regex_parse_ctx_get_prev_node( |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
152 th_regex_parse_ctx_t *ctx, th_regex_node_t **pnode) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
153 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
154 if (ctx->data != NULL && ctx->data->nnodes > 0) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
155 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
156 *pnode = &ctx->data->nodes[ctx->data->nnodes - 1]; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
157 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
158 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
159 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
160 return THERR_INVALID_DATA; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
161 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
162 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
163 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
164 static int th_regex_parse_ctx_push(th_regex_parse_ctx_t *ctx) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
165 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
166 if (ctx->stack == NULL || ctx->nstack + 1 >= ctx->stacksize) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
167 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
168 ctx->stacksize += 16; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
169 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
170 if ((ctx->stack = th_realloc(ctx->stack, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
171 ctx->stacksize * sizeof(th_regex_node_t *))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
172 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
173 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
174 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
175 ctx->stack[ctx->nstack] = ctx->data; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
176 ctx->nstack++; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
177 ctx->data = NULL; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
178 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
179 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
180 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
181 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
182 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
183 static int th_regex_parse_ctx_pop(th_regex_parse_ctx_t *ctx, th_regex_t **data) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
184 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
185 if (ctx->nstack > 0) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
186 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
187 *data = ctx->data; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
188 ctx->nstack--; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
189 ctx->data = ctx->stack[ctx->nstack]; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
190 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
191 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
192 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
193 return THERR_INVALID_DATA; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
194 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
195 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
196 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
197 static int th_regex_parse_ctx_node_commit(th_regex_parse_ctx_t *ctx, th_regex_node_t *node) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
198 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
199 th_regex_t *data = ctx->data; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
200 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
201 if (data == NULL) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
202 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
203 if ((data = ctx->data = th_malloc0(sizeof(th_regex_t))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
204 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
205 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
206 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
207 if (data->nodes == NULL || data->nnodes + 1 >= data->nodessize) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
208 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
209 data->nodessize += 16; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
210 if ((data->nodes = th_realloc(data->nodes, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
211 data->nodessize * sizeof(th_regex_node_t))) == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
212 return THERR_MALLOC; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
213 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
214 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
215 memcpy(&data->nodes[data->nnodes], node, sizeof(th_regex_node_t)); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
216 data->nnodes++; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
217 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
218 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
219 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
220 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
221 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
222 static BOOL th_regex_find_next(const th_char_t *str, |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
223 const size_t start, size_t *offs, |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
224 const th_char_t delim) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
225 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
226 for (*offs = start; str[*offs] != 0; (*offs)++) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
227 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
228 if (str[*offs] == delim) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
229 return TRUE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
230 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
231 return FALSE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
232 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
233 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
234 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
235 static BOOL th_regex_parse_ssize_t(const th_char_t *str, |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
236 ssize_t *value) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
237 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
238 th_char_t ch; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
239 BOOL neg; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
240 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
241 if (*str == '-') |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
242 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
243 str++; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
244 neg = TRUE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
245 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
246 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
247 neg = FALSE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
248 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
249 // Is the value negative? |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
250 while ((ch = *str++)) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
251 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
252 if (ch >= '0' && ch <= '9') |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
253 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
254 *value *= 10; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
255 *value += ch - '0'; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
256 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
257 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
258 return FALSE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
259 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
260 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
261 if (neg) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
262 *value = -(*value); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
263 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
264 return TRUE; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
265 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
266 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
267 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
268 static void th_regex_list_item_init(th_regex_list_item_t *item) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
269 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
270 memset(item, 0, sizeof(th_regex_list_item_t)); |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
271 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
272 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
273 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
274 static int th_regex_list_add_item(th_regex_list_t *list, th_regex_list_item_t *item) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
275 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
276 if (list->items == NULL || list->nitems + 1 >= list->itemssize) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
277 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
278 list->itemssize += 16; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
279 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
280 if ((list->items = th_realloc(list->items, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
281 list->itemssize * sizeof(th_regex_list_item_t))) == NULL) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
282 return THERR_MALLOC; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
283 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
284 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
285 memcpy(&list->items[list->nitems], item, sizeof(th_regex_list_item_t)); |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
286 list->nitems++; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
287 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
288 return THERR_OK; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
289 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
290 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
291 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
292 static void th_regex_list_free(th_regex_list_t *list) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
293 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
294 if (list != NULL) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
295 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
296 for (size_t n = 0; n < list->nitems; n++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
297 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
298 th_free(list->items[n].chars); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
299 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
300 th_free(list->items); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
301 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
302 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
303 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
304 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
305 static int th_regex_parse_list(const th_char_t *str, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
306 const size_t slen, th_regex_list_t *list) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
307 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
308 th_char_t *tmp = NULL; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
309 th_regex_list_item_t item; |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
310 int res; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
311 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
312 if ((res = th_regex_strndup(&tmp, str, slen)) != THERR_OK) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
313 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
314 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
315 // Handle ranges like [A-Z] |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
316 for (size_t offs = 0; offs < slen; offs++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
317 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
318 th_char_t |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
319 *prev = (offs > 0) ? tmp + offs - 1 : NULL, |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
320 *curr = tmp + offs, |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
321 *next = (offs + 1 < slen) ? tmp + offs + 1 : NULL; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
322 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
323 if (*curr == '-') |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
324 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
325 if (prev != NULL && next != NULL) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
326 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
327 // Range |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
328 th_regex_list_item_init(&item); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
329 item.type = 1; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
330 item.start = *prev; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
331 item.end = *next; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
332 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
333 if (item.start >= item.end) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
334 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
335 res = THERR_INVALID_DATA; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
336 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
337 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
338 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
339 *curr = *prev = *next = 0; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
340 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
341 if ((res = th_regex_list_add_item(list, &item)) != THERR_OK) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
342 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
343 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
344 else |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
345 if (next != NULL) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
346 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
347 res = THERR_INVALID_DATA; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
348 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
349 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
350 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
351 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
352 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
353 // Count number of remaining characters |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
354 th_regex_list_item_init(&item); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
355 item.type = 0; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
356 item.nchars = 0; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
357 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
358 for (size_t offs = 0; offs < slen; offs++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
359 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
360 th_char_t curr = tmp[offs]; |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
361 if (curr != 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
362 item.nchars++; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
363 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
364 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
365 if (item.nchars > 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
366 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
367 if ((item.chars = th_malloc(sizeof(th_char_t) * item.nchars)) == NULL) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
368 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
369 res = THERR_MALLOC; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
370 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
371 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
372 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
373 for (size_t offs = 0, n = 0; offs < slen; offs++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
374 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
375 th_char_t curr = tmp[offs]; |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
376 if (curr != 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
377 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
378 item.chars[n] = curr; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
379 n++; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
380 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
381 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
382 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
383 if ((res = th_regex_list_add_item(list, &item)) != THERR_OK) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
384 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
385 th_free(item.chars); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
386 goto out; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
387 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
388 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
389 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
390 out: |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
391 th_free(tmp); |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
392 return res; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
393 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
394 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
395 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
396 static int th_regex_parse_ctx_node_commit_strchr_do(th_regex_parse_ctx_t *ctx, |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
397 const th_char_t *buf, const size_t bufLen) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
398 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
399 th_regex_node_t node; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
400 th_regex_node_init(&node); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
401 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
402 if (bufLen > 1) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
403 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
404 int res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
405 node.type = TH_RE_TYPE_STR; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
406 if ((res = th_regex_strndup(&node.match.str, buf, bufLen)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
407 return res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
408 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
409 else |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
410 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
411 node.type = TH_RE_TYPE_CHAR; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
412 node.match.chr = buf[0]; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
413 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
414 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
415 return th_regex_parse_ctx_node_commit(ctx, &node); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
416 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
417 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
418 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
419 static int th_regex_parse_ctx_node_commit_strchr(th_regex_parse_ctx_t *ctx, const BOOL split) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
420 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
421 int res = THERR_OK;; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
422 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
423 if (ctx->bufPos > 0) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
424 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
425 if (ctx->bufPos > 1 && split) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
426 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
427 if ((res = th_regex_parse_ctx_node_commit_strchr_do(ctx, ctx->buf, ctx->bufPos - 1)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
428 return res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
429 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
430 res = th_regex_parse_ctx_node_commit_strchr_do(ctx, ctx->buf + ctx->bufPos - 1, 1); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
431 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
432 else |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
433 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
434 res = th_regex_parse_ctx_node_commit_strchr_do(ctx, ctx->buf, ctx->bufPos); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
435 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
436 ctx->bufPos = 0; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
437 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
438 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
439 return res; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
440 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
441 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
442 |
655 | 443 /** |
444 * Parse given regular expression @p pattern string into compiled/tokenized | |
445 * form as @c th_regex_t structures. Returns @c THERR_OK if successful, | |
446 * or other @c THERR_* return value if not. In either case, the @p pexpr | |
447 * may have been allocated and must be freed via th_regex_free(). | |
657 | 448 * @param[in,out] pexpr pointer to a pointer of @c th_regex_t structures to be |
655 | 449 * @param[in] pattern regular expression pattern string |
450 * @returns @c THERR_* return value indicating success or failure | |
451 */ | |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
452 int th_regex_compile(th_regex_t **pexpr, const th_char_t *pattern) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
453 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
454 int res = THERR_OK; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
455 th_regex_parse_ctx_t ctx; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
456 th_regex_node_t node, *pnode; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
457 th_char_t *tmp = NULL; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
458 size_t start; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
459 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
460 // Check pointers |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
461 if (pexpr == NULL || pattern == NULL) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
462 { |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
463 res = THERR_NULLPTR; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
464 goto out; |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
465 } |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
466 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
467 // Initialize parsing context |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
468 memset(&ctx, 0, sizeof(ctx)); |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
469 ctx.pattern = pattern; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
470 ctx.bufSize = 256; |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
471 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
472 if ((ctx.buf = th_malloc(ctx.bufSize * sizeof(th_char_t))) == NULL) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
473 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
474 res = THERR_MALLOC; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
475 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
476 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
477 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
478 // Start parsing the pattern |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
479 for (; ctx.pattern[ctx.offs] != 0; ctx.offs++) |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
480 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
481 th_char_t cch = ctx.pattern[ctx.offs]; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
482 |
613 | 483 switch (cch) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
484 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
485 case '?': |
613 | 486 case '*': |
487 case '+': | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
488 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, TRUE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
489 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
490 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
491 if ((res = th_regex_parse_ctx_get_prev_node(&ctx, &pnode)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
492 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
493 |
613 | 494 if (cch == '?') |
495 { | |
643 | 496 // Previous token is optional (repeat 0-1 times) (non-greedy matching) |
497 pnode->mode = TH_RE_MATCH_COUNT; | |
498 pnode->repeatMin = 0; | |
499 pnode->repeatMax = 1; | |
613 | 500 } |
501 else | |
502 { | |
641 | 503 // Check if previous was a count ("**", "*+", etc.) |
643 | 504 if (pnode->mode == TH_RE_MATCH_COUNT) |
613 | 505 { |
506 res = THERR_INVALID_DATA; | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
507 goto out; |
613 | 508 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
509 |
643 | 510 pnode->mode = TH_RE_MATCH_COUNT; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
511 |
613 | 512 if (cch == '*') |
513 { | |
514 // Previous token can repeat 0 or more times | |
515 pnode->repeatMin = 0; | |
516 pnode->repeatMax = -1; | |
517 } | |
518 else | |
519 { | |
520 // Previous token must repeat 1 or more times | |
521 pnode->repeatMin = 1; | |
522 pnode->repeatMax = -1; | |
523 } | |
524 } | |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
525 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
526 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
527 case '{': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
528 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, TRUE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
529 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
530 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
531 // {n} | {min,max} |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
532 start = ctx.offs + 1; |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
533 if (!th_regex_find_next(ctx.pattern, start, &ctx.offs, '}')) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
534 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
535 // End not found |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
536 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
537 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
538 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
539 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
540 th_free(tmp); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
541 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
542 if ((res = th_regex_parse_ctx_get_prev_node(&ctx, &pnode)) != THERR_OK || |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
543 (res = th_regex_strndup(&tmp, ctx.pattern + start, |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
544 ctx.offs - start)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
545 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
546 |
643 | 547 pnode->mode = TH_RE_MATCH_COUNT; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
548 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
549 if (th_regex_find_next(tmp, 0, &start, ',')) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
550 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
551 tmp[start] = 0; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
552 if (!th_regex_parse_ssize_t(tmp, &pnode->repeatMin) || |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
553 !th_regex_parse_ssize_t(tmp + start + 1, &pnode->repeatMax)) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
554 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
555 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
556 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
557 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
558 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
559 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
560 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
561 if (!th_regex_parse_ssize_t(tmp, &pnode->repeatMin)) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
562 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
563 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
564 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
565 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
566 pnode->repeatMax = pnode->repeatMin; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
567 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
568 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
569 if (pnode->repeatMin < 0 || pnode->repeatMax < 1 || |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
570 pnode->repeatMax < pnode->repeatMin) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
571 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
572 // Invalid repeat counts |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
573 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
574 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
575 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
576 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
577 |
648 | 578 /* |
579 case '|': | |
580 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) | |
581 goto out; | |
582 | |
583 // Alt pattern .. how to handle these? | |
584 break; | |
585 */ | |
586 | |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
587 case '(': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
588 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
589 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
590 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
591 // Start of subpattern |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
592 if ((res = th_regex_parse_ctx_push(&ctx)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
593 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
594 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
595 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
596 case ')': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
597 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
598 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
599 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
600 // End of subpattern |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
601 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
602 node.type = TH_RE_TYPE_SUBEXPR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
603 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
604 if ((res = th_regex_parse_ctx_pop(&ctx, &node.match.expr)) != THERR_OK || |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
605 (res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
606 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
607 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
608 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
609 case '^': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
610 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
611 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
612 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
613 // Start of line anchor |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
614 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
615 node.mode = TH_RE_MATCH_ANCHOR_START; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
616 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
617 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
618 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
619 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
620 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
621 case '$': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
622 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
623 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
624 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
625 // End of line anchor |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
626 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
627 node.mode = TH_RE_MATCH_ANCHOR_END; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
628 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
629 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
630 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
631 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
632 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
633 case '[': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
634 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
635 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
636 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
637 // Start of char list |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
638 start = ctx.offs + 1; |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
639 if (!th_regex_find_next(ctx.pattern, start, &ctx.offs, ']') || |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
640 ctx.offs == start) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
641 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
642 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
643 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
644 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
645 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
646 th_regex_node_init(&node); |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
647 if (ctx.pattern[start] == '^') |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
648 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
649 node.type = TH_RE_TYPE_LIST_REVERSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
650 start++; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
651 } |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
652 else |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
653 node.type = TH_RE_TYPE_LIST; |
638 | 654 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
655 if ((res = th_regex_parse_list(ctx.pattern + start, |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
656 ctx.offs - start, &node.match.list)) != THERR_OK || |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
657 (res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
658 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
659 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
660 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
661 case '.': |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
662 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
663 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
664 |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
665 // Any single character matches |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
666 th_regex_node_init(&node); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
667 node.type = TH_RE_TYPE_ANY_CHAR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
668 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
669 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
670 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
671 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
672 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
673 case '\\': |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
674 // Literal escape |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
675 ctx.offs++; |
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
676 if (ctx.pattern[ctx.offs] == 0) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
677 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
678 // End of pattern, error |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
679 res = THERR_INVALID_DATA; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
680 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
681 } |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
682 // fall-through |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
683 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
684 default: |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
685 // Given character must match |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
686 if (ctx.bufPos < ctx.bufSize) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
687 ctx.buf[ctx.bufPos++] = ctx.pattern[ctx.offs]; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
688 else |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
689 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
690 goto out; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
691 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
692 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
693 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
694 |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
695 // Commit last string/char if any |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
696 if ((res = th_regex_parse_ctx_node_commit_strchr(&ctx, FALSE)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
697 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
698 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
699 // Create root node |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
700 th_regex_node_init(&node); |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
701 node.type = TH_RE_TYPE_SUBEXPR; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
702 node.match.expr = ctx.data; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
703 ctx.data = NULL; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
704 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
705 if ((res = th_regex_parse_ctx_node_commit(&ctx, &node)) != THERR_OK) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
706 goto out; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
707 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
708 out: |
611
d895b0fd6ad6
Combine code from th_regex_compile() to th_regex_compile_do().
Matti Hamalainen <ccr@tnsp.org>
parents:
610
diff
changeset
|
709 *pexpr = ctx.data; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
710 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
711 // Free temporary buffers |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
712 th_free(tmp); |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
713 th_free(ctx.buf); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
714 return res; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
715 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
716 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
717 |
655 | 718 /** |
719 * Deallocate the given regular expression structure @p expr. | |
720 * All associated data will be freed, though pointers may not | |
721 * be NULLed. | |
722 * | |
723 * @param[in] expr structure to be deallocated | |
724 */ | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
725 void th_regex_free(th_regex_t *expr) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
726 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
727 if (expr != NULL) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
728 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
729 for (size_t nnode = 0; nnode < expr->nnodes; nnode++) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
730 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
731 th_regex_node_t *node = &expr->nodes[nnode]; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
732 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
733 th_regex_free(node->match.expr); |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
734 th_regex_list_free(&node->match.list); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
735 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
736 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
737 th_free(expr->nodes); |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
738 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
739 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
740 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
741 |
651 | 742 static void th_regex_dump_indent(th_ioctx *fh, const int level) |
647 | 743 { |
744 for (int indent = 0; indent < level; indent++) | |
651 | 745 thfputs(" ", fh); |
647 | 746 } |
747 | |
748 | |
651 | 749 static void th_regex_dump_node(th_ioctx *fh, const th_regex_node_t *node) |
647 | 750 { |
651 | 751 thfprintf(fh, |
647 | 752 "%s %s ", |
753 re_match_modes[node->mode], | |
754 re_match_types[node->type]); | |
755 | |
756 if (node->mode == TH_RE_MATCH_COUNT) | |
757 { | |
651 | 758 thfprintf(fh, "min=%" PRId_SSIZE_T ", max=%" PRId_SSIZE_T " : ", |
647 | 759 node->repeatMin, node->repeatMax); |
760 } | |
761 | |
762 switch (node->type) | |
763 { | |
764 case TH_RE_TYPE_CHAR: | |
651 | 765 thfprintf(fh, "'%c'", node->match.chr); |
647 | 766 break; |
767 | |
768 case TH_RE_TYPE_STR: | |
651 | 769 thfprintf(fh, "\"%s\"", node->match.str); |
647 | 770 break; |
771 | |
772 case TH_RE_TYPE_ANY_CHAR: | |
651 | 773 thfprintf(fh, "."); |
647 | 774 break; |
775 | |
776 case TH_RE_TYPE_LIST: | |
777 case TH_RE_TYPE_LIST_REVERSE: | |
651 | 778 thfputs("[ ", fh); |
647 | 779 for (size_t n = 0; n < node->match.list.nitems; n++) |
780 { | |
781 const th_regex_list_item_t *li = &node->match.list.items[n]; | |
782 if (li->type) | |
783 { | |
651 | 784 thfprintf(fh, "'%c-%c' ", li->start, li->end); |
647 | 785 } |
786 else | |
787 { | |
788 for (size_t i = 0; i < li->nchars; i++) | |
651 | 789 thfprintf(fh, "'%c' ", li->chars[i]); |
647 | 790 } |
791 } | |
651 | 792 thfputs("]", fh); |
647 | 793 break; |
794 } | |
795 } | |
796 | |
797 | |
655 | 798 /** |
799 * Print out the contents of given regular expression structure @p expr | |
800 * in "human-readable" format to specified @c th_ioctx context. Typically | |
801 * useful for debugging purposes only. | |
802 * | |
803 * @param[in,out] fh th_ioctx handle to be used for output, must be writable. | |
804 * @param[in] level starting whitespace indentation level | |
805 * @param[in] expr regular expression structure to be "dumped" | |
806 */ | |
651 | 807 void th_regex_dump(th_ioctx *fh, const int level, const th_regex_t *expr) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
808 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
809 if (expr != NULL) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
810 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
811 for (size_t nnode = 0; nnode < expr->nnodes; nnode++) |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
812 { |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
813 th_regex_node_t *node = &expr->nodes[nnode]; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
814 |
647 | 815 th_regex_dump_indent(fh, level); |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
816 |
651 | 817 thfprintf(fh, |
647 | 818 "[%" PRIu_SIZE_T "/%" PRIu_SIZE_T "] ", |
819 nnode + 1, expr->nnodes); | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
820 |
647 | 821 th_regex_dump_node(fh, node); |
651 | 822 thfputs("\n", fh); |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
823 |
647 | 824 if (node->type == TH_RE_TYPE_SUBEXPR) |
825 th_regex_dump(fh, level + 1, node->match.expr); | |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
826 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
827 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
828 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
829 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
830 |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
831 static BOOL th_regex_match_list(const th_regex_list_t *list, const th_char_t cch) |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
832 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
833 // Could be optimized, perhaps .. sort match.chars, binary search etc? |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
834 for (size_t nitem = 0; nitem < list->nitems; nitem++) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
835 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
836 const th_regex_list_item_t *item = &list->items[nitem]; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
837 |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
838 if (item->type == 0) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
839 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
840 for (size_t n = 0; n < item->nchars; n++) |
649 | 841 { |
842 if (item->chars[n] == cch) | |
843 return TRUE; | |
844 } | |
639
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
845 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
846 else |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
847 { |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
848 if (cch >= item->start && cch <= item->end) |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
849 return TRUE; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
850 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
851 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
852 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
853 return FALSE; |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
854 } |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
855 |
8c957ad9d4c3
Some more work on regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
638
diff
changeset
|
856 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
857 static BOOL th_regex_match_expr( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
858 const th_char_t *haystack, |
649 | 859 size_t *offs, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
860 const th_regex_t *expr, |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
861 const size_t startnode, |
647 | 862 const int flags, |
863 const int level | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
864 ); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
865 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
866 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
867 static BOOL th_regex_match_one( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
868 const th_char_t *haystack, |
649 | 869 size_t *offs, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
870 const th_regex_node_t *node, |
647 | 871 const int flags, |
872 const int level | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
873 ) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
874 { |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
875 th_char_t cch; |
638 | 876 BOOL res = FALSE; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
877 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
878 switch (node->type) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
879 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
880 case TH_RE_TYPE_SUBEXPR: |
649 | 881 res = th_regex_match_expr(haystack, offs, node->match.expr, 0, flags, level + 1); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
882 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
883 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
884 case TH_RE_TYPE_LIST: |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
885 case TH_RE_TYPE_LIST_REVERSE: |
649 | 886 if ((cch = haystack[*offs]) == 0) |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
887 res = FALSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
888 else |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
889 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
890 res = th_regex_match_list(&node->match.list, cch); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
891 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
892 if (node->type == TH_RE_TYPE_LIST_REVERSE) |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
893 res = !res; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
894 |
649 | 895 (*offs)++; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
896 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
897 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
898 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
899 case TH_RE_TYPE_ANY_CHAR: |
649 | 900 if ((cch = haystack[*offs]) == 0) |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
901 res = FALSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
902 else |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
903 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
904 res = TRUE; |
649 | 905 (*offs)++; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
906 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
907 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
908 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
909 case TH_RE_TYPE_CHAR: |
649 | 910 if ((cch = haystack[*offs]) == 0) |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
911 res = FALSE; |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
912 else |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
913 { |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
914 res = (cch == node->match.chr); |
649 | 915 (*offs)++; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
916 } |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
917 break; |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
918 |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
919 case TH_RE_TYPE_STR: |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
920 res = TRUE; |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
921 for (th_char_t *str = node->match.str; |
648 | 922 res && *str != 0; |
649 | 923 str++, (*offs)++) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
924 { |
649 | 925 if (haystack[*offs] != *str) |
645
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
926 res = FALSE; |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
927 } |
b897995101b7
More fiddling and twiddling. Add parsing to string nodes instead of separate character nodes.
Matti Hamalainen <ccr@tnsp.org>
parents:
643
diff
changeset
|
928 break; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
929 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
930 |
638 | 931 return res; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
932 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
933 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
934 |
649 | 935 static BOOL th_regex_match_count( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
936 const th_char_t *haystack, |
649 | 937 size_t *offs, |
938 const th_regex_t *expr, | |
939 const th_regex_node_t *node, | |
940 size_t *nnode, | |
941 const int flags, | |
942 const int level | |
943 ) | |
944 { | |
667 | 945 size_t toffs = *offs, last_offs = *offs; |
649 | 946 ssize_t count = 0; |
947 | |
948 do | |
949 { | |
666 | 950 // Attempt to match the repeated node once |
667 | 951 size_t poffs = toffs; |
952 if (th_regex_match_one(haystack, &poffs, node, flags, level)) | |
953 { | |
954 // Matched, increase count of repeats | |
955 count++; | |
956 //DBG_RE_PRINT("#%" PRId_SSIZE_T "\n", count); | |
957 | |
958 // poffs should now be at position + 1 from match | |
959 } | |
960 else | |
961 { | |
962 // Did not match, get out if repeatMin > 0 | |
963 if (node->repeatMin > 0) | |
964 break; | |
965 } | |
966 | |
967 // Attempt to match rest of the expression | |
968 size_t qoffs1 = poffs, qoffs2 = toffs; | |
969 DBG_RE_PRINT("try rest '%s' :: '%s'\n", haystack + qoffs1, haystack + qoffs2); | |
970 if (th_regex_match_expr(haystack, &qoffs1, expr, *nnode + 1, flags, level + 1)) | |
971 { | |
972 // Matched | |
973 toffs = last_offs = qoffs1; | |
666 | 974 |
667 | 975 DBG_RE_PRINT(" yes1: count=%" PRId_SSIZE_T " [%" PRId_SSIZE_T " .. %" PRId_SSIZE_T "]\n", count, node->repeatMin, node->repeatMax); |
976 | |
977 // Check min repeats and if we are "not greedy". | |
978 if (count >= node->repeatMin && node->repeatMax == 1) | |
979 break; | |
980 | |
981 // Check max repeats | |
982 if (node->repeatMax > 0 && count >= node->repeatMax) | |
983 break; | |
984 } | |
985 else | |
986 if (node->repeatMin == 0 && | |
987 th_regex_match_expr(haystack, &qoffs2, expr, *nnode + 1, flags, level + 1)) | |
649 | 988 { |
667 | 989 // Matched |
990 toffs = last_offs = qoffs2; | |
991 | |
992 DBG_RE_PRINT(" yes2: count=%" PRId_SSIZE_T " [%" PRId_SSIZE_T " .. %" PRId_SSIZE_T "]\n", count, node->repeatMin, node->repeatMax); | |
993 | |
994 // Check min repeats and if we are "not greedy". | |
995 if (count >= node->repeatMin && node->repeatMax == 1) | |
996 break; | |
997 | |
998 // Check max repeats | |
999 if (node->repeatMax > 0 && count >= node->repeatMax) | |
1000 break; | |
666 | 1001 |
649 | 1002 } |
1003 else | |
666 | 1004 { |
667 | 1005 // Rest of expression did not match, try again |
1006 DBG_RE_PRINT(" no\n"); | |
1007 toffs = poffs; | |
666 | 1008 } |
649 | 1009 |
1010 | |
667 | 1011 } while (haystack[toffs] != 0); |
649 | 1012 |
667 | 1013 // Check results |
1014 BOOL res = count >= node->repeatMin || | |
1015 (node->repeatMax > 0 && count >= node->repeatMax); | |
666 | 1016 |
1017 if (res) | |
649 | 1018 { |
667 | 1019 *offs = last_offs; |
649 | 1020 *nnode = expr->nnodes; |
1021 } | |
1022 | |
666 | 1023 DBG_RE_PRINT("RESULT: %s : offs=%" PRIu_SIZE_T "='%s'\n", |
1024 res ? "YES" : "NO", | |
1025 *offs, haystack + *offs); | |
649 | 1026 |
1027 return res; | |
1028 } | |
1029 | |
1030 | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1031 static BOOL th_regex_match_expr( |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
1032 const th_char_t *haystack, |
649 | 1033 size_t *offs, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1034 const th_regex_t *expr, |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1035 const size_t startnode, |
647 | 1036 const int flags, |
1037 const int level | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1038 ) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1039 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1040 BOOL res = TRUE; |
649 | 1041 size_t soffs = *offs; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1042 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1043 for (size_t nnode = startnode; res && nnode < expr->nnodes; nnode++) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1044 { |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1045 const th_regex_node_t *node = &expr->nodes[nnode]; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1046 |
647 | 1047 #ifdef TH_EXPERIMENTAL_REGEX_DEBUG |
651 | 1048 if (th_dbg_fh != NULL) |
648 | 1049 { |
651 | 1050 th_regex_dump_indent(th_dbg_fh, level); |
1051 | |
1052 thfprintf(th_dbg_fh, | |
648 | 1053 "[%" PRIu_SIZE_T "/%" PRIu_SIZE_T "] ", |
1054 nnode + 1, expr->nnodes); | |
647 | 1055 |
651 | 1056 th_regex_dump_node(th_dbg_fh, node); |
647 | 1057 |
651 | 1058 thfprintf(th_dbg_fh, " <-> \"%s\"\n", |
648 | 1059 haystack + soffs); |
1060 } | |
647 | 1061 #endif |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1062 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1063 switch (node->mode) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1064 { |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1065 case TH_RE_MATCH_ONCE: |
647 | 1066 res = th_regex_match_one(haystack, &soffs, node, flags, level); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1067 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1068 |
643 | 1069 case TH_RE_MATCH_COUNT: |
649 | 1070 res = th_regex_match_count(haystack, &soffs, expr, node, &nnode, flags, level); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1071 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1072 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1073 case TH_RE_MATCH_ANCHOR_START: |
643 | 1074 res = (soffs == 0); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1075 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1076 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1077 case TH_RE_MATCH_ANCHOR_END: |
643 | 1078 res = (haystack[soffs] == 0); |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1079 break; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1080 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1081 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1082 |
643 | 1083 if (res) |
649 | 1084 *offs = soffs; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1085 |
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1086 return res; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1087 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1088 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1089 |
655 | 1090 /** |
1091 * Match the specified string @p haystack against specified compiled | |
1092 * regular expression @p expr and return results in optional variables | |
1093 * @p pnmatches for number of matches and/or @p pmatches @c th_regex_match_t | |
1094 * structures for matching sequences information. If @p pmatches is used, | |
1095 * the resulting linked list should be eventually freed via th_regex_free_matches(). | |
1096 * | |
1097 * @param[in] expr regular expression structure to be matched | |
1098 * @param[in] haystack string to be matched against | |
1099 * @param[out] pnmatches pointer to variable to be set to number of found matches, or @c NULL if the information is not desired | |
1100 * @param[out] pmatches pointer to a pointer of @c th_regex_match_t structures, or @c NULL if the information is not desired | |
1101 * @param[in] maxmatches maximum number of matches until bailing out, or @c 0 if no limit | |
1102 * @param[in] flags additional flags, see @c TH_REF_* | |
1103 */ | |
664
c5aa9ada1051
s/th_regex_char_t/th_char_t/g
Matti Hamalainen <ccr@tnsp.org>
parents:
657
diff
changeset
|
1104 int th_regex_match(const th_regex_t *expr, const th_char_t *haystack, |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1105 size_t *pnmatches, th_regex_match_t **pmatches, const size_t maxmatches, |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1106 const int flags) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1107 { |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1108 size_t nmatches = 0; |
647 | 1109 int level = 0; |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1110 (void) flags; |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1111 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1112 if (pnmatches != NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1113 *pnmatches = 0; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1114 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1115 // Check given pattern and string |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1116 if (expr == NULL || haystack == NULL) |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1117 return THERR_NULLPTR; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1118 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1119 // Start matching |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1120 // XXX NOTE .. lots to think about and to take into account: |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1121 // - anchored and unanchored expressions |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1122 // - how to check if the expression has consumed all possibilities? |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1123 // .. |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1124 for (size_t soffs = 0; haystack[soffs] != 0; ) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1125 { |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1126 size_t coffs = soffs; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1127 |
647 | 1128 if (th_regex_match_expr(haystack, &coffs, expr, 0, flags, level)) |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1129 { |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1130 // A match was found, increase count |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1131 nmatches++; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1132 |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1133 // Deliver to caller if required |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1134 if (pnmatches != NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1135 *pnmatches = nmatches; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1136 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1137 if (pmatches != NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1138 { |
647 | 1139 // Add the match region to the list |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1140 th_regex_match_t *match = th_malloc0(sizeof(th_regex_match_t)); |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1141 if (match == NULL) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1142 return THERR_MALLOC; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1143 |
669
7493d4c9ff77
Add some regex flags, features to be implemented "some day".
Matti Hamalainen <ccr@tnsp.org>
parents:
667
diff
changeset
|
1144 match->type = TH_RE_MATCH_EXPR; |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1145 match->start = soffs; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1146 match->len = coffs - soffs; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1147 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1148 th_llist_append_node((th_llist_t **) pmatches, (th_llist_t *) match); |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1149 } |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1150 |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1151 // Check match count limit, if set |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1152 if (maxmatches > 0 && nmatches >= maxmatches) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1153 break; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1154 |
612
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1155 // If offset was not advanced, increase by one |
cc9ec51b4875
Add some comments and debug messages.
Matti Hamalainen <ccr@tnsp.org>
parents:
611
diff
changeset
|
1156 // otherwise use end of match offset as new start |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1157 if (soffs == coffs) |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1158 soffs++; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1159 else |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1160 soffs = coffs; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1161 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1162 else |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1163 { |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1164 soffs++; |
605
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1165 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1166 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1167 |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1168 return THERR_OK; |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1169 } |
566e6ef41f9d
Initial commit of the highly experimental and unfinished regular expression
Matti Hamalainen <ccr@tnsp.org>
parents:
diff
changeset
|
1170 |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1171 |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1172 static void th_regex_free_match(th_regex_match_t *node) |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1173 { |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1174 (void) node; |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1175 // Nothing to do here at the moment |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1176 } |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1177 |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1178 |
655 | 1179 /** |
1180 * Deallocate the given set of @c th_regex_match_t | |
1181 * linked list structures pointed by @p matches. | |
1182 * All associated data will be freed. | |
1183 * | |
1184 * @param[in] matches structure to be deallocated | |
1185 */ | |
640
9e1f9e1d1487
Aaand some more work. Still just a broken concept.
Matti Hamalainen <ccr@tnsp.org>
parents:
639
diff
changeset
|
1186 void th_regex_free_matches(th_regex_match_t *matches) |
610
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1187 { |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1188 th_llist_free_func_node((th_llist_t *) matches, |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1189 (void (*)(th_llist_t *)) th_regex_free_match); |
a0e8d9c6300b
A bit more work on the regex stuff.
Matti Hamalainen <ccr@tnsp.org>
parents:
609
diff
changeset
|
1190 } |