Mercurial > hg > egg-tcls
annotate urllog.tcl @ 240:302faa300510
urllog: Fix 100L.
author | Matti Hamalainen <ccr@tnsp.org> |
---|---|
date | Mon, 19 Jan 2015 16:32:47 +0200 |
parents | 059660980388 |
children | 669842725e2f |
rev | line source |
---|---|
0 | 1 ########################################################################## |
2 # | |
222 | 3 # URLLog v2.3.0 by Matti 'ccr' Hamalainen <ccr@tnsp.org> |
176
eda776bcb7ed
urllog: Bump copyright, and version.
Matti Hamalainen <ccr@tnsp.org>
parents:
160
diff
changeset
|
4 # (C) Copyright 2000-2014 Tecnic Software productions (TNSP) |
0 | 5 # |
113
077c7383f36f
urllog: Add line about the script's license.
Matti Hamalainen <ccr@tnsp.org>
parents:
112
diff
changeset
|
6 # This script is freely distributable under GNU GPL (version 2) license. |
077c7383f36f
urllog: Add line about the script's license.
Matti Hamalainen <ccr@tnsp.org>
parents:
112
diff
changeset
|
7 # |
0 | 8 ########################################################################## |
9 # | |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
10 # URL-logger script for EggDrop IRC robot, utilizing SQLite3 database |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
11 # This script requires SQLite TCL extension. Under Debian, you need: |
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
12 # tcl8.5 libsqlite3-tcl (and eggdrop eggdrop-data, of course) |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
13 # |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
14 # NOTICE! If you are upgrading to URLLog v2.0+ from any 1.x version, you |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
15 # may want to run a conversion script against your URL-database file, |
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
16 # if you wish to preserve the old data. |
0 | 17 # |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
18 # See convert_urllog_db.tcl for more information. |
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
19 # |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
20 # If you are doing a fresh install, you will need to create the |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
21 # initial SQLite3 database with the required table schemas. You |
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
22 # can do that by running: create_urllog_db.tcl |
0 | 23 # |
24 ########################################################################## | |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
25 |
0 | 26 ### |
27 ### HTTP options | |
28 ### | |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
29 # Set to 1 if you want to enable use of HTTP proxy. |
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
30 # If you do, you MUST set the proxy settings below too. |
0 | 31 set http_proxy 0 |
32 | |
33 # Proxy host and port number (only used if enabled above) | |
34 set http_proxy_host "" | |
35 set http_proxy_port 8080 | |
36 | |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
37 # Enable _experimental_ TLS/SSL support. This may not work at all. |
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
38 # If unsure, leave this option disabled (0). |
104
da337ca10e0a
urllog: Enable SSL/TLS support by default.
Matti Hamalainen <ccr@tnsp.org>
parents:
103
diff
changeset
|
39 set http_tls_support 1 |
0 | 40 |
89
77e05ce9e9b8
urllog: Add certdir option setting.
Matti Hamalainen <ccr@tnsp.org>
parents:
87
diff
changeset
|
41 set http_tls_cadir "/usr/share/ca-certificates/mozilla" |
77e05ce9e9b8
urllog: Add certdir option setting.
Matti Hamalainen <ccr@tnsp.org>
parents:
87
diff
changeset
|
42 |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
43 |
0 | 44 ### |
45 ### General options | |
46 ### | |
47 | |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
48 # Channels where URLLog records links/URLs |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
49 # set urllog_log_channels "#foobar;#baz" |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
50 # You can use * to match substrings or everything |
227 | 51 set urllog_log_channels "#tnsp;#fireball;#mazmlame" |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
52 |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
53 |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
54 # Filename of the SQLite URL database file |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
55 set urllog_db_file "urllog.sqlite" |
0 | 56 |
57 | |
58 # 1 = Verbose: Say messages when URL is OK, bad, etc. | |
59 # 0 = Quiet : Be quiet (only speak if asked with !urlfind, etc) | |
60 set urllog_verbose 1 | |
61 | |
62 | |
50
f69363fc1f61
Update some comments and add a bit of documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
49
diff
changeset
|
63 # 1 = Enable logging of various script actions into bot's log |
0 | 64 # 0 = Don't. |
65 set urllog_logmsg 1 | |
66 | |
67 | |
68 # 1 = Check URLs for validity and existence before adding. | |
69 # 0 = No checks. Add _anything_ that looks like an URL to the database. | |
70 set urllog_check 1 | |
71 | |
72 | |
73 ### | |
74 ### Search related settings | |
75 ### | |
76 | |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
77 # Channels where !urlfind and other commands can be used. |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
78 # By default this is set to be the same as urllog_log_channels |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
79 set urllog_search_channels $urllog_log_channels |
0 | 80 |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
81 # Limit how many URLs should the "!urlfind" command show at most. |
0 | 82 set urllog_showmax_pub 3 |
83 | |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
84 # Same as above, but for private message search. |
0 | 85 set urllog_showmax_priv 6 |
86 | |
87 | |
88 ### | |
89 ### ShortURL-settings | |
90 ### | |
181
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
91 # To enable ShortURL functionality, you need to set up the |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
92 # URL redirector PHP script (urlredirect.php) correctly, and |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
93 # enable change the settings in it and below appropriately. |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
94 # See urlredirect.php.txt for more information. |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
95 # |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
96 # You will also need SQLite3 support for PHP and access to |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
97 # change .htaccess file(s) on your web server. The PHP |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
98 # script will also need access to the SQLite3 database this |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
99 # script uses. |
ff23ce8b938f
urllog: Document ShortURL functionality slightly.
Matti Hamalainen <ccr@tnsp.org>
parents:
176
diff
changeset
|
100 # |
0 | 101 |
73
646b2fd67312
urllog: Improve documentation of different settings.
Matti Hamalainen <ccr@tnsp.org>
parents:
70
diff
changeset
|
102 # 1 = Enable showing of ShortURLs |
646b2fd67312
urllog: Improve documentation of different settings.
Matti Hamalainen <ccr@tnsp.org>
parents:
70
diff
changeset
|
103 # 0 = ShortURLs not shown in any bot actions |
0 | 104 set urllog_shorturl 1 |
105 | |
73
646b2fd67312
urllog: Improve documentation of different settings.
Matti Hamalainen <ccr@tnsp.org>
parents:
70
diff
changeset
|
106 # Max length of original URL to be shown, rest is chopped |
646b2fd67312
urllog: Improve documentation of different settings.
Matti Hamalainen <ccr@tnsp.org>
parents:
70
diff
changeset
|
107 # off if the URL is longer than the specified amount. |
0 | 108 set urllog_shorturl_orig 30 |
109 | |
73
646b2fd67312
urllog: Improve documentation of different settings.
Matti Hamalainen <ccr@tnsp.org>
parents:
70
diff
changeset
|
110 # Web server URL that handles redirects of ShortURLs |
0 | 111 set urllog_shorturl_prefix "http://tnsp.org/u/" |
112 | |
113 | |
114 ### | |
81
17e542b7985a
urllog, quotedb: Improve documentation.
Matti Hamalainen <ccr@tnsp.org>
parents:
73
diff
changeset
|
115 ### Message texts (informal, errors, etc.) |
0 | 116 ### |
117 | |
118 # No such host was found | |
119 set urlmsg_nosuchhost "ei tommosta oo!" | |
120 | |
121 # Could not connect host (I/O errors etc) | |
122 set urlmsg_ioerror "kraak, virhe yhdynnässä." | |
123 | |
124 # HTTP timeout | |
125 set urlmsg_timeout "ei jaksa ootella" | |
126 | |
127 # No such document was found | |
128 set urlmsg_errorgettingdoc "siitosvirhe" | |
129 | |
130 # URL was already known (was in database) | |
131 set urlmsg_alreadyknown "wanha!" | |
132 #set urlmsg_alreadyknown "Empiiristen havaintojen perusteella ja tällä sovellutusalueella esiintyneisiin aikaisempiin kontekstuaalisiin ilmaisuihin viitaten uskallan todeta, että sovellukseen ilmoittamasi tietoverkko-osoite oli kronologisti ajatellen varsin postpresentuaalisesti sopimaton ja ennestään hyvin tunnettu." | |
133 | |
134 # No match was found when searched with !urlfind or other command | |
135 set urlmsg_nomatch "Ei osumia." | |
136 | |
137 | |
138 ### | |
139 ### Things that you usually don't need to touch ... | |
140 ### | |
141 | |
142 # What IRC "command" should we use to send messages: | |
143 # (Valid alternatives are "PRIVMSG" and "NOTICE") | |
144 set urllog_preferredmsg "PRIVMSG" | |
145 | |
146 # The valid known Top Level Domains (TLDs), but not the country code TLDs | |
147 # (Now includes the new IANA published TLDs) | |
90
a9a4456eb213
urllog: Add .xxx TLD to supported list.
Matti Hamalainen <ccr@tnsp.org>
parents:
89
diff
changeset
|
148 set urllog_tlds "org,com,net,mil,gov,biz,edu,coop,aero,info,museum,name,pro,int,xxx" |
0 | 149 |
150 | |
151 ########################################################################## | |
152 # No need to look below this line | |
153 ########################################################################## | |
154 set urllog_name "URLLog" | |
222 | 155 set urllog_version "2.3.0" |
0 | 156 |
157 set urllog_tlds [split $urllog_tlds ","] | |
158 set urllog_httprep [split "\@|%40|{|%7B|}|%7D|\[|%5B|\]|%5D" "|"] | |
159 | |
102
5425dc418505
urllog: Entity data is now in UTF-8, but TCL source files are interpreted with current system locale, which may not be UTF-8. We must therefore "convert" the entity mapping string to UTF-8 to be certain of TCL's interpretation of its encoding.
Matti Hamalainen <ccr@tnsp.org>
parents:
101
diff
changeset
|
160 |
131
b04ecf8bfb15
urllog: Fix some entity translations.
Matti Hamalainen <ccr@tnsp.org>
parents:
129
diff
changeset
|
161 set urllog_ent_str "-|-|'|'|—|-|‏||—|-|–|--|‪||‬|" |
b04ecf8bfb15
urllog: Fix some entity translations.
Matti Hamalainen <ccr@tnsp.org>
parents:
129
diff
changeset
|
162 append urllog_ent_str "|‎||å|Ã¥|Å|Ã…|é|é|:|:| | " |
133 | 163 append urllog_ent_str "|”|\"|“|\"|«|<<|»|>>|"|\"" |
131
b04ecf8bfb15
urllog: Fix some entity translations.
Matti Hamalainen <ccr@tnsp.org>
parents:
129
diff
changeset
|
164 append urllog_ent_str "|ä|ä|ö|ö|Ä|Ä|Ö|Ö|&|&|<|<|>|>" |
b04ecf8bfb15
urllog: Fix some entity translations.
Matti Hamalainen <ccr@tnsp.org>
parents:
129
diff
changeset
|
165 append urllog_ent_str "|ä|ä|å|ö|—|-|'|'|–|-|"|\"" |
b04ecf8bfb15
urllog: Fix some entity translations.
Matti Hamalainen <ccr@tnsp.org>
parents:
129
diff
changeset
|
166 append urllog_ent_str "|||-|’|'|ü|ü|Ü|Ãœ|•|*|€|€" |
223
606c2a48b2ce
urllog: Add one entity translation.
Matti Hamalainen <ccr@tnsp.org>
parents:
222
diff
changeset
|
167 append urllog_ent_str "|”|\"|‘|'" |
102
5425dc418505
urllog: Entity data is now in UTF-8, but TCL source files are interpreted with current system locale, which may not be UTF-8. We must therefore "convert" the entity mapping string to UTF-8 to be certain of TCL's interpretation of its encoding.
Matti Hamalainen <ccr@tnsp.org>
parents:
101
diff
changeset
|
168 set urllog_html_ent [split [encoding convertfrom "utf-8" $urllog_ent_str] "|"] |
0 | 169 |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
170 ### Require packages |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
171 package require sqlite3 |
0 | 172 package require http |
7
50b52294e93e
urllog: Strip ‏ entities from titles; Some work on SSL/https support.
Matti Hamalainen <ccr@tnsp.org>
parents:
4
diff
changeset
|
173 |
0 | 174 ### Binding initializations |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
175 bind pub - !urlfind urllog_pub_urlfind |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
176 bind msg - !urlfind urllog_msg_urlfind |
0 | 177 bind pubm - *.* urllog_checkmsg |
178 bind topc - *.* urllog_checkmsg | |
179 | |
180 | |
181 ### Initialization messages | |
176
eda776bcb7ed
urllog: Bump copyright, and version.
Matti Hamalainen <ccr@tnsp.org>
parents:
160
diff
changeset
|
182 set urllog_message "$urllog_name v$urllog_version (C) 2000-2014 ccr/TNSP" |
0 | 183 putlog "$urllog_message" |
184 | |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
185 ### HTTP module initialization |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
186 ::http::config -useragent "$urllog_name/$urllog_version" |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
187 if {$http_proxy != 0} { |
28 | 188 ::http::config -proxyhost $http_proxy_host -proxyport $http_proxy_port |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
189 } |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
190 |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
191 if {$http_tls_support != 0} { |
28 | 192 package require tls |
235
059660980388
urllog: Enable TLS, fixes annoying issues where https fails.
Matti Hamalainen <ccr@tnsp.org>
parents:
230
diff
changeset
|
193 ::http::register https 443 [list ::tls::socket -request 1 -require 1 -tls1 1 -cadir $http_tls_cadir] |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
194 } |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
195 |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
196 ### SQLite database initialization |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
197 if {[catch {sqlite3 urldb $urllog_db_file} uerrmsg]} { |
28 | 198 putlog " Could not open SQLite3 database '$urllog_db_file': $uerrmsg" |
199 exit 2 | |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
200 } |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
201 |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
202 |
0 | 203 if {$http_proxy != 0} { |
28 | 204 putlog " (Using proxy $http_proxy_host:$http_proxy_port)" |
0 | 205 } |
206 | |
207 if {$urllog_check != 0} { | |
28 | 208 putlog " (Additional URL validity checks enabled)" |
0 | 209 } |
210 | |
211 if {$urllog_verbose != 0} { | |
28 | 212 putlog " (Verbose mode enabled)" |
0 | 213 } |
214 | |
215 #------------------------------------------------------------------------- | |
216 ### Utility functions | |
217 proc urllog_log {arg} { | |
28 | 218 global urllog_logmsg urllog_name |
0 | 219 |
28 | 220 if {$urllog_logmsg != 0} { |
221 putlog "$urllog_name: $arg" | |
222 } | |
0 | 223 } |
224 | |
225 | |
152 | 226 proc urllog_ctime {utime} { |
28 | 227 if {$utime == "" || $utime == "*"} { |
228 set utime 0 | |
229 } | |
230 return [clock format $utime -format "%d.%m.%Y %H:%M"] | |
0 | 231 } |
232 | |
233 | |
234 proc urllog_isnumber {uarg} { | |
28 | 235 foreach i [split $uarg {}] { |
65
31c8c4f50aa6
urllog: Improve urllog_isnumber function.
Matti Hamalainen <ccr@tnsp.org>
parents:
62
diff
changeset
|
236 if {![string match \[0-9\] $i]} { return 0 } |
28 | 237 } |
65
31c8c4f50aa6
urllog: Improve urllog_isnumber function.
Matti Hamalainen <ccr@tnsp.org>
parents:
62
diff
changeset
|
238 return 1 |
0 | 239 } |
240 | |
241 | |
242 proc urllog_msg {apublic anick achan amsg} { | |
28 | 243 global urllog_preferredmsg |
0 | 244 |
28 | 245 if {$apublic == 1} { |
246 putserv "$urllog_preferredmsg $achan :$amsg" | |
247 } else { | |
248 putserv "$urllog_preferredmsg $anick :$amsg" | |
249 } | |
0 | 250 } |
251 | |
252 | |
253 proc urllog_verb_msg {anick achan amsg} { | |
28 | 254 global urllog_verbose |
0 | 255 |
28 | 256 if {$urllog_verbose != 0} { |
257 urllog_msg 1 $anick $achan $amsg | |
258 } | |
0 | 259 } |
260 | |
261 | |
262 proc urllog_convert_ent {udata} { | |
28 | 263 global urllog_html_ent |
115
5db02af76016
urllog: Improve entity conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
114
diff
changeset
|
264 return [string map -nocase $urllog_html_ent [string map $urllog_html_ent $udata]] |
0 | 265 } |
266 | |
267 | |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
268 proc urllog_escape { str } { |
28 | 269 return [string map {' ''} $str] |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
270 } |
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
271 |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
272 |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
273 proc urllog_sanitize_encoding {uencoding} { |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
274 regsub -- "^\[a-z\]\[a-z\]_\[A-Z\]\[A-Z\]\." $uencoding "" uencoding |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
275 set uencoding [string tolower $uencoding] |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
276 regsub -- "^iso-" $uencoding "iso" uencoding |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
277 return $uencoding |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
278 } |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
279 |
121
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
280 proc urllog_clean_title {utitle} { |
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
281 if {[catch {set utitle [encoding convertto "iso8859-15" $utitle]} cerrmsg]} { |
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
282 putlog "Could not convert title encoding: $cerrmsg" |
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
283 } |
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
284 return $utitle |
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
285 } |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
286 |
0 | 287 #------------------------------------------------------------------------- |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
288 set urllog_shorturl_str "ABCDEFGHIJKLNMOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789" |
13
e06d41fb69d5
Begin work on converting urllog.tcl to use an SQLite3 database instead of flat file.
Matti Hamalainen <ccr@tnsp.org>
parents:
8
diff
changeset
|
289 |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
290 proc urllog_get_short {utime} { |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
291 global urllog_shorturl_prefix urllog_shorturl_str |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
292 |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
293 set ulen [string length $urllog_shorturl_str] |
0 | 294 |
28 | 295 set u1 [expr $utime / ($ulen * $ulen)] |
296 set utmp [expr $utime % ($ulen * $ulen)] | |
297 set u2 [expr $utmp / $ulen] | |
298 set u3 [expr $utmp % $ulen] | |
0 | 299 |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
300 return "\[ $urllog_shorturl_prefix[string index $urllog_shorturl_str $u1][string index $urllog_shorturl_str $u2][string index $urllog_shorturl_str $u3] \]" |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
301 } |
0 | 302 |
303 | |
304 #------------------------------------------------------------------------- | |
305 proc urllog_chop_url {url} { | |
28 | 306 global urllog_shorturl_orig |
68 | 307 |
28 | 308 if {[string length $url] > $urllog_shorturl_orig} { |
309 return "[string range $url 0 $urllog_shorturl_orig]..." | |
310 } else { | |
311 return $url | |
312 } | |
0 | 313 } |
314 | |
315 #------------------------------------------------------------------------- | |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
316 proc urllog_exists {urlStr urlNick urlHost urlChan} { |
28 | 317 global urldb urlmsg_alreadyknown urllog_shorturl |
0 | 318 |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
319 set usql "SELECT id AS uid, utime AS utime, url AS uurl, user AS uuser, host AS uhost, chan AS uchan, title AS utitle FROM urls WHERE url='[urllog_escape $urlStr]'" |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
320 urldb eval $usql { |
28 | 321 urllog_log "URL said by $urlNick ($urlStr) already known" |
322 if {$urllog_shorturl != 0} { | |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
323 set qstr "[urllog_get_short $uid] " |
28 | 324 } else { |
325 set qstr "" | |
326 } | |
327 append qstr "($uuser/$uchan@[urllog_ctime $utime])" | |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
328 if {[string length $utitle] > 0} { |
121
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
329 set qstr "$urlmsg_alreadyknown - '[urllog_clean_title $utitle]' $qstr" |
28 | 330 } else { |
331 set qstr "$urlmsg_alreadyknown $qstr" | |
332 } | |
333 urllog_verb_msg $urlNick $urlChan $qstr | |
334 return 0 | |
335 } | |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
336 return 1 |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
337 } |
0 | 338 |
18
1e2232135354
More changes for SQLite support.
Matti Hamalainen <ccr@tnsp.org>
parents:
13
diff
changeset
|
339 |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
340 #------------------------------------------------------------------------- |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
341 proc urllog_addurl {urlStr urlNick urlHost urlChan urlTitle} { |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
342 global urldb urllog_shorturl |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
343 |
93
4e02c0219afe
urllog: Insert NULL into title column when we didn't get a title.
Matti Hamalainen <ccr@tnsp.org>
parents:
92
diff
changeset
|
344 if {$urlTitle == ""} { |
4e02c0219afe
urllog: Insert NULL into title column when we didn't get a title.
Matti Hamalainen <ccr@tnsp.org>
parents:
92
diff
changeset
|
345 set uins "NULL" |
4e02c0219afe
urllog: Insert NULL into title column when we didn't get a title.
Matti Hamalainen <ccr@tnsp.org>
parents:
92
diff
changeset
|
346 } else { |
4e02c0219afe
urllog: Insert NULL into title column when we didn't get a title.
Matti Hamalainen <ccr@tnsp.org>
parents:
92
diff
changeset
|
347 set uins "'[urllog_escape $urlTitle]'" |
4e02c0219afe
urllog: Insert NULL into title column when we didn't get a title.
Matti Hamalainen <ccr@tnsp.org>
parents:
92
diff
changeset
|
348 } |
4e02c0219afe
urllog: Insert NULL into title column when we didn't get a title.
Matti Hamalainen <ccr@tnsp.org>
parents:
92
diff
changeset
|
349 set usql "INSERT INTO urls (utime,url,user,host,chan,title) VALUES ([unixtime], '[urllog_escape $urlStr]', '[urllog_escape $urlNick]', '[urllog_escape $urlHost]', '[urllog_escape $urlChan]', $uins)" |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
350 if {[catch {urldb eval $usql} uerrmsg]} { |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
351 urllog_log "$uerrmsg on SQL:\n$usql" |
28 | 352 return 0 |
353 } | |
82
1bbc79f41a1c
urllog: Rename few variables for clarity.
Matti Hamalainen <ccr@tnsp.org>
parents:
81
diff
changeset
|
354 set uid [urldb last_insert_rowid] |
28 | 355 urllog_log "Added URL ($urlNick@$urlChan): $urlStr" |
0 | 356 |
357 | |
28 | 358 ### Let's say something, to confirm that everything went well. |
359 if {$urllog_shorturl != 0} { | |
82
1bbc79f41a1c
urllog: Rename few variables for clarity.
Matti Hamalainen <ccr@tnsp.org>
parents:
81
diff
changeset
|
360 set qstr "[urllog_get_short $uid] " |
28 | 361 } else { |
362 set qstr "" | |
363 } | |
364 if {[string length $urlTitle] > 0} { | |
121
bec98a9f8695
Convert the title encoding when outputting to channel.
Matti Hamalainen <ccr@tnsp.org>
parents:
120
diff
changeset
|
365 urllog_verb_msg $urlNick $urlChan "'[urllog_clean_title $urlTitle]' ([urllog_chop_url $urlStr]) $qstr" |
28 | 366 } else { |
367 urllog_verb_msg $urlNick $urlChan "[urllog_chop_url $urlStr] $qstr" | |
368 } | |
0 | 369 |
28 | 370 return 1 |
0 | 371 } |
372 | |
373 | |
374 #------------------------------------------------------------------------- | |
375 proc urllog_checkurl {urlStr urlNick urlHost urlChan} { | |
28 | 376 global urllog_tlds urllog_check urlmsg_nosuchhost urlmsg_ioerror |
377 global urlmsg_timeout urlmsg_errorgettingdoc urllog_httprep | |
378 global urllog_shorturl_prefix urllog_shorturl urllog_encoding | |
95
687bdd74dfac
urllog: Check if TLS support is enabled when checking if we can fetch title information via HTTP or SSL/HTTP.
Matti Hamalainen <ccr@tnsp.org>
parents:
93
diff
changeset
|
379 global http_tls_support |
3
8003090caa35
Lots of code cleanups, add "fixer" for RasiaTube links (which suck) to point directly to Youtube.
Matti Hamalainen <ccr@tnsp.org>
parents:
0
diff
changeset
|
380 |
96
e5a6c27be365
urllog: Comments and cosmetics.
Matti Hamalainen <ccr@tnsp.org>
parents:
95
diff
changeset
|
381 ### Try to guess the URL protocol component (if it is missing) |
28 | 382 set u_checktld 1 |
383 if {[string match "*www.*" $urlStr] && ![string match "http://*" $urlStr] && ![string match "https://*" $urlStr]} { | |
384 set urlStr "http://$urlStr" | |
385 } elseif {[string match "*ftp.*" $urlStr] && ![string match "ftp://*" $urlStr]} { | |
386 set urlStr "ftp://$urlStr" | |
387 } | |
0 | 388 |
95
687bdd74dfac
urllog: Check if TLS support is enabled when checking if we can fetch title information via HTTP or SSL/HTTP.
Matti Hamalainen <ccr@tnsp.org>
parents:
93
diff
changeset
|
389 ### Handle URLs that have an IPv4-address |
687bdd74dfac
urllog: Check if TLS support is enabled when checking if we can fetch title information via HTTP or SSL/HTTP.
Matti Hamalainen <ccr@tnsp.org>
parents:
93
diff
changeset
|
390 if {[regexp "(\[a-z\]+)://(\[0-9\]{1,3})\\.(\[0-9\]{1,3})\\.(\[0-9\]{1,3})\\.(\[0-9\]{1,3})" $urlStr u_match u_proto ni1 ni2 ni3 ni4]} { |
28 | 391 # Check if the IP is on local network |
92
f6f4595856ff
urllog: Cosmetics. Remove useless parenthesis.
Matti Hamalainen <ccr@tnsp.org>
parents:
91
diff
changeset
|
392 if {$ni1 == 127 || $ni1 == 10 || ($ni1 == 192 && $ni2 == 168) || $ni1 == 0} { |
28 | 393 urllog_log "URL pointing to local or invalid network, ignored ($urlStr)." |
394 return 0 | |
395 } | |
396 # Skip TLD check for URLs with IP address | |
397 set u_checktld 0 | |
398 } | |
0 | 399 |
96
e5a6c27be365
urllog: Comments and cosmetics.
Matti Hamalainen <ccr@tnsp.org>
parents:
95
diff
changeset
|
400 ### Check now if we have an ShortURL here ... |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
401 if {[string match "$urllog_shorturl_prefix*" $urlStr]} { |
98
fbbe7ee40e2f
urllog: Improve one informational / error message.
Matti Hamalainen <ccr@tnsp.org>
parents:
97
diff
changeset
|
402 urllog_log "Ignoring ShortURL from $urlNick: $urlStr" |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
403 set uud "" |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
404 set usql "SELECT id AS uid, url AS uurl, user AS uuser, host AS uhost, chan AS uchan, title AS utitle FROM urls WHERE utime=$uud" |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
405 urldb eval $usql { |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
406 |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
407 } |
28 | 408 return 0 |
409 } | |
0 | 410 |
95
687bdd74dfac
urllog: Check if TLS support is enabled when checking if we can fetch title information via HTTP or SSL/HTTP.
Matti Hamalainen <ccr@tnsp.org>
parents:
93
diff
changeset
|
411 ### Get URL protocol component |
687bdd74dfac
urllog: Check if TLS support is enabled when checking if we can fetch title information via HTTP or SSL/HTTP.
Matti Hamalainen <ccr@tnsp.org>
parents:
93
diff
changeset
|
412 set u_proto "" |
160
e3e156911ab4
urllog: Oops, fix a silly typobug.
Matti Hamalainen <ccr@tnsp.org>
parents:
155
diff
changeset
|
413 regexp "(\[a-z\]+)://" $urlStr u_match u_proto |
95
687bdd74dfac
urllog: Check if TLS support is enabled when checking if we can fetch title information via HTTP or SSL/HTTP.
Matti Hamalainen <ccr@tnsp.org>
parents:
93
diff
changeset
|
414 |
28 | 415 ### Check the PORT (if the ":" is there) |
416 set u_record [split $urlStr "/"] | |
417 set u_hostname [lindex $u_record 2] | |
418 set u_port [lindex [split $u_hostname ":"] end] | |
0 | 419 |
28 | 420 if {![urllog_isnumber $u_port] && $u_port != "" && $u_port != $u_hostname} { |
421 urllog_log "Broken URL from $urlNick: ($urlStr) illegal port $u_port" | |
422 return 0 | |
423 } | |
0 | 424 |
28 | 425 # Default to port 80 (HTTP) |
426 if {![urllog_isnumber $u_port]} { | |
427 set u_port 80 | |
428 } | |
3
8003090caa35
Lots of code cleanups, add "fixer" for RasiaTube links (which suck) to point directly to Youtube.
Matti Hamalainen <ccr@tnsp.org>
parents:
0
diff
changeset
|
429 |
28 | 430 ### Is it a http or ftp url? (FIX ME!) |
97
366e68ad94df
urllog: Use u_proto variable to check for if the protocol is supported instead of doing useless additional string checking.
Matti Hamalainen <ccr@tnsp.org>
parents:
96
diff
changeset
|
431 if {$u_proto != "http" && $u_proto != "https" && $u_proto != "ftp"} { |
366e68ad94df
urllog: Use u_proto variable to check for if the protocol is supported instead of doing useless additional string checking.
Matti Hamalainen <ccr@tnsp.org>
parents:
96
diff
changeset
|
432 urllog_log "Broken URL from $urlNick: ($urlStr) UNSUPPORTED protocol class ($u_proto)." |
28 | 433 return 0 |
434 } | |
0 | 435 |
28 | 436 ### Check the Top Level Domain (TLD) validity |
437 if {$u_checktld != 0} { | |
438 set u_sane [lindex [split $u_hostname "."] end] | |
439 set u_tld [lindex [split $u_sane ":"] 0] | |
440 set u_found 0 | |
0 | 441 |
28 | 442 if {[string length $u_tld] == 2} { |
443 # Assume all 2-letter domains to be valid :) | |
444 set u_found 1 | |
445 } else { | |
446 # Check our list of known TLDs | |
447 foreach itld $urllog_tlds { | |
448 if {[string match $itld $u_tld]} { | |
449 set u_found 1 | |
450 } | |
451 } | |
452 } | |
0 | 453 |
28 | 454 if {$u_found == 0} { |
455 urllog_log "Broken URL from $urlNick: ($urlStr) illegal TLD: $u_tld." | |
456 return 0 | |
457 } | |
458 } | |
0 | 459 |
28 | 460 set urlStr [string map $urllog_httprep $urlStr] |
3
8003090caa35
Lots of code cleanups, add "fixer" for RasiaTube links (which suck) to point directly to Youtube.
Matti Hamalainen <ccr@tnsp.org>
parents:
0
diff
changeset
|
461 |
91
6f4bfd8e9447
urllog: Reorder code and make it simpler by removing duplicate checks.
Matti Hamalainen <ccr@tnsp.org>
parents:
90
diff
changeset
|
462 ### Does the URL already exist? |
6f4bfd8e9447
urllog: Reorder code and make it simpler by removing duplicate checks.
Matti Hamalainen <ccr@tnsp.org>
parents:
90
diff
changeset
|
463 if {![urllog_exists $urlStr $urlNick $urlHost $urlChan]} { |
6f4bfd8e9447
urllog: Reorder code and make it simpler by removing duplicate checks.
Matti Hamalainen <ccr@tnsp.org>
parents:
90
diff
changeset
|
464 return 1 |
6f4bfd8e9447
urllog: Reorder code and make it simpler by removing duplicate checks.
Matti Hamalainen <ccr@tnsp.org>
parents:
90
diff
changeset
|
465 } |
0 | 466 |
28 | 467 ### Do we perform additional optional checks? |
230 | 468 if {$urllog_check == 0 || !(($http_tls_support != 0 && $u_proto == "https") || $u_proto == "http")} { |
469 # No optional checks, or it's not http/https. | |
470 # Just add the URL, if it does not exist already. | |
91
6f4bfd8e9447
urllog: Reorder code and make it simpler by removing duplicate checks.
Matti Hamalainen <ccr@tnsp.org>
parents:
90
diff
changeset
|
471 urllog_addurl $urlStr $urlNick $urlHost $urlChan "" |
28 | 472 return 1 |
473 } | |
7
50b52294e93e
urllog: Strip ‏ entities from titles; Some work on SSL/https support.
Matti Hamalainen <ccr@tnsp.org>
parents:
4
diff
changeset
|
474 |
28 | 475 ### Does the document pointed by the URL exist? |
225
cb86368b8fcd
urllog: Change handling of HTTP requests.
Matti Hamalainen <ccr@tnsp.org>
parents:
224
diff
changeset
|
476 if {[catch {set utoken [::http::geturl $urlStr -timeout 6000 -headers {Accept-Encoding identity}]} uerrmsg]} { |
28 | 477 urllog_verb_msg $urlNick $urlChan "$urlmsg_ioerror ($uerrmsg)" |
478 urllog_log "HTTP request failed: $uerrmsg" | |
479 return 0 | |
480 } | |
0 | 481 |
224
aaf433ab696a
urllog: Improve error messages a bit.
Matti Hamalainen <ccr@tnsp.org>
parents:
223
diff
changeset
|
482 set ustatus [::http::status $utoken] |
aaf433ab696a
urllog: Improve error messages a bit.
Matti Hamalainen <ccr@tnsp.org>
parents:
223
diff
changeset
|
483 if {$ustatus == "timeout"} { |
28 | 484 urllog_verb_msg $urlNick $urlChan "$urlmsg_timeout" |
485 urllog_log "HTTP request timed out ($urlStr)" | |
486 return 0 | |
487 } | |
0 | 488 |
224
aaf433ab696a
urllog: Improve error messages a bit.
Matti Hamalainen <ccr@tnsp.org>
parents:
223
diff
changeset
|
489 if {$ustatus != "ok"} { |
28 | 490 urllog_verb_msg $urlNick $urlChan "$urlmsg_errorgettingdoc ([::http::error $utoken])" |
491 urllog_log "Error in HTTP transaction: [::http::error $utoken] ($urlStr)" | |
492 return 0 | |
493 } | |
3
8003090caa35
Lots of code cleanups, add "fixer" for RasiaTube links (which suck) to point directly to Youtube.
Matti Hamalainen <ccr@tnsp.org>
parents:
0
diff
changeset
|
494 |
28 | 495 # Fixme! Handle redirects! |
224
aaf433ab696a
urllog: Improve error messages a bit.
Matti Hamalainen <ccr@tnsp.org>
parents:
223
diff
changeset
|
496 set ustatus [::http::status $utoken] |
aaf433ab696a
urllog: Improve error messages a bit.
Matti Hamalainen <ccr@tnsp.org>
parents:
223
diff
changeset
|
497 set uscode [::http::code $utoken] |
28 | 498 set ucode [::http::ncode $utoken] |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
499 set udata [::http::data $utoken] |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
500 array set umeta [::http::meta $utoken] |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
501 ::http::cleanup $utoken |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
502 |
28 | 503 if {$ucode >= 200 && $ucode <= 309} { |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
504 set uenc_doc "" |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
505 set uenc_http "" |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
506 set uencoding "" |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
507 |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
508 # Get information about specified character encodings |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
509 if {[info exists umeta(Content-Type)] && [regexp -nocase {charset\s*=\s*([a-z0-9._-]+)} $umeta(Content-Type) umatches uenc_http]} { |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
510 # Found character set encoding information in HTTP headers |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
511 } |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
512 |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
513 if {[regexp -nocase -- "<meta.\*\?content=\"text/html.\*\?charset=(\[^\"\]*)\".\*\?/\?>" $udata umatches uenc_doc]} { |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
514 # Found old style HTML meta tag with character set information |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
515 } elseif {[regexp -nocase -- "<meta.\*\?charset=\"(\[^\"\]*)\".\*\?/\?>" $udata umatches uenc_doc]} { |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
516 # Found HTML5 style meta tag with character set information |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
517 } |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
518 |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
519 # Make sanitized versions of the encoding strings |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
520 set uenc_http2 [urllog_sanitize_encoding $uenc_http] |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
521 set uenc_doc2 [urllog_sanitize_encoding $uenc_doc] |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
522 |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
523 # KLUDGE! |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
524 set uencoding $uenc_http2 |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
525 |
210
52cadf5a12b6
urllog: Disable some debug logging.
Matti Hamalainen <ccr@tnsp.org>
parents:
209
diff
changeset
|
526 # putlog "got charsets : http='$uenc_http', doc='$uenc_doc' / sanitized http='$uenc_http2', doc='$uenc_doc2'" |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
527 |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
528 |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
529 # Check if the document has specified encoding |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
530 if {$uenc_doc != ""} { |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
531 # Does it differ from what HTTP says? |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
532 if {$uenc_http != "" && $uenc_doc != $uenc_http && $uenc_doc2 != $uenc_http2} { |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
533 # Yes, we will try reconverting |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
534 set uencoding $uenc_doc2 |
28 | 535 } |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
536 } elseif {$uenc_http == ""} { |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
537 # If _NO_ known encoding of any kind, assume the default of iso8859-1 |
86
4c2b6482c08c
urllog: Different strategy for charset encoding conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
84
diff
changeset
|
538 set uencoding "iso8859-1" |
4c2b6482c08c
urllog: Different strategy for charset encoding conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
84
diff
changeset
|
539 } |
0 | 540 |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
541 # Get the document title, if any |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
542 set urlTitle "" |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
543 if {[regexp -nocase -- "<title>(.\*\?)</title>" $udata umatches urlTitle]} { |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
544 # If character set conversion is required, do it now |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
545 if {$uencoding != ""} { |
210
52cadf5a12b6
urllog: Disable some debug logging.
Matti Hamalainen <ccr@tnsp.org>
parents:
209
diff
changeset
|
546 # putlog "conversion requested from $uencoding" |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
547 if {[catch {set urlTitle [encoding convertfrom $uencoding $urlTitle]} cerrmsg]} { |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
548 urllog_log "Error in charset conversion: $cerrmsg" |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
549 } |
28 | 550 } |
150
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
551 |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
552 # putlog "xxx: $uencoding : '$urlTitle'" |
52350ed97775
urllog: Cleanups, rename/move some global variables.
Matti Hamalainen <ccr@tnsp.org>
parents:
136
diff
changeset
|
553 # return 0 |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
554 |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
555 # Convert some HTML entities to plaintext and do some cleanup |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
556 set utmp [urllog_convert_ent $urlTitle] |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
557 regsub -all "\r|\n|\t" $utmp " " utmp |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
558 regsub -all " *" $utmp " " utmp |
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
559 set urlTitle [string trim $utmp] |
28 | 560 } |
3
8003090caa35
Lots of code cleanups, add "fixer" for RasiaTube links (which suck) to point directly to Youtube.
Matti Hamalainen <ccr@tnsp.org>
parents:
0
diff
changeset
|
561 |
28 | 562 # Rasiatube hack |
563 if {[string match "*/rasiatube/view*" $urlStr]} { | |
564 set rasia 0 | |
118
e5f2961a6145
urllog: Improve rasiatube URL de-mangling.
Matti Hamalainen <ccr@tnsp.org>
parents:
117
diff
changeset
|
565 if {[regexp -nocase -- "<link rel=\"video_src\"\.\*\?file=(http://\[^&\]+)&" $udata umatches utmp]} { |
e5f2961a6145
urllog: Improve rasiatube URL de-mangling.
Matti Hamalainen <ccr@tnsp.org>
parents:
117
diff
changeset
|
566 regsub -all "\/v\/" $utmp "\/watch\?v=" urlStr |
28 | 567 set rasia 1 |
568 } else { | |
118
e5f2961a6145
urllog: Improve rasiatube URL de-mangling.
Matti Hamalainen <ccr@tnsp.org>
parents:
117
diff
changeset
|
569 if {[regexp -nocase -- "SWFObject.\"(\[^\"\]+)\", *\"flashvideo" $udata umatches utmp]} { |
e5f2961a6145
urllog: Improve rasiatube URL de-mangling.
Matti Hamalainen <ccr@tnsp.org>
parents:
117
diff
changeset
|
570 regsub "http:\/\/www.dailymotion.com\/swf\/" $utmp "http:\/\/www.dailymotion.com\/video\/" urlStr |
28 | 571 set rasia 1 |
572 } | |
573 } | |
574 if {$rasia != 0} { | |
575 urllog_log "RasiaTube mangler: $urlStr" | |
576 urllog_verb_msg $urlNick $urlChan "Korjataan haiseva rasiatube-linkki: $urlStr" | |
577 } | |
578 } | |
3
8003090caa35
Lots of code cleanups, add "fixer" for RasiaTube links (which suck) to point directly to Youtube.
Matti Hamalainen <ccr@tnsp.org>
parents:
0
diff
changeset
|
579 |
83
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
580 # Check if the URL already exists, just in case we had some redirects |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
581 if {[urllog_exists $urlStr $urlNick $urlHost $urlChan]} { |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
582 urllog_addurl $urlStr $urlNick $urlHost $urlChan $urlTitle |
f171a9fb7b7b
urllog: Split urllog_add function to urllog_exists for checking whether given URL already exists in the database. Use urllog_exists where appropriate.
Matti Hamalainen <ccr@tnsp.org>
parents:
82
diff
changeset
|
583 } |
28 | 584 return 1 |
585 } else { | |
116
4f3edcf72987
urllog: Improvements in document / HTTP encoding handling and conversion.
Matti Hamalainen <ccr@tnsp.org>
parents:
115
diff
changeset
|
586 urllog_verb_msg $urlNick $urlChan "$urlmsg_errorgettingdoc ($ucode)" |
224
aaf433ab696a
urllog: Improve error messages a bit.
Matti Hamalainen <ccr@tnsp.org>
parents:
223
diff
changeset
|
587 urllog_log "Error fetching document: status=$ustatus, code=$ucode, scode=$uscode, url=$urlStr" |
28 | 588 } |
0 | 589 } |
590 | |
591 | |
592 #------------------------------------------------------------------------- | |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
593 |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
594 |
87 | 595 proc urllog_checkmsg {unick uhost uhand uchan utext} { |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
596 global urllog_log_channels |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
597 |
28 | 598 ### Check the nick |
87 | 599 if {$unick == "*"} { |
28 | 600 urllog_log "urllog_checkmsg: nick was wc, this should not happen." |
601 return 0 | |
602 } | |
0 | 603 |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
604 ### Check the channel |
229 | 605 foreach akey [split $urllog_log_channels ";"] { |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
606 if {[string match $akey $uchan]} { |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
607 ### Do the URL checking |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
608 foreach str [split $utext " "] { |
221
b8bf9d7666b6
urllog: Improve URL / link matching.
Matti Hamalainen <ccr@tnsp.org>
parents:
219
diff
changeset
|
609 if {[regexp "((ftp|http|https)://\[^\[:space:\]\]+|^(www|ftp)\.\[^\[:space:\]\]+)" $str ulink]} { |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
610 urllog_checkurl $str $unick $uhost $uchan |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
611 } |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
612 } |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
613 return 0 |
28 | 614 } |
615 } | |
0 | 616 |
28 | 617 return 0 |
0 | 618 } |
619 | |
620 | |
621 #------------------------------------------------------------------------- | |
622 ### Parse arguments, find and show the results | |
623 proc urllog_find {unick uhand uchan utext upublic} { | |
62
6428b1bcb34b
urllog: Remove some global variable references where they are not used.
Matti Hamalainen <ccr@tnsp.org>
parents:
50
diff
changeset
|
624 global urllog_shorturl urldb |
28 | 625 global urllog_showmax_pub urllog_showmax_priv urlmsg_nomatch |
0 | 626 |
28 | 627 if {$upublic == 0} { |
628 set ulimit 5 | |
629 } else { | |
630 set ulimit 3 | |
631 } | |
19
9cf22053e5da
Repair !urlfind functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
18
diff
changeset
|
632 |
28 | 633 ### Parse the given command |
634 urllog_log "$unick/$uhand searched URL: $utext" | |
0 | 635 |
28 | 636 set ftokens [split $utext " "] |
637 set fpatlist "" | |
638 foreach ftoken $ftokens { | |
639 set fprefix [string range $ftoken 0 0] | |
640 set fpattern [string range $ftoken 1 end] | |
128
0d21b9d1d2b9
urllog: Improve search functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
127
diff
changeset
|
641 set qpattern "'%[urllog_escape $fpattern]%'" |
0 | 642 |
28 | 643 if {$fprefix == "-"} { |
128
0d21b9d1d2b9
urllog: Improve search functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
127
diff
changeset
|
644 lappend fpatlist "(url NOT LIKE $qpattern OR title NOT LIKE $qpattern)" |
28 | 645 } elseif {$fprefix == "%"} { |
128
0d21b9d1d2b9
urllog: Improve search functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
127
diff
changeset
|
646 lappend fpatlist "user LIKE $qpattern" |
28 | 647 } elseif {$fprefix == "@"} { |
648 # foo | |
112
fae3dd7a8b20
urllog: Oops, a typo in variable name. Fixed.
Matti Hamalainen <ccr@tnsp.org>
parents:
111
diff
changeset
|
649 } elseif {$fprefix == "+"} { |
128
0d21b9d1d2b9
urllog: Improve search functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
127
diff
changeset
|
650 lappend fpatlist "(url LIKE $qpattern OR title LIKE $qpattern)" |
28 | 651 } else { |
128
0d21b9d1d2b9
urllog: Improve search functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
127
diff
changeset
|
652 set qpattern "'%[urllog_escape $ftoken]%'" |
0d21b9d1d2b9
urllog: Improve search functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
127
diff
changeset
|
653 lappend fpatlist "(url LIKE $qpattern OR title LIKE $qpattern)" |
28 | 654 } |
655 } | |
19
9cf22053e5da
Repair !urlfind functionality.
Matti Hamalainen <ccr@tnsp.org>
parents:
18
diff
changeset
|
656 |
27
6e381916b016
Some fixes in the query mechanisms of QuoteDB and URLLog.
Matti Hamalainen <ccr@tnsp.org>
parents:
20
diff
changeset
|
657 if {[llength $fpatlist] > 0} { |
6e381916b016
Some fixes in the query mechanisms of QuoteDB and URLLog.
Matti Hamalainen <ccr@tnsp.org>
parents:
20
diff
changeset
|
658 set fquery "WHERE [join $fpatlist " AND "]" |
6e381916b016
Some fixes in the query mechanisms of QuoteDB and URLLog.
Matti Hamalainen <ccr@tnsp.org>
parents:
20
diff
changeset
|
659 } else { |
6e381916b016
Some fixes in the query mechanisms of QuoteDB and URLLog.
Matti Hamalainen <ccr@tnsp.org>
parents:
20
diff
changeset
|
660 set fquery "" |
6e381916b016
Some fixes in the query mechanisms of QuoteDB and URLLog.
Matti Hamalainen <ccr@tnsp.org>
parents:
20
diff
changeset
|
661 } |
68 | 662 |
28 | 663 set iresults 0 |
82
1bbc79f41a1c
urllog: Rename few variables for clarity.
Matti Hamalainen <ccr@tnsp.org>
parents:
81
diff
changeset
|
664 set usql "SELECT id AS uid, utime AS utime, url AS uurl, user AS uuser, host AS uhost FROM urls $fquery ORDER BY utime DESC LIMIT $ulimit" |
68 | 665 urldb eval $usql { |
28 | 666 incr iresults |
667 set shortURL $uurl | |
82
1bbc79f41a1c
urllog: Rename few variables for clarity.
Matti Hamalainen <ccr@tnsp.org>
parents:
81
diff
changeset
|
668 if {$urllog_shorturl != 0 && $uid != ""} { |
1bbc79f41a1c
urllog: Rename few variables for clarity.
Matti Hamalainen <ccr@tnsp.org>
parents:
81
diff
changeset
|
669 set shortURL "$shortURL [urllog_get_short $uid]" |
28 | 670 } |
671 urllog_msg $upublic $unick $uchan "#$iresults: $shortURL ($uuser@[urllog_ctime $utime])" | |
672 } | |
673 | |
674 if {$iresults == 0} { | |
675 # If no URLs were found | |
676 urllog_msg $upublic $unick $uchan $urlmsg_nomatch | |
677 } | |
0 | 678 |
28 | 679 return 0 |
0 | 680 } |
681 | |
682 | |
683 #------------------------------------------------------------------------- | |
684 ### Finding binded functions | |
685 proc urllog_pub_urlfind {unick uhost uhand uchan utext} { | |
219
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
686 global urllog_search_channels |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
687 |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
688 foreach akey [split $urllog_search_channels ";"] { |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
689 if {[string match $akey $uchan]} { |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
690 return [urllog_find $unick $uhand $uchan $utext 1] |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
691 } |
4e09bcc48851
urllog: Add settings for specifying channels where URL logging is active, and where !urlfind functionality works (separately, if so desired.)
Matti Hamalainen <ccr@tnsp.org>
parents:
218
diff
changeset
|
692 } |
28 | 693 return 0 |
0 | 694 } |
695 | |
696 | |
697 proc urllog_msg_urlfind {unick uhost uhand utext} { | |
28 | 698 urllog_find $unick $uhand "" $utext 0 |
699 return 0 | |
3
8003090caa35
Lots of code cleanups, add "fixer" for RasiaTube links (which suck) to point directly to Youtube.
Matti Hamalainen <ccr@tnsp.org>
parents:
0
diff
changeset
|
700 } |
0 | 701 |
702 # end of script |