annotate update.sh @ 160:a84b40bc2a99

Add parselist.pl utility and use it in update script.
author Matti Hamalainen <ccr@tnsp.org>
date Mon, 17 Aug 2015 22:02:22 +0300
parents e97705171c3c
children b4a07ea2d739
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
6
7fca87c41e17 Added data fetching and updating shellscript.
Matti Hamalainen <ccr@tnsp.org>
parents:
diff changeset
1 #!/bin/sh
7fca87c41e17 Added data fetching and updating shellscript.
Matti Hamalainen <ccr@tnsp.org>
parents:
diff changeset
2 URLSUFFIX=".htm"
7fca87c41e17 Added data fetching and updating shellscript.
Matti Hamalainen <ccr@tnsp.org>
parents:
diff changeset
3
151
ca012374190c Fix update script to work with dash/non-bash shells.
Matti Hamalainen <ccr@tnsp.org>
parents: 134
diff changeset
4 createdir()
76
d1b65d9903ab Clean up the update script slightly.
Matti Hamalainen <ccr@tnsp.org>
parents: 75
diff changeset
5 {
125
22872b46eee9 Cleanup.
Matti Hamalainen <ccr@tnsp.org>
parents: 124
diff changeset
6 test ! -d "$1" && mkdir -p "$1" && chmod 751 "$1"
76
d1b65d9903ab Clean up the update script slightly.
Matti Hamalainen <ccr@tnsp.org>
parents: 75
diff changeset
7 }
d1b65d9903ab Clean up the update script slightly.
Matti Hamalainen <ccr@tnsp.org>
parents: 75
diff changeset
8
125
22872b46eee9 Cleanup.
Matti Hamalainen <ccr@tnsp.org>
parents: 124
diff changeset
9
151
ca012374190c Fix update script to work with dash/non-bash shells.
Matti Hamalainen <ccr@tnsp.org>
parents: 134
diff changeset
10 fetch()
101
891bd3c93f96 Modularize update script and fetch list of classes data separately from the
Matti Hamalainen <ccr@tnsp.org>
parents: 97
diff changeset
11 {
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
12 URLPREFIX="$1"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
13 CLASSFILE="$2"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
14 LISTFILE="$2.tmp"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
15 PATPREFIX="$3"
101
891bd3c93f96 Modularize update script and fetch list of classes data separately from the
Matti Hamalainen <ccr@tnsp.org>
parents: 97
diff changeset
16
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
17 if wget -q -O "$LISTFILE" "$URLPREFIX"; then
160
a84b40bc2a99 Add parselist.pl utility and use it in update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 157
diff changeset
18 perl -w ./parselist.pl "$PATPREFIX" "$URLSUFFIX" < "$LISTFILE" > "$CLASSFILE"
152
4085ea7aa7a6 Few minor improvements to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 151
diff changeset
19 rm -f "$LISTFILE"
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
20 echo -n "* Fetched classfile $CLASSFILE: "
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
21 cat "$CLASSFILE" | wc -l
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
22 fi
101
891bd3c93f96 Modularize update script and fetch list of classes data separately from the
Matti Hamalainen <ccr@tnsp.org>
parents: 97
diff changeset
23 }
891bd3c93f96 Modularize update script and fetch list of classes data separately from the
Matti Hamalainen <ccr@tnsp.org>
parents: 97
diff changeset
24
891bd3c93f96 Modularize update script and fetch list of classes data separately from the
Matti Hamalainen <ccr@tnsp.org>
parents: 97
diff changeset
25
151
ca012374190c Fix update script to work with dash/non-bash shells.
Matti Hamalainen <ccr@tnsp.org>
parents: 134
diff changeset
26 parse()
24
1b8260151e99 Get updates / data from upcoming work-in-progress timetables for next period also.
Matti Hamalainen <ccr@tnsp.org>
parents: 22
diff changeset
27 {
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
28 # Create cache directories, if they do not exist
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
29 OLDCACHEDIR="${CACHEDIR}old/"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
30 createdir "${OLDCACHEDIR}"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
31 createdir "${CACHEDIR}"
46
9b5fa0f3812b Improvements in the update logic.
Matti Hamalainen <ccr@tnsp.org>
parents: 42
diff changeset
32
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
33 URLPREFIX="$1"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
34 CLASSFILE="$2"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
35 PATPREFIX="$3"
6
7fca87c41e17 Added data fetching and updating shellscript.
Matti Hamalainen <ccr@tnsp.org>
parents:
diff changeset
36
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
37 echo "<? \$origBaseURI = \"${URLPREFIX}${PATPREFIX}\"; \$origBaseExt = \".htm\"; ?>" > "${CACHEDIR}baseuri.data";
82
c553ad61e9c2 Make original data links work for both current and next period mode.
Matti Hamalainen <ccr@tnsp.org>
parents: 76
diff changeset
38
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
39 test -e "$CLASSFILE" || return 0
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
40
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
41 echo "* Processing $CLASSFILE ..."
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
42 cat "$CLASSFILE" | while read i; do
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
43 parse=no
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
44 INFILE="${CACHEDIR}${i}.html"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
45 ONFILE="${OLDCACHEDIR}${i}.html"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
46 DATAFILE="${i}.data"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
47 wget -q -O "${INFILE}.new" "${URLPREFIX}${PATPREFIX}${i}${URLSUFFIX}"
6
7fca87c41e17 Added data fetching and updating shellscript.
Matti Hamalainen <ccr@tnsp.org>
parents:
diff changeset
48
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
49 if test -e "${INFILE}.new"; then
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
50 # New data fetched, does old file exist?
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
51 if test -e "$INFILE"; then
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
52 # Yes, do a diff
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
53 if ! diff -u "$INFILE" "$INFILE.new" > "$INFILE.diff"; then
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
54 # There were differences, do a parse
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
55 parse=yes
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
56 mv "$INFILE" "$ONFILE" && \
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
57 mv "$INFILE.new" "$INFILE"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
58 else
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
59 # No changes, apparently .. remove the new one
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
60 rm -f "$INFILE.new" "$INFILE.diff"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
61 fi
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
62 else
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
63 # No old file, parse new data
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
64 mv "$INFILE.new" "$INFILE"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
65 parse=yes
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
66 fi
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
67 fi
6
7fca87c41e17 Added data fetching and updating shellscript.
Matti Hamalainen <ccr@tnsp.org>
parents:
diff changeset
68
124
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
69 # No new file fetched, does datafile exist?
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
70 if test ! -e "${CACHEDIR}${DATAFILE}"; then
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
71 # No, try to parse it if old file input exists
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
72 parse=yes
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
73 fi
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
74
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
75 # Parsing of old data requested?
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
76 if test "x$parse" = "xyes" -a -e "$INFILE"; then
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
77 if test -e "${CACHEDIR}${DATAFILE}"; then
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
78 mv "${CACHEDIR}${DATAFILE}" "${OLDCACHEDIR}${DATAFILE}"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
79 fi
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
80 echo "Parsing $i"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
81 perl parsedata.pl -php "$INFILE" -o "${CACHEDIR}${DATAFILE}"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
82 #perl parsedata.pl -xml "$INFILE" -o "${CACHEDIR}${i}.xml"
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
83 fi
4547a239b0fd Indentation cleanup, use spaces instead of tabs.
Matti Hamalainen <ccr@tnsp.org>
parents: 122
diff changeset
84 done
24
1b8260151e99 Get updates / data from upcoming work-in-progress timetables for next period also.
Matti Hamalainen <ccr@tnsp.org>
parents: 22
diff changeset
85 }
6
7fca87c41e17 Added data fetching and updating shellscript.
Matti Hamalainen <ccr@tnsp.org>
parents:
diff changeset
86
152
4085ea7aa7a6 Few minor improvements to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 151
diff changeset
87 fix_permissions()
4085ea7aa7a6 Few minor improvements to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 151
diff changeset
88 {
4085ea7aa7a6 Few minor improvements to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 151
diff changeset
89 chmod 751 . && chmod 644 *.php *.css && chmod 600 coursecache.txt classes.txt classes_next.txt
4085ea7aa7a6 Few minor improvements to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 151
diff changeset
90 }
76
d1b65d9903ab Clean up the update script slightly.
Matti Hamalainen <ccr@tnsp.org>
parents: 75
diff changeset
91
152
4085ea7aa7a6 Few minor improvements to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 151
diff changeset
92 fix_permissions
35
4d9354abda73 Update fetching URLs and parameters to match the latest changes on OAMK's web.
Matti Hamalainen <ccr@tnsp.org>
parents: 34
diff changeset
93 CACHEDIR="cache/"
134
6f7a2f9dcad4 Fix some errors due to undefined variables in case the class data is not
Matti Hamalainen <ccr@tnsp.org>
parents: 125
diff changeset
94 fetch "http://www.oamk.fi/~heikkim/riihi1/luokat/" "classes.txt" "Ryh._"
6f7a2f9dcad4 Fix some errors due to undefined variables in case the class data is not
Matti Hamalainen <ccr@tnsp.org>
parents: 125
diff changeset
95 parse "http://www.oamk.fi/~heikkim/riihi1/luokat/" "classes.txt" "Ryh._"
6f7a2f9dcad4 Fix some errors due to undefined variables in case the class data is not
Matti Hamalainen <ccr@tnsp.org>
parents: 125
diff changeset
96 #parse "http://www.oamk.fi/tyojarjestykset/otek/luokat/" "classes.txt" "OR_"
29
ac51fc10414f Add support for URL prefix pattern in updates.
Matti Hamalainen <ccr@tnsp.org>
parents: 24
diff changeset
97
122
8cbd07999b66 Update to latest upcoming.
Matti Hamalainen <ccr@tnsp.org>
parents: 116
diff changeset
98
31
dbe7ff545293 Add support for fetching and showing data for next/upcoming period.
Matti Hamalainen <ccr@tnsp.org>
parents: 29
diff changeset
99 CACHEDIR="cache-next/"
134
6f7a2f9dcad4 Fix some errors due to undefined variables in case the class data is not
Matti Hamalainen <ccr@tnsp.org>
parents: 125
diff changeset
100 #fetch "http://www.oamk.fi/~heikkim/riihi1/luokat/" "classes.txt" "Ryh._"
6f7a2f9dcad4 Fix some errors due to undefined variables in case the class data is not
Matti Hamalainen <ccr@tnsp.org>
parents: 125
diff changeset
101 #parse "http://www.oamk.fi/~heikkim/riihi1/luokat/" "classes.txt" "Ryh._"
102
bf7a9f63bd82 Add some comments to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 101
diff changeset
102
152
4085ea7aa7a6 Few minor improvements to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 151
diff changeset
103 fix_permissions
157
e97705171c3c Show diff of class files.
Matti Hamalainen <ccr@tnsp.org>
parents: 152
diff changeset
104 diff -u "classes.txt" "classes_next.txt"
102
bf7a9f63bd82 Add some comments to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 101
diff changeset
105
bf7a9f63bd82 Add some comments to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 101
diff changeset
106 # http://www.oamk.fi/~heikkim/riihi[1-5]/
bf7a9f63bd82 Add some comments to the update script.
Matti Hamalainen <ccr@tnsp.org>
parents: 101
diff changeset
107 # http://www.oamk.fi/~heikkim/Luhti[1-5]/