# HG changeset patch # User Matti Hamalainen # Date 1350118706 -10800 # Node ID 9b5fa0f3812b08c1e5a31c7cda2ba874a67870a1 # Parent d8a1e85b8dda48e70641f53270e9566976709f1b Improvements in the update logic. diff -r d8a1e85b8dda -r 9b5fa0f3812b update.sh --- a/update.sh Mon Sep 24 16:45:40 2012 +0300 +++ b/update.sh Sat Oct 13 11:58:26 2012 +0300 @@ -3,6 +3,15 @@ function parse() { +# Create cache directories, if they do not exist +if test ! -d "${OLDCACHEDIR}"; then + mkdir -p "${OLDCACHEDIR}" +fi + +if test ! -d "${CACHEDIR}"; then + mkdir -p "${CACHEDIR}" +fi + URLPREFIX="$1" CLASSFILE="$2" LISTFILE="$2.tmp" @@ -18,47 +27,52 @@ cat "$CLASSFILE" | while read i; do parse=no INFILE="${CACHEDIR}${i}.html" - wget -q -O "$INFILE.new" "${URLPREFIX}${PATPREFIX}${i}${URLSUFFIX}" + ONFILE="${OLDCACHEDIR}${i}.html" + DATAFILE="${i}.data" + wget -q -O "${INFILE}.new" "${URLPREFIX}${PATPREFIX}${i}${URLSUFFIX}" - if test -e "$INFILE.new"; then + if test -e "${INFILE}.new"; then # New data fetched, does old file exist? if test -e "$INFILE"; then # Yes, do a diff if ! diff -u "$INFILE" "$INFILE.new" > "$INFILE.diff"; then # There were differences, do a parse parse=yes - mv "$INFILE" "$INFILE.old" + mv "$INFILE" "$ONFILE" && \ mv "$INFILE.new" "$INFILE" + else + # No changes, apparently .. remove the new one + rm -f "$INFILE.new" "$INFILE.diff" fi else # No old file, parse new data mv "$INFILE.new" "$INFILE" parse=yes fi - else - # No new file fetched, does datafile exist? - if test ! -e "$i.data"; then - # No, try to parse it if old file input exists - parse=yes - fi + fi + # No new file fetched, does datafile exist? + if test ! -e "${CACHEDIR}${DATAFILE}"; then + # No, try to parse it if old file input exists + parse=yes fi # Parsing of old data requested? if test "x$parse" = "xyes" -a -e "$INFILE"; then - OUTFILE="${CACHEDIR}/$i.data" - if test -e "$OUTFILE"; then - mv "$OUTFILE" "$OUTFILE.old" + if test -e "${CACHEDIR}${DATAFILE}"; then + mv "${CACHEDIR}${DATAFILE}" "${OLDCACHEDIR}${DATAFILE}" fi echo "Parsing $i" - perl parsedata.pl -php "$INFILE" -o "$OUTFILE" + perl parsedata.pl -php "$INFILE" -o "${CACHEDIR}${DATAFILE}" fi done fi } CACHEDIR="cache/" -#parse "http://www.oamk.fi/tyojarjestykset/otek/luokat/" "luokat.txt" "OR_" -parse "http://www.oamk.fi/~heikkim/riihi1/luokat/" "luokat.txt" "Ryh._" +OLDCACHEDIR="cache-old/" +parse "http://www.oamk.fi/tyojarjestykset/otek/luokat/" "luokat.txt" "OR_" +#parse "http://www.oamk.fi/~heikkim/riihi2/Oppilaat/" "luokat.txt" "Ryh._" CACHEDIR="cache-next/" -parse "http://www.oamk.fi/~heikkim/riihi2/luokat/" "luokat_next.txt" "Ryh._" +OLDCACHEDIR="cache-next-old/" +parse "http://www.oamk.fi/~heikkim/riihi2/Oppilaat/" "luokat_next.txt" "Ryh._"