/* File : pcfpcmp.c Author : Richard A. O'Keefe Updated: 9/16/99 Purpose: Compare a file containing numbers against a model USAGE pcfpcmp Model ! This is a comment. It can be used for including version control information. It can also be used for joining long lines, e.g. abcde$! fghi matches the same instance text that abcdefghi does. $ This matches literal characters. The second character must be $, no matter what the character is at the time. The numeric argument is the number of characters to match; zero or omitted has the same effect as 1. = The numeric argument is the code of the character to use as the magic number. This is a bit clumsy, but it is not something the contestants do, and it only needs to be done if you want to use a different magic character. For some reason, the commercial "at" sign appears to be rare in programming contests, so $64=@! Set the magic character to '@' could be useful. c If c is 0, heed alphabetic case when matching. This is the initial state. If c is 1, ignore alphabetic case when matching, on a character-by-character basis. This means that æ will match either æ or Æ but not ae, Ae, or AE. Numbers other than 0 or 1 are reserved for specifying cleverer case handling. As an example, if "Can't read filename.txt" is to be matched, with the suspicion that the operating system might force letters to the wrong case (yes, ISO 10646 is quite explicit that the only case folding scheme that works is folding to _lower_ case, not the _upper_ case that DOS likes) before the program sees them, you might check for Can't read $1cfilename.txt$c s If the number is 0, match zero or more spaces, as many as possible. If the number is not zero, match exactly that many spaces. # Many of the New Zealand and South Pacific programming contest problems use '#' to indicate end of file, rather than rely on language-dependent end of file detection schemes. The rules for the New Zealand programming contest allow any number of extra lines of output, as long as they are completely blank (contain only space characters). # is meant for checking that. If the number is 0, it matches zero or more spaces and newlines. if the number is n > 0, it matches any number of spaces, stopping when the nth newline is matched. If there are not that many newlines, it does not match. Note that this pattern doesn't _have_ to appear at the end of the model; if you want to allow a block of blank lines at any point, you can include this. Beware that it will swallow spaces at the start of the next line too unless you use a non-zero count. matches floating point numbers. If the numeric argument is 0, it matches as many characters as it can consistent with the pattern [ ]*[-+]?[_0-9]*([.][_0-9]*)?([Ee][-+]?[_0-9]*) and then insists that there must be at least one digit in the scale factor (if there is one), and at least one digit ignoring the scale factor. If the numeric argument is not 0, it matches exactly that many characters, as long as they match the pattern [ ]*[-+]?[_0-9]*([.][_0-9]*)?([Ee][-+]?[_0-9]*)[ ]* and have enough digits. In either case, the characters must represent a floating point number that is in the specified interval. An interval can be [ ] {x | a <= x <= z} [ ) {x | a <= x < z} ( ) {x | a < x < z} ( ] {x | a < x <= z} where a range can be L,U a = L, z = U M*R a = min(M/R,M*R), z = max(M/R,M*R) M+D a = M-D, z = M+D At all times we must have R > 0, D >= 0. Each of the numbers L (initially 0), U (initially 1), M (initially 1), R (initially 1), D (initially 0) may be omitted, in which case it retains the value it had last time. The initial values may be set using l u m r d which set the lower bound, upper bound, mid-point, ratio, and difference respectively. For these commands the numbers may be floating point numbers; the other commands forbid signs, decimal points, and scale factors. The numbers here and in a range use the same syntax as the pattern matches. For example, to set a relative error of one part in ten thousand, and then match the numbers 1, 2, -3, 4.5, you might do $1e-4r$[1*]$[2*]$[-3*]$[4.5*] I had intended to allow Common-Lisp-style D, F, S, and L exponent letters, which would also be nice for Fortran, but that conflicted with the `l' and `d' commands here. Nor do we accept the floating point suffices of ISO C. FUTURE DIRECTIONS Implement our own `bigfloat' package with sign {-1,0,+1} scale factor {-LONG_MAX..+LONG_MAX} NDIGITS decimal digits such that sign = 0 => scale factor = 0 and all digits 0, sign != 0 => first digit != 0. The operations needed are - reading - multiplying - adding - subtracting - comparing (less than or equal to will do nicely) We don't need division, which is fortunate. The point of this is to ensure exactly the same results on all platforms. For this year, the tolerances we set should take the fuzziness of this program's calculations into account too. As one example, 0.49999999999999999999 was accepted as being in the interval [0.5,2.0], which it clearly is not. Exact calculations are the only way to prevent that. RETURNS The program may write one of three messages to standard output. accepted The instance definitely matches the model. In this case no messages will be written to stderr. Status = EX_OK. rejected The instance definitely does not match the model. In this case an explanatory message will be written to stderr showing the location of the first mismatch. Status = EX_DATAERR. unknown Thanks to a bad command line, missing or unreadable file, operating system problem, bad model file, or internal error, the program could not determine whether the instance matches the model or not. In this case an explanatory message will be written to stderr describing the nature of the problem and giving file locations, where relevant. Status = EX_USAGE (bad command line) | EX_NOINPUT (some input file can't be read) | EX_RESOURCE (insufficient memory) | EX_IOERR (operating system reading error) | EX_OSFILE (the model file is malformed) | EX_SOFTWARE (some internal error; a BUG) The status codes are normal UNIX status codes; you will only get them if you compile the program with -Dunix. Non-UNIX systems will use EXIT_SUCCESS for "accepted" and EXIT_FAILURE for all "rejected" and "unknown" cases. */ #include #include #include #include #include #include /* Exit status codes taken from */ #define EX_OK 0 /* successful termination */ #define EX_USAGE 64 /* command line usage error */ #define EX_DATAERR 65 /* Instance data bad */ #define EX_NOINPUT 66 /* cannot open Model or Instance */ #define EX_SOFTWARE 70 /* internal software error */ #define EX_OSFILE 72 /* Model data bad */ #define EX_IOERR 74 /* input/output error */ #define EX_RESOURCE 79 /* resource limit hit */ void my_exit(int r) { switch (r) { case EX_OK: printf("accepted\n"); break; case EX_DATAERR: printf("rejected\n"); break; default: r = EX_SOFTWARE; case EX_SOFTWARE: case EX_OSFILE: case EX_NOINPUT: case EX_USAGE: case EX_RESOURCE: case EX_IOERR: printf("unknown\n"); break; } #ifdef unix exit(r); #else exit(r == EX_OK ? EXIT_SUCCESS : EXIT_FAILURE); #endif } void my_assert(int c, char const *name, int line) { if (!c) { fprintf(stderr, "%s line %d: internal error\n", name, line); my_exit(EX_SOFTWARE); } } #ifdef NDEBUG #define assert(c) (void)0 #else #define assert(c) my_assert((c), __FILE__, __LINE__) #endif #define INITIAL_MAGIC '$' #define INITIAL_LOWER 0.0 #define INITIAL_UPPER 1.0 #define INITIAL_MIDPT 1.0 #define INITIAL_RATIO 1.0 #define INITIAL_DELTA 0.0 struct Source { FILE *stream; /* where to get characters */ char const *name; /* what the file was called */ long line; /* what line it is at */ int col; /* what column it is at */ int which; /* which of several internal files is it? */ }; typedef struct Source SOURCE; void show_source( SOURCE *source, int ch ) { fprintf(stderr, "%s line %ld col %d%c ", source->name, source->line, source->col, ch); } SOURCE *open_source( char const *name, int which ) { SOURCE *result; assert(name != 0); result = (SOURCE *)malloc(sizeof *result); if (result == 0) { fprintf(stderr, "Can't allocate enough memory\n"); my_exit(EX_RESOURCE); } if (0 == strcmp(name, "-")) { result->stream = stdin; } else { result->stream = fopen(name, "r"); if (result->stream == 0) { fprintf(stderr, "Can't open '%s' for input\n", name); my_exit(EX_NOINPUT); } } result->name = name; assert(result != 0 && result->stream != 0 && result->name != 0); result->line = 1; result->col = 1; result->which = which; return result; } void close_source( SOURCE *source ) { assert(source != 0); if (source->stream != stdin) fclose(source->stream); free(source); } void unread_char( SOURCE *source, int c ) { assert(source != 0); ungetc(c, source->stream); if (c == '\n') source->line--; } int read_char( SOURCE *source ) { int c; assert(source != 0); c = getc(source->stream); if (c < 0 && !feof(source->stream)) { show_source(source, ':'); fprintf(stderr, "I/O system error\n"); my_exit(EX_IOERR); } if (c == '\n') { source->line++; source->col = 1; } else { source->col++; } return c; } int asis(int c) { assert(0 <= c && c <= (int)UCHAR_MAX); return c; } int aslc(int c) { assert(0 <= c && c <= (int)UCHAR_MAX); c = tolower(c); assert(0 <= c && c <= (int)UCHAR_MAX); return c; } #define MAX_DIGITS 40 struct NumBuf { SOURCE source; /* where it came from */ char has_sign; /* was there a leading + or -? */ char has_point; /* was there a decimal point? */ char has_scale; /* was there an E scaling factor? */ char was_empty; /* nothing but blanks */ char nodig_sig; /* no digit in significand? */ char nodig_exp; /* no digit in exponent? */ char buffer[sizeof "-.E-99999" + MAX_DIGITS]; }; /* intval is only called to read model integers */ /* which may be omitted, so we ignore nodig_sig */ int intval(struct NumBuf *p, int l, int u) { char *end; long r; errno = 0; r = strtol(p->buffer, &end, 10); if (errno != 0 || *end != '\0' || p->has_sign || (p->nodig_sig && !p->was_empty) ) { show_source(&p->source, ':'); fprintf(stderr, "bad integer format: %s\n", p->buffer); my_exit(p->source.which ? EX_DATAERR : EX_OSFILE); } if (r < l || r > u) { show_source(&p->source, ':'); fprintf(stderr, "integer %s not in range %d..%d\n", p->buffer, l, u); my_exit(p->source.which ? EX_DATAERR : EX_OSFILE); } assert(INT_MIN <= l && l <= r && r <= u && u <= INT_MAX); return (int)r; } double dblval(struct NumBuf *p, int bound) { /* bound < 0 => all ok, = 0 => must be >= 0, > 0 => must be > 0 */ char *end; double r; errno = 0; r = strtod(p->buffer, &end); if (errno != 0 || *end != '\0' || p->nodig_sig || p->nodig_exp) { show_source(&p->source, ':'); fprintf(stderr, "bad number format: %s\n", p->buffer); my_exit(p->source.which ? EX_DATAERR : EX_OSFILE); } if ((bound > 0 && r <= 0) || (bound >= 0 && r < 0)) { show_source(&p->source, ':'); fprintf(stderr, "number %s not %s 0\n", p->buffer, (bound > 0 ? ">" : ">=")); my_exit(p->source.which ? EX_DATAERR : EX_OSFILE); } return r; } void read_number( SOURCE *source, /* where to read from */ struct NumBuf *p, /* where to put the answer */ int leadok, /* leading blanks ok? */ int trailok, /* trailing blanks ok? */ int width /* how many characters to read */ ) { char *d; int c; p->source = *source; p->has_sign = p->has_point = p->has_scale = 0; p->nodig_sig = p->nodig_exp = p->was_empty = 1; d = p->buffer; c = read_char(source); if (leadok) { while (c == ' ') { if (--width == 0) goto quit; c = read_char(source); } } if (c == '-') { p->has_sign = 1; p->was_empty = 0; *d++ = c; if (--width == 0) goto quit; c = read_char(source); } else if (c == '+') { p->has_sign = 1; p->was_empty = 0; if (--width == 0) goto quit; c = read_char(source); } for (;;) { if (isdigit(c)) { p->nodig_sig = 0; *d++ = c; } else if (c != '_') { break; } p->was_empty = 0; if (--width == 0) goto quit; c = read_char(source); } if (p->nodig_sig) *d++ = '0'; if (c == '.') { p->has_point = 1; p->was_empty = 0; *d++ = c; if (--width == 0) goto quit; c = read_char(source); for (;;) { if (isdigit(c)) { p->nodig_sig = 0; *d++ = c; } else if (c != '_') { break; } if (--width == 0) goto quit; c = read_char(source); } if (!isdigit(*(unsigned char const *)(d-1))) *d++ = '0'; } if (c != 'E' && c != 'e') { p->nodig_exp = 0; } else { p->has_scale = 1; p->was_empty = 0; *d++ = 'E'; if (--width == 0) goto quit; c = read_char(source); if (c == '-') { p->has_sign = 1; *d++ = c; if (--width == 0) goto quit; c = read_char(source); } else if (c == '+') { p->has_sign = 1; if (--width == 0) goto quit; c = read_char(source); } for (;;) { if (isdigit(c)) { p->nodig_exp = 0; *d++ = c; } else if (c != '_') { break; } if (--width == 0) goto quit; c = read_char(source); } if (p->nodig_exp) *d++ = '0'; } if (trailok) { while (c == ' ') { if (--width == 0) goto quit; c = read_char(source); } } quit: unread_char(source, c); *d = '\0'; } int check( char const *model_file_name, char const *instance_file_name ) { SOURCE *model; SOURCE *instance; int magic = INITIAL_MAGIC; double lower = INITIAL_LOWER; double upper = INITIAL_UPPER; double midpt = INITIAL_MIDPT; double ratio = INITIAL_RATIO; double delta = INITIAL_DELTA; double a, z, t; int left_strict, right_strict; int (*shift)(int) = asis; int w; /* width */ int c, d; /* characters, usually */ struct NumBuf num; assert(model_file_name != 0 && instance_file_name != 0); if (0 == strcmp(instance_file_name, "-") && 0 == strcmp(model_file_name, "-") ) { fprintf(stderr, "model and instance are both '-'\n"); my_exit(EX_USAGE); } model = open_source(model_file_name, 0); instance = open_source(instance_file_name, 1); while ((c = read_char(model)) >= 0) { if (c != magic) { d = read_char(instance); if (d < 0) { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "Instance ran out\n"); goto fail; } if (shift(d) != shift(c)) { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "Model has %c (%d) " "but instance has %c (%d)\n", c, c, d, d); goto fail; } } else { read_number(model, &num, 0, 0, 0); c = read_char(model); switch (c) { case '$': w = intval(&num, 0, INT_MAX); if (w == 0) w = 1; c = magic; while (w-- != 0) { d = read_char(instance); if (d < 0) { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "Instance ran out\n"); goto fail; } if (shift(d) != shift(c)) { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "Model has %c (%d) " "but instance has %c (%d)\n", c, c, d, d); goto fail; } } break; case '=': magic = intval(&num, ' '+1, UCHAR_MAX); break; case '!': do { c = read_char(model); } while (c >= 0 && c != '\n'); break; case 'c': case 'C': shift = intval(&num, 0, 1) ? aslc : asis; break; case 'l': case 'L': lower = dblval(&num, -1); break; case 'u': case 'U': upper = dblval(&num, -1); break; case 'm': case 'M': midpt = dblval(&num, -1); break; case 'r': case 'R': ratio = dblval(&num, 1); break; case 'd': case 'D': delta = dblval(&num, 0); break; case 's': case 'S': case ' ': w = intval(&num, 0, INT_MAX); if (w == 0) { do { d = read_char(instance); } while (d == ' '); unread_char(instance, d); } else { do { d = read_char(instance); if (d != ' ') { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "missing space\n"); goto fail; } w--; } while (w > 0); } break; case 'e': case 'E': case '#': w = intval(&num, 0, INT_MAX); if (w == 0) { do { d = read_char(instance); } while (d == ' ' || d == '\n'); unread_char(instance, d); } else { do { d = read_char(instance); if (d == '\n') { w--; } else if (d != ' ') { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "missing blank lines\n"); goto fail; } } while (w > 0); } break; case '[': case '(': w = intval(&num, 0, INT_MAX); left_strict = c != '['; read_number(model, &num, 1, 1, 0); c = read_char(model); switch (c) { case ',': if (!num.was_empty) lower = dblval(&num, -1); a = lower; read_number(model, &num, 1, 1, 0); if (!num.was_empty) upper = dblval(&num, -1); z = upper; break; case '*': if (!num.was_empty) midpt = dblval(&num, -1); a = midpt; read_number(model, &num, 1, 1, 0); if (!num.was_empty) ratio = dblval(&num, 1); z = a*ratio; a = a/ratio; break; case '+': if (!num.was_empty) midpt = dblval(&num, -1); a = midpt; read_number(model, &num, 1, 1, 0); if (!num.was_empty) delta = dblval(&num, 0); z = a+delta; a = a-delta; break; default: show_source(model, ':'); fprintf(stderr, "bad interval %c%s%c...\n", "[("[left_strict], num.buffer, c); goto error; } c = read_char(model); switch (c) { case ']': right_strict = 0; break; case ')': right_strict = 1; break; default: show_source(model, ':'); fprintf(stderr, "bad interval %c...%s%c\n", "[("[left_strict], num.buffer, c); goto error; } if (a > z) { t = a; a = z; z = t; } read_number(instance, &num, 1, 0, w); t = dblval(&num, -1); if (!( (left_strict ? a < t : a <= t) && (right_strict ? t < z : t <= z) )) { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "%s not in interval %c%g,%g%c\n", num.buffer, "[("[left_strict], a, z, "])"[right_strict]); goto fail; } break; default: show_source(model, ':'); fprintf(stderr, "bad magic sequence %c%c\n", magic, c); goto error; } } } d = read_char(instance); if (d >= 0) { show_source(model, ','); show_source(instance, ':'); fprintf(stderr, "Model ran out but instance didn't\n"); goto fail; } close_source(model); close_source(instance); return EX_OK; fail: close_source(model); close_source(instance); return EX_DATAERR; error: close_source(model); close_source(instance); return EX_OSFILE; } int main(int argc, char **argv) { int r; switch (argc) { case 2: r = check(argv[1], "-"); break; case 3: r = check(argv[1], argv[2]); break; default: fprintf(stderr, "Usage: pcfpcmp Model {Instance |