Handling overinput with scanf()

3419 views c
-3

With scanf() I would like to make sure that nothing more than what exactly the format says isn't inputted. My code:

if (scanf("< %ld ; %ld > %c", &lo, &hi, &control) == 3)
    {

    }

I am scanning input of the form <0;100> s and would like to make sure noone is able to enter something like <0;100> sasdf (representing an interval), in this case scanf() says this is ok, because it did 3 succesfull conversions and not throw err. I really only want to scan that one char after the interval. How can i achieve this?

answered question

Not directly related to your question, but if you are going to read directly from stdin with scanf, you might want to put a space also before the first < to eat the trailing newline that is left by this from any previous attempts.

2 Answers

10

A scanset could be used to capture any extra input following a single character after the closing >. If the next character is a newline, the scanset will fail and scanf will return 3 values were scanned. If there is something other than a newline, scanf will return 4.

#include <stdio.h>

int main ( void){
    char extra[100] = "";
    char control = 0;
    long int lo = 0;
    long int hi = 0;
    int result = 0;

    do {
        printf("Please enter <x;y> s " );
        result = scanf(" <%ld ;%ld > %c%99[^\n]", &lo, &hi, &control, extra);
        if ( EOF == result) {
            return 0;
        }
    } while ( result != 3);

}

posted this
9

Read the input as a line using e.g. fgets(). Then use sscanf() to try and parse it:

char  buffer[1024];
char *line;
long  lo, hi;
char  control, dummy;

line = fgets(buffer, sizeof buffer, stdin);
if (!line) {
    /* No input at all. */
    exit(EXIT_FAILURE);
}

if (sscanf(line, " < %ld ; %ld > %c %c", &lo, &hi, &control, &dummy) == 3) {
    /* Format is good, we have lo, hi, and control. */
} else {
    /* Not this format; complain. */
}

Note that the final %c should be preceded by a space; that way any trailing whitespace, including newlines, are consumed prior to the final %c. Essentially, it is a dummy conversion that does not happen when the format is correct; that is also why we expect a result of 3 and not 4. When the result is 3, we know the first three conversions were successful, but the fourth dummy one was not, and that obviously means the input was formatted correctly, in the desired pattern, without anything else on the line.

This way, you can also support multiple different formats at the same time. For example,

if (sscanf(line, " < %ld ; %ld > %c %c", &lo, &hi, &control, &dummy) == 3 ||
    sscanf(line, " ( %ld %1*[,;] %ld ) %c %c", &lo, &hi, &control, &dummy) == 3 ||
    sscanf(line, " %ld %ld %c %c", &lo, &hi, &control, &dummy) == 3) {
    /* Format is good, we have lo, hi, and control. */
} else {
    /* Not this format; complain. */
}

accepts for example <1;2>x, < 1 ; 2 > x, (3,4)y, (5;6)z, 7+8w and 9 10 c.

The %1*[,;] part is funky: the 1 says that at most one character is accepted. The asterisk * says it does not count as a conversion (in the return value), and it is not stored (so there is no corresponding parameter). The [ type modifier character specifies a list of accepted (or rejected, if ^ is the first character) characters, until ]. So, in English, we might describe it thus: "accept one comma or a semicolon, but don't count it as a conversion or store it". (If it was %3*[,;/], then it would be "accept between one to three characters, where each character is a comma, semicolon, or a slash". Very useful for this kind of cases.)

posted this

Have an answer?

JD

Please login first before posting an answer.

Ads

Categories