1

I am using fscanf in C++ as follows:

Here is my text file:

abcd
efgh

There is no space or new line after "efgh". Now here is my loop:

FILE* fp;
char eof = 0;
do
{
    char str[20];
    fscanf(fp, "%s", str);
    std::cout<<str<<std::endl;
}
while((eof = fgetc(fp)) != eof)

The output i expect is:

abcd
efgh

But the actual output i get is:

abcd
efgh
efgh

I debugged a found that after reading "efgh" the value read into eof is '\n' and not EOF. The environment is linux mint.I want to know why always the last string is read 2 times.Please advice

Nikson Kanti Paul
  • 3,394
  • 1
  • 35
  • 51
Rampal Chaudhary
  • 707
  • 1
  • 8
  • 18

3 Answers3

2

The last string is not being read twice. The problem is the loop's test for continuing:

(eof = fgetc(fp)) != eof

This assigns fgetc()'s return value to eof and checks that it is not equal to eof. Something which is hard to do logically. However, when fgetc() is called when the file is at EOF, it returns -1. That is cast to a char for the assignment, but the sub-expression in parenthesis retains the value -1 (due to type promotion rules). Comparing -1 to 255 or -127 (depending on whether char is signed or unsigned) finally terminates the loop.

The third time through the loop, fscanf() fails and does not update str: that's why the same value seems to have been read twice.

To fix it, the most straightforward technique would be:

do {
 ...
} while (!feof (fp));

However, on many operating systems, feof() doesn't work very well with fscanf() because the end of file indication isn't reliably set until fscanf() fails. A more reliable, O/S-resistent technique is to use

do {
    int result = fscanf (fp, ...whatever...);
    if (result < 0)   // end of file or i/o error?
         break;
} while (!feof (fp));
wallyk
  • 56,922
  • 16
  • 83
  • 148
  • This is incorrect. I'm not sure what his code is actually doing, but the results of `eof = fgetc()` is the lvalue `eof` (in C++, anyway---in C, it's an rvalue, but it's still the value you would get by reading `eof`). The result is that `(eof = fgetc()) != eof` can never be true. – James Kanze Oct 11 '12 at 07:38
  • 1
    And of course, just as you never use `ios_base::eof()` in iostream, you never use `feof` in C style streams. The value is not reliable until an input has failed. The important thing it to check the results of `fscanf` (which should be 1) to control the loop. And the idiomatic way of doing this is to do it as the condition of a `while`. (One can argue about the appropriateness of having such side effects in a condition, but the style is ubiquitous, to the point where anything else will cause the reader to ask "why?".) – James Kanze Oct 11 '12 at 07:43
  • @JamesKanze But then again `!=` is not a sequence point, so he might just be comparing `eof`'s old value to the new one. – Christian Rau Oct 11 '12 at 08:31
  • @ChristianRau The sequence point isn't relevant here. The `=` operator is in one of the arguments to the `!=`, so the expression must be evaluated before the `!=`. The side effects are not required to have taken place, but the `!=` doesn't use them; it uses the _value_ of the assignment expression. Which is defined to be the value which was actually assigned, with the same type. (Or is it. This was clearly the case in C, and I'm fairly sure was the intent in C++. But the standard says the results are the lvalue which was assigned to; `!=` implies lvalue->rvalue conversion.) – James Kanze Oct 11 '12 at 09:40
  • @ChristianRau So does lvalue->rvalue converion imply reading (and thus depending on the side effects). I'd say that the current C++ standard might need some clarification here, since you could certainly interpret it that way. – James Kanze Oct 11 '12 at 09:42
  • @JamesKanze But couldn't it just evaluate the `eof` of the right side before doing any assignment on the left side and thus compare the newly assigned value (left) with the old one (right)? – Christian Rau Oct 11 '12 at 09:44
  • @ChristianRau That's a good point. I don't know why I missed it: there's clearly undefined behavior, because the expression modifies an object on the left side of the `!=`, and accesses it other than to determine the value to be stored on the right. An obvious case of undefined behavior. – James Kanze Oct 11 '12 at 12:03
1

[Following up on the comments of Christian Rau in another thread, I've changed my first point to correspond to what I now realize]

There are several problems with your code. Some of the most obvious are:

  • The condition at the end of your do...while has undefined behavior. In the expression eof = fgetc(fp)) != eof, you modify an object (eof), and you access it elsewhere in the expression other than to determine the value to be stored. As far as the standard is concerned, anything may happen, and in fact, different compilers will do different things.

  • You are assigning the results of fgetc to a char, rather than to an int. The return value of fgetc is either in the range [0...UCHAR_MAX] or is EOF (which is guaranteed to be negative). In other words, it can take on one more value than will fit in a char. The results of then comparing the char with EOF depend on whether plain char is signed or not. If it's not signed, it can never have a negative value, and thus, will never be equal to EOF. If it's signed, then on particular character code (0xFF, or 'ÿ' in latin-1) will be detected as end of file. The return value of fgetc should always be assigned to an int, and the conversion to char should only be done after the test for EOF.

  • You're using the results of fscanf without checking that the function has succeeded. In C++, IO, be it iostream or FILE* is not predictive. Because of the way the interface is defined, it is impossible to tell in advance whether you will encounter end of file. You must try to read, and then test whether the read succeeded.

  • You're using fscanf into a char[] without limiting the input length. This is a buffer overflow waiting to happen.

In C++, the most natural way fo writing what you're doing would be:

std::string word;
while ( anIStream >> word ) {
    //  ...
}

Using the older, C compatible streams, you would write:

char word[20];
while ( fscanf( fp, "%19s", word ) == 1 ) {
    //  ...
}

In both cases, the check for success controls the loop; in the case of the C interface, you use a format width specifier to ensure that you don't overrun the buffer. (And in both cases, you must define the variables being read outside the loop, even though you will only use them in the loop.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
0

you are checking eof with eof. try to check like this

while( (eof = fgetc(fp)) != EOF)
Nikson Kanti Paul
  • 3,394
  • 1
  • 35
  • 51
  • 1
    I don't think he is; I think the second `eof` in his expression is a typo, and not his actual code, since otherwise, the expression can never be true. – James Kanze Oct 11 '12 at 07:47