0

I'm using an in-house file parsing library, which I'm using to parse a gnarly reporting file generated by a legacy system. The library iterates allows you to define Linq queries which are applied successively to return an enumerable set of structures in the file.

A typical example would be something like the below.

var OrderReportParser = 
    from blanks in Rep(BlankLine) // One or more blank lines
    from header1 in TextLine      // A header text line with no useful information
    from hyphenLine in WhiteSpace.And(Rep(Char('-'))).And(BlankLine)
                                  // A line containing only hyphens
    /* ... Snip ... lots of other from clauses */
    from orderId in WhiteSpace.And(AlphaNumeric) // The id of this record
    from price in WhiteSpace.And(Decimal)        // Order price
    from quantity in WhiteSpace.And(Integer)     // Order quantity
    select new OrderLine (orderId, price, quantity)

Because much of my file is simply text, many of the intermediate results generated by a statement such as the above are not required in the output (such as the variables blanks, header1, hyphenLine in the example above).

Is there any such mechanism in C# creating variables for the intermediate results, or do I always to create variable for each?

I am thinking of examples such as F#'s _ variable, which can be used in this fashion. See F#'s underscore: why not just create a variable name? for an example in the context of Tuples.

Community
  • 1
  • 1
MonkeyPushButton
  • 1,077
  • 10
  • 19
  • Is this a public parser library? I assume is uses "altered" LINQ semantics because what is shown here is a big cross product. – usr Sep 02 '13 at 13:56
  • I don't think we can remove the bridge to the other end, it's required to reach to the other end. I can't also imagine how complicated the relationship your collections have. – King King Sep 02 '13 at 13:57
  • Comments on the library not required, as my hands are tied (I almost put this as part of the question). I happen to quite like it though, having dealt with it for a while. It's quite a functional approach to the problem, which is why it may look odd at first sight. – MonkeyPushButton Sep 02 '13 at 14:02
  • Looks like you can combine all the unnecessary stuff with `And`. That just leaves you with one dummy variable. – Dax Fohl Sep 02 '13 at 14:15
  • `_` is a valid variable name. I use it when I want to show that I will not use that variable (for instance, as in a LINQ expression `(_ => CallSomeMethod())`). Would that work for you? – default Sep 02 '13 at 14:22
  • @Default He would only be able to use `_` once, so it's not quite the same thing. – Magnus Grindal Bakken Sep 02 '13 at 14:50
  • @MagnusGrindalBakken he could also go with `_01`, `_02`, .., `_nn`. It would still be a way to show "I am not using these variables". I guess this has to do with readability, because I fail to see this to be any kind of performance hit (he still needs to select from it). – default Sep 02 '13 at 16:41

1 Answers1

1

If you're asking if it's possible to do something like this:

var OrderReportParser = 
    from Rep(BlankLine)
    from TextLine
    from WhiteSpace.And(Rep(Char('-'))).And(BlankLine)
    ...

...then the answer is no. The designers of Linq probably never imagined that people would want to select something and then immediately throw it away without looking at it, since with most other Linq providers this parser syntax would create a huge Cartesian product. (Or if they did think about it they didn't consider it to be a useful enough feature.)

Why do you want to get rid of the variable names anyway? Personally I think including the variable names is likely to make the intent of the code clearer. If the unused variables bother you so much I guess you can name them something like _1, _2 or dummy1, dummy2, etc. That should make it pretty clear that they aren't used for anything. But they have to be there.

Edit: I had an inkling that the "anonymous variable" _ from languages like F# might've been what you were driving at. The answer is still no, I'm afraid. You could name the first variable _, but you wouldn't be allowed to redefine it on the second line, so that would be it. Also, the _ variable wouldn't have the special semantics that it has in F#, so you would essentially be pretending that C# has a feature that it doesn't. Keep in mind that C# is fundamentally an imperative language. It has a lot of functional-style features, such as Linq, but it's still very much a language in the C/Java tradition, where these sorts of pattern matching features have not yet made much inroads. I like those features too, but you have to think a little differently when writing C#.

Magnus Grindal Bakken
  • 2,083
  • 1
  • 16
  • 22