0

Suppose I have data like this:

Sample input

How can I convert data to this kind of output:

wanted output

I need to use c++ to implement but I am just a beginner so I get stuck on this project.

Andy
  • 12,859
  • 5
  • 41
  • 56
gnvsdnvjds
  • 21
  • 2
  • These are two different problems. 1) Read CSV data in C++ and 2) how to create a Pivot table in C++. – JimmyNJ Aug 04 '20 at 23:02
  • Welcome to Stack Overflow. Please read the [About](http://stackoverflow.com/tour) page soon and also visit the links describing [How to Ask a Question](http://stackoverflow.com/questions/how-to-ask) and [How to create a Minimal, Complete, and Verifiable example](http://stackoverflow.com/help/mcve) (MCVE). Providing the necessary details, including your MCVE, compiler warnings and associated errors, if any, will allow everyone here to help you with your question. – David C. Rankin Aug 04 '20 at 23:19
  • [C++ reading csv file and assigning values to array](https://stackoverflow.com/a/53148642/3422102) may help with reading the file. There are many answers related to csv files and C++ on StackOverflow. – David C. Rankin Aug 04 '20 at 23:21
  • Hello David, thanks for your reference and this is very helpful. I will try to ask the question more clearly in the future. – gnvsdnvjds Aug 05 '20 at 14:25

1 Answers1

1

Your question has 2 parts. And the necessity to shortly talk about data models and Excel.

98% of the Excel people use Excel wrong. Most data that is processed is relational data. And relational data should be stored in relational databases. So many functions and functionalities like subtotals or pivot tables has been added to Excel to mimick database functionality. But this is basically the wrong approach. I could go on for hours, but people love their Excel (me too), so let them do.

Your data is stored in a so called denormalized form or flat table. That is very good, because then you can use Pivot Tables for easy analysis.

So, first we need to read you data, stored in a CSV file. I have already posted tons of examples on how to do that. But let me explain again.

We will read the data from any stream, e.g. an open file stream or an std::istringstream.

This we do with the function std::getline. We read a complete line and then split the line into tokens. For that we use a std::vector (to store the tokens) and the std::sregex_token_iterator to iterate over the substrings and store them. That part is rather simple.

Next, and this is most important, and similar to selecting the correct data model / database scheme, we need to think of how to store the data.

If we look at the input, we have x-values. Each x value can have 0, 1 or more y and z values. So, we have a relation. By the way. Not each x-value needs to have all y-values. So, there maybe input data like this:

10,5,0.2
10,6,0.5
10,7,0.6
20,5,0.7
20,6,0.8
20,7,0.9
30,2,0.7
20,8,0.8

The equivalent of relational data in STL can be designed by using a associative container, like a std::map or std::unordered_map. And so, we will select a data type std::map<int, std::map<int, double>> to handle the source data properly. The first std::map type is the x-value. And the second, associated part is again a std::map with the y- and z-value.

After reading the values and storing the data, we will receive the following hierachical structure:

content of map

Then for printing, we need to first extract all DISTINCT y-values and store them temporary. A std::set is the adequate container to store distinct values. Then, we will print the y-values as a header.

Next, we will print all c-values and then check, if a z-value for the current y-value exists. If so, we will print it.

And all this can be done with a small piece of code:

#include <iostream>
#include <sstream>
#include <vector>
#include <map>
#include <regex>
#include <set>

const std::regex re{ "," };

std::istringstream sourceFileStream{R"(10,5,0.2
10,6,0.5
10,7,0.6
20,5,0.7
20,6,0.8
20,7,0.9
30,2,0.7
20,8,0.8
)"};

int main() {

    std::map<int, std::map<int, double>> data;


    // Read values -----------------------------------------------------------------------------------
    // Read all lines from source file
    for (std::string line{}; std::getline(sourceFileStream, line); ) {

        // Split the line into tokens
        std::vector token(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});

        // Convert strings to values
        const int x{ std::stoi(token[0]) };
        const int y{ std::stoi(token[1]) };
        const double z{ stod(token[2]) };

        // Store values into map
        data[x][y] = z;
    }

    // Print values ----------------------------------------------------------------------------------

    // Extract all distinct y values. To get unique values, we will use a std::set
    std::set<int> yValues{};
    for (const auto& [x, yz] : data) for (const auto& [y, z] : yz) yValues.insert(y);

    // Print all y values
    for (const int y : yValues) std::cout << '\t' << y;
    std::cout << '\n';

    // Now iterate over all x values
    for (const auto& [x, yz] : data) {

        // Print the x-value for this line
        std::cout << x;

        // Search the z Value for this x value and the y value of the current column
        for (const int y : yValues) {
            std::cout << '\t';

            // If we have a z value for this x value and the current column y value
            if (auto it = yz.find(y); it != yz.end()) std::cout << it->second;
        }
        std::cout << '\n';
    }
    return 0;
}

Please be aware that the line data[x][y] = z; is, under the hood, a very complex statement. Please take your time to deeply analyze and understand this.

If there should be any question, then I am happy to answer

A M
  • 14,694
  • 5
  • 19
  • 44
  • Hi Armin, thanks for your explanation and I truly believe that I could learn a lot from your post. As for the part of splitting the line into tokens, I get an error that says "argument list for class template 'std::vector' is missing". I was wondering if you could tell me how to fix it? Much appreciate your help. – gnvsdnvjds Aug 05 '20 at 14:56
  • I am using the CTAD feature available in C++17. Please compile with C++17 enabled. Or, write: ````std::vector token(std::sregex_token_iterator(line.begin(), line.end(), re, -1), {});```` – A M Aug 05 '20 at 14:58
  • Yeah, I just added and the problem for this part has been solved. Thanks for your explannation. – gnvsdnvjds Aug 05 '20 at 15:17
  • Actually I also have a problem with the set part. It says "x" in the for loop is undefined, so as y and yz below. Any hint for solving this problem? Thanks for your time – gnvsdnvjds Aug 05 '20 at 15:29
  • I am using range based for and structured bindings, and, as said above, I use C++17. Please set the corresponding option for your compiler. The code above has been compiled and tested without any problem with C++17. I am using the free MS Visual Studio 2019 Community Edition – A M Aug 06 '20 at 05:23