-1

Want to find out first occurrence and last occurrence of the value from the row from the csv file using python. The value which I want to compare is date value which is row[1].

Input:

ABC, 12/2/2017 ,9:21 AM
ABC, 12/2/2017 ,1:15 PM
ABC, 12/2/2017 ,6:38 PM
ABC, 12/4/2017 ,9:21 AM
ABC, 12/4/2017 ,1:01 PM
ABC, 12/7/2017 ,11:59 AM
ABC, 12/8/2017 ,9:33 AM
ABC, 12/8/2017 ,11:15 AM
ABC, 12/8/2017 ,5:15 PM

Output:

ABC, 12/2/2017 ,9:21 AM
ABC, 12/2/2017 ,6:38 PM
ABC, 12/4/2017 ,9:21 AM
ABC, 12/4/2017 ,1:01 PM
ABC, 12/7/2017 ,11:59 AM
ABC, 12/8/2017 ,9:33 AM
ABC, 12/8/2017 ,5:15 PM

Thanks in advance

hc_dev
  • 8,389
  • 1
  • 26
  • 38
  • look here https://stackoverflow.com/questions/20296955/attributeerror-when-trying-to-use-seek-to-get-last-row-of-csv-file – Nikos Tavoularis Jan 05 '18 at 15:10
  • 2
    Welcome to Stack Overflow! Pure code-writing requests are off-topic on Stack Overflow -- we expect questions here to relate to specific programming problems -- but we will happily help you write it yourself! Tell us [what you've tried](https://stackoverflow.com/help/how-to-ask), and where you are stuck. This will also help us answer your question better. – WhatsThePoint Jan 05 '18 at 15:15
  • Is your data already sorted like your Input Example ? – Mr Rubix Jan 05 '18 at 15:19
  • 1
    The question is not clear. The first row will be row[0] and not row[1]. Explain what you want to do clearly and also what code have you tried till now. – rnso Jan 05 '18 at 23:42
  • @NikosTavoularis If you label the given link (find last row of csv file), it would be clear that this does not answer the question for content-filtering ( _last timestamp_ ) – hc_dev Feb 17 '21 at 22:44

2 Answers2

0

This presumes your data is already sorted as in your example. operator.itemgetter(1) returns a function that allows itertools.groupby to group rows on the 2nd item (i.e. row[1]).

import iterools
import operator
import csv

with open('some.csv', newline='') as f:
    reader = csv.reader(f)
    result = []
    for k, g in itertools.groupby(reader, operator.itemgetter(1)):
        group = list(g)
        result.append(group[0])
        if len(group) > 1:
            result.append(group[-1])
Steven Rumbalski
  • 44,786
  • 9
  • 89
  • 119
  • This is a strong _presumption_ on ❗(semantically) sorted rows, but will achieve questioned filtering: last row per day-group == latest day-time – hc_dev Feb 19 '21 at 06:55
0

Helping you to express your problem:

  • helps you to ask clearly, which in turn
  • enables you to write pseudo-code, which in turn
  • can be the minimal example (what you tried already), which in turn
  • should be posted by you in the question, which in turn
  • enables us to guide you to the solution

Problem

Find the first and last occurrence (time within a day) as row of the given CSV file. The program needs to be coded in Python. The program needs to read rows from the CSV file. I want to read each row splitted into 3 columns (especially second column date and third column time) in order to compare their values to respective values of other rows. The comparison should happen within a date group and only compare rows of the same date.

date value which is row[1].

Given Input

A simple example of the CSV file has following 3 rows which represent a single day (second column with value 12/2/2017) with 3 times (third column with 3 distinct values 9:21 AM, 1:15 PM and 6:38 PM):

ABC, 12/2/2017 ,9:21 AM
ABC, 12/2/2017 ,1:15 PM
ABC, 12/2/2017 ,6:38 PM

Pseudo code

Supposed I don't know python and have no clue where to start with coding, here is the logical flow I would like to program in python:

  1. read all rows of the CSV file
  2. group the rows by second column (date) values to a list of times within a single day
  3. compare the list of times (grouped per day) by third column (time) values to filter only 2 rows into the result: the first occurrence (earliest time) and the last occurrence (latest time)
  4. write the filtered result as rows to the CSV output file

Expected Output

After this the expected output CSV would contain:

ABC, 12/2/2017 ,9:21 AM

ABC, 12/2/2017 ,6:38 PM

Note that one row has been filtered-out (removed from result):

ABC, 12/2/2017 ,1:15 PM
hc_dev
  • 8,389
  • 1
  • 26
  • 38