question

Upvotes
Accepted
1 0 1 1

Python Pandas -remove rows based on given value in cell?

I have following script which removes rows in which date start with 202110 in column DATE OF OPERATION. I understand that space in the name of column is not allowed so script also replace space by _ and then after rows are removed it add back the space. For some reason I couldn't attach csv here so please see example below:

Column1DATE OF OPERATIONNAVpUnitsdsasa2021120124324dsasa2021102223232sd20211022232sd202110223-2802.6667


The code is as below and the error I'm getting is: KeyError: 'DATE_OF_OPERATION'

Could you advice what is the case of the error? - Thank you

import os
import glob
import pandas as pd
from pathlib import Path
source_files = sorted(Path(r'/Users/maciejgrzeszczuk/Downloads/').glob('*.csv'))

for file in source_files:
 df = pd.read_csv(file)
 df.columns = df.columns.str.replace(' ', '_')
 df = df[~df['DATE_OF_OPERATION'].astype(str).str.startswith('202110')]
 df.columns = df.columns.str.replace('_', ' ')
 name, ext = file.name.split('.')
 df.to_csv(f'{name}.{ext}', index=0)
pythonpandas
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

hi @grzeszczukmaciek

Thank you for your participation in the forum. Is the reply below satisfactory in resolving your query?
If so please can you click the 'Accept' text next to the appropriate reply? This will guide all community members who have a similar question.

Thanks,
AHS

Hi @grzeszczukmaciek ,

Please be informed that a reply has been verified as correct in answering the question, and has been marked as such.

Thanks,

AHS

1 Answer

· Write an Answer
Upvotes
Accepted
1.3k 3 2 4

Hi @grzeszczukmaciek ,

This is a pure Python question (and not an issue related to our APIs), but let's try to propose a solution.

spaces are allowed in column names, so your code could be simplified to:

for file in source_files:
 df = pd.read_csv(file)
 df = df[~df['DATE OF OPERATION'].astype(str).str.startswith('202110')]
 name, ext = file.name.split('.')
 df.to_csv(f'{name}.{ext}', index=0)

On my side, it woked with following file.csv content:

DATE OF OPERATION,ASK,BID,TRADE PRICE
20210901,10,12,11
20211001,11,13,12
20211101,12,14,13
icon clock
10 |1500

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.