Read:
import pandas as pd contents = pd.DataFrame(pd.read_csv('products.csv', delimiter=',',)) print(contents)
Read and name first row:
contents=pd.read_csv('products.csv', delimiter=',', names = ['product_id', 'product_title', 'product_description', 'price'])
print(contents)
Print first 5 rows:
contents=pd.read_csv('products.csv', delimiter=',', names = ['product_id', 'product_title', 'product_description', 'price'])
print(contents.head())
Print last 5 rows:
contents=pd.read_csv('products.csv', delimiter=',', names = ['product_id', 'product_title', 'product_description', 'price'])
print(contents.tail())
Print random 5 rows:
contents=pd.read_csv('products.csv', delimiter=',', names = ['product_id', 'product_title', 'product_description', 'price'])
print(contents.sample())
Print specific columns:
contents=pd.read_csv('products.csv', delimiter=',', names = ['product_id', 'product_title', 'product_description', 'price'])
print(contents[['product_title','price']])
Filter specific values:
print(contents[contents.product_title == 'Products ABC'])
Counts of rows:
print(contents.count())
Sum of data:
print(contents.price.sum())
Min of data:
print(contents.price.min())
Max of data:
print(contents.price.max())
Mean of data:
print(contents.price.mean())
Median of data:
print(contents.price.median())
Grouping of data(ex:mean of group):
print(contents.groupby('price').mean())
Merge of two dataset
print(contents.merge(price))
Sorting:
contents.sort_values('product_title')
Reset index after sorting:
#with deleting former indexing
contents.sort_values(by = ['product_title'], ascending = False).reset_index(drop = True)
#without deleting former indexing
contents.sort_values(by = ['product_title'], ascending = False).reset_index()
Discribe data:
- For numeric columns, describe() returns basic statistics: the value count, mean, standard deviation, minimum, maximum, and 25th, 50th, and 75th quantiles for the data in a column.
- For string columns, describe() returns the value count, the number of unique entries, the most frequently occurring value (‘top’), and the number of times the top value occurs (‘freq’)
contents['price'].describe()
Delete column:
contents = contents.drop(columns='price')contents = contents.drop('price', axis=1) #axis=1 means columns
contents = contents.drop([0], axis=1) # delete according to column index
Delete Rows
Delete rows:
contents = contents.drop([0], axis=0) # axis=0 means rows, delete according to row index
Exporting data:
Exporting Data:
# Output data to a CSV file# Typically, I don't want row numbers in my output file, hence index=False.# To avoid character issues, I typically use utf8 encoding for input/output.
data.to_csv("output_filename.csv", index=False, encoding='utf8')
# Output data to an Excel file.# For the excel output to work, you may need to install the "xlsxwriter" package.
data.to_csv("output_excel_file.xlsx", sheet_name="Sheet 1", index=False)
Be First to Comment