데이터분석
pandas basic 01
jaehwi0823
2019. 10. 10. 20:04
1. pandas basic elements
index = myData.index
columns = myData.columns
data = myData.values
2. Data types
# check all data types
myData.dtypes
# counts them
myData.get_dtype_counts()
3. Handling a Series
Select a column
# choose one
myData['column_name']
myData.column_name
if you want to treat it as a dataframe,
mySeries.to_frame()
check frequencies
# total
mySeries.size
mySeries.shape
len(mySeries)
# not Null only
mySeries.count()
mySeries.notnull().sum()
# counts per item
mySeries.value_counts()
mySeries.value_counts(normalize=True)
Statistics
# summary
mySeries.describe()
# percentile
mySeries.quantile([.1, .2, .3, .5, .8, .9])
Treat null
# check null
mySeries.notnull().all()
mySeries.isnull().sum()
mySeries.hasnans
# fill it
mySeries.fillna(0)
# or remove it
mySeries.dropna()
change dtype
mySeries.astype(int)
4. Index
set index
myData.set_index('column')
import pandas as pd
myData = pd.read_csv('./data/d.csv', index_col='index_column')
myData = pd.read_csv('./data/d.csv', index_col='index_column', drop=False)
bring back the index
myData.reset_index()
change index
newData = myData.rename(index={'old_idx':'new_idx'},
columns={'old_col':'new_col'})
5. Column insert / delete
# insert
idx = myData.columns.get_loc('myCol')
myData.insert(loc=idx+1,
column=newCol,
value=myData.V1 - myData.V2)
# Delete
myData = myData.drop('myCol', axis=1)