ABOUT ME

-

Today
-
Yesterday
-
Total
-
  • pandas basic 01
    데이터분석 2019. 10. 10. 20:04

    1. pandas basic elements

    index = myData.index
    columns = myData.columns
    data = myData.values

    2. Data types

    # check all data types
    myData.dtypes
    
    # counts them
    myData.get_dtype_counts()

    3. Handling a Series

    Select a column

    # choose one
    myData['column_name']
    myData.column_name

    if you want to treat it as a dataframe,

    mySeries.to_frame()

    check frequencies

    # total
    mySeries.size
    mySeries.shape
    len(mySeries)
    
    # not Null only
    mySeries.count()
    mySeries.notnull().sum()
    
    # counts per item
    mySeries.value_counts()
    mySeries.value_counts(normalize=True)

    Statistics

    # summary
    mySeries.describe()
    
    # percentile
    mySeries.quantile([.1, .2, .3, .5, .8, .9])

    Treat null

    # check null
    mySeries.notnull().all()
    mySeries.isnull().sum()
    mySeries.hasnans
    
    # fill it
    mySeries.fillna(0)
    
    # or remove it
    mySeries.dropna()

    change dtype

    mySeries.astype(int)

    4. Index

    set index

    myData.set_index('column')
    
    import pandas as pd
    myData = pd.read_csv('./data/d.csv', index_col='index_column')
    myData = pd.read_csv('./data/d.csv', index_col='index_column', drop=False)

    bring back the index

    myData.reset_index()

    change index

    newData = myData.rename(index={'old_idx':'new_idx'},
                            columns={'old_col':'new_col'})

    5. Column insert / delete

    # insert 
    idx = myData.columns.get_loc('myCol')
    myData.insert(loc=idx+1,
                  column=newCol,
                  value=myData.V1 - myData.V2)
    
    # Delete
    myData = myData.drop('myCol', axis=1)

    '데이터분석' 카테고리의 다른 글

    sklearn basic 01  (0) 2019.10.10
    pandas groupby 활용하기  (1) 2019.10.10
    pandas 테이블 양식 수정하기  (0) 2019.10.10
    pandas basic 03  (0) 2019.10.10
    pandas basic 02  (0) 2019.10.10

    댓글

Designed by Tistory.