Skip to content

「Python」 pandas的简单使用

Published at:
python
import numpy as np
import pandas as pd

1. 数据类型

1.1 Series

series相当于一个一维数组,通过pd.Series(data, index=index)来创建,可以通过index来自定义索引方式。Series有三种创建方式:

1.1.1 From ndarray

If data is an ndarray, index must be the same length as data. If no index is passed, one will be created having values [0, ..., len(data) - 1].

python
s = pd.Series(np.random.randn(5), index=["a", "b", "c", "d", "e"])
# a    0.469112
# b   -0.282863
# c   -1.509059
# d   -1.135632
# e    1.212112
# dtype: float64

s.index
#Index(['a', 'b', 'c', 'd', 'e'], dtype='object')

pd.Series(np.random.randn(5))
# 0   -0.173215
# 1    0.119209
# 2   -1.044236
# 3   -0.861849
# 4   -2.104569
# dtype: float64

1.1.2From dict

Series can be instantiated from dicts:

python
d = {"b": 1, "a": 0, "c": 2}
pd.Series(d)
# b    1
# a    0
# c    2
# dtype: int64

1.1.3 From scalar value

If data is a scalar value, an index must be provided. The value will be repeated to match the length of index.

python
pd.Series(5.0, index=["a", "b", "c", "d", "e"])
# a    5.0
# b    5.0
# c    5.0
# d    5.0
# e    5.0
# dtype: float64

1.1.4使用方法

python
s[0]
# Out[13]: 0.4691122999071863

s[:3]
# Out[14]: 
# a    0.469112
# b   -0.282863
# c   -1.509059
# dtype: float64

s[s > s.median()]
# Out[15]: 
# a    0.469112
# e    1.212112
# dtype: float64

s[[4, 3, 1]]
# Out[16]: 
# e    1.212112
# d   -1.135632
# b   -0.282863
# dtype: float64

np.exp(s)
# Out[17]: 
# a    1.598575
# b    0.753623
# c    0.221118
# d    0.321219
# e    3.360575
# dtype: float64

s.array
# Out[19]: 
# <PandasArray>
# [ 0.4691122999071863, -0.2828633443286633, -1.5090585031735124,
#  -1.1356323710171934,  1.2121120250208506]
# Length: 5, dtype: float64

s.to_numpy()
# Out[20]: array([ 0.4691, -0.2829, -1.5091, -1.1356,  1.2121])

s["a"]
# Out[21]: 0.4691122999071863

s["e"] = 12.0
s
# Out[23]: 
# a     0.469112
# b    -0.282863
# c    -1.509059
# d    -1.135632
# e    12.000000
# dtype: float64

"e" in s
# Out[24]: True

"f" in s
# Out[25]: False

Series在使用方面与ndarry , dict非常相似,可以使用Series.to_numpy()转化为numpy.

DataFrame

类似于二维的表格

1. 导入数据

参考资料: [1]https://pandas.pydata.org/pandas-docs/stable/user_guide