pandas 数据对齐

import pandas as pd
import numpy as np

广播机制

DataFrame 对象之间的数据对齐在列和索引(行标签)上自动对齐。同样,结果对象将得到列和行标签的并集。

df1 = pd.DataFrame(np.random.randn(10, 4), columns=["A", "B", "C", "D"])
df1

A B C D
0 -0.205518 -1.079244 0.466754 0.697197
1 0.373488 -0.704131 -0.562608 -0.747020
2 0.727782 1.176981 1.748998 -2.381428
3 0.693188 -0.236861 -0.045300 1.116772
4 0.679781 -0.113023 -0.237897 0.991700
5 0.301815 -0.736582 -0.327876 1.240366
6 -0.362792 0.409796 -0.559092 1.486711
7 1.158763 2.898433 -1.066979 1.109399
8 1.801811 -0.150998 0.256121 0.626933
9 0.248901 -0.846965 0.728085 -0.302709
df2 = pd.DataFrame(np.random.randn(7, 3), columns=["A", "B", "C"])
df2

A B C
0 0.656296 0.948401 1.203846
1 0.576805 2.193866 -1.295344
2 0.107251 -1.394675 1.322735
3 0.278274 -0.398505 -0.894721
4 0.798450 -0.817746 1.933429
5 -0.856174 0.212137 -0.323455
6 -0.148207 2.293063 0.164304
df1 + df2

A B C D
0 0.450778 -0.130843 1.670600 NaN
1 0.950293 1.489735 -1.857952 NaN
2 0.835033 -0.217694 3.071733 NaN
3 0.971461 -0.635366 -0.940020 NaN
4 1.478231 -0.930769 1.695532 NaN
5 -0.554358 -0.524445 -0.651331 NaN
6 -0.510998 2.702860 -0.394789 NaN
7 NaN NaN NaN NaN
8 NaN NaN NaN NaN
9 NaN NaN NaN NaN

在 DataFrame 和 Series 之间执行操作时,默认行为是对齐 DataFrame 列上的 Series 索引,从而按行广播。例如:

df1 - df1.iloc[0]

A B C D
0 0.000000 0.000000 0.000000 0.000000
1 0.579007 0.375113 -1.029362 -1.444217
2 0.933301 2.256226 1.282244 -3.078625
3 0.898706 0.842383 -0.512054 0.419576
4 0.885300 0.966221 -0.704651 0.294503
5 0.507334 0.342662 -0.794630 0.543169
6 -0.157273 1.489041 -1.025846 0.789514
7 1.364281 3.977677 -1.533733 0.412202
8 2.007330 0.928246 -0.210632 -0.070263
9 0.454420 0.232279 0.261331 -0.999906
Previous
Next