Pandas提供了各種工具(功能),可以輕松地將Series,DataFrame和Panel對(duì)象組合在一起。
pd.concat(objs,axis=0,join='outer',join_axes=None,
ignore_index=False)
其中,
{0,1,...},默認(rèn)為0,這是連接的軸。{'inner', 'outer'},默認(rèn)inner。如何處理其他軸上的索引。聯(lián)合的外部和交叉的內(nèi)部。False。如果指定為True,則不要使用連接軸上的索引值。結(jié)果軸將被標(biāo)記為:0,...,n-1。(n-1)軸的特定索引,而不是執(zhí)行內(nèi)部/外部集邏輯。concat()函數(shù)完成了沿軸執(zhí)行級(jí)聯(lián)操作的所有重要工作。下面代碼中,創(chuàng)建不同的對(duì)象并進(jìn)行連接。
import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
rs = pd.concat([one,two])
print(rs)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
Marks_scored Name subject_id
1 98 Alex sub1
2 90 Amy sub2
3 87 Allen sub4
4 69 Alice sub6
5 78 Ayoung sub5
1 89 Billy sub2
2 80 Brian sub4
3 79 Bran sub3
4 97 Bryce sub6
5 88 Betty sub5
假設(shè)想把特定的鍵與每個(gè)碎片的DataFrame關(guān)聯(lián)起來。可以通過使用鍵參數(shù)來實(shí)現(xiàn)這一點(diǎn) -
import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
rs = pd.concat([one,two],keys=['x','y'])
print(rs)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
Marks_scored Name subject_id
x 1 98 Alex sub1
2 90 Amy sub2
3 87 Allen sub4
4 69 Alice sub6
5 78 Ayoung sub5
y 1 89 Billy sub2
2 80 Brian sub4
3 79 Bran sub3
4 97 Bryce sub6
5 88 Betty sub5
結(jié)果的索引是重復(fù)的; 每個(gè)索引重復(fù)。如果想要生成的對(duì)象必須遵循自己的索引,請(qǐng)將ignore_index設(shè)置為True。參考以下示例代碼 -
import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
rs = pd.concat([one,two],keys=['x','y'],ignore_index=True)
print(rs)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
Marks_scored Name subject_id
0 98 Alex sub1
1 90 Amy sub2
2 87 Allen sub4
3 69 Alice sub6
4 78 Ayoung sub5
5 89 Billy sub2
6 80 Brian sub4
7 79 Bran sub3
8 97 Bryce sub6
9 88 Betty sub5
觀察,索引完全改變,鍵也被覆蓋。如果需要沿axis=1添加兩個(gè)對(duì)象,則會(huì)添加新列。
import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
rs = pd.concat([one,two],axis=1)
print(rs)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
Marks_scored Name subject_id Marks_scored Name subject_id
1 98 Alex sub1 89 Billy sub2
2 90 Amy sub2 80 Brian sub4
3 87 Allen sub4 79 Bran sub3
4 69 Alice sub6 97 Bryce sub6
5 78 Ayoung sub5 88 Betty sub5
連接的一個(gè)有用的快捷方式是在Series和DataFrame實(shí)例的append方法。這些方法實(shí)際上早于concat()方法。 它們沿axis=0連接,即索引 -
import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
rs = one.append(two)
print(rs)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
Marks_scored Name subject_id
1 98 Alex sub1
2 90 Amy sub2
3 87 Allen sub4
4 69 Alice sub6
5 78 Ayoung sub5
1 89 Billy sub2
2 80 Brian sub4
3 79 Bran sub3
4 97 Bryce sub6
5 88 Betty sub5
append()函數(shù)也可以帶多個(gè)對(duì)象 -
import pandas as pd
one = pd.DataFrame({
'Name': ['Alex', 'Amy', 'Allen', 'Alice', 'Ayoung'],
'subject_id':['sub1','sub2','sub4','sub6','sub5'],
'Marks_scored':[98,90,87,69,78]},
index=[1,2,3,4,5])
two = pd.DataFrame({
'Name': ['Billy', 'Brian', 'Bran', 'Bryce', 'Betty'],
'subject_id':['sub2','sub4','sub3','sub6','sub5'],
'Marks_scored':[89,80,79,97,88]},
index=[1,2,3,4,5])
rs = one.append([two,one,two])
print(rs)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
Marks_scored Name subject_id
1 98 Alex sub1
2 90 Amy sub2
3 87 Allen sub4
4 69 Alice sub6
5 78 Ayoung sub5
1 89 Billy sub2
2 80 Brian sub4
3 79 Bran sub3
4 97 Bryce sub6
5 88 Betty sub5
1 98 Alex sub1
2 90 Amy sub2
3 87 Allen sub4
4 69 Alice sub6
5 78 Ayoung sub5
1 89 Billy sub2
2 80 Brian sub4
3 79 Bran sub3
4 97 Bryce sub6
5 88 Betty sub5
Pandas為時(shí)間序列數(shù)據(jù)的工作時(shí)間提供了一個(gè)強(qiáng)大的工具,尤其是在金融領(lǐng)域。在處理時(shí)間序列數(shù)據(jù)時(shí),我們經(jīng)常遇到以下情況 -
Pandas提供了一個(gè)相對(duì)緊湊和自包含的工具來執(zhí)行上述任務(wù)。
datetime.now()用于獲取當(dāng)前的日期和時(shí)間。
import pandas as pd
print pd.datetime.now()
上述代碼執(zhí)行結(jié)果如下 -
2017-11-03 02:17:45.997992
時(shí)間戳數(shù)據(jù)是時(shí)間序列數(shù)據(jù)的最基本類型,它將數(shù)值與時(shí)間點(diǎn)相關(guān)聯(lián)。 對(duì)于Pandas對(duì)象來說,意味著使用時(shí)間點(diǎn)。舉個(gè)例子 -
import pandas as pd
time = pd.Timestamp('2018-11-01')
print(time)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
2018-11-01 00:00:00
也可以轉(zhuǎn)換整數(shù)或浮動(dòng)時(shí)期。這些的默認(rèn)單位是納秒(因?yàn)檫@些是如何存儲(chǔ)時(shí)間戳的)。 然而,時(shí)代往往存儲(chǔ)在另一個(gè)可以指定的單元中。 再舉一個(gè)例子 -
import pandas as pd
time = pd.Timestamp(1588686880,unit='s')
print(time)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
2020-05-05 13:54:40
import pandas as pd
time = pd.date_range("12:00", "23:59", freq="30min").time
print(time)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
[datetime.time(12, 0) datetime.time(12, 30) datetime.time(13, 0)
datetime.time(13, 30) datetime.time(14, 0) datetime.time(14, 30)
datetime.time(15, 0) datetime.time(15, 30) datetime.time(16, 0)
datetime.time(16, 30) datetime.time(17, 0) datetime.time(17, 30)
datetime.time(18, 0) datetime.time(18, 30) datetime.time(19, 0)
datetime.time(19, 30) datetime.time(20, 0) datetime.time(20, 30)
datetime.time(21, 0) datetime.time(21, 30) datetime.time(22, 0)
datetime.time(22, 30) datetime.time(23, 0) datetime.time(23, 30)]
import pandas as pd
time = pd.date_range("12:00", "23:59", freq="H").time
print(time)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
[datetime.time(12, 0) datetime.time(13, 0) datetime.time(14, 0)
datetime.time(15, 0) datetime.time(16, 0) datetime.time(17, 0)
datetime.time(18, 0) datetime.time(19, 0) datetime.time(20, 0)
datetime.time(21, 0) datetime.time(22, 0) datetime.time(23, 0)]
要轉(zhuǎn)換類似日期的對(duì)象(例如字符串,時(shí)代或混合)的序列或類似列表的對(duì)象,可以使用to_datetime函數(shù)。當(dāng)傳遞時(shí)將返回一個(gè)Series(具有相同的索引),而類似列表被轉(zhuǎn)換為DatetimeIndex。 看看下面的例子 -
import pandas as pd
time = pd.to_datetime(pd.Series(['Jul 31, 2009','2019-10-10', None]))
print(time)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
0 2009-07-31
1 2019-10-10
2 NaT
dtype: datetime64[ns]
NaT表示不是一個(gè)時(shí)間的值(相當(dāng)于NaN)
舉一個(gè)例子,
import pandas as pd
import pandas as pd
time = pd.to_datetime(['2009/11/23', '2019.12.31', None])
print(time)
執(zhí)行上面示例代碼,得到以下結(jié)果 -
DatetimeIndex(['2009-11-23', '2019-12-31', 'NaT'], dtype='datetime64[ns]', freq=None)