Pandas :: ์›น ์Šคํฌ๋ž˜ํ•‘ ๋ฐ์ดํ„ฐ to_datetime ํ˜•๋ณ€ํ™˜ ์˜ค๋ฅ˜
Programming/Python

Pandas :: ์›น ์Šคํฌ๋ž˜ํ•‘ ๋ฐ์ดํ„ฐ to_datetime ํ˜•๋ณ€ํ™˜ ์˜ค๋ฅ˜

TypeError: Argument 'date_string' has incorrect type (expected str, got NavigableString) ํ•ด๊ฒฐ๋ฒ•

 

๐Ÿ™„ ์˜ค๋ฅ˜

 

์›น ์Šคํฌ๋ž˜ํ•‘์œผ๋กœ ๋‚ ์งœ ๋ฐ์ดํ„ฐ๋ฅผ ์ˆ˜์ง‘ํ•˜์˜€๋‹ค. ๋ฌธ์žํ˜•์‹ ๊ทธ๋Œ€๋กœ pandas DataFrame์— ๋„ฃ์—ˆ๋‹ค.

data
Out[216]: 
['2020.08.26 10:14',
 '2020.08.26 11:47',
 '2020.08.26 13:40',
 '2020.08.26 12:17',
 '2020.08.26 12:08',
 '2020.08.27 13:15',
 '2020.08.26 13:57',
 '2020.08.26 09:18',
 '2020.08.26 09:04',
 '2020.08.26 10:04']
 
 
 df = pd.DataFrame({'datetime' : data})

df
Out[218]: 
           datetime
0  2020.08.26 10:14
1  2020.08.26 11:47
2  2020.08.26 13:40
3  2020.08.26 12:17
4  2020.08.26 12:08
5  2020.08.27 13:15
6  2020.08.26 13:57
7  2020.08.26 09:18
8  2020.08.26 09:04
9  2020.08.26 10:04

 

datetime ์—ด์€ ๋‹ค์Œ๊ณผ ๊ฐ™์ด object ํ˜•์‹์œผ๋กœ ๋˜์–ด์žˆ๋‹ค.

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 1 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   datetime  10 non-null     object
dtypes: object(1)
memory usage: 208.0+ bytes

 

๊ทธ๋Ÿฐ๋ฐ ์ด ์—ด์„ datetime ํ˜•์‹์œผ๋กœ ๋ฐ”๊พธ๋Š” ๊ณผ์ •์—์„œ type error๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค.

df.datetime = pd.to_datetime(df.datetime, format = '%Y.%m.%d %H:%M')
Traceback (most recent call last):

  File "pandas\_libs\tslib.pyx", line 605, in pandas._libs.tslib.array_to_datetime

TypeError: Expected unicode, got NavigableString


During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "<ipython-input-220-f80b1cd7e09c>", line 1, in <module>
    df.datetime = pd.to_datetime(df.datetime, format = '%Y.%m.%d %H:%M')

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 728, in to_datetime
    values = convert_listlike(arg._values, format)

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\tools\datetimes.py", line 440, in _convert_listlike_datetimes
    result, tz_parsed = objects_to_datetime64ns(

  File "C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\arrays\datetimes.py", line 1848, in objects_to_datetime64ns
    result, tz_parsed = tslib.array_to_datetime(

  File "pandas\_libs\tslib.pyx", line 481, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 703, in pandas._libs.tslib.array_to_datetime

  File "pandas\_libs\tslib.pyx", line 828, in pandas._libs.tslib.array_to_datetime_object

TypeError: Argument 'date_string' has incorrect type (expected str, got NavigableString)

 

 

NavigableString ๊ฐ์ฒด๋Š” ํƒœ๊ทธ ์•ˆ์˜ ํ…์ŠคํŠธ๋‹ค. ์ด๋ฅผ pandas๊ฐ€ str์œผ๋กœ ์ธ์‹ํ•˜์ง€ ๋ชปํ•˜๋Š” ๊ฒƒ์œผ๋กœ ํ™•์ธ๋œ๋‹ค.

 

type(df.datetime[0])
Out[226]: bs4.element.NavigableString

 

 


๐Ÿ’ก ํ•ด๊ฒฐ

์ด๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด, DataFrame์— ๋„ฃ๊ธฐ ์ „ str๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ์ž‘์—…์„ ๊ฑฐ์ณ์ฃผ๋ฉด ๋œ๋‹ค.

df = pd.DataFrame({'datetime' : [str(i) for i in data]}) โญโญโญ

 

 

Dtype์€ ์ด์ „๊ณผ ๋งˆ์ฐฌ๊ฐ€์ง€๋กœ object ์ง€๋งŒ, ๊ฐ ๋ฐ์ดํ„ฐ์˜ ํƒ€์ž…์€ str ์ž„์„ ํ™•์ธํ•  ์ˆ˜ ์žˆ๋‹ค.

df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 1 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   datetime  10 non-null     object
dtypes: object(1)
memory usage: 208.0+ bytes

type(df.datetime[0])
Out[229]: str

 

 

๊ทธ๋ฆฌ๊ณ  to_datetime ์„ ์ ์šฉํ•˜๋ฉด ๋ฌธ์ œ์—†์ด ์ˆ˜ํ–‰๋œ๋‹ค.

df.datetime = pd.to_datetime(df.datetime, format = '%Y.%m.%d %H:%M')

df.datetime
Out[231]: 
0   2020-08-26 10:14:00
1   2020-08-26 11:47:00
2   2020-08-26 13:40:00
3   2020-08-26 12:17:00
4   2020-08-26 12:08:00
5   2020-08-27 13:15:00
6   2020-08-26 13:57:00
7   2020-08-26 09:18:00
8   2020-08-26 09:04:00
9   2020-08-26 10:04:00
Name: datetime, dtype: datetime64[ns]

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 10 entries, 0 to 9
Data columns (total 1 columns):
 #   Column    Non-Null Count  Dtype         
---  ------    --------------  -----         
 0   datetime  10 non-null     datetime64[ns]
dtypes: datetime64[ns](1)
memory usage: 208.0 bytes