head 1.59; access; symbols pkgsrc-2023Q4:1.56.0.2 pkgsrc-2023Q4-base:1.56 pkgsrc-2023Q3:1.48.0.2 pkgsrc-2023Q3-base:1.48 pkgsrc-2023Q2:1.44.0.2 pkgsrc-2023Q2-base:1.44 pkgsrc-2023Q1:1.43.0.2 pkgsrc-2023Q1-base:1.43 pkgsrc-2022Q4:1.41.0.2 pkgsrc-2022Q4-base:1.41 pkgsrc-2022Q3:1.39.0.4 pkgsrc-2022Q3-base:1.39 pkgsrc-2022Q2:1.39.0.2 pkgsrc-2022Q2-base:1.39 pkgsrc-2022Q1:1.37.0.2 pkgsrc-2022Q1-base:1.37 pkgsrc-2021Q4:1.35.0.2 pkgsrc-2021Q4-base:1.35 pkgsrc-2021Q3:1.33.0.4 pkgsrc-2021Q3-base:1.33 pkgsrc-2021Q2:1.33.0.2 pkgsrc-2021Q2-base:1.33 pkgsrc-2021Q1:1.31.0.4 pkgsrc-2021Q1-base:1.31 pkgsrc-2020Q4:1.31.0.2 pkgsrc-2020Q4-base:1.31 pkgsrc-2020Q3:1.30.0.6 pkgsrc-2020Q3-base:1.30 pkgsrc-2020Q2:1.30.0.4 pkgsrc-2020Q2-base:1.30 pkgsrc-2020Q1:1.30.0.2 pkgsrc-2020Q1-base:1.30 pkgsrc-2019Q4:1.28.0.8 pkgsrc-2019Q4-base:1.28 pkgsrc-2019Q3:1.28.0.4 pkgsrc-2019Q3-base:1.28 pkgsrc-2019Q2:1.28.0.2 pkgsrc-2019Q2-base:1.28 pkgsrc-2019Q1:1.27.0.6 pkgsrc-2019Q1-base:1.27 pkgsrc-2018Q4:1.27.0.4 pkgsrc-2018Q4-base:1.27 pkgsrc-2018Q3:1.27.0.2 pkgsrc-2018Q3-base:1.27 pkgsrc-2018Q2:1.23.0.2 pkgsrc-2018Q2-base:1.23 pkgsrc-2018Q1:1.21.0.2 pkgsrc-2018Q1-base:1.21 pkgsrc-2017Q4:1.19.0.2 pkgsrc-2017Q4-base:1.19 pkgsrc-2017Q3:1.17.0.4 pkgsrc-2017Q3-base:1.17 pkgsrc-2017Q2:1.16.0.2 pkgsrc-2017Q2-base:1.16 pkgsrc-2017Q1:1.14.0.2 pkgsrc-2017Q1-base:1.14 pkgsrc-2016Q4:1.13.0.4 pkgsrc-2016Q4-base:1.13 pkgsrc-2016Q3:1.13.0.2 pkgsrc-2016Q3-base:1.13 pkgsrc-2016Q2:1.10.0.2 pkgsrc-2016Q2-base:1.10 pkgsrc-2016Q1:1.9.0.2 pkgsrc-2016Q1-base:1.9 pkgsrc-2015Q4:1.8.0.4 pkgsrc-2015Q4-base:1.8 pkgsrc-2015Q3:1.8.0.2 pkgsrc-2015Q3-base:1.8 pkgsrc-2015Q2:1.7.0.8 pkgsrc-2015Q2-base:1.7 pkgsrc-2015Q1:1.7.0.6 pkgsrc-2015Q1-base:1.7 pkgsrc-2014Q4:1.7.0.4 pkgsrc-2014Q4-base:1.7 pkgsrc-2014Q3:1.7.0.2 pkgsrc-2014Q3-base:1.7 pkgsrc-2014Q2:1.6.0.4 pkgsrc-2014Q2-base:1.6 pkgsrc-2014Q1:1.6.0.2 pkgsrc-2014Q1-base:1.6 pkgsrc-2013Q4:1.5.0.2 pkgsrc-2013Q4-base:1.5 pkgsrc-2013Q3:1.4.0.4 pkgsrc-2013Q3-base:1.4 pkgsrc-2013Q2:1.4.0.2 pkgsrc-2013Q2-base:1.4 pkgsrc-2013Q1:1.3.0.2 pkgsrc-2013Q1-base:1.3 pkgsrc-2012Q4:1.1.0.2 pkgsrc-2012Q4-base:1.1; locks; strict; comment @# @; 1.59 date 2024.03.06.18.56.35; author adam; state Exp; branches; next 1.58; commitid apwENQsquiZge81F; 1.58 date 2024.01.24.16.31.15; author wiz; state Exp; branches; next 1.57; commitid m2Rbvho7calcMIVE; 1.57 date 2024.01.20.08.18.55; author adam; state Exp; branches; next 1.56; commitid plWDn8ZkflgbbaVE; 1.56 date 2023.12.15.09.29.59; author adam; state Exp; branches; next 1.55; commitid zbbBgnjRjwchJxQE; 1.55 date 2023.11.11.10.04.38; author adam; state Exp; branches; next 1.54; commitid zQQAPODC2hs11bME; 1.54 date 2023.10.29.17.39.51; author adam; state Exp; branches; next 1.53; commitid q0MMQlHP2YK2XxKE; 1.53 date 2023.10.28.19.57.11; author wiz; state Exp; branches; next 1.52; commitid jP8MYROLWZ3yJqKE; 1.52 date 2023.10.23.06.37.48; author wiz; state Exp; branches; next 1.51; commitid 4YdPmMYgk9hutIJE; 1.51 date 2023.10.15.00.05.44; author gutteridge; state Exp; branches; next 1.50; commitid CsMrbslinEgCyEIE; 1.50 date 2023.10.05.04.46.05; author gutteridge; state Exp; branches; next 1.49; commitid A393n6Vj0upproHE; 1.49 date 2023.09.28.16.01.24; author adam; state Exp; branches; next 1.48; commitid PHdrGnk8C8v2pyGE; 1.48 date 2023.09.02.07.19.56; author adam; state Exp; branches; next 1.47; commitid uZXNSo2g9itYlaDE; 1.47 date 2023.08.28.10.34.02; author adam; state Exp; branches; next 1.46; commitid Qvcvmncue7M9AxCE; 1.46 date 2023.08.01.23.20.47; author wiz; state Exp; branches; next 1.45; commitid lyjXpsSeA6xpH8zE; 1.45 date 2023.07.01.08.37.40; author wiz; state Exp; branches; next 1.44; commitid OGZpaIgVtdY8O4vE; 1.44 date 2023.04.25.13.51.49; author jperkin; state Exp; branches; next 1.43; commitid trWrwpi6ic7zHumE; 1.43 date 2023.01.28.19.47.54; author he; state Exp; branches; next 1.42; commitid ZiIHzlsUDZWvplbE; 1.42 date 2023.01.25.14.05.16; author adam; state Exp; branches; next 1.41; commitid Pxa7m9za1TInFVaE; 1.41 date 2022.12.05.22.42.54; author adam; state Exp; branches; next 1.40; commitid bZmaW00hIv6Oaq4E; 1.40 date 2022.11.28.21.46.51; author adam; state Exp; branches; next 1.39; commitid kRIIrqmbmCJm5w3E; 1.39 date 2022.04.10.00.57.14; author gutteridge; state Exp; branches; next 1.38; commitid 53ZWhv2DKvpNCAzD; 1.38 date 2022.04.09.21.33.50; author gutteridge; state Exp; branches; next 1.37; commitid iEhTUVgx5qUHuzzD; 1.37 date 2022.01.04.20.54.15; author wiz; state Exp; branches; next 1.36; commitid CYyhdK9qtoffkmnD; 1.36 date 2021.12.30.13.05.37; author adam; state Exp; branches; next 1.35; commitid w23rFuQ4pTWhUFmD; 1.35 date 2021.12.12.20.30.49; author adam; state Exp; branches; next 1.34; commitid v9tlmxP2iEpOWokD; 1.34 date 2021.11.21.16.31.26; author ryoon; state Exp; branches; next 1.33; commitid g3o4PpeSKt8EiGhD; 1.33 date 2021.05.06.04.39.03; author adam; state Exp; branches; next 1.32; commitid s4D9C9Pp7yGHJ2SC; 1.32 date 2021.04.09.14.41.35; author tnn; state Exp; branches; next 1.31; commitid UfqIfcWkKjrgXCOC; 1.31 date 2020.10.12.21.52.03; author bacon; state Exp; branches; next 1.30; commitid 568C66J21E1N0FrC; 1.30 date 2020.02.14.16.21.55; author minskim; state Exp; branches; next 1.29; commitid YZPROZPA2dQP0FWB; 1.29 date 2020.01.26.17.31.40; author rillig; state Exp; branches; next 1.28; commitid 4fBBvoSLJaGd0eUB; 1.28 date 2019.06.16.19.14.52; author adam; state Exp; branches; next 1.27; commitid cxDFtA82awiWLrrB; 1.27 date 2018.08.10.09.00.36; author adam; state Exp; branches; next 1.26; commitid ik4skKD4qjGllyNA; 1.26 date 2018.07.09.08.22.45; author adam; state Exp; branches; next 1.25; commitid VcA5GRljVG8carJA; 1.25 date 2018.07.05.01.21.05; author minskim; state Exp; branches; next 1.24; commitid nAn5UhLcLUdxXSIA; 1.24 date 2018.07.04.06.50.04; author adam; state Exp; branches; next 1.23; commitid 7EiP3cxc0BJsOMIA; 1.23 date 2018.06.18.07.08.23; author adam; state Exp; branches; next 1.22; commitid 5f4VZmoJo4TzqJGA; 1.22 date 2018.05.30.07.56.30; author adam; state Exp; branches; next 1.21; commitid GynzKIWaQSFJiiEA; 1.21 date 2018.01.30.09.21.44; author adam; state Exp; branches; next 1.20; commitid BJXBvFMKX0SnDSoA; 1.20 date 2018.01.05.16.13.51; author adam; state Exp; branches; next 1.19; commitid N73JztUbRj2sIHlA; 1.19 date 2017.12.14.13.37.59; author adam; state Exp; branches; next 1.18; commitid 7UAKwdWWRH5JyRiA; 1.18 date 2017.11.02.09.41.38; author adam; state Exp; branches; next 1.17; commitid 2z9pUtR8tynrBrdA; 1.17 date 2017.07.14.10.17.02; author adam; state Exp; branches; next 1.16; commitid 5vzZRH0nvNEhmbZz; 1.16 date 2017.06.07.08.13.56; author adam; state Exp; branches; next 1.15; commitid tIsHkxBK6bnpSpUz; 1.15 date 2017.05.21.08.54.33; author adam; state Exp; branches; next 1.14; commitid qBxZiO8LHpRbEeSz; 1.14 date 2017.02.20.17.00.36; author wiz; state Exp; branches; next 1.13; commitid 1rcVYtkuiSEheIGz; 1.13 date 2016.08.19.07.57.26; author wiz; state Exp; branches; next 1.12; commitid viJaEeqShscqaTiz; 1.12 date 2016.08.16.03.22.12; author maya; state Exp; branches; next 1.11; commitid YlwpqeUWYu2VItiz; 1.11 date 2016.07.15.07.24.06; author wiz; state Exp; branches; next 1.10; commitid 2kWgpQ0EeUvZ6oez; 1.10 date 2016.06.08.17.43.35; author wiz; state Exp; branches; next 1.9; commitid z4yEulWexjFaJG9z; 1.9 date 2015.12.28.14.35.02; author wiz; state Exp; branches; next 1.8; commitid LcJx48MOVdP4VIOy; 1.8 date 2015.07.21.19.44.45; author bad; state Exp; branches; next 1.7; commitid 7Phx94ISCVTsLbuy; 1.7 date 2014.07.19.13.17.46; author bad; state Exp; branches; next 1.6; commitid IAssOy1lVUUWpZIx; 1.6 date 2014.01.16.10.41.53; author wiz; state Exp; branches; next 1.5; commitid TlG85y817smpuklx; 1.5 date 2013.12.10.13.00.30; author bad; state Exp; branches; next 1.4; commitid M120slErhBp7rAgx; 1.4 date 2013.05.16.23.10.16; author bad; state Exp; branches; next 1.3; commitid XXy8WBZJi6fkvUPw; 1.3 date 2013.02.16.00.02.19; author bad; state Exp; branches; next 1.2; 1.2 date 2013.01.07.23.18.35; author bad; state Exp; branches; next 1.1; 1.1 date 2012.11.22.00.15.13; author bad; state Exp; branches; next ; desc @@ 1.59 log @py-pandas: updated to 2.2.1 What’s new in 2.2.1 (February 22, 2024) These are the changes in pandas 2.2.1. See Release notes for a full changelog including other versions of pandas. Enhancements Added pyarrow pip extra so users can install pandas and pyarrow with pip with pip install pandas[pyarrow] (GH 54466) Fixed regressions Fixed memory leak in read_csv() (GH 57039) Fixed performance regression in Series.combine_first() (GH 55845) Fixed regression causing overflow for near-minimum timestamps (GH 57150) Fixed regression in concat() changing long-standing behavior that always sorted the non-concatenation axis when the axis was a DatetimeIndex (GH 57006) Fixed regression in merge_ordered() raising TypeError for fill_method="ffill" and how="left" (GH 57010) Fixed regression in pandas.testing.assert_series_equal() defaulting to check_exact=True when checking the Index (GH 57067) Fixed regression in read_json() where an Index would be returned instead of a RangeIndex (GH 57429) Fixed regression in wide_to_long() raising an AttributeError for string columns (GH 57066) Fixed regression in DataFrameGroupBy.idxmin(), DataFrameGroupBy.idxmax(), SeriesGroupBy.idxmin(), SeriesGroupBy.idxmax() ignoring the skipna argument (GH 57040) Fixed regression in DataFrameGroupBy.idxmin(), DataFrameGroupBy.idxmax(), SeriesGroupBy.idxmin(), SeriesGroupBy.idxmax() where values containing the minimum or maximum value for the dtype could produce incorrect results (GH 57040) Fixed regression in CategoricalIndex.difference() raising KeyError when other contains null values other than NaN (GH 57318) Fixed regression in DataFrame.groupby() raising ValueError when grouping by a Series in some cases (GH 57276) Fixed regression in DataFrame.loc() raising IndexError for non-unique, masked dtype indexes where result has more than 10,000 rows (GH 57027) Fixed regression in DataFrame.loc() which was unnecessarily throwing “incompatible dtype warning” when expanding with partial row indexer and multiple columns (see PDEP6) (GH 56503) Fixed regression in DataFrame.map() with na_action="ignore" not being respected for NumPy nullable and ArrowDtypes (GH 57316) Fixed regression in DataFrame.merge() raising ValueError for certain types of 3rd-party extension arrays (GH 57316) Fixed regression in DataFrame.query() with all NaT column with object dtype (GH 57068) Fixed regression in DataFrame.shift() raising AssertionError for axis=1 and empty DataFrame (GH 57301) Fixed regression in DataFrame.sort_index() not producing a stable sort for a index with duplicates (GH 57151) Fixed regression in DataFrame.to_dict() with orient='list' and datetime or timedelta types returning integers (GH 54824) Fixed regression in DataFrame.to_json() converting nullable integers to floats (GH 57224) Fixed regression in DataFrame.to_sql() when method="multi" is passed and the dialect type is not Oracle (GH 57310) Fixed regression in DataFrame.transpose() with nullable extension dtypes not having F-contiguous data potentially causing exceptions when used (GH 57315) Fixed regression in DataFrame.update() emitting incorrect warnings about downcasting (GH 57124) Fixed regression in DataFrameGroupBy.idxmin(), DataFrameGroupBy.idxmax(), SeriesGroupBy.idxmin(), SeriesGroupBy.idxmax() ignoring the skipna argument (GH 57040) Fixed regression in DataFrameGroupBy.idxmin(), DataFrameGroupBy.idxmax(), SeriesGroupBy.idxmin(), SeriesGroupBy.idxmax() where values containing the minimum or maximum value for the dtype could produce incorrect results (GH 57040) Fixed regression in ExtensionArray.to_numpy() raising for non-numeric masked dtypes (GH 56991) Fixed regression in Index.join() raising TypeError when joining an empty index to a non-empty index containing mixed dtype values (GH 57048) Fixed regression in Series.astype() introducing decimals when converting from integer with missing values to string dtype (GH 57418) Fixed regression in Series.pct_change() raising a ValueError for an empty Series (GH 57056) Fixed regression in Series.to_numpy() when dtype is given as float and the data contains NaNs (GH 57121) Fixed regression in addition or subtraction of DateOffset objects with millisecond components to datetime64 Index, Series, or DataFrame (GH 57529) Bug fixes Fixed bug in pandas.api.interchange.from_dataframe() which was raising for Nullable integers (GH 55069) Fixed bug in pandas.api.interchange.from_dataframe() which was raising for empty inputs (GH 56700) Fixed bug in pandas.api.interchange.from_dataframe() which wasn’t converting columns names to strings (GH 55069) Fixed bug in DataFrame.__getitem__() for empty DataFrame with Copy-on-Write enabled (GH 57130) Fixed bug in PeriodIndex.asfreq() which was silently converting frequencies which are not supported as period frequencies instead of raising an error (GH 56945) @ text @# $NetBSD: Makefile,v 1.58 2024/01/24 16:31:15 wiz Exp $ DISTNAME= pandas-2.2.1 PKGNAME= ${PYPKGPREFIX}-${DISTNAME} CATEGORIES= math graphics python MASTER_SITES= ${MASTER_SITE_PYPI:=p/pandas/} MAINTAINER= bad@@NetBSD.org HOMEPAGE= https://pandas.pydata.org/ COMMENT= Python Data Analysis Library LICENSE= modified-bsd TOOL_DEPENDS+= ${PYPKGPREFIX}-cython>=0.29.33:../../devel/py-cython # Package directly expresses a meson minimum; we need higher to pick up our # multi-version build fixes. TOOL_DEPENDS+= meson>=1.2.2nb1:../../devel/meson TOOL_DEPENDS+= ${PYPKGPREFIX}-meson_python>=0.13.1:../../devel/py-meson_python TOOL_DEPENDS+= ${PYPKGPREFIX}-versioneer-[0-9]*:../../devel/py-versioneer DEPENDS+= ${PYPKGPREFIX}-dateutil>=2.8.2:../../time/py-dateutil DEPENDS+= ${PYPKGPREFIX}-pytz>=2020.1:../../time/py-pytz DEPENDS+= ${PYPKGPREFIX}-tzdata>=2022.1:../../time/py-tzdata TEST_DEPENDS+= ${PYPKGPREFIX}-hypothesis>=6.34.2:../../devel/py-hypothesis TEST_DEPENDS+= ${PYPKGPREFIX}-test-asyncio>=0.17.0:../../devel/py-test-asyncio TEST_DEPENDS+= ${PYPKGPREFIX}-test-xdist>=2.2.0:../../devel/py-test-xdist USE_LANGUAGES= c c++ USE_TOOLS+= pkg-config USE_CXX_FEATURES= c++11 # __has_builtin GCC_REQD+= 10 SUBST_CLASSES+= python SUBST_STAGE.python= pre-configure SUBST_MESSAGE.python= Fixing python binary name. SUBST_FILES.python= meson.build SUBST_VARS.python= TOOL_PYTHONBIN PYTHON_VERSIONS_INCOMPATIBLE= 27 38 # This would otherwise be installed, causing PLIST mismatch post-patch: cd ${WRKSRC} && ${RM} -f pandas/_libs/window/aggregations.pyx.orig .include "../../lang/python/batteries-included.mk" .include "../../lang/python/wheel.mk" BUILDLINK_API_DEPENDS.py-numpy+= ${PYPKGPREFIX}-numpy>=1.23.2 .include "../../math/py-numpy/buildlink3.mk" .include "../../mk/bsd.pkg.mk" @ 1.58 log @py-pandas: needs at least gcc 10 because of __has_builtin @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.57 2024/01/20 08:18:55 adam Exp $ d3 1 a3 1 DISTNAME= pandas-2.2.0 @ 1.57 log @py-pandas: updated to 2.2.0 Pandas 2.2.0 This release includes some new features, bug fixes, and performance improvements. We recommend that all users upgrade to this version. https://pandas.pydata.org/pandas-docs/version/2.2.0/whatsnew/v2.2.0.html @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.56 2023/12/15 09:29:59 adam Exp $ d31 3 @ 1.56 log @py-pandas: updated to 2.1.4 What’s new in 2.1.4 Fixed regressions Fixed regression when trying to read a pickled pandas DataFrame from pandas 1.3 Bug fixes Bug in Series constructor raising DeprecationWarning when index is a list of Series Bug in Series when trying to cast date-like string inputs to ArrowDtype of pyarrow.timestamp Bug in DataFrame.apply() where passing raw=True ignored args passed to the applied function Bug in Index.__getitem__() returning wrong result for Arrow dtypes and negative stepsize Fixed bug in to_numeric() converting to extension dtype for string[pyarrow_numpy] dtype Fixed bug in DataFrameGroupBy.min() and DataFrameGroupBy.max() not preserving extension dtype for empty object Fixed bug in DataFrame.__setitem__() casting Index with object-dtype to PyArrow backed strings when infer_string option is set Fixed bug in DataFrame.to_hdf() raising when columns have StringDtype Fixed bug in Index.insert() casting object-dtype to PyArrow backed strings when infer_string option is set Fixed bug in Series.__ne__() resulting in False for comparison between NA and string value for dtype="string[pyarrow_numpy]" Fixed bug in Series.mode() not keeping object dtype when infer_string is set Fixed bug in Series.reset_index() not preserving object dtype when infer_string is set Fixed bug in Series.str.split() and Series.str.rsplit() when pat=None for ArrowDtype with pyarrow.string Fixed bug in Series.str.translate() losing object dtype when string option is set @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.55 2023/11/11 10:04:38 adam Exp $ d3 1 a3 1 DISTNAME= pandas-2.1.4 @ 1.55 log @py-pandas: updated to 2.1.3 Pandas 2.1.3 This is a patch release in the 2.1.x series and includes some regression and bug fixes, and a security fix. We recommend that all users upgrade to this version. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.54 2023/10/29 17:39:51 adam Exp $ d3 1 a3 1 DISTNAME= pandas-2.1.3 @ 1.54 log @py-pandas: updated to 2.1.2 2.1.2 Deprecations Reverted deprecation of fill_method=None in DataFrame.pct_change(), Series.pct_change(), DataFrameGroupBy.pct_change(), and SeriesGroupBy.pct_change(); the values 'backfill', 'bfill', 'pad', and 'ffill' are still deprecated (GH 53491) Fixed regressions Fixed regression in DataFrame.join() where result has missing values and dtype is arrow backed string (GH 55348) Fixed regression in rolling() where non-nanosecond index or on column would produce incorrect results (GH 55026, GH 55106, GH 55299) Fixed regression in DataFrame.resample() which was extrapolating back to origin when origin was outside its bounds (GH 55064) Fixed regression in DataFrame.sort_index() which was not sorting correctly when the index was a sliced MultiIndex (GH 55379) Fixed regression in DataFrameGroupBy.agg() and SeriesGroupBy.agg() where if the option compute.use_numba was set to True, groupby methods not supported by the numba engine would raise a TypeError (GH 55520) Fixed performance regression with wide DataFrames, typically involving methods where all columns were accessed individually (GH 55256, GH 55245) Fixed regression in merge_asof() raising TypeError for by with datetime and timedelta dtypes (GH 55453) Fixed regression in read_parquet() when reading a file with a string column consisting of more than 2 GB of string data and using the "string" dtype (GH 55606) Fixed regression in DataFrame.to_sql() not roundtripping datetime columns correctly for sqlite when using detect_types (GH 55554) Fixed regression in construction of certain DataFrame or Series subclasses (GH 54922) Bug fixes Fixed bug in DataFrameGroupBy reductions not preserving object dtype when infer_string is set (GH 55620) Fixed bug in SeriesGroupBy.value_counts() returning incorrect dtype for string columns (GH 55627) Fixed bug in Categorical.equals() if other has arrow backed string dtype (GH 55364) Fixed bug in DataFrame.__setitem__() not inferring string dtype for zero-dimensional array with infer_string=True (GH 55366) Fixed bug in DataFrame.idxmin() and DataFrame.idxmax() raising for arrow dtypes (GH 55368) Fixed bug in DataFrame.interpolate() raising incorrect error message (GH 55347) Fixed bug in Index.insert() raising when inserting None into Index with dtype="string[pyarrow_numpy]" (GH 55365) Fixed bug in Series.all() and Series.any() not treating missing values correctly for dtype="string[pyarrow_numpy]" (GH 55367) Fixed bug in Series.floordiv() for ArrowDtype (GH 55561) Fixed bug in Series.mode() not sorting values for arrow backed string dtype (GH 55621) Fixed bug in Series.rank() for string[pyarrow_numpy] dtype (GH 55362) Fixed bug in Series.str.extractall() for ArrowDtype dtype being converted to object (GH 53846) Fixed bug where PDEP-6 warning about setting an item of an incompatible dtype was being shown when creating a new conditional column (GH 55025) Silence Period[B] warnings introduced by GH 53446 during normal plotting activity (GH 55138) Fixed bug in Series constructor not inferring string dtype when NA is the first value and infer_string is set (:issue:` 55655`) Other Fixed non-working installation of optional dependency group output_formatting. Replacing underscore _ with a dash - fixes broken dependency resolution. A correct way to use now is pip install pandas[output-formatting]. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.53 2023/10/28 19:57:11 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-2.1.2 a22 1 TEST_DEPENDS+= ${PYPKGPREFIX}-test>=7.3.2:../../devel/py-test a42 3 do-test: cd ${WRKSRC} && ${SETENV} ${TEST_ENV} pytest-${PYVERSSUFFIX} pandas @ 1.53 log @python/wheel.mk: simplify a lot, and switch to 'installer' for installation This follows the recommended bootstrap method (flit_core, build, installer). However, installer installs different files than pip, so update PLISTs for all packages using wheel.mk and bump their PKGREVISIONs. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.52 2023/10/23 06:37:48 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-2.1.1 a4 1 PKGREVISION= 2 @ 1.52 log @*: update for Python base package change Instead of depending on one of the removed packages (that are now included in the base Python packages), include batteries-included.mk to require a Python version that supplies them. Remove now included packages. Bump PKGREVISION. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.51 2023/10/15 00:05:44 gutteridge Exp $ d5 1 a5 1 PKGREVISION= 1 @ 1.51 log @py-pandas: fix minimum meson dependency pattern We need to force a minimum with the most recent Python multi-version patching. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.50 2023/10/05 04:46:05 gutteridge Exp $ d5 1 a21 1 DEPENDS+= ${PYPKGPREFIX}-sqlite3-[0-9]*:../../databases/py-sqlite3 d48 1 @ 1.50 log @py-pandas: fix (sandboxed) non-default Python builds Another issue where Meson isn't versioned in pkgsrc, so we end up with it "helpfully" supplying the path to Python it believes is correct, which is wrong for any non-default Python version. (The 2.1.0 version of this package carried a similar fix, which was removed in the update to 2.1.1; a variation of it is restored here.) Separately, this package directly expresses a minimum Meson version, so reflect that as well. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.49 2023/09/28 16:01:24 adam Exp $ d14 3 a16 1 TOOL_DEPENDS+= meson>=1.2.1:../../devel/meson # Directly expresses meson minimum @ 1.49 log @py-pandas: updated to 2.1.1 What’s new in 2.1.1 (September 20, 2023) These are the changes in pandas 2.1.1. See Release notes for a full changelog including other versions of pandas. Fixed regressions Fixed regression in concat() when DataFrame ‘s have two different extension dtypes (GH 54848) Fixed regression in merge() when merging over a PyArrow string index (GH 54894) Fixed regression in read_csv() when usecols is given and dtypes is a dict for engine="python" (GH 54868) Fixed regression in read_csv() when delim_whitespace is True (GH 54918, GH 54931) Fixed regression in GroupBy.get_group() raising for axis=1 (GH 54858) Fixed regression in DataFrame.__setitem__() raising AssertionError when setting a Series with a partial MultiIndex (GH 54875) Fixed regression in DataFrame.filter() not respecting the order of elements for filter (GH 54980) Fixed regression in DataFrame.to_sql() not roundtripping datetime columns correctly for sqlite (GH 54877) Fixed regression in DataFrameGroupBy.agg() when aggregating a DataFrame with duplicate column names using a dictionary (GH 55006) Fixed regression in MultiIndex.append() raising when appending overlapping IntervalIndex levels (GH 54934) Fixed regression in Series.drop_duplicates() for PyArrow strings (GH 54904) Fixed regression in Series.interpolate() raising when fill_value was given (GH 54920) Fixed regression in Series.value_counts() raising for numeric data if bins was specified (GH 54857) Fixed regression in comparison operations for PyArrow backed columns not propagating exceptions correctly (GH 54944) Fixed regression when comparing a Series with datetime64 dtype with None (GH 54870) Bug fixes Fixed bug for ArrowDtype raising NotImplementedError for fixed-size list (GH 55000) Fixed bug in DataFrame.stack() with future_stack=True and columns a non-MultiIndex consisting of tuples (GH 54948) Fixed bug in Series.dt.tz() with ArrowDtype where a string was returned instead of a tzinfo object (GH 55003) Fixed bug in Series.pct_change() and DataFrame.pct_change() showing unnecessary FutureWarning (GH 54981) Other Reverted the deprecation that disallowed Series.apply() returning a DataFrame when the passed-in callable returns a Series object (GH 52116) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.48 2023/09/02 07:19:56 adam Exp $ d14 1 d31 6 @ 1.48 log @py-pandas: updated to 2.1.0 https://pandas.pydata.org/docs/whatsnew/v2.1.0.html @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.47 2023/08/28 10:34:02 adam Exp $ d3 1 a3 1 DISTNAME= pandas-2.1.0 a29 6 SUBST_CLASSES+= python SUBST_STAGE.python= pre-configure SUBST_MESSAGE.python= Fixing python binary name. SUBST_FILES.python= meson.build SUBST_SED.python= -e "s,\['python',\['python${PYVERSSUFFIX}'," @ 1.47 log @py-pandas: updated to 2.0.3 2.0.3 Fixed regressions Bug in Timestamp.weekday`() was returning incorrect results before '0000-02-29' (GH53738) Fixed performance regression in merging on datetime-like columns (GH53231) Fixed regression when DataFrame.to_string() creates extra space for string dtypes (GH52690) Bug fixes Bug in DataFrame.convert_dtype() and Series.convert_dtype() when trying to convert ArrowDtype with dtype_backend="nullable_numpy" (GH53648) Bug in RangeIndex.union() when using sort=True with another RangeIndex (GH53490) Bug in Series.reindex() when expanding a non-nanosecond datetime or timedelta Series would not fill with NaT correctly (GH53497) Bug in read_csv() when defining dtype with bool[pyarrow] for the "c" and "python" engines (GH53390) Bug in Series.str.split() and Series.str.rsplit() with expand=True for ArrowDtype with pyarrow.string (GH53532) Bug in indexing methods (e.g. DataFrame.__getitem__()) where taking the entire DataFrame/Series would raise an OverflowError when Copy on Write was enabled and the length of the array was over the maximum size a 32-bit integer can hold (GH53616) Bug when constructing a DataFrame with columns of an ArrowDtype with a pyarrow.dictionary type that reindexes the data (GH53617) Bug when indexing a DataFrame or Series with an Index with a timestamp ArrowDtype would raise an AttributeError (GH53644) 2.0.2 Fixed regressions Fixed performance regression in GroupBy.apply() (GH53195) Fixed regression in merge() on Windows when dtype is np.intc (GH52451) Fixed regression in read_sql() dropping columns with duplicated column names (GH53117) Fixed regression in DataFrame.loc() losing MultiIndex name when enlarging object (GH53053) Fixed regression in DataFrame.to_string() printing a backslash at the end of the first row of data, instead of headers, when the DataFrame doesn’t fit the line width (GH53054) Fixed regression in MultiIndex.join() returning levels in wrong order (GH53093) Bug fixes Bug in arrays.ArrowExtensionArray incorrectly assigning dict instead of list for .type with pyarrow.map_ and raising a NotImplementedError with pyarrow.struct (GH53328) Bug in api.interchange.from_dataframe() was raising IndexError on empty categorical data (GH53077) Bug in api.interchange.from_dataframe() was returning DataFrame’s of incorrect sizes when called on slices (GH52824) Bug in api.interchange.from_dataframe() was unnecessarily raising on bitmasks (GH49888) Bug in merge() when merging on datetime columns on different resolutions (GH53200) Bug in read_csv() raising OverflowError for engine="pyarrow" and parse_dates set (GH53295) Bug in to_datetime() was inferring format to contain "%H" instead of "%I" if date contained “AM” / “PM” tokens (GH53147) Bug in DataFrame.convert_dtypes() ignores convert_* keywords when set to False dtype_backend="pyarrow" (GH52872) Bug in DataFrame.convert_dtypes() losing timezone for tz-aware dtypes and dtype_backend="pyarrow" (GH53382) Bug in DataFrame.sort_values() raising for PyArrow dictionary dtype (GH53232) Bug in Series.describe() treating pyarrow-backed timestamps and timedeltas as categorical data (GH53001) Bug in Series.rename() not making a lazy copy when Copy-on-Write is enabled when a scalar is passed to it (GH52450) Bug in pd.array() raising for NumPy array and pa.large_string or pa.large_binary (GH52590) Bug in DataFrame.__getitem__() not preserving dtypes for MultiIndex partial keys (GH51895) 2.0.1 Fixed regressions Fixed regression for subclassed Series when constructing from a dictionary (GH52445) Fixed regression in SeriesGroupBy.agg() failing when grouping with categorical data, multiple groupings, as_index=False, and a list of aggregations (GH52760) Fixed regression in DataFrame.pivot() changing Index name of input object (GH52629) Fixed regression in DataFrame.resample() raising on a DataFrame with no columns (GH52484) Fixed regression in DataFrame.sort_values() not resetting index when DataFrame is already sorted and ignore_index=True (GH52553) Fixed regression in MultiIndex.isin() raising TypeError for Generator (GH52568) Fixed regression in Series.describe() showing RuntimeWarning for extension dtype Series with one element (GH52515) Fixed regression when adding a new column to a DataFrame when the DataFrame.columns was a RangeIndex and the new key was hashable but not a scalar (GH52652) Bug fixes Bug in Series.dt.days that would overflow int32 number of days (GH52391) Bug in arrays.DatetimeArray constructor returning an incorrect unit when passed a non-nanosecond numpy datetime array (GH52555) Bug in ArrowExtensionArray with duration dtype overflowing when constructed from data containing numpy NaT (GH52843) Bug in Series.dt.round() when passing a freq of equal or higher resolution compared to the Series would raise a ZeroDivisionError (GH52761) Bug in Series.median() with ArrowDtype returning an approximate median (GH52679) Bug in api.interchange.from_dataframe() was unnecessarily raising on categorical dtypes (GH49889) Bug in api.interchange.from_dataframe() was unnecessarily raising on large string dtypes (GH52795) Bug in pandas.testing.assert_series_equal() where check_dtype=False would still raise for datetime or timedelta types with different resolutions (GH52449) Bug in read_csv() casting PyArrow datetimes to NumPy when dtype_backend="pyarrow" and parse_dates is set causing a performance bottleneck in the process (GH52546) Bug in to_datetime() and to_timedelta() when trying to convert numeric data with a ArrowDtype (GH52425) Bug in to_numeric() with errors='coerce' and dtype_backend='pyarrow' with ArrowDtype data (GH52588) Bug in ArrowDtype.__from_arrow__() not respecting if dtype is explicitly given (GH52533) Bug in DataFrame.describe() not respecting ArrowDtype in include and exclude (GH52570) Bug in DataFrame.max() and related casting different Timestamp resolutions always to nanoseconds (GH52524) Bug in Series.describe() not returning ArrowDtype with pyarrow.float64 type with numeric data (GH52427) Bug in Series.dt.tz_localize() incorrectly localizing timestamps with ArrowDtype (GH52677) Bug in arithmetic between np.datetime64 and np.timedelta64 NaT scalars with units always returning nanosecond resolution (GH52295) Bug in logical and comparison operations between ArrowDtype and numpy masked types (e.g. "boolean") (GH52625) Fixed bug in merge() when merging with ArrowDtype one one and a NumPy dtype on the other side (GH52406) Fixed segfault in Series.to_numpy() with null[pyarrow] dtype (GH52443) Other DataFrame created from empty dicts had columns of dtype object. It is now a RangeIndex (GH52404) Series created from empty dicts had index of dtype object. It is now a RangeIndex (GH52404) Implemented Series.str.split() and Series.str.rsplit() for ArrowDtype with pyarrow.string (GH52401) Implemented most str accessor methods for ArrowDtype with pyarrow.string (GH52401) Supplying a non-integer hashable key that tests False in api.types.is_scalar() now raises a KeyError for RangeIndex.get_loc(), like it does for Index.get_loc(). Previously it raised an InvalidIndexError (GH52652). @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.46 2023/08/01 23:20:47 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-2.0.3 d14 1 d25 2 a26 1 USE_LANGUAGES= c c++11 d28 7 a34 1 GCC_REQD+= 8 a37 3 do-test: cd ${WRKSRC} && ${SETENV} ${TEST_ENV} pytest-${PYVERSSUFFIX} pandas d42 3 @ 1.46 log @*: remove more references to Python 3.7 @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.45 2023/07/01 08:37:40 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-1.5.3 d13 3 a15 1 DEPENDS+= ${PYPKGPREFIX}-dateutil>=2.8.1:../../time/py-dateutil d18 5 a22 4 TEST_DEPENDS+= ${PYPKGPREFIX}-hypothesis>=5.5.3:../../devel/py-hypothesis TEST_DEPENDS+= ${PYPKGPREFIX}-test>=6.0:../../devel/py-test TEST_DEPENDS+= ${PYPKGPREFIX}-test-asyncio-[0-9]*:../../devel/py-test-asyncio TEST_DEPENDS+= ${PYPKGPREFIX}-test-xdist>=1.31:../../devel/py-test-xdist d35 1 a35 1 cd ${WRKSRC} && rm -f pandas/_libs/window/aggregations.pyx.orig d37 1 a37 1 .include "../../lang/python/egg.mk" @ 1.45 log @*: restrict py-numpy users to 3.9+ in preparation for update @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.44 2023/04/25 13:51:49 jperkin Exp $ d25 1 a25 1 PYTHON_VERSIONS_INCOMPATIBLE= 27 37 38 @ 1.44 log @*: GCC_REQD must always be appended to. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.43 2023/01/28 19:47:54 he Exp $ d25 1 a25 1 PYTHON_VERSIONS_INCOMPATIBLE= 27 37 # py-scipy @ 1.43 log @math/py-pandas: note upstream pull request, and remove .orig file. The .orig file would otherwise be installed, cauisng a PLIST mismatch. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.42 2023/01/25 14:05:16 adam Exp $ d23 1 a23 1 GCC_REQD= 8 @ 1.42 log @py-pandas: updated to 1.5.3 What's new in 1.5.3 (January 18, 2023) -------------------------------------- These are the changes in pandas 1.5.3. See :ref:`release` for a full changelog including other versions of pandas. Fixed regressions ~~~~~~~~~~~~~~~~~ - Fixed performance regression in :meth:`Series.isin` when ``values`` is empty (:issue:`49839`) - Fixed regression in :meth:`DataFrame.memory_usage` showing unnecessary ``FutureWarning`` when :class:`DataFrame` is empty (:issue:`50066`) - Fixed regression in :meth:`.DataFrameGroupBy.transform` when used with ``as_index=False`` (:issue:`49834`) - Enforced reversion of ``color`` as an alias for ``c`` and ``size`` as an alias for ``s`` in function :meth:`DataFrame.plot.scatter` (:issue:`49732`) - Fixed regression in :meth:`.SeriesGroupBy.apply` setting a ``name`` attribute on the result if the result was a :class:`DataFrame` (:issue:`49907`) - Fixed performance regression in setting with the :meth:`~DataFrame.at` indexer (:issue:`49771`) - Fixed regression in the methods ``apply``, ``agg``, and ``transform`` when used with NumPy functions that informed users to supply ``numeric_only=True`` if the operation failed on non-numeric dtypes; such columns must be dropped prior to using these methods (:issue:`50538`) - Fixed regression in :func:`to_datetime` raising ``ValueError`` when parsing array of ``float`` containing ``np.nan`` (:issue:`50237`) Bug fixes ~~~~~~~~~ - Bug in the Copy-on-Write implementation losing track of views when indexing a :class:`DataFrame` with another :class:`DataFrame` (:issue:`50630`) - Bug in :meth:`.Styler.to_excel` leading to error when unrecognized ``border-style`` (e.g. ``"hair"``) provided to Excel writers (:issue:`48649`) - Bug in :meth:`Series.quantile` emitting warning from NumPy when :class:`Series` has only ``NA`` values (:issue:`50681`) - Bug when chaining several :meth:`.Styler.concat` calls, only the last styler was concatenated (:issue:`49207`) - Fixed bug when instantiating a :class:`DataFrame` subclass inheriting from ``typing.Generic`` that triggered a ``UserWarning`` on python 3.11 (:issue:`49649`) - Bug in :func:`pivot_table` with NumPy 1.24 or greater when the :class:`DataFrame` columns has nested elements (:issue:`50342`) - Bug in :func:`pandas.testing.assert_series_equal` (and equivalent ``assert_`` functions) when having nested data and using numpy >= 1.25 (:issue:`50360`) Other ~~~~~ If you are using :meth:`DataFrame.to_sql`, :func:`read_sql`, :func:`read_sql_table`, or :func:`read_sql_query` with SQLAlchemy 1.4.46 or greater, you may see a ``sqlalchemy.exc.RemovedIn20Warning``. These warnings can be safely ignored for the SQLAlchemy 1.4.x releases as pandas works toward compatibility with SQLAlchemy 2.0. - Reverted deprecation (:issue:`45324`) of behavior of :meth:`Series.__getitem__` and :meth:`Series.__setitem__` slicing with an integer :class:`Index`; this will remain positional (:issue:`49612`) - A ``FutureWarning`` raised when attempting to set values inplace with :meth:`DataFrame.loc` or :meth:`DataFrame.iloc` has been changed to a ``DeprecationWarning`` (:issue:`48673`) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.41 2022/12/05 22:42:54 adam Exp $ d30 4 @ 1.41 log @py-pandas: needs C++ and GCC >= 8 @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.40 2022/11/28 21:46:51 adam Exp $ d3 1 a3 1 DISTNAME= pandas-1.5.2 @ 1.40 log @py-pandas: updated to 1.5.2 What's new in 1.5.2 (November 21, 2022) --------------------------------------- These are the changes in pandas 1.5.2. See :ref:`release` for a full changelog including other versions of pandas. Fixed regressions ~~~~~~~~~~~~~~~~~ - Fixed regression in :meth:`MultiIndex.join` for extension array dtypes (:issue:`49277`) - Fixed regression in :meth:`Series.replace` raising ``RecursionError`` with numeric dtype and when specifying ``value=None`` (:issue:`45725`) - Fixed regression in arithmetic operations for :class:`DataFrame` with :class:`MultiIndex` columns with different dtypes (:issue:`49769`) - Fixed regression in :meth:`DataFrame.plot` preventing :class:`~matplotlib.colors.Colormap` instance from being passed using the ``colormap`` argument if Matplotlib 3.6+ is used (:issue:`49374`) - Fixed regression in :func:`date_range` returning an invalid set of periods for ``CustomBusinessDay`` frequency and ``start`` date with timezone (:issue:`49441`) - Fixed performance regression in groupby operations (:issue:`49676`) - Fixed regression in :class:`Timedelta` constructor returning object of wrong type when subclassing ``Timedelta`` (:issue:`49579`) Bug fixes ~~~~~~~~~ - Bug in the Copy-on-Write implementation losing track of views in certain chained indexing cases (:issue:`48996`) - Fixed memory leak in :meth:`.Styler.to_excel` (:issue:`49751`) Other ~~~~~ - Reverted ``color`` as an alias for ``c`` and ``size`` as an alias for ``s`` in function :meth:`DataFrame.plot.scatter` (:issue:`49732`) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.39 2022/04/10 00:57:14 gutteridge Exp $ d21 3 a23 1 USE_LANGUAGES= c c++ @ 1.39 log @Fix build breakage from py-scipy now being Python >= 3.8 @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.38 2022/04/09 21:33:50 gutteridge Exp $ d3 1 a3 1 DISTNAME= pandas-1.3.5 a4 1 PKGREVISION= 1 d13 2 a14 6 DEPENDS+= ${PYPKGPREFIX}-bottleneck-[0-9]*:../../math/py-bottleneck DEPENDS+= ${PYPKGPREFIX}-dateutil>=2.7.3:../../time/py-dateutil DEPENDS+= ${PYPKGPREFIX}-matplotlib-[0-9]*:../../graphics/py-matplotlib DEPENDS+= ${PYPKGPREFIX}-numexpr-[0-9]*:../../math/py-numexpr DEPENDS+= ${PYPKGPREFIX}-pytz>=2017.3:../../time/py-pytz DEPENDS+= ${PYPKGPREFIX}-scipy>=0.7:../../math/py-scipy d16 4 a19 5 DEPENDS+= ${PYPKGPREFIX}-tables>=2.2:../../math/py-tables BUILD_DEPENDS+= ${PYPKGPREFIX}-test-runner-[0-9]*:../../devel/py-test-runner TEST_DEPENDS+= ${PYPKGPREFIX}-hypothesis>=3.58:../../devel/py-hypothesis TEST_DEPENDS+= ${PYPKGPREFIX}-test>=5.0.1:../../devel/py-test TEST_DEPENDS+= ${PYPKGPREFIX}-test-xdist-[0-9]*:../../devel/py-test-xdist d23 1 a23 1 PYSETUPTESTTARGET= pytest d25 2 a26 1 PYTHON_VERSIONS_INCOMPATIBLE= 27 37 # py-scipy d29 1 a29 1 BUILDLINK_API_DEPENDS.py-numpy+= ${PYPKGPREFIX}-numpy>=1.16.5 @ 1.38 log @py-pandas: fix BUILDLINK_API_DEPENDS for py-numpy @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.37 2022/01/04 20:54:15 wiz Exp $ d31 1 a31 1 PYTHON_VERSIONS_INCOMPATIBLE= 27 @ 1.37 log @*: bump PKGREVISION for egg.mk users They now have a tool dependency on py-setuptools instead of a DEPENDS @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.36 2021/12/30 13:05:37 adam Exp $ d34 1 a34 1 BUILDLINK_API_DEPENDS.pynumpy+= ${PYPKGPREFIX}-numpy>=1.16.5 @ 1.36 log @Forget about Python 3.6 @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.35 2021/12/12 20:30:49 adam Exp $ d5 1 @ 1.35 log @py-pandas: updated to 1.3.5 What's new in 1.3.5 (December 12, 2021) --------------------------------------- Fixed regressions ~~~~~~~~~~~~~~~~~ - Fixed regression in :meth:`Series.equals` when comparing floats with dtype object to None (:issue:`44190`) - Fixed regression in :func:`merge_asof` raising error when array was supplied as join key (:issue:`42844`) - Fixed regression when resampling :class:`DataFrame` with :class:`DateTimeIndex` with empty groups and ``uint8``, ``uint16`` or ``uint32`` columns incorrectly raising ``RuntimeError`` (:issue:`43329`) - Fixed regression in creating a :class:`DataFrame` from a timezone-aware :class:`Timestamp` scalar near a Daylight Savings Time transition (:issue:`42505`) - Fixed performance regression in :func:`read_csv` (:issue:`44106`) - Fixed regression in :meth:`Series.duplicated` and :meth:`Series.drop_duplicates` when Series has :class:`Categorical` dtype with boolean categories (:issue:`44351`) - Fixed regression in :meth:`.GroupBy.sum` with ``timedelta64[ns]`` dtype containing ``NaT`` failing to treat that value as NA (:issue:`42659`) - Fixed regression in :meth:`.RollingGroupby.cov` and :meth:`.RollingGroupby.corr` when ``other`` had the same shape as each group would incorrectly return superfluous groups in the result (:issue:`42915`) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.34 2021/11/21 16:31:26 ryoon Exp $ d30 1 a30 1 PYTHON_VERSIONS_INCOMPATIBLE= 27 36 @ 1.34 log @py-pandas: Update to 1.3.4 Changelog: What's new in 1.3.4 (October 17, 2021) These are the changes in pandas 1.3.4. See Release notes for a full changelog including other versions of pandas. ------------------------------------------------------------------------------- Fixed regressions * Fixed regression in DataFrame.convert_dtypes() incorrectly converts byte strings to strings (GH43183) * Fixed regression in GroupBy.agg() where it was failing silently with mixed data types along axis=1 and MultiIndex (GH43209) * Fixed regression in merge() with integer and NaN keys failing with outer merge (GH43550) * Fixed regression in DataFrame.corr() raising ValueError with method= "spearman" on 32-bit platforms (GH43588) * Fixed performance regression in MultiIndex.equals() (GH43549) * Fixed performance regression in GroupBy.first() and GroupBy.last() with StringDtype (GH41596) * Fixed regression in Series.cat.reorder_categories() failing to update the categories on the Series (GH43232) * Fixed regression in Series.cat.categories() setter failing to update the categories on the Series (GH43334) * Fixed regression in read_csv() raising UnicodeDecodeError exception when memory_map=True (GH43540) * Fixed regression in DataFrame.explode() raising AssertionError when column is any scalar which is not a string (GH43314) * Fixed regression in Series.aggregate() attempting to pass args and kwargs multiple times to the user supplied func in certain cases (GH43357) * Fixed regression when iterating over a DataFrame.groupby.rolling object causing the resulting DataFrames to have an incorrect index if the input groupings were not sorted (GH43386) * Fixed regression in DataFrame.groupby.rolling.cov() and DataFrame.groupby.rolling.corr() computing incorrect results if the input groupings were not sorted (GH43386) ------------------------------------------------------------------------------- Bug fixes * Fixed bug in pandas.DataFrame.groupby.rolling() and pandas.api.indexers.FixedForwardWindowIndexer leading to segfaults and window endpoints being mixed across groups (GH43267) * Fixed bug in GroupBy.mean() with datetimelike values including NaT values returning incorrect results (GH43132) * Fixed bug in Series.aggregate() not passing the first args to the user supplied func in certain cases (GH43357) * Fixed memory leaks in Series.rolling.quantile() and Series.rolling.median() (GH43339) ------------------------------------------------------------------------------- Other * The minimum version of Cython needed to compile pandas is now 0.29.24 ( GH43729) What's new in 1.3.3 (September 12, 2021) These are the changes in pandas 1.3.3. See Release notes for a full changelog including other versions of pandas. ------------------------------------------------------------------------------- Fixed regressions * Fixed regression in DataFrame constructor failing to broadcast for defined Index and len one list of Timestamp (GH42810) * Fixed regression in GroupBy.agg() incorrectly raising in some cases ( GH42390) * Fixed regression in GroupBy.apply() where nan values were dropped even with dropna=False (GH43205) * Fixed regression in GroupBy.quantile() which was failing with pandas.NA ( GH42849) * Fixed regression in merge() where on columns with ExtensionDtype or bool data types were cast to object in right and outer merge (GH40073) * Fixed regression in RangeIndex.where() and RangeIndex.putmask() raising AssertionError when result did not represent a RangeIndex (GH43240) * Fixed regression in read_parquet() where the fastparquet engine would not work properly with fastparquet 0.7.0 (GH43075) * Fixed regression in DataFrame.loc.__setitem__() raising ValueError when setting array as cell value (GH43422) * Fixed regression in is_list_like() where objects with __iter__ set to None would be identified as iterable (GH43373) * Fixed regression in DataFrame.__getitem__() raising error for slice of DatetimeIndex when index is non monotonic (GH43223) * Fixed regression in Resampler.aggregate() when used after column selection would raise if func is a list of aggregation functions (GH42905) * Fixed regression in DataFrame.corr() where Kendall correlation would produce incorrect results for columns with repeated values (GH43401) * Fixed regression in DataFrame.groupby() where aggregation on columns with object types dropped results on those columns (GH42395, GH43108) * Fixed regression in Series.fillna() raising TypeError when filling float Series with list-like fill value having a dtype which couldn't cast lostlessly (like float32 filled with float64) (GH43424) * Fixed regression in read_csv() raising AttributeError when the file handle is an tempfile.SpooledTemporaryFile object (GH43439) * Fixed performance regression in core.window.ewm.ExponentialMovingWindow.mean() (GH42333) ------------------------------------------------------------------------------- Performance improvements * Performance improvement for DataFrame.__setitem__() when the key or value is not a DataFrame, or key is not list-like (GH43274) ------------------------------------------------------------------------------- Bug fixes * Fixed bug in DataFrameGroupBy.agg() and DataFrameGroupBy.transform() with engine="numba" where index data was not being correctly passed into func ( GH43133) What's new in 1.3.2 (August 15, 2021) These are the changes in pandas 1.3.2. See Release notes for a full changelog including other versions of pandas. ------------------------------------------------------------------------------- Fixed regressions * Performance regression in DataFrame.isin() and Series.isin() for nullable data types (GH42714) * Regression in updating values of Series using boolean index, created by using DataFrame.pop() (GH42530) * Regression in DataFrame.from_records() with empty records (GH42456) * Fixed regression in DataFrame.shift() where TypeError occurred when shifting DataFrame created by concatenation of slices and fills with values (GH42719) * Regression in DataFrame.agg() when the func argument returned lists and axis=1 (GH42727) * Regression in DataFrame.drop() does nothing if MultiIndex has duplicates and indexer is a tuple or list of tuples (GH42771) * Fixed regression where read_csv() raised a ValueError when parameters names and prefix were both set to None (GH42387) * Fixed regression in comparisons between Timestamp object and datetime64 objects outside the implementation bounds for nanosecond datetime64 ( GH42794) * Fixed regression in Styler.highlight_min() and Styler.highlight_max() where pandas.NA was not successfully ignored (GH42650) * Fixed regression in concat() where copy=False was not honored in axis=1 Series concatenation (GH42501) * Regression in Series.nlargest() and Series.nsmallest() with nullable integer or float dtype (GH42816) * Fixed regression in Series.quantile() with Int64Dtype (GH42626) * Fixed regression in Series.groupby() and DataFrame.groupby() where supplying the by argument with a Series named with a tuple would incorrectly raise (GH42731) ------------------------------------------------------------------------------- Bug fixes * Bug in read_excel() modifies the dtypes dictionary when reading a file with duplicate columns (GH42462) * 1D slices over extension types turn into N-dimensional slices over ExtensionArrays (GH42430) * Fixed bug in Series.rolling() and DataFrame.rolling() not calculating window bounds correctly for the first row when center=True and window is an offset that covers all the rows (GH42753) * Styler.hide_columns() now hides the index name header row as well as column headers (GH42101) * Styler.set_sticky() has amended CSS to control the column/index names and ensure the correct sticky positions (GH42537) * Bug in de-serializing datetime indexes in PYTHONOPTIMIZED mode (GH42866) What's new in 1.3.1 (July 25, 2021) These are the changes in pandas 1.3.1. See Release notes for a full changelog including other versions of pandas. ------------------------------------------------------------------------------- Fixed regressions * Pandas could not be built on PyPy (GH42355) * DataFrame constructed with an older version of pandas could not be unpickled (GH42345) * Performance regression in constructing a DataFrame from a dictionary of dictionaries (GH42248) * Fixed regression in DataFrame.agg() dropping values when the DataFrame had an Extension Array dtype, a duplicate index, and axis=1 (GH42380) * Fixed regression in DataFrame.astype() changing the order of noncontiguous data (GH42396) * Performance regression in DataFrame in reduction operations requiring casting such as DataFrame.mean() on integer data (GH38592) * Performance regression in DataFrame.to_dict() and Series.to_dict() when orient argument one of 'records', 'dict', or 'split' (GH42352) * Fixed regression in indexing with a list subclass incorrectly raising TypeError (GH42433, GH42461) * Fixed regression in DataFrame.isin() and Series.isin() raising TypeError with nullable data containing at least one missing value (GH42405) * Regression in concat() between objects with bool dtype and integer dtype casting to object instead of to integer (GH42092) * Bug in Series constructor not accepting a dask.Array (GH38645) * Fixed regression for SettingWithCopyWarning displaying incorrect stacklevel (GH42570) * Fixed regression for merge_asof() raising KeyError when one of the by columns is in the index (GH34488) * Fixed regression in to_datetime() returning pd.NaT for inputs that produce duplicated values, when cache=True (GH42259) * Fixed regression in SeriesGroupBy.value_counts() that resulted in an IndexError when called on a Series with one row (GH42618) ------------------------------------------------------------------------------- Bug fixes * Fixed bug in DataFrame.transpose() dropping values when the DataFrame had an Extension Array dtype and a duplicate index (GH42380) * Fixed bug in DataFrame.to_xml() raising KeyError when called with index= False and an offset index (GH42458) * Fixed bug in Styler.set_sticky() not handling index names correctly for single index columns case (GH42537) * Fixed bug in DataFrame.copy() failing to consolidate blocks in the result ( GH42579) What's new in 1.3.0 (July 2, 2021) These are the changes in pandas 1.3.0. See Release notes for a full changelog including other versions of pandas. Warning When reading new Excel 2007+ (.xlsx) files, the default argument engine=None to read_excel() will now result in using the openpyxl engine in all cases when the option io.excel.xlsx.reader is set to "auto". Previously, some cases would use the xlrd engine instead. See What's new 1.2.0 for background on this change. ------------------------------------------------------------------------------- Enhancements ------------------------------------------------------------------------------- Custom HTTP(s) headers when reading csv or json files When reading from a remote URL that is not handled by fsspec (e.g. HTTP and HTTPS) the dictionary passed to storage_options will be used to create the headers included in the request. This can be used to control the User-Agent header or send other custom headers (GH36688). For example: In [1]: headers = {"User-Agent": "pandas"} In [2]: df = pd.read_csv( ...: "https://download.bls.gov/pub/time.series/cu/cu.item", ...: sep="\t", ...: storage_options=headers ...: ) ...: ------------------------------------------------------------------------------- Read and write XML documents We added I/O support to read and render shallow versions of XML documents with read_xml() and DataFrame.to_xml(). Using lxml as parser, both XPath 1.0 and XSLT 1.0 are available. (GH27554) In [1]: xml = """ ...: ...: ...: square ...: 360 ...: 4.0 ...: ...: ...: circle ...: 360 ...: ...: ...: ...: triangle ...: 180 ...: 3.0 ...: ...: """ In [2]: df = pd.read_xml(xml) In [3]: df Out[3]: shape degrees sides 0 square 360 4.0 1 circle 360 NaN 2 triangle 180 3.0 In [4]: df.to_xml() Out[4]: 0 square 360 4.0 1 circle 360 2 triangle 180 3.0 For more, see Writing XML in the user guide on IO tools. ------------------------------------------------------------------------------- Styler enhancements We provided some focused development on Styler. See also the Styler documentation which has been revised and improved (GH39720, GH39317, GH40493). + The method Styler.set_table_styles() can now accept more natural CSS language for arguments, such as 'color:red;' instead of [('color', 'red')] (GH39563) + The methods Styler.highlight_null(), Styler.highlight_min(), and Styler.highlight_max() now allow custom CSS highlighting instead of the default background coloring (GH40242) + Styler.apply() now accepts functions that return an ndarray when axis= None, making it now consistent with the axis=0 and axis=1 behavior ( GH39359) + When incorrectly formatted CSS is given via Styler.apply() or Styler.applymap(), an error is now raised upon rendering (GH39660) + Styler.format() now accepts the keyword argument escape for optional HTML and LaTeX escaping (GH40388, GH41619) + Styler.background_gradient() has gained the argument gmap to supply a specific gradient map for shading (GH22727) + Styler.clear() now clears Styler.hidden_index and Styler.hidden_columns as well (GH40484) + Added the method Styler.highlight_between() (GH39821) + Added the method Styler.highlight_quantile() (GH40926) + Added the method Styler.text_gradient() (GH41098) + Added the method Styler.set_tooltips() to allow hover tooltips; this can be used enhance interactive displays (GH21266, GH40284) + Added the parameter precision to the method Styler.format() to control the display of floating point numbers (GH40134) + Styler rendered HTML output now follows the w3 HTML Style Guide ( GH39626) + Many features of the Styler class are now either partially or fully usable on a DataFrame with a non-unique indexes or columns (GH41143) + One has greater control of the display through separate sparsification of the index or columns using the new styler options, which are also usable via option_context() (GH41142) + Added the option styler.render.max_elements to avoid browser overload when styling large DataFrames (GH40712) + Added the method Styler.to_latex() (GH21673, GH42320), which also allows some limited CSS conversion (GH40731) + Added the method Styler.to_html() (GH13379) + Added the method Styler.set_sticky() to make index and column headers permanently visible in scrolling HTML frames (GH29072) ------------------------------------------------------------------------------- DataFrame constructor honors copy=False with dict When passing a dictionary to DataFrame with copy=False, a copy will no longer be made (GH32960). In [3]: arr = np.array([1, 2, 3]) In [4]: df = pd.DataFrame({"A": arr, "B": arr.copy()}, copy=False) In [5]: df Out[5]: A B 0 1 1 1 2 2 2 3 3 df["A"] remains a view on arr: In [6]: arr[0] = 0 In [7]: assert df.iloc[0, 0] == 0 The default behavior when not passing copy will remain unchanged, i.e. a copy will be made. ------------------------------------------------------------------------------- PyArrow backed string data type We've enhanced the StringDtype, an extension type dedicated to string data. ( GH39908) It is now possible to specify a storage keyword option to StringDtype. Use pandas options or specify the dtype using dtype='string[pyarrow]' to allow the StringArray to be backed by a PyArrow array instead of a NumPy array of Python objects. The PyArrow backed StringArray requires pyarrow 1.0.0 or greater to be installed. Warning string[pyarrow] is currently considered experimental. The implementation and parts of the API may change without warning. In [8]: pd.Series(['abc', None, 'def'], dtype=pd.StringDtype(storage="pyarrow")) Out[8]: 0 abc 1 2 def dtype: string You can use the alias "string[pyarrow]" as well. In [9]: s = pd.Series(['abc', None, 'def'], dtype="string[pyarrow]") In [10]: s Out[10]: 0 abc 1 2 def dtype: string You can also create a PyArrow backed string array using pandas options. In [11]: with pd.option_context("string_storage", "pyarrow"): ....: s = pd.Series(['abc', None, 'def'], dtype="string") ....: In [12]: s Out[12]: 0 abc 1 2 def dtype: string The usual string accessor methods work. Where appropriate, the return type of the Series or columns of a DataFrame will also have string dtype. In [13]: s.str.upper() Out[13]: 0 ABC 1 2 DEF dtype: string In [14]: s.str.split('b', expand=True).dtypes Out[14]: 0 string 1 string dtype: object String accessor methods returning integers will return a value with Int64Dtype In [15]: s.str.count("a") Out[15]: 0 1 1 2 0 dtype: Int64 ------------------------------------------------------------------------------- Centered datetime-like rolling windows When performing rolling calculations on DataFrame and Series objects with a datetime-like index, a centered datetime-like window can now be used (GH38780). For example: In [16]: df = pd.DataFrame( ....: {"A": [0, 1, 2, 3, 4]}, index=pd.date_range("2020", periods=5, freq="1D") ....: ) ....: In [17]: df Out[17]: A 2020-01-01 0 2020-01-02 1 2020-01-03 2 2020-01-04 3 2020-01-05 4 In [18]: df.rolling("2D", center=True).mean() Out[18]: A 2020-01-01 0.5 2020-01-02 1.5 2020-01-03 2.5 2020-01-04 3.5 2020-01-05 4.0 ------------------------------------------------------------------------------- Other enhancements * DataFrame.rolling(), Series.rolling(), DataFrame.expanding(), and Series.expanding() now support a method argument with a 'table' option that performs the windowing operation over an entire DataFrame. See Window Overview for performance and functional benefits (GH15095, GH38995) * ExponentialMovingWindow now support a online method that can perform mean calculations in an online fashion. See Window Overview (GH41673) * Added MultiIndex.dtypes() (GH37062) * Added end and end_day options for the origin argument in DataFrame.resample () (GH37804) * Improved error message when usecols and names do not match for read_csv() and engine="c" (GH29042) * Improved consistency of error messages when passing an invalid win_type argument in Window methods (GH15969) * read_sql_query() now accepts a dtype argument to cast the columnar data from the SQL database based on user input (GH10285) * read_csv() now raising ParserWarning if length of header or given names does not match length of data when usecols is not specified (GH21768) * Improved integer type mapping from pandas to SQLAlchemy when using DataFrame.to_sql() (GH35076) * to_numeric() now supports downcasting of nullable ExtensionDtype objects ( GH33013) * Added support for dict-like names in MultiIndex.set_names and MultiIndex.rename (GH20421) * read_excel() can now auto-detect .xlsb files and older .xls files (GH35416, GH41225) * ExcelWriter now accepts an if_sheet_exists parameter to control the behavior of append mode when writing to existing sheets (GH40230) * Rolling.sum(), Expanding.sum(), Rolling.mean(), Expanding.mean(), ExponentialMovingWindow.mean(), Rolling.median(), Expanding.median(), Rolling.max(), Expanding.max(), Rolling.min(), and Expanding.min() now support Numba execution with the engine keyword (GH38895, GH41267) * DataFrame.apply() can now accept NumPy unary operators as strings, e.g. df.apply("sqrt"), which was already the case for Series.apply() (GH39116) * DataFrame.apply() can now accept non-callable DataFrame properties as strings, e.g. df.apply("size"), which was already the case for Series.apply () (GH39116) * DataFrame.applymap() can now accept kwargs to pass on to the user-provided func (GH39987) * Passing a DataFrame indexer to iloc is now disallowed for Series.__getitem__() and DataFrame.__getitem__() (GH39004) * Series.apply() can now accept list-like or dictionary-like arguments that aren't lists or dictionaries, e.g. ser.apply(np.array(["sum", "mean"])), which was already the case for DataFrame.apply() (GH39140) * DataFrame.plot.scatter() can now accept a categorical column for the argument c (GH12380, GH31357) * Series.loc() now raises a helpful error message when the Series has a MultiIndex and the indexer has too many dimensions (GH35349) * read_stata() now supports reading data from compressed files (GH26599) * Added support for parsing ISO 8601-like timestamps with negative signs to Timedelta (GH37172) * Added support for unary operators in FloatingArray (GH38749) * RangeIndex can now be constructed by passing a range object directly e.g. pd.RangeIndex(range(3)) (GH12067) * Series.round() and DataFrame.round() now work with nullable integer and floating dtypes (GH38844) * read_csv() and read_json() expose the argument encoding_errors to control how encoding errors are handled (GH39450) * GroupBy.any() and GroupBy.all() use Kleene logic with nullable data types ( GH37506) * GroupBy.any() and GroupBy.all() return a BooleanDtype for columns with nullable data types (GH33449) * GroupBy.any() and GroupBy.all() raising with object data containing pd.NA even when skipna=True (GH37501) * GroupBy.rank() now supports object-dtype data (GH38278) * Constructing a DataFrame or Series with the data argument being a Python iterable that is not a NumPy ndarray consisting of NumPy scalars will now result in a dtype with a precision the maximum of the NumPy scalars; this was already the case when data is a NumPy ndarray (GH40908) * Add keyword sort to pivot_table() to allow non-sorting of the result ( GH39143) * Add keyword dropna to DataFrame.value_counts() to allow counting rows that include NA values (GH41325) * Series.replace() will now cast results to PeriodDtype where possible instead of object dtype (GH41526) * Improved error message in corr and cov methods on Rolling, Expanding, and ExponentialMovingWindow when other is not a DataFrame or Series (GH41741) * Series.between() can now accept left or right as arguments to inclusive to include only the left or right boundary (GH40245) * DataFrame.explode() now supports exploding multiple columns. Its column argument now also accepts a list of str or tuples for exploding on multiple columns at the same time (GH39240) * DataFrame.sample() now accepts the ignore_index argument to reset the index after sampling, similar to DataFrame.drop_duplicates() and DataFrame.sort_values() (GH38581) ------------------------------------------------------------------------------- Notable bug fixes These are bug fixes that might have notable behavior changes. ------------------------------------------------------------------------------- Categorical.unique now always maintains same dtype as original Previously, when calling Categorical.unique() with categorical data, unused categories in the new array would be removed, making the dtype of the new array different than the original (GH18291) As an example of this, given: In [19]: dtype = pd.CategoricalDtype(['bad', 'neutral', 'good'], ordered=True) In [20]: cat = pd.Categorical(['good', 'good', 'bad', 'bad'], dtype=dtype) In [21]: original = pd.Series(cat) In [22]: unique = original.unique() Previous behavior: In [1]: unique ['good', 'bad'] Categories (2, object): ['bad' < 'good'] In [2]: original.dtype == unique.dtype False New behavior: In [23]: unique Out[23]: ['good', 'bad'] Categories (3, object): ['bad' < 'neutral' < 'good'] In [24]: original.dtype == unique.dtype Out[24]: True ------------------------------------------------------------------------------- Preserve dtypes in DataFrame.combine_first() DataFrame.combine_first() will now preserve dtypes (GH7509) In [25]: df1 = pd.DataFrame({"A": [1, 2, 3], "B": [1, 2, 3]}, index=[0, 1, 2]) In [26]: df1 Out[26]: A B 0 1 1 1 2 2 2 3 3 In [27]: df2 = pd.DataFrame({"B": [4, 5, 6], "C": [1, 2, 3]}, index=[2, 3, 4]) In [28]: df2 Out[28]: B C 2 4 1 3 5 2 4 6 3 In [29]: combined = df1.combine_first(df2) Previous behavior: In [1]: combined.dtypes Out[2]: A float64 B float64 C float64 dtype: object New behavior: In [30]: combined.dtypes Out[30]: A float64 B int64 C float64 dtype: object ------------------------------------------------------------------------------- Groupby methods agg and transform no longer changes return dtype for callables Previously the methods DataFrameGroupBy.aggregate(), SeriesGroupBy.aggregate(), DataFrameGroupBy.transform(), and SeriesGroupBy.transform() might cast the result dtype when the argument func is callable, possibly leading to undesirable results (GH21240). The cast would occur if the result is numeric and casting back to the input dtype does not change any values as measured by np.allclose. Now no such casting occurs. In [31]: df = pd.DataFrame({'key': [1, 1], 'a': [True, False], 'b': [True, True]}) In [32]: df Out[32]: key a b 0 1 True True 1 1 False True Previous behavior: In [5]: df.groupby('key').agg(lambda x: x.sum()) Out[5]: a b key 1 True 2 New behavior: In [33]: df.groupby('key').agg(lambda x: x.sum()) Out[33]: a b key 1 1 2 ------------------------------------------------------------------------------- float result for GroupBy.mean(), GroupBy.median(), and GroupBy.var() Previously, these methods could result in different dtypes depending on the input values. Now, these methods will always return a float dtype. (GH41137) In [34]: df = pd.DataFrame({'a': [True], 'b': [1], 'c': [1.0]}) Previous behavior: In [5]: df.groupby(df.index).mean() Out[5]: a b c 0 True 1 1.0 New behavior: In [35]: df.groupby(df.index).mean() Out[35]: a b c 0 1.0 1.0 1.0 ------------------------------------------------------------------------------- Try operating inplace when setting values with loc and iloc When setting an entire column using loc or iloc, pandas will try to insert the values into the existing data rather than create an entirely new array. In [36]: df = pd.DataFrame(range(3), columns=["A"], dtype="float64") In [37]: values = df.values In [38]: new = np.array([5, 6, 7], dtype="int64") In [39]: df.loc[[0, 1, 2], "A"] = new In both the new and old behavior, the data in values is overwritten, but in the old behavior the dtype of df["A"] changed to int64. Previous behavior: In [1]: df.dtypes Out[1]: A int64 dtype: object In [2]: np.shares_memory(df["A"].values, new) Out[2]: False In [3]: np.shares_memory(df["A"].values, values) Out[3]: False In pandas 1.3.0, df continues to share data with values New behavior: In [40]: df.dtypes Out[40]: A float64 dtype: object In [41]: np.shares_memory(df["A"], new) Out[41]: False In [42]: np.shares_memory(df["A"], values) Out[42]: True ------------------------------------------------------------------------------- Never operate inplace when setting frame[keys] = values When setting multiple columns using frame[keys] = values new arrays will replace pre-existing arrays for these keys, which will not be over-written ( GH39510). As a result, the columns will retain the dtype(s) of values, never casting to the dtypes of the existing arrays. In [43]: df = pd.DataFrame(range(3), columns=["A"], dtype="float64") In [44]: df[["A"]] = 5 In the old behavior, 5 was cast to float64 and inserted into the existing array backing df: Previous behavior: In [1]: df.dtypes Out[1]: A float64 In the new behavior, we get a new array, and retain an integer-dtyped 5: New behavior: In [45]: df.dtypes Out[45]: A int64 dtype: object ------------------------------------------------------------------------------- Consistent casting with setting into Boolean Series Setting non-boolean values into a Series with dtype=bool now consistently casts to dtype=object (GH38709) In [46]: orig = pd.Series([True, False]) In [47]: ser = orig.copy() In [48]: ser.iloc[1] = np.nan In [49]: ser2 = orig.copy() In [50]: ser2.iloc[1] = 2.0 Previous behavior: In [1]: ser Out [1]: 0 1.0 1 NaN dtype: float64 In [2]:ser2 Out [2]: 0 True 1 2.0 dtype: object New behavior: In [51]: ser Out[51]: 0 True 1 NaN dtype: object In [52]: ser2 Out[52]: 0 True 1 2.0 dtype: object ------------------------------------------------------------------------------- GroupBy.rolling no longer returns grouped-by column in values The group-by column will now be dropped from the result of a groupby.rolling operation (GH32262) In [53]: df = pd.DataFrame({"A": [1, 1, 2, 3], "B": [0, 1, 2, 3]}) In [54]: df Out[54]: A B 0 1 0 1 1 1 2 2 2 3 3 3 Previous behavior: In [1]: df.groupby("A").rolling(2).sum() Out[1]: A B A 1 0 NaN NaN 1 2.0 1.0 2 2 NaN NaN 3 3 NaN NaN New behavior: In [55]: df.groupby("A").rolling(2).sum() Out[55]: B A 1 0 NaN 1 1.0 2 2 NaN 3 3 NaN ------------------------------------------------------------------------------- Removed artificial truncation in rolling variance and standard deviation Rolling.std() and Rolling.var() will no longer artificially truncate results that are less than ~1e-8 and ~1e-15 respectively to zero (GH37051, GH40448, GH39872). However, floating point artifacts may now exist in the results when rolling over larger values. In [56]: s = pd.Series([7, 5, 5, 5]) In [57]: s.rolling(3).var() Out[57]: 0 NaN 1 NaN 2 1.333333e+00 3 4.440892e-16 dtype: float64 ------------------------------------------------------------------------------- GroupBy.rolling with MultiIndex no longer drops levels in the result GroupBy.rolling() will no longer drop levels of a DataFrame with a MultiIndex in the result. This can lead to a perceived duplication of levels in the resulting MultiIndex, but this change restores the behavior that was present in version 1.1.3 (GH38787, GH38523). In [58]: index = pd.MultiIndex.from_tuples([('idx1', 'idx2')], names=['label1', 'label2']) In [59]: df = pd.DataFrame({'a': [1], 'b': [2]}, index=index) In [60]: df Out[60]: a b label1 label2 idx1 idx2 1 2 Previous behavior: In [1]: df.groupby('label1').rolling(1).sum() Out[1]: a b label1 idx1 1.0 2.0 New behavior: In [61]: df.groupby('label1').rolling(1).sum() Out[61]: a b label1 label1 label2 idx1 idx1 idx2 1.0 2.0 ------------------------------------------------------------------------------- Backwards incompatible API changes ------------------------------------------------------------------------------- Increased minimum versions for dependencies Some minimum supported versions of dependencies were updated. If installed, we now require: Package Minimum Version Required Changed numpy 1.17.3 X X pytz 2017.3 X python-dateutil 2.7.3 X bottleneck 1.2.1 numexpr 2.7.0 X pytest (dev) 6.0 X mypy (dev) 0.812 X setuptools 38.6.0 X For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported. Package Minimum Version Changed beautifulsoup4 4.6.0 fastparquet 0.4.0 X fsspec 0.7.4 gcsfs 0.6.0 lxml 4.3.0 matplotlib 2.2.3 numba 0.46.0 openpyxl 3.0.0 X pyarrow 0.17.0 X pymysql 0.8.1 X pytables 3.5.1 s3fs 0.4.0 scipy 1.2.0 sqlalchemy 1.3.0 X tabulate 0.8.7 X xarray 0.12.0 xlrd 1.2.0 xlsxwriter 1.0.2 xlwt 1.3.0 pandas-gbq 0.12.0 See Dependencies and Optional dependencies for more. ------------------------------------------------------------------------------- Other API changes * Partially initialized CategoricalDtype objects (i.e. those with categories= None) will no longer compare as equal to fully initialized dtype objects ( GH38516) * Accessing _constructor_expanddim on a DataFrame and _constructor_sliced on a Series now raise an AttributeError. Previously a NotImplementedError was raised (GH38782) * Added new engine and **engine_kwargs parameters to DataFrame.to_sql() to support other future 'SQL engines'. Currently we still only use SQLAlchemy under the hood, but more engines are planned to be supported such as turbodbc (GH36893) * Removed redundant freq from PeriodIndex string representation (GH41653) * ExtensionDtype.construct_array_type() is now a required method instead of an optional one for ExtensionDtype subclasses (GH24860) * Calling hash on non-hashable pandas objects will now raise TypeError with the built-in error message (e.g. unhashable type: 'Series'). Previously it would raise a custom message such as 'Series' objects are mutable, thus they cannot be hashed. Furthermore, isinstance(, abc.collections.Hashable) will now return False (GH40013) * Styler.from_custom_template() now has two new arguments for template names, and removed the old name, due to template inheritance having been introducing for better parsing (GH42053). Subclassing modifications to Styler attributes are also needed. ------------------------------------------------------------------------------- Build * Documentation in .pptx and .pdf formats are no longer included in wheels or source distributions. (GH30741) ------------------------------------------------------------------------------- Deprecations ------------------------------------------------------------------------------- Deprecated dropping nuisance columns in DataFrame reductions and DataFrameGroupBy operations Calling a reduction (e.g. .min, .max, .sum) on a DataFrame with numeric_only= None (the default), columns where the reduction raises a TypeError are silently ignored and dropped from the result. This behavior is deprecated. In a future version, the TypeError will be raised, and users will need to select only valid columns before calling the function. For example: In [62]: df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)}) In [63]: df Out[63]: A B 0 1 2016-01-01 1 2 2016-01-02 2 3 2016-01-03 3 4 2016-01-04 Old behavior: In [3]: df.prod() Out[3]: Out[3]: A 24 dtype: int64 Future behavior: In [4]: df.prod() ... TypeError: 'DatetimeArray' does not implement reduction 'prod' In [5]: df[["A"]].prod() Out[5]: A 24 dtype: int64 Similarly, when applying a function to DataFrameGroupBy, columns on which the function raises TypeError are currently silently ignored and dropped from the result. This behavior is deprecated. In a future version, the TypeError will be raised, and users will need to select only valid columns before calling the function. For example: In [64]: df = pd.DataFrame({"A": [1, 2, 3, 4], "B": pd.date_range("2016-01-01", periods=4)}) In [65]: gb = df.groupby([1, 1, 2, 2]) Old behavior: In [4]: gb.prod(numeric_only=False) Out[4]: A 1 2 2 12 Future behavior: In [5]: gb.prod(numeric_only=False) ... TypeError: datetime64 type does not support prod operations In [6]: gb[["A"]].prod(numeric_only=False) Out[6]: A 1 2 2 12 ------------------------------------------------------------------------------- Other Deprecations * Deprecated allowing scalars to be passed to the Categorical constructor ( GH38433) * Deprecated constructing CategoricalIndex without passing list-like data ( GH38944) * Deprecated allowing subclass-specific keyword arguments in the Index constructor, use the specific subclass directly instead (GH14093, GH21311, GH22315, GH26974) * Deprecated the astype() method of datetimelike (timedelta64[ns], datetime64 [ns], Datetime64TZDtype, PeriodDtype) to convert to integer dtypes, use values.view(...) instead (GH38544) * Deprecated MultiIndex.is_lexsorted() and MultiIndex.lexsort_depth(), use MultiIndex.is_monotonic_increasing() instead (GH32259) * Deprecated keyword try_cast in Series.where(), Series.mask(), DataFrame.where(), DataFrame.mask(); cast results manually if desired ( GH38836) * Deprecated comparison of Timestamp objects with datetime.date objects. Instead of e.g. ts <= mydate use ts <= pd.Timestamp(mydate) or ts.date() <= mydate (GH36131) * Deprecated Rolling.win_type returning "freq" (GH38963) * Deprecated Rolling.is_datetimelike (GH38963) * Deprecated DataFrame indexer for Series.__setitem__() and DataFrame.__setitem__() (GH39004) * Deprecated ExponentialMovingWindow.vol() (GH39220) * Using .astype to convert between datetime64[ns] dtype and DatetimeTZDtype is deprecated and will raise in a future version, use obj.tz_localize or obj.dt.tz_localize instead (GH38622) * Deprecated casting datetime.date objects to datetime64 when used as fill_value in DataFrame.unstack(), DataFrame.shift(), Series.shift(), and DataFrame.reindex(), pass pd.Timestamp(dateobj) instead (GH39767) * Deprecated Styler.set_na_rep() and Styler.set_precision() in favor of Styler.format() with na_rep and precision as existing and new input arguments respectively (GH40134, GH40425) * Deprecated Styler.where() in favor of using an alternative formulation with Styler.applymap() (GH40821) * Deprecated allowing partial failure in Series.transform() and DataFrame.transform() when func is list-like or dict-like and raises anything but TypeError; func raising anything but a TypeError will raise in a future version (GH40211) * Deprecated arguments error_bad_lines and warn_bad_lines in read_csv() and read_table() in favor of argument on_bad_lines (GH15122) * Deprecated support for np.ma.mrecords.MaskedRecords in the DataFrame constructor, pass {name: data[name] for name in data.dtype.names} instead ( GH40363) * Deprecated using merge(), DataFrame.merge(), and DataFrame.join() on a different number of levels (GH34862) * Deprecated the use of **kwargs in ExcelWriter; use the keyword argument engine_kwargs instead (GH40430) * Deprecated the level keyword for DataFrame and Series aggregations; use groupby instead (GH39983) * Deprecated the inplace parameter of Categorical.remove_categories(), Categorical.add_categories(), Categorical.reorder_categories(), Categorical.rename_categories(), Categorical.set_categories() and will be removed in a future version (GH37643) * Deprecated merge() producing duplicated columns through the suffixes keyword and already existing columns (GH22818) * Deprecated setting Categorical._codes, create a new Categorical with the desired codes instead (GH40606) * Deprecated the convert_float optional argument in read_excel() and ExcelFile.parse() (GH41127) * Deprecated behavior of DatetimeIndex.union() with mixed timezones; in a future version both will be cast to UTC instead of object dtype (GH39328) * Deprecated using usecols with out of bounds indices for read_csv() with engine="c" (GH25623) * Deprecated special treatment of lists with first element a Categorical in the DataFrame constructor; pass as pd.DataFrame({col: categorical, ...}) instead (GH38845) * Deprecated behavior of DataFrame constructor when a dtype is passed and the data cannot be cast to that dtype. In a future version, this will raise instead of being silently ignored (GH24435) * Deprecated the Timestamp.freq attribute. For the properties that use it ( is_month_start, is_month_end, is_quarter_start, is_quarter_end, is_year_start, is_year_end), when you have a freq, use e.g. freq.is_month_start(ts) (GH15146) * Deprecated construction of Series or DataFrame with DatetimeTZDtype data and datetime64[ns] dtype. Use Series(data).dt.tz_localize(None) instead ( GH41555, GH33401) * Deprecated behavior of Series construction with large-integer values and small-integer dtype silently overflowing; use Series(data).astype(dtype) instead (GH41734) * Deprecated behavior of DataFrame construction with floating data and integer dtype casting even when lossy; in a future version this will remain floating, matching Series behavior (GH41770) * Deprecated inference of timedelta64[ns], datetime64[ns], or DatetimeTZDtype dtypes in Series construction when data containing strings is passed and no dtype is passed (GH33558) * In a future version, constructing Series or DataFrame with datetime64[ns] data and DatetimeTZDtype will treat the data as wall-times instead of as UTC times (matching DatetimeIndex behavior). To treat the data as UTC times, use pd.Series(data).dt.tz_localize("UTC").dt.tz_convert(dtype.tz) or pd.Series(data.view("int64"), dtype=dtype) (GH33401) * Deprecated passing lists as key to DataFrame.xs() and Series.xs() (GH41760) * Deprecated boolean arguments of inclusive in Series.between() to have {"left", "right", "neither", "both"} as standard argument values (GH40628) * Deprecated passing arguments as positional for all of the following, with exceptions noted (GH41485): + concat() (other than objs) + read_csv() (other than filepath_or_buffer) + read_table() (other than filepath_or_buffer) + DataFrame.clip() and Series.clip() (other than upper and lower) + DataFrame.drop_duplicates() (except for subset), Series.drop_duplicates (), Index.drop_duplicates() and MultiIndex.drop_duplicates() + DataFrame.drop() (other than labels) and Series.drop() + DataFrame.dropna() and Series.dropna() + DataFrame.ffill(), Series.ffill(), DataFrame.bfill(), and Series.bfill () + DataFrame.fillna() and Series.fillna() (apart from value) + DataFrame.interpolate() and Series.interpolate() (other than method) + DataFrame.mask() and Series.mask() (other than cond and other) + DataFrame.reset_index() (other than level) and Series.reset_index() + DataFrame.set_axis() and Series.set_axis() (other than labels) + DataFrame.set_index() (other than keys) + DataFrame.sort_index() and Series.sort_index() + DataFrame.sort_values() (other than by) and Series.sort_values() + DataFrame.where() and Series.where() (other than cond and other) + Index.set_names() and MultiIndex.set_names() (except for names) + MultiIndex.codes() (except for codes) + MultiIndex.set_levels() (except for levels) + Resampler.interpolate() (other than method) ------------------------------------------------------------------------------- Performance improvements * Performance improvement in IntervalIndex.isin() (GH38353) * Performance improvement in Series.mean() for nullable data types (GH34814) * Performance improvement in Series.isin() for nullable data types (GH38340) * Performance improvement in DataFrame.fillna() with method="pad" or method= "backfill" for nullable floating and nullable integer dtypes (GH39953) * Performance improvement in DataFrame.corr() for method=kendall (GH28329) * Performance improvement in DataFrame.corr() for method=spearman (GH40956, GH41885) * Performance improvement in Rolling.corr() and Rolling.cov() (GH39388) * Performance improvement in RollingGroupby.corr(), ExpandingGroupby.corr(), ExpandingGroupby.corr() and ExpandingGroupby.cov() (GH39591) * Performance improvement in unique() for object data type (GH37615) * Performance improvement in json_normalize() for basic cases (including separators) (GH40035 GH15621) * Performance improvement in ExpandingGroupby aggregation methods (GH39664) * Performance improvement in Styler where render times are more than 50% reduced and now matches DataFrame.to_html() (GH39972 GH39952, GH40425) * The method Styler.set_td_classes() is now as performant as Styler.apply() and Styler.applymap(), and even more so in some cases (GH40453) * Performance improvement in ExponentialMovingWindow.mean() with times ( GH39784) * Performance improvement in GroupBy.apply() when requiring the Python fallback implementation (GH40176) * Performance improvement in the conversion of a PyArrow Boolean array to a pandas nullable Boolean array (GH41051) * Performance improvement for concatenation of data with type CategoricalDtype (GH40193) * Performance improvement in GroupBy.cummin() and GroupBy.cummax() with nullable data types (GH37493) * Performance improvement in Series.nunique() with nan values (GH40865) * Performance improvement in DataFrame.transpose(), Series.unstack() with DatetimeTZDtype (GH40149) * Performance improvement in Series.plot() and DataFrame.plot() with entry point lazy loading (GH41492) ------------------------------------------------------------------------------- Bug fixes ------------------------------------------------------------------------------- Categorical * Bug in CategoricalIndex incorrectly failing to raise TypeError when scalar data is passed (GH38614) * Bug in CategoricalIndex.reindex failed when the Index passed was not categorical but whose values were all labels in the category (GH28690) * Bug where constructing a Categorical from an object-dtype array of date objects did not round-trip correctly with astype (GH38552) * Bug in constructing a DataFrame from an ndarray and a CategoricalDtype ( GH38857) * Bug in setting categorical values into an object-dtype column in a DataFrame (GH39136) * Bug in DataFrame.reindex() was raising an IndexError when the new index contained duplicates and the old index was a CategoricalIndex (GH38906) * Bug in Categorical.fillna() with a tuple-like category raising NotImplementedError instead of ValueError when filling with a non-category tuple (GH41914) ------------------------------------------------------------------------------- Datetimelike * Bug in DataFrame and Series constructors sometimes dropping nanoseconds from Timestamp (resp. Timedelta) data, with dtype=datetime64[ns] (resp. timedelta64[ns]) (GH38032) * Bug in DataFrame.first() and Series.first() with an offset of one month returning an incorrect result when the first day is the last day of a month (GH29623) * Bug in constructing a DataFrame or Series with mismatched datetime64 data and timedelta64 dtype, or vice-versa, failing to raise a TypeError (GH38575 , GH38764, GH38792) * Bug in constructing a Series or DataFrame with a datetime object out of bounds for datetime64[ns] dtype or a timedelta object out of bounds for timedelta64[ns] dtype (GH38792, GH38965) * Bug in DatetimeIndex.intersection(), DatetimeIndex.symmetric_difference(), PeriodIndex.intersection(), PeriodIndex.symmetric_difference() always returning object-dtype when operating with CategoricalIndex (GH38741) * Bug in DatetimeIndex.intersection() giving incorrect results with non-Tick frequencies with n != 1 (GH42104) * Bug in Series.where() incorrectly casting datetime64 values to int64 ( GH37682) * Bug in Categorical incorrectly typecasting datetime object to Timestamp ( GH38878) * Bug in comparisons between Timestamp object and datetime64 objects just outside the implementation bounds for nanosecond datetime64 (GH39221) * Bug in Timestamp.round(), Timestamp.floor(), Timestamp.ceil() for values near the implementation bounds of Timestamp (GH39244) * Bug in Timedelta.round(), Timedelta.floor(), Timedelta.ceil() for values near the implementation bounds of Timedelta (GH38964) * Bug in date_range() incorrectly creating DatetimeIndex containing NaT instead of raising OutOfBoundsDatetime in corner cases (GH24124) * Bug in infer_freq() incorrectly fails to infer 'H' frequency of DatetimeIndex if the latter has a timezone and crosses DST boundaries ( GH39556) * Bug in Series backed by DatetimeArray or TimedeltaArray sometimes failing to set the array's freq to None (GH41425) ------------------------------------------------------------------------------- Timedelta * Bug in constructing Timedelta from np.timedelta64 objects with non-nanosecond units that are out of bounds for timedelta64[ns] (GH38965) * Bug in constructing a TimedeltaIndex incorrectly accepting np.datetime64 ("NaT") objects (GH39462) * Bug in constructing Timedelta from an input string with only symbols and no digits failed to raise an error (GH39710) * Bug in TimedeltaIndex and to_timedelta() failing to raise when passed non-nanosecond timedelta64 arrays that overflow when converting to timedelta64[ns] (GH40008) ------------------------------------------------------------------------------- Timezones * Bug in different tzinfo objects representing UTC not being treated as equivalent (GH39216) * Bug in dateutil.tz.gettz("UTC") not being recognized as equivalent to other UTC-representing tzinfos (GH39276) ------------------------------------------------------------------------------- Numeric * Bug in DataFrame.quantile(), DataFrame.sort_values() causing incorrect subsequent indexing behavior (GH38351) * Bug in DataFrame.sort_values() raising an IndexError for empty by (GH40258) * Bug in DataFrame.select_dtypes() with include=np.number would drop numeric ExtensionDtype columns (GH35340) * Bug in DataFrame.mode() and Series.mode() not keeping consistent integer Index for empty input (GH33321) * Bug in DataFrame.rank() when the DataFrame contained np.inf (GH32593) * Bug in DataFrame.rank() with axis=0 and columns holding incomparable types raising an IndexError (GH38932) * Bug in Series.rank(), DataFrame.rank(), and GroupBy.rank() treating the most negative int64 value as missing (GH32859) * Bug in DataFrame.select_dtypes() different behavior between Windows and Linux with include="int" (GH36596) * Bug in DataFrame.apply() and DataFrame.agg() when passed the argument func= "size" would operate on the entire DataFrame instead of rows or columns ( GH39934) * Bug in DataFrame.transform() would raise a SpecificationError when passed a dictionary and columns were missing; will now raise a KeyError instead ( GH40004) * Bug in GroupBy.rank() giving incorrect results with pct=True and equal values between consecutive groups (GH40518) * Bug in Series.count() would result in an int32 result on 32-bit platforms when argument level=None (GH40908) * Bug in Series and DataFrame reductions with methods any and all not returning Boolean results for object data (GH12863, GH35450, GH27709) * Bug in Series.clip() would fail if the Series contains NA values and has nullable int or float as a data type (GH40851) * Bug in UInt64Index.where() and UInt64Index.putmask() with an np.int64 dtype other incorrectly raising TypeError (GH41974) * Bug in DataFrame.agg() not sorting the aggregated axis in the order of the provided aggregation functions when one or more aggregation function fails to produce results (GH33634) * Bug in DataFrame.clip() not interpreting missing values as no threshold ( GH40420) ------------------------------------------------------------------------------- Conversion * Bug in Series.to_dict() with orient='records' now returns Python native types (GH25969) * Bug in Series.view() and Index.view() when converting between datetime-like (datetime64[ns], datetime64[ns, tz], timedelta64, period) dtypes (GH39788) * Bug in creating a DataFrame from an empty np.recarray not retaining the original dtypes (GH40121) * Bug in DataFrame failing to raise a TypeError when constructing from a frozenset (GH40163) * Bug in Index construction silently ignoring a passed dtype when the data cannot be cast to that dtype (GH21311) * Bug in StringArray.astype() falling back to NumPy and raising when converting to dtype='categorical' (GH40450) * Bug in factorize() where, when given an array with a numeric NumPy dtype lower than int64, uint64 and float64, the unique values did not keep their original dtype (GH41132) * Bug in DataFrame construction with a dictionary containing an array-like with ExtensionDtype and copy=True failing to make a copy (GH38939) * Bug in qcut() raising error when taking Float64DType as input (GH40730) * Bug in DataFrame and Series construction with datetime64[ns] data and dtype =object resulting in datetime objects instead of Timestamp objects (GH41599 ) * Bug in DataFrame and Series construction with timedelta64[ns] data and dtype=object resulting in np.timedelta64 objects instead of Timedelta objects (GH41599) * Bug in DataFrame construction when given a two-dimensional object-dtype np.ndarray of Period or Interval objects failing to cast to PeriodDtype or IntervalDtype, respectively (GH41812) * Bug in constructing a Series from a list and a PandasDtype (GH39357) * Bug in creating a Series from a range object that does not fit in the bounds of int64 dtype (GH30173) * Bug in creating a Series from a dict with all-tuple keys and an Index that requires reindexing (GH41707) * Bug in infer_dtype() not recognizing Series, Index, or array with a Period dtype (GH23553) * Bug in infer_dtype() raising an error for general ExtensionArray objects. It will now return "unknown-array" instead of raising (GH37367) * Bug in DataFrame.convert_dtypes() incorrectly raised a ValueError when called on an empty DataFrame (GH40393) ------------------------------------------------------------------------------- Strings * Bug in the conversion from pyarrow.ChunkedArray to StringArray when the original had zero chunks (GH41040) * Bug in Series.replace() and DataFrame.replace() ignoring replacements with regex=True for StringDType data (GH41333, GH35977) * Bug in Series.str.extract() with StringArray returning object dtype for an empty DataFrame (GH41441) * Bug in Series.str.replace() where the case argument was ignored when regex= False (GH41602) ------------------------------------------------------------------------------- Interval * Bug in IntervalIndex.intersection() and IntervalIndex.symmetric_difference () always returning object-dtype when operating with CategoricalIndex ( GH38653, GH38741) * Bug in IntervalIndex.intersection() returning duplicates when at least one of the Index objects have duplicates which are present in the other ( GH38743) * IntervalIndex.union(), IntervalIndex.intersection(), IntervalIndex.difference(), and IntervalIndex.symmetric_difference() now cast to the appropriate dtype instead of raising a TypeError when operating with another IntervalIndex with incompatible dtype (GH39267) * PeriodIndex.union(), PeriodIndex.intersection(), PeriodIndex.symmetric_difference(), PeriodIndex.difference() now cast to object dtype instead of raising IncompatibleFrequency when operating with another PeriodIndex with incompatible dtype (GH39306) * Bug in IntervalIndex.is_monotonic(), IntervalIndex.get_loc(), IntervalIndex.get_indexer_for(), and IntervalIndex.__contains__() when NA values are present (GH41831) ------------------------------------------------------------------------------- Indexing * Bug in Index.union() and MultiIndex.union() dropping duplicate Index values when Index was not monotonic or sort was set to False (GH36289, GH31326, GH40862) * Bug in CategoricalIndex.get_indexer() failing to raise InvalidIndexError when non-unique (GH38372) * Bug in IntervalIndex.get_indexer() when target has CategoricalDtype and both the index and the target contain NA values (GH41934) * Bug in Series.loc() raising a ValueError when input was filtered with a Boolean list and values to set were a list with lower dimension (GH20438) * Bug in inserting many new columns into a DataFrame causing incorrect subsequent indexing behavior (GH38380) * Bug in DataFrame.__setitem__() raising a ValueError when setting multiple values to duplicate columns (GH15695) * Bug in DataFrame.loc(), Series.loc(), DataFrame.__getitem__() and Series.__getitem__() returning incorrect elements for non-monotonic DatetimeIndex for string slices (GH33146) * Bug in DataFrame.reindex() and Series.reindex() with timezone aware indexes raising a TypeError for method="ffill" and method="bfill" and specified tolerance (GH38566) * Bug in DataFrame.reindex() with datetime64[ns] or timedelta64[ns] incorrectly casting to integers when the fill_value requires casting to object dtype (GH39755) * Bug in DataFrame.__setitem__() raising a ValueError when setting on an empty DataFrame using specified columns and a nonempty DataFrame value ( GH38831) * Bug in DataFrame.loc.__setitem__() raising a ValueError when operating on a unique column when the DataFrame has duplicate columns (GH38521) * Bug in DataFrame.iloc.__setitem__() and DataFrame.loc.__setitem__() with mixed dtypes when setting with a dictionary value (GH38335) * Bug in Series.loc.__setitem__() and DataFrame.loc.__setitem__() raising KeyError when provided a Boolean generator (GH39614) * Bug in Series.iloc() and DataFrame.iloc() raising a KeyError when provided a generator (GH39614) * Bug in DataFrame.__setitem__() not raising a ValueError when the right hand side is a DataFrame with wrong number of columns (GH38604) * Bug in Series.__setitem__() raising a ValueError when setting a Series with a scalar indexer (GH38303) * Bug in DataFrame.loc() dropping levels of a MultiIndex when the DataFrame used as input has only one row (GH10521) * Bug in DataFrame.__getitem__() and Series.__getitem__() always raising KeyError when slicing with existing strings where the Index has milliseconds (GH33589) * Bug in setting timedelta64 or datetime64 values into numeric Series failing to cast to object dtype (GH39086, GH39619) * Bug in setting Interval values into a Series or DataFrame with mismatched IntervalDtype incorrectly casting the new values to the existing dtype ( GH39120) * Bug in setting datetime64 values into a Series with integer-dtype incorrectly casting the datetime64 values to integers (GH39266) * Bug in setting np.datetime64("NaT") into a Series with Datetime64TZDtype incorrectly treating the timezone-naive value as timezone-aware (GH39769) * Bug in Index.get_loc() not raising KeyError when key=NaN and method is specified but NaN is not in the Index (GH39382) * Bug in DatetimeIndex.insert() when inserting np.datetime64("NaT") into a timezone-aware index incorrectly treating the timezone-naive value as timezone-aware (GH39769) * Bug in incorrectly raising in Index.insert(), when setting a new column that cannot be held in the existing frame.columns, or in Series.reset_index () or DataFrame.reset_index() instead of casting to a compatible dtype ( GH39068) * Bug in RangeIndex.append() where a single object of length 1 was concatenated incorrectly (GH39401) * Bug in RangeIndex.astype() where when converting to CategoricalIndex, the categories became a Int64Index instead of a RangeIndex (GH41263) * Bug in setting numpy.timedelta64 values into an object-dtype Series using a Boolean indexer (GH39488) * Bug in setting numeric values into a into a boolean-dtypes Series using at or iat failing to cast to object-dtype (GH39582) * Bug in DataFrame.__setitem__() and DataFrame.iloc.__setitem__() raising ValueError when trying to index with a row-slice and setting a list as values (GH40440) * Bug in DataFrame.loc() not raising KeyError when the key was not found in MultiIndex and the levels were not fully specified (GH41170) * Bug in DataFrame.loc.__setitem__() when setting-with-expansion incorrectly raising when the index in the expanding axis contained duplicates (GH40096) * Bug in DataFrame.loc.__getitem__() with MultiIndex casting to float when at least one index column has float dtype and we retrieve a scalar (GH41369) * Bug in DataFrame.loc() incorrectly matching non-Boolean index elements ( GH20432) * Bug in indexing with np.nan on a Series or DataFrame with a CategoricalIndex incorrectly raising KeyError when np.nan keys are present (GH41933) * Bug in Series.__delitem__() with ExtensionDtype incorrectly casting to ndarray (GH40386) * Bug in DataFrame.at() with a CategoricalIndex returning incorrect results when passed integer keys (GH41846) * Bug in DataFrame.loc() returning a MultiIndex in the wrong order if an indexer has duplicates (GH40978) * Bug in DataFrame.__setitem__() raising a TypeError when using a str subclass as the column name with a DatetimeIndex (GH37366) * Bug in PeriodIndex.get_loc() failing to raise a KeyError when given a Period with a mismatched freq (GH41670) * Bug .loc.__getitem__ with a UInt64Index and negative-integer keys raising OverflowError instead of KeyError in some cases, wrapping around to positive integers in others (GH41777) * Bug in Index.get_indexer() failing to raise ValueError in some cases with invalid method, limit, or tolerance arguments (GH41918) * Bug when slicing a Series or DataFrame with a TimedeltaIndex when passing an invalid string raising ValueError instead of a TypeError (GH41821) * Bug in Index constructor sometimes silently ignoring a specified dtype ( GH38879) * Index.where() behavior now mirrors Index.putmask() behavior, i.e. index.where(mask, other) matches index.putmask(~mask, other) (GH39412) ------------------------------------------------------------------------------- Missing * Bug in Grouper did not correctly propagate the dropna argument; DataFrameGroupBy.transform() now correctly handles missing values for dropna=True (GH35612) * Bug in isna(), Series.isna(), Index.isna(), DataFrame.isna(), and the corresponding notna functions not recognizing Decimal("NaN") objects ( GH39409) * Bug in DataFrame.fillna() not accepting a dictionary for the downcast keyword (GH40809) * Bug in isna() not returning a copy of the mask for nullable types, causing any subsequent mask modification to change the original array (GH40935) * Bug in DataFrame construction with float data containing NaN and an integer dtype casting instead of retaining the NaN (GH26919) * Bug in Series.isin() and MultiIndex.isin() didn't treat all nans as equivalent if they were in tuples (GH41836) ------------------------------------------------------------------------------- MultiIndex * Bug in DataFrame.drop() raising a TypeError when the MultiIndex is non-unique and level is not provided (GH36293) * Bug in MultiIndex.intersection() duplicating NaN in the result (GH38623) * Bug in MultiIndex.equals() incorrectly returning True when the MultiIndex contained NaN even when they are differently ordered (GH38439) * Bug in MultiIndex.intersection() always returning an empty result when intersecting with CategoricalIndex (GH38653) * Bug in MultiIndex.difference() incorrectly raising TypeError when indexes contain non-sortable entries (GH41915) * Bug in MultiIndex.reindex() raising a ValueError when used on an empty MultiIndex and indexing only a specific level (GH41170) * Bug in MultiIndex.reindex() raising TypeError when reindexing against a flat Index (GH41707) ------------------------------------------------------------------------------- I/O * Bug in Index.__repr__() when display.max_seq_items=1 (GH38415) * Bug in read_csv() not recognizing scientific notation if the argument decimal is set and engine="python" (GH31920) * Bug in read_csv() interpreting NA value as comment, when NA does contain the comment string fixed for engine="python" (GH34002) * Bug in read_csv() raising an IndexError with multiple header columns and index_col is specified when the file has no data rows (GH38292) * Bug in read_csv() not accepting usecols with a different length than names for engine="python" (GH16469) * Bug in read_csv() returning object dtype when delimiter="," with usecols and parse_dates specified for engine="python" (GH35873) * Bug in read_csv() raising a TypeError when names and parse_dates is specified for engine="c" (GH33699) * Bug in read_clipboard() and DataFrame.to_clipboard() not working in WSL ( GH38527) * Allow custom error values for the parse_dates argument of read_sql(), read_sql_query() and read_sql_table() (GH35185) * Bug in DataFrame.to_hdf() and Series.to_hdf() raising a KeyError when trying to apply for subclasses of DataFrame or Series (GH33748) * Bug in HDFStore.put() raising a wrong TypeError when saving a DataFrame with non-string dtype (GH34274) * Bug in json_normalize() resulting in the first element of a generator object not being included in the returned DataFrame (GH35923) * Bug in read_csv() applying the thousands separator to date columns when the column should be parsed for dates and usecols is specified for engine= "python" (GH39365) * Bug in read_excel() forward filling MultiIndex names when multiple header and index columns are specified (GH34673) * Bug in read_excel() not respecting set_option() (GH34252) * Bug in read_csv() not switching true_values and false_values for nullable Boolean dtype (GH34655) * Bug in read_json() when orient="split" not maintaining a numeric string index (GH28556) * read_sql() returned an empty generator if chunksize was non-zero and the query returned no results. Now returns a generator with a single empty DataFrame (GH34411) * Bug in read_hdf() returning unexpected records when filtering on categorical string columns using the where parameter (GH39189) * Bug in read_sas() raising a ValueError when datetimes were null (GH39725) * Bug in read_excel() dropping empty values from single-column spreadsheets ( GH39808) * Bug in read_excel() loading trailing empty rows/columns for some filetypes (GH41167) * Bug in read_excel() raising an AttributeError when the excel file had a MultiIndex header followed by two empty rows and no index (GH40442) * Bug in read_excel(), read_csv(), read_table(), read_fwf(), and read_clipboard() where one blank row after a MultiIndex header with no index would be dropped (GH40442) * Bug in DataFrame.to_string() misplacing the truncation column when index= False (GH40904) * Bug in DataFrame.to_string() adding an extra dot and misaligning the truncation row when index=False (GH40904) * Bug in read_orc() always raising an AttributeError (GH40918) * Bug in read_csv() and read_table() silently ignoring prefix if names and prefix are defined, now raising a ValueError (GH39123) * Bug in read_csv() and read_excel() not respecting the dtype for a duplicated column name when mangle_dupe_cols is set to True (GH35211) * Bug in read_csv() silently ignoring sep if delimiter and sep are defined, now raising a ValueError (GH39823) * Bug in read_csv() and read_table() misinterpreting arguments when sys.setprofile had been previously called (GH41069) * Bug in the conversion from PyArrow to pandas (e.g. for reading Parquet) with nullable dtypes and a PyArrow array whose data buffer size is not a multiple of the dtype size (GH40896) * Bug in read_excel() would raise an error when pandas could not determine the file type even though the user specified the engine argument (GH41225) * Bug in read_clipboard() copying from an excel file shifts values into the wrong column if there are null values in first column (GH41108) * Bug in DataFrame.to_hdf() and Series.to_hdf() raising a TypeError when trying to append a string column to an incompatible column (GH41897) ------------------------------------------------------------------------------- Period * Comparisons of Period objects or Index, Series, or DataFrame with mismatched PeriodDtype now behave like other mismatched-type comparisons, returning False for equals, True for not-equal, and raising TypeError for inequality checks (GH39274) ------------------------------------------------------------------------------- Plotting * Bug in plotting.scatter_matrix() raising when 2d ax argument passed ( GH16253) * Prevent warnings when Matplotlib's constrained_layout is enabled (GH25261) * Bug in DataFrame.plot() was showing the wrong colors in the legend if the function was called repeatedly and some calls used yerr while others didn t (GH39522) * Bug in DataFrame.plot() was showing the wrong colors in the legend if the function was called repeatedly and some calls used secondary_y and others use legend=False (GH40044) * Bug in DataFrame.plot.box() when dark_background theme was selected, caps or min/max markers for the plot were not visible (GH40769) ------------------------------------------------------------------------------- Groupby/resample/rolling * Bug in GroupBy.agg() with PeriodDtype columns incorrectly casting results too aggressively (GH38254) * Bug in SeriesGroupBy.value_counts() where unobserved categories in a grouped categorical Series were not tallied (GH38672) * Bug in SeriesGroupBy.value_counts() where an error was raised on an empty Series (GH39172) * Bug in GroupBy.indices() would contain non-existent indices when null values were present in the groupby keys (GH9304) * Fixed bug in GroupBy.sum() causing a loss of precision by now using Kahan summation (GH38778) * Fixed bug in GroupBy.cumsum() and GroupBy.mean() causing loss of precision through using Kahan summation (GH38934) * Bug in Resampler.aggregate() and DataFrame.transform() raising a TypeError instead of SpecificationError when missing keys had mixed dtypes (GH39025) * Bug in DataFrameGroupBy.idxmin() and DataFrameGroupBy.idxmax() with ExtensionDtype columns (GH38733) * Bug in Series.resample() would raise when the index was a PeriodIndex consisting of NaT (GH39227) * Bug in RollingGroupby.corr() and ExpandingGroupby.corr() where the groupby column would return 0 instead of np.nan when providing other that was longer than each group (GH39591) * Bug in ExpandingGroupby.corr() and ExpandingGroupby.cov() where 1 would be returned instead of np.nan when providing other that was longer than each group (GH39591) * Bug in GroupBy.mean(), GroupBy.median() and DataFrame.pivot_table() not propagating metadata (GH28283) * Bug in Series.rolling() and DataFrame.rolling() not calculating window bounds correctly when window is an offset and dates are in descending order (GH40002) * Bug in Series.groupby() and DataFrame.groupby() on an empty Series or DataFrame would lose index, columns, and/or data types when directly using the methods idxmax, idxmin, mad, min, max, sum, prod, and skew or using them through apply, aggregate, or resample (GH26411) * Bug in GroupBy.apply() where a MultiIndex would be created instead of an Index when used on a RollingGroupby object (GH39732) * Bug in DataFrameGroupBy.sample() where an error was raised when weights was specified and the index was an Int64Index (GH39927) * Bug in DataFrameGroupBy.aggregate() and Resampler.aggregate() would sometimes raise a SpecificationError when passed a dictionary and columns were missing; will now always raise a KeyError instead (GH40004) * Bug in DataFrameGroupBy.sample() where column selection was not applied before computing the result (GH39928) * Bug in ExponentialMovingWindow when calling __getitem__ would incorrectly raise a ValueError when providing times (GH40164) * Bug in ExponentialMovingWindow when calling __getitem__ would not retain com, span, alpha or halflife attributes (GH40164) * ExponentialMovingWindow now raises a NotImplementedError when specifying times with adjust=False due to an incorrect calculation (GH40098) * Bug in ExponentialMovingWindowGroupby.mean() where the times argument was ignored when engine='numba' (GH40951) * Bug in ExponentialMovingWindowGroupby.mean() where the wrong times were used the in case of multiple groups (GH40951) * Bug in ExponentialMovingWindowGroupby where the times vector and values became out of sync for non-trivial groups (GH40951) * Bug in Series.asfreq() and DataFrame.asfreq() dropping rows when the index was not sorted (GH39805) * Bug in aggregation functions for DataFrame not respecting numeric_only argument when level keyword was given (GH40660) * Bug in SeriesGroupBy.aggregate() where using a user-defined function to aggregate a Series with an object-typed Index causes an incorrect Index shape (GH40014) * Bug in RollingGroupby where as_index=False argument in groupby was ignored (GH39433) * Bug in GroupBy.any() and GroupBy.all() raising a ValueError when using with nullable type columns holding NA even with skipna=True (GH40585) * Bug in GroupBy.cummin() and GroupBy.cummax() incorrectly rounding integer values near the int64 implementations bounds (GH40767) * Bug in GroupBy.rank() with nullable dtypes incorrectly raising a TypeError (GH41010) * Bug in GroupBy.cummin() and GroupBy.cummax() computing wrong result with nullable data types too large to roundtrip when casting to float (GH37493) * Bug in DataFrame.rolling() returning mean zero for all NaN window with min_periods=0 if calculation is not numerical stable (GH41053) * Bug in DataFrame.rolling() returning sum not zero for all NaN window with min_periods=0 if calculation is not numerical stable (GH41053) * Bug in SeriesGroupBy.agg() failing to retain ordered CategoricalDtype on order-preserving aggregations (GH41147) * Bug in GroupBy.min() and GroupBy.max() with multiple object-dtype columns and numeric_only=False incorrectly raising a ValueError (GH41111) * Bug in DataFrameGroupBy.rank() with the GroupBy object's axis=0 and the rank method's keyword axis=1 (GH41320) * Bug in DataFrameGroupBy.__getitem__() with non-unique columns incorrectly returning a malformed SeriesGroupBy instead of DataFrameGroupBy (GH41427) * Bug in DataFrameGroupBy.transform() with non-unique columns incorrectly raising an AttributeError (GH41427) * Bug in Resampler.apply() with non-unique columns incorrectly dropping duplicated columns (GH41445) * Bug in Series.groupby() aggregations incorrectly returning empty Series instead of raising TypeError on aggregations that are invalid for its dtype, e.g. .prod with datetime64[ns] dtype (GH41342) * Bug in DataFrameGroupBy aggregations incorrectly failing to drop columns with invalid dtypes for that aggregation when there are no valid columns ( GH41291) * Bug in DataFrame.rolling.__iter__() where on was not assigned to the index of the resulting objects (GH40373) * Bug in DataFrameGroupBy.transform() and DataFrameGroupBy.agg() with engine= "numba" where *args were being cached with the user passed function ( GH41647) * Bug in DataFrameGroupBy methods agg, transform, sum, bfill, ffill, pad, pct_change, shift, ohlc dropping .columns.names (GH41497) ------------------------------------------------------------------------------- Reshaping * Bug in merge() raising error when performing an inner join with partial index and right_index=True when there was no overlap between indices ( GH33814) * Bug in DataFrame.unstack() with missing levels led to incorrect index names (GH37510) * Bug in merge_asof() propagating the right Index with left_index=True and right_on specification instead of left Index (GH33463) * Bug in DataFrame.join() on a DataFrame with a MultiIndex returned the wrong result when one of both indexes had only one level (GH36909) * merge_asof() now raises a ValueError instead of a cryptic TypeError in case of non-numerical merge columns (GH29130) * Bug in DataFrame.join() not assigning values correctly when the DataFrame had a MultiIndex where at least one dimension had dtype Categorical with non-alphabetically sorted categories (GH38502) * Series.value_counts() and Series.mode() now return consistent keys in original order (GH12679, GH11227 and GH39007) * Bug in DataFrame.stack() not handling NaN in MultiIndex columns correctly ( GH39481) * Bug in DataFrame.apply() would give incorrect results when the argument func was a string, axis=1, and the axis argument was not supported; now raises a ValueError instead (GH39211) * Bug in DataFrame.sort_values() not reshaping the index correctly after sorting on columns when ignore_index=True (GH39464) * Bug in DataFrame.append() returning incorrect dtypes with combinations of ExtensionDtype dtypes (GH39454) * Bug in DataFrame.append() returning incorrect dtypes when used with combinations of datetime64 and timedelta64 dtypes (GH39574) * Bug in DataFrame.append() with a DataFrame with a MultiIndex and appending a Series whose Index is not a MultiIndex (GH41707) * Bug in DataFrame.pivot_table() returning a MultiIndex for a single value when operating on an empty DataFrame (GH13483) * Index can now be passed to the numpy.all() function (GH40180) * Bug in DataFrame.stack() not preserving CategoricalDtype in a MultiIndex ( GH36991) * Bug in to_datetime() raising an error when the input sequence contained unhashable items (GH39756) * Bug in Series.explode() preserving the index when ignore_index was True and values were scalars (GH40487) * Bug in to_datetime() raising a ValueError when Series contains None and NaT and has more than 50 elements (GH39882) * Bug in Series.unstack() and DataFrame.unstack() with object-dtype values containing timezone-aware datetime objects incorrectly raising TypeError ( GH41875) * Bug in DataFrame.melt() raising InvalidIndexError when DataFrame has duplicate columns used as value_vars (GH41951) ------------------------------------------------------------------------------- Sparse * Bug in DataFrame.sparse.to_coo() raising a KeyError with columns that are a numeric Index without a 0 (GH18414) * Bug in SparseArray.astype() with copy=False producing incorrect results when going from integer dtype to floating dtype (GH34456) * Bug in SparseArray.max() and SparseArray.min() would always return an empty result (GH40921) ------------------------------------------------------------------------------- ExtensionArray * Bug in DataFrame.where() when other is a Series with an ExtensionDtype ( GH38729) * Fixed bug where Series.idxmax(), Series.idxmin(), Series.argmax(), and Series.argmin() would fail when the underlying data is an ExtensionArray ( GH32749, GH33719, GH36566) * Fixed bug where some properties of subclasses of PandasExtensionDtype where improperly cached (GH40329) * Bug in DataFrame.mask() where masking a DataFrame with an ExtensionDtype raises a ValueError (GH40941) ------------------------------------------------------------------------------- Styler * Bug in Styler where the subset argument in methods raised an error for some valid MultiIndex slices (GH33562) * Styler rendered HTML output has seen minor alterations to support w3 good code standards (GH39626) * Bug in Styler where rendered HTML was missing a column class identifier for certain header cells (GH39716) * Bug in Styler.background_gradient() where text-color was not determined correctly (GH39888) * Bug in Styler.set_table_styles() where multiple elements in CSS-selectors of the table_styles argument were not correctly added (GH34061) * Bug in Styler where copying from Jupyter dropped the top left cell and misaligned headers (GH12147) * Bug in Styler.where where kwargs were not passed to the applicable callable (GH40845) * Bug in Styler causing CSS to duplicate on multiple renders (GH39395, GH40334) ------------------------------------------------------------------------------- Other * inspect.getmembers(Series) no longer raises an AbstractMethodError (GH38782 ) * Bug in Series.where() with numeric dtype and other=None not casting to nan (GH39761) * Bug in assert_series_equal(), assert_frame_equal(), assert_index_equal() and assert_extension_array_equal() incorrectly raising when an attribute has an unrecognized NA type (GH39461) * Bug in assert_index_equal() with exact=True not raising when comparing CategoricalIndex instances with Int64Index and RangeIndex categories ( GH41263) * Bug in DataFrame.equals(), Series.equals(), and Index.equals() with object-dtype containing np.datetime64("NaT") or np.timedelta64("NaT") ( GH39650) * Bug in show_versions() where console JSON output was not proper JSON ( GH39701) * pandas can now compile on z/OS when using xlc (GH35826) * Bug in pandas.util.hash_pandas_object() not recognizing hash_key, encoding and categorize when the input object type is a DataFrame (GH41404) What's new in 1.2.5 (June 22, 2021) These are the changes in pandas 1.2.5. See Release notes for a full changelog including other versions of pandas. ------------------------------------------------------------------------------- Fixed regressions * Fixed regression in concat() between two DataFrame where one has an Index that is all-None and the other is DatetimeIndex incorrectly raising ( GH40841) * Fixed regression in DataFrame.sum() and DataFrame.prod() when min_count and numeric_only are both given (GH41074) * Fixed regression in read_csv() when using memory_map=True with an non-UTF8 encoding (GH40986) * Fixed regression in DataFrame.replace() and Series.replace() when the values to replace is a NumPy float array (GH40371) * Fixed regression in ExcelFile() when a corrupt file is opened but not closed (GH41778) * Fixed regression in DataFrame.astype() with dtype=str failing to convert NaN in categorical columns (GH41797) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.33 2021/05/06 04:39:03 adam Exp $ d3 1 a3 1 DISTNAME= pandas-1.3.4 @ 1.33 log @py-pandas: updated to 1.2.4 What's new in 1.2.4 (April 12, 2021) Fixed regressions - Fixed regression in :meth:`DataFrame.sum` when ``min_count`` greater than the :class:`DataFrame` shape was passed resulted in a ``ValueError`` (:issue:`39738`) - Fixed regression in :meth:`DataFrame.to_json` raising ``AttributeError`` when run on PyPy (:issue:`39837`) - Fixed regression in (in)equality comparison of ``pd.NaT`` with a non-datetimelike numpy array returning a scalar instead of an array (:issue:`40722`) - Fixed regression in :meth:`DataFrame.where` not returning a copy in the case of an all True condition (:issue:`39595`) - Fixed regression in :meth:`DataFrame.replace` raising ``IndexError`` when ``regex`` was a multi-key dictionary (:issue:`39338`) - Fixed regression in repr of floats in an ``object`` column not respecting ``float_format`` when printed in the console or outputted through :meth:`DataFrame.to_string`, :meth:`DataFrame.to_html`, and :meth:`DataFrame.to_latex` (:issue:`40024`) - Fixed regression in NumPy ufuncs such as ``np.add`` not passing through all arguments for :class:`DataFrame` What's new in 1.2.3 (March 02, 2021) Fixed regressions - Fixed regression in :meth:`~DataFrame.to_excel` raising ``KeyError`` when giving duplicate columns with ``columns`` attribute (:issue:`39695`) - Fixed regression in nullable integer unary ops propagating mask on assignment (:issue:`39943`) - Fixed regression in :meth:`DataFrame.__setitem__` not aligning :class:`DataFrame` on right-hand side for boolean indexer (:issue:`39931`) - Fixed regression in :meth:`~DataFrame.to_json` failing to use ``compression`` with URL-like paths that are internally opened in binary mode or with user-provided file objects that are opened in binary mode (:issue:`39985`) - Fixed regression in :meth:`Series.sort_index` and :meth:`DataFrame.sort_index`, which exited with an ungraceful error when having kwarg ``ascending=None`` passed. Passing ``ascending=None`` is still considered invalid, and the improved error message suggests a proper usage (``ascending`` must be a boolean or a list-like of boolean) (:issue:`39434`) - Fixed regression in :meth:`DataFrame.transform` and :meth:`Series.transform` giving incorrect column labels when passed a dictionary with a mix of list and non-list values (:issue:`40018`) What's new in 1.2.2 (February 09, 2021) --------------------------------------- These are the changes in pandas 1.2.2. See :ref:`release` for a full changelog including other versions of pandas. {{ header }} .. --------------------------------------------------------------------------- .. _whatsnew_122.regressions: Fixed regressions ~~~~~~~~~~~~~~~~~ - Fixed regression in :func:`read_excel` that caused it to raise ``AttributeError`` when checking version of older xlrd versions (:issue:`38955`) - Fixed regression in :class:`DataFrame` constructor reordering element when construction from datetime ndarray with dtype not ``"datetime64[ns]"`` (:issue:`39422`) - Fixed regression in :meth:`DataFrame.astype` and :meth:`Series.astype` not casting to bytes dtype (:issue:`39474`) - Fixed regression in :meth:`~DataFrame.to_pickle` failing to create bz2/xz compressed pickle files with ``protocol=5`` (:issue:`39002`) - Fixed regression in :func:`pandas.testing.assert_series_equal` and :func:`pandas.testing.assert_frame_equal` always raising ``AssertionError`` when comparing extension dtypes (:issue:`39410`) - Fixed regression in :meth:`~DataFrame.to_csv` opening ``codecs.StreamWriter`` in binary mode instead of in text mode and ignoring user-provided ``mode`` (:issue:`39247`) - Fixed regression in :meth:`Categorical.astype` casting to incorrect dtype when ``np.int32`` is passed to dtype argument (:issue:`39402`) - Fixed regression in :meth:`~DataFrame.to_excel` creating corrupt files when appending (``mode="a"``) to an existing file (:issue:`39576`) - Fixed regression in :meth:`DataFrame.transform` failing in case of an empty DataFrame or Series (:issue:`39636`) - Fixed regression in :meth:`~DataFrame.groupby` or :meth:`~DataFrame.resample` when aggregating an all-NaN or numeric object dtype column (:issue:`39329`) - Fixed regression in :meth:`.Rolling.count` where the ``min_periods`` argument would be set to ``0`` after the operation (:issue:`39554`) - Fixed regression in :func:`read_excel` that incorrectly raised when the argument ``io`` was a non-path and non-buffer and the ``engine`` argument was specified (:issue:`39528`) .. --------------------------------------------------------------------------- .. _whatsnew_122.bug_fixes: Bug fixes ~~~~~~~~~ - :func:`pandas.read_excel` error message when a specified ``sheetname`` does not exist is now uniform across engines (:issue:`39250`) - Fixed bug in :func:`pandas.read_excel` producing incorrect results when the engine ``openpyxl`` is used and the excel file is missing or has incorrect dimension information; the fix requires ``openpyxl`` >= 3.0.0, prior versions may still fail (:issue:`38956`, :issue:`39001`) - Fixed bug in :func:`pandas.read_excel` sometimes producing a ``DataFrame`` with trailing rows of ``np.nan`` when the engine ``openpyxl`` is used (:issue:`39181`) What's new in 1.2.1 (January 20, 2021) -------------------------------------- These are the changes in pandas 1.2.1. See :ref:`release` for a full changelog including other versions of pandas. {{ header }} .. --------------------------------------------------------------------------- .. _whatsnew_121.regressions: Fixed regressions ~~~~~~~~~~~~~~~~~ - Fixed regression in :meth:`~DataFrame.to_csv` that created corrupted zip files when there were more rows than ``chunksize`` (:issue:`38714`) - Fixed regression in :meth:`~DataFrame.to_csv` opening ``codecs.StreamReaderWriter`` in binary mode instead of in text mode (:issue:`39247`) - Fixed regression in :meth:`read_csv` and other read functions were the encoding error policy (``errors``) did not default to ``"replace"`` when no encoding was specified (:issue:`38989`) - Fixed regression in :func:`read_excel` with non-rawbyte file handles (:issue:`38788`) - Fixed regression in :meth:`DataFrame.to_stata` not removing the created file when an error occured (:issue:`39202`) - Fixed regression in ``DataFrame.__setitem__`` raising ``ValueError`` when expanding :class:`DataFrame` and new column is from type ``"0 - name"`` (:issue:`39010`) - Fixed regression in setting with :meth:`DataFrame.loc` raising ``ValueError`` when :class:`DataFrame` has unsorted :class:`MultiIndex` columns and indexer is a scalar (:issue:`38601`) - Fixed regression in setting with :meth:`DataFrame.loc` raising ``KeyError`` with :class:`MultiIndex` and list-like columns indexer enlarging :class:`DataFrame` (:issue:`39147`) - Fixed regression in :meth:`~DataFrame.groupby()` with :class:`Categorical` grouping column not showing unused categories for ``grouped.indices`` (:issue:`38642`) - Fixed regression in :meth:`.GroupBy.sem` where the presence of non-numeric columns would cause an error instead of being dropped (:issue:`38774`) - Fixed regression in :meth:`.DataFrameGroupBy.diff` raising for ``int8`` and ``int16`` columns (:issue:`39050`) - Fixed regression in :meth:`DataFrame.groupby` when aggregating an ``ExtensionDType`` that could fail for non-numeric values (:issue:`38980`) - Fixed regression in :meth:`.Rolling.skew` and :meth:`.Rolling.kurt` modifying the object inplace (:issue:`38908`) - Fixed regression in :meth:`DataFrame.any` and :meth:`DataFrame.all` not returning a result for tz-aware ``datetime64`` columns (:issue:`38723`) - Fixed regression in :meth:`DataFrame.apply` with ``axis=1`` using str accessor in apply function (:issue:`38979`) - Fixed regression in :meth:`DataFrame.replace` raising ``ValueError`` when :class:`DataFrame` has dtype ``bytes`` (:issue:`38900`) - Fixed regression in :meth:`Series.fillna` that raised ``RecursionError`` with ``datetime64[ns, UTC]`` dtype (:issue:`38851`) - Fixed regression in comparisons between ``NaT`` and ``datetime.date`` objects incorrectly returning ``True`` (:issue:`39151`) - Fixed regression in calling NumPy :func:`~numpy.ufunc.accumulate` ufuncs on DataFrames, e.g. ``np.maximum.accumulate(df)`` (:issue:`39259`) - Fixed regression in repr of float-like strings of an ``object`` dtype having trailing 0's truncated after the decimal (:issue:`38708`) - Fixed regression that raised ``AttributeError`` with PyArrow versions [0.16.0, 1.0.0) (:issue:`38801`) - Fixed regression in :func:`pandas.testing.assert_frame_equal` raising ``TypeError`` with ``check_like=True`` when :class:`Index` or columns have mixed dtype (:issue:`39168`) We have reverted a commit that resulted in several plotting related regressions in pandas 1.2.0 (:issue:`38969`, :issue:`38736`, :issue:`38865`, :issue:`38947` and :issue:`39126`). As a result, bugs reported as fixed in pandas 1.2.0 related to inconsistent tick labeling in bar plots are again present (:issue:`26186` and :issue:`11465`) What's new in 1.2.0 (December 26, 2020) Performance improvements - Performance improvements when creating DataFrame or Series with dtype ``str`` or :class:`StringDtype` from array with many string elements (:issue:`36304`, :issue:`36317`, :issue:`36325`, :issue:`36432`, :issue:`37371`) - Performance improvement in :meth:`.GroupBy.agg` with the ``numba`` engine (:issue:`35759`) - Performance improvements when creating :meth:`Series.map` from a huge dictionary (:issue:`34717`) - Performance improvement in :meth:`.GroupBy.transform` with the ``numba`` engine (:issue:`36240`) - :class:`.Styler` uuid method altered to compress data transmission over web whilst maintaining reasonably low table collision probability (:issue:`36345`) - Performance improvement in :func:`to_datetime` with non-ns time unit for ``float`` ``dtype`` columns (:issue:`20445`) - Performance improvement in setting values on an :class:`IntervalArray` (:issue:`36310`) - The internal index method :meth:`~Index._shallow_copy` now makes the new index and original index share cached attributes, avoiding creating these again, if created on either. This can speed up operations that depend on creating copies of existing indexes (:issue:`36840`) - Performance improvement in :meth:`.RollingGroupby.count` (:issue:`35625`) - Small performance decrease to :meth:`.Rolling.min` and :meth:`.Rolling.max` for fixed windows (:issue:`36567`) - Reduced peak memory usage in :meth:`DataFrame.to_pickle` when using ``protocol=5`` in python 3.8+ (:issue:`34244`) - Faster ``dir`` calls when the object has many index labels, e.g. ``dir(ser)`` (:issue:`37450`) - Performance improvement in :class:`ExpandingGroupby` (:issue:`37064`) - Performance improvement in :meth:`Series.astype` and :meth:`DataFrame.astype` for :class:`Categorical` (:issue:`8628`) - Performance improvement in :meth:`DataFrame.groupby` for ``float`` ``dtype`` (:issue:`28303`), changes of the underlying hash-function can lead to changes in float based indexes sort ordering for ties (e.g. :meth:`Index.value_counts`) - Performance improvement in :meth:`pd.isin` for inputs with more than 1e6 elements (:issue:`36611`) - Performance improvement for :meth:`DataFrame.__setitem__` with list-like indexers (:issue:`37954`) - :meth:`read_json` now avoids reading entire file into memory when chunksize is specified (:issue:`34548`) Bug fixes Categorical - :meth:`Categorical.fillna` will always return a copy, validate a passed fill value regardless of whether there are any NAs to fill, and disallow an ``NaT`` as a fill value for numeric categories (:issue:`36530`) - Bug in :meth:`Categorical.__setitem__` that incorrectly raised when trying to set a tuple value (:issue:`20439`) - Bug in :meth:`CategoricalIndex.equals` incorrectly casting non-category entries to ``np.nan`` (:issue:`37667`) - Bug in :meth:`CategoricalIndex.where` incorrectly setting non-category entries to ``np.nan`` instead of raising ``TypeError`` (:issue:`37977`) - Bug in :meth:`Categorical.to_numpy` and ``np.array(categorical)`` with tz-aware ``datetime64`` categories incorrectly dropping the time zone information instead of casting to object dtype (:issue:`38136`) Datetime-like - Bug in :meth:`DataFrame.combine_first` that would convert datetime-like column on other :class:`DataFrame` to integer when the column is not present in original :class:`DataFrame` (:issue:`28481`) - Bug in :attr:`.DatetimeArray.date` where a ``ValueError`` would be raised with a read-only backing array (:issue:`33530`) - Bug in ``NaT`` comparisons failing to raise ``TypeError`` on invalid inequality comparisons (:issue:`35046`) - Bug in :class:`.DateOffset` where attributes reconstructed from pickle files differ from original objects when input values exceed normal ranges (e.g. months=12) (:issue:`34511`) - Bug in :meth:`.DatetimeIndex.get_slice_bound` where ``datetime.date`` objects were not accepted or naive :class:`Timestamp` with a tz-aware :class:`.DatetimeIndex` (:issue:`35690`) - Bug in :meth:`.DatetimeIndex.slice_locs` where ``datetime.date`` objects were not accepted (:issue:`34077`) - Bug in :meth:`.DatetimeIndex.searchsorted`, :meth:`.TimedeltaIndex.searchsorted`, :meth:`PeriodIndex.searchsorted`, and :meth:`Series.searchsorted` with ``datetime64``, ``timedelta64`` or :class:`Period` dtype placement of ``NaT`` values being inconsistent with NumPy (:issue:`36176`, :issue:`36254`) - Inconsistency in :class:`.DatetimeArray`, :class:`.TimedeltaArray`, and :class:`.PeriodArray` method ``__setitem__`` casting arrays of strings to datetime-like scalars but not scalar strings (:issue:`36261`) - Bug in :meth:`.DatetimeArray.take` incorrectly allowing ``fill_value`` with a mismatched time zone (:issue:`37356`) - Bug in :class:`.DatetimeIndex.shift` incorrectly raising when shifting empty indexes (:issue:`14811`) - :class:`Timestamp` and :class:`.DatetimeIndex` comparisons between tz-aware and tz-naive objects now follow the standard library ``datetime`` behavior, returning ``True``/``False`` for ``!=``/``==`` and raising for inequality comparisons (:issue:`28507`) - Bug in :meth:`.DatetimeIndex.equals` and :meth:`.TimedeltaIndex.equals` incorrectly considering ``int64`` indexes as equal (:issue:`36744`) - :meth:`Series.to_json`, :meth:`DataFrame.to_json`, and :meth:`read_json` now implement time zone parsing when orient structure is ``table`` (:issue:`35973`) - :meth:`astype` now attempts to convert to ``datetime64[ns, tz]`` directly from ``object`` with inferred time zone from string (:issue:`35973`) - Bug in :meth:`.TimedeltaIndex.sum` and :meth:`Series.sum` with ``timedelta64`` dtype on an empty index or series returning ``NaT`` instead of ``Timedelta(0)`` (:issue:`31751`) - Bug in :meth:`.DatetimeArray.shift` incorrectly allowing ``fill_value`` with a mismatched time zone (:issue:`37299`) - Bug in adding a :class:`.BusinessDay` with nonzero ``offset`` to a non-scalar other (:issue:`37457`) - Bug in :func:`to_datetime` with a read-only array incorrectly raising (:issue:`34857`) - Bug in :meth:`Series.isin` with ``datetime64[ns]`` dtype and :meth:`.DatetimeIndex.isin` incorrectly casting integers to datetimes (:issue:`36621`) - Bug in :meth:`Series.isin` with ``datetime64[ns]`` dtype and :meth:`.DatetimeIndex.isin` failing to consider tz-aware and tz-naive datetimes as always different (:issue:`35728`) - Bug in :meth:`Series.isin` with ``PeriodDtype`` dtype and :meth:`PeriodIndex.isin` failing to consider arguments with different ``PeriodDtype`` as always different (:issue:`37528`) - Bug in :class:`Period` constructor now correctly handles nanoseconds in the ``value`` argument (:issue:`34621` and :issue:`17053`) Timedelta - Bug in :class:`.TimedeltaIndex`, :class:`Series`, and :class:`DataFrame` floor-division with ``timedelta64`` dtypes and ``NaT`` in the denominator (:issue:`35529`) - Bug in parsing of ISO 8601 durations in :class:`Timedelta` and :func:`to_datetime` (:issue:`29773`, :issue:`36204`) - Bug in :func:`to_timedelta` with a read-only array incorrectly raising (:issue:`34857`) - Bug in :class:`Timedelta` incorrectly truncating to sub-second portion of a string input when it has precision higher than nanoseconds (:issue:`36738`) Timezones - Bug in :func:`date_range` was raising ``AmbiguousTimeError`` for valid input with ``ambiguous=False`` (:issue:`35297`) - Bug in :meth:`Timestamp.replace` was losing fold information (:issue:`37610`) Numeric - Bug in :func:`to_numeric` where float precision was incorrect (:issue:`31364`) - Bug in :meth:`DataFrame.any` with ``axis=1`` and ``bool_only=True`` ignoring the ``bool_only`` keyword (:issue:`32432`) - Bug in :meth:`Series.equals` where a ``ValueError`` was raised when NumPy arrays were compared to scalars (:issue:`35267`) - Bug in :class:`Series` where two Series each have a :class:`.DatetimeIndex` with different time zones having those indexes incorrectly changed when performing arithmetic operations (:issue:`33671`) - Bug in :mod:`pandas.testing` module functions when used with ``check_exact=False`` on complex numeric types (:issue:`28235`) - Bug in :meth:`DataFrame.__rmatmul__` error handling reporting transposed shapes (:issue:`21581`) - Bug in :class:`Series` flex arithmetic methods where the result when operating with a ``list``, ``tuple`` or ``np.ndarray`` would have an incorrect name (:issue:`36760`) - Bug in :class:`.IntegerArray` multiplication with ``timedelta`` and ``np.timedelta64`` objects (:issue:`36870`) - Bug in :class:`MultiIndex` comparison with tuple incorrectly treating tuple as array-like (:issue:`21517`) - Bug in :meth:`DataFrame.diff` with ``datetime64`` dtypes including ``NaT`` values failing to fill ``NaT`` results correctly (:issue:`32441`) - Bug in :class:`DataFrame` arithmetic ops incorrectly accepting keyword arguments (:issue:`36843`) - Bug in :class:`.IntervalArray` comparisons with :class:`Series` not returning Series (:issue:`36908`) - Bug in :class:`DataFrame` allowing arithmetic operations with list of array-likes with undefined results. Behavior changed to raising ``ValueError`` (:issue:`36702`) - Bug in :meth:`DataFrame.std` with ``timedelta64`` dtype and ``skipna=False`` (:issue:`37392`) - Bug in :meth:`DataFrame.min` and :meth:`DataFrame.max` with ``datetime64`` dtype and ``skipna=False`` (:issue:`36907`) - Bug in :meth:`DataFrame.idxmax` and :meth:`DataFrame.idxmin` with mixed dtypes incorrectly raising ``TypeError`` (:issue:`38195`) Conversion - Bug in :meth:`DataFrame.to_dict` with ``orient='records'`` now returns python native datetime objects for datetime-like columns (:issue:`21256`) - Bug in :meth:`Series.astype` conversion from ``string`` to ``float`` raised in presence of ``pd.NA`` values (:issue:`37626`) Strings - Bug in :meth:`Series.to_string`, :meth:`DataFrame.to_string`, and :meth:`DataFrame.to_latex` adding a leading space when ``index=False`` (:issue:`24980`) - Bug in :func:`to_numeric` raising a ``TypeError`` when attempting to convert a string dtype Series containing only numeric strings and ``NA`` (:issue:`37262`) Interval - Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` where :class:`Interval` dtypes would be converted to object dtypes (:issue:`34871`) - Bug in :meth:`IntervalIndex.take` with negative indices and ``fill_value=None`` (:issue:`37330`) - Bug in :meth:`IntervalIndex.putmask` with datetime-like dtype incorrectly casting to object dtype (:issue:`37968`) - Bug in :meth:`IntervalArray.astype` incorrectly dropping dtype information with a :class:`CategoricalDtype` object (:issue:`37984`) Indexing - Bug in :meth:`PeriodIndex.get_loc` incorrectly raising ``ValueError`` on non-datelike strings instead of ``KeyError``, causing similar errors in :meth:`Series.__getitem__`, :meth:`Series.__contains__`, and :meth:`Series.loc.__getitem__` (:issue:`34240`) - Bug in :meth:`Index.sort_values` where, when empty values were passed, the method would break by trying to compare missing values instead of pushing them to the end of the sort order (:issue:`35584`) - Bug in :meth:`Index.get_indexer` and :meth:`Index.get_indexer_non_unique` where ``int64`` arrays are returned instead of ``intp`` (:issue:`36359`) - Bug in :meth:`DataFrame.sort_index` where parameter ascending passed as a list on a single level index gives wrong result (:issue:`32334`) - Bug in :meth:`DataFrame.reset_index` was incorrectly raising a ``ValueError`` for input with a :class:`MultiIndex` with missing values in a level with ``Categorical`` dtype (:issue:`24206`) - Bug in indexing with boolean masks on datetime-like values sometimes returning a view instead of a copy (:issue:`36210`) - Bug in :meth:`DataFrame.__getitem__` and :meth:`DataFrame.loc.__getitem__` with :class:`IntervalIndex` columns and a numeric indexer (:issue:`26490`) - Bug in :meth:`Series.loc.__getitem__` with a non-unique :class:`MultiIndex` and an empty-list indexer (:issue:`13691`) - Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`MultiIndex` and a level named ``"0"`` (:issue:`37194`) - Bug in :meth:`Series.__getitem__` when using an unsigned integer array as an indexer giving incorrect results or segfaulting instead of raising ``KeyError`` (:issue:`37218`) - Bug in :meth:`Index.where` incorrectly casting numeric values to strings (:issue:`37591`) - Bug in :meth:`DataFrame.loc` returning empty result when indexer is a slice with negative step size (:issue:`38071`) - Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` raises when the index was of ``object`` dtype and the given numeric label was in the index (:issue:`26491`) - Bug in :meth:`DataFrame.loc` returned requested key plus missing values when ``loc`` was applied to single level from a :class:`MultiIndex` (:issue:`27104`) - Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`CategoricalIndex` using a list-like indexer containing NA values (:issue:`37722`) - Bug in :meth:`DataFrame.loc.__setitem__` expanding an empty :class:`DataFrame` with mixed dtypes (:issue:`37932`) - Bug in :meth:`DataFrame.xs` ignored ``droplevel=False`` for columns (:issue:`19056`) - Bug in :meth:`DataFrame.reindex` raising ``IndexingError`` wrongly for empty DataFrame with ``tolerance`` not ``None`` or ``method="nearest"`` (:issue:`27315`) - Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`CategoricalIndex` using list-like indexer that contains elements that are in the index's ``categories`` but not in the index itself failing to raise ``KeyError`` (:issue:`37901`) - Bug on inserting a boolean label into a :class:`DataFrame` with a numeric :class:`Index` columns incorrectly casting to integer (:issue:`36319`) - Bug in :meth:`DataFrame.iloc` and :meth:`Series.iloc` aligning objects in ``__setitem__`` (:issue:`22046`) - Bug in :meth:`MultiIndex.drop` does not raise if labels are partially found (:issue:`37820`) - Bug in :meth:`DataFrame.loc` did not raise ``KeyError`` when missing combination was given with ``slice(None)`` for remaining levels (:issue:`19556`) - Bug in :meth:`DataFrame.loc` raising ``TypeError`` when non-integer slice was given to select values from :class:`MultiIndex` (:issue:`25165`, :issue:`24263`) - Bug in :meth:`Series.at` returning :class:`Series` with one element instead of scalar when index is a :class:`MultiIndex` with one level (:issue:`38053`) - Bug in :meth:`DataFrame.loc` returning and assigning elements in wrong order when indexer is differently ordered than the :class:`MultiIndex` to filter (:issue:`31330`, :issue:`34603`) - Bug in :meth:`DataFrame.loc` and :meth:`DataFrame.__getitem__` raising ``KeyError`` when columns were :class:`MultiIndex` with only one level (:issue:`29749`) - Bug in :meth:`Series.__getitem__` and :meth:`DataFrame.__getitem__` raising blank ``KeyError`` without missing keys for :class:`IntervalIndex` (:issue:`27365`) - Bug in setting a new label on a :class:`DataFrame` or :class:`Series` with a :class:`CategoricalIndex` incorrectly raising ``TypeError`` when the new label is not among the index's categories (:issue:`38098`) - Bug in :meth:`Series.loc` and :meth:`Series.iloc` raising ``ValueError`` when inserting a list-like ``np.array``, ``list`` or ``tuple`` in an ``object`` Series of equal length (:issue:`37748`, :issue:`37486`) - Bug in :meth:`Series.loc` and :meth:`Series.iloc` setting all the values of an ``object`` Series with those of a list-like ``ExtensionArray`` instead of inserting it (:issue:`38271`) Missing - Bug in :meth:`.SeriesGroupBy.transform` now correctly handles missing values for ``dropna=False`` (:issue:`35014`) - Bug in :meth:`Series.nunique` with ``dropna=True`` was returning incorrect results when both ``NA`` and ``None`` missing values were present (:issue:`37566`) - Bug in :meth:`Series.interpolate` where kwarg ``limit_area`` and ``limit_direction`` had no effect when using methods ``pad`` and ``backfill`` (:issue:`31048`) MultiIndex - Bug in :meth:`DataFrame.xs` when used with :class:`IndexSlice` raises ``TypeError`` with message ``"Expected label or tuple of labels"`` (:issue:`35301`) - Bug in :meth:`DataFrame.reset_index` with ``NaT`` values in index raises ``ValueError`` with message ``"cannot convert float NaN to integer"`` (:issue:`36541`) - Bug in :meth:`DataFrame.combine_first` when used with :class:`MultiIndex` containing string and ``NaN`` values raises ``TypeError`` (:issue:`36562`) - Bug in :meth:`MultiIndex.drop` dropped ``NaN`` values when non existing key was given as input (:issue:`18853`) - Bug in :meth:`MultiIndex.drop` dropping more values than expected when index has duplicates and is not sorted (:issue:`33494`) I/O - :func:`read_sas` no longer leaks resources on failure (:issue:`35566`) - Bug in :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` caused a ``ValueError`` when it was called with a filename in combination with ``mode`` containing a ``b`` (:issue:`35058`) - Bug in :meth:`read_csv` with ``float_precision='round_trip'`` did not handle ``decimal`` and ``thousands`` parameters (:issue:`35365`) - :meth:`to_pickle` and :meth:`read_pickle` were closing user-provided file objects (:issue:`35679`) - :meth:`to_csv` passes compression arguments for ``'gzip'`` always to ``gzip.GzipFile`` (:issue:`28103`) - :meth:`to_csv` did not support zip compression for binary file object not having a filename (:issue:`35058`) - :meth:`to_csv` and :meth:`read_csv` did not honor ``compression`` and ``encoding`` for path-like objects that are internally converted to file-like objects (:issue:`35677`, :issue:`26124`, :issue:`32392`) - :meth:`DataFrame.to_pickle`, :meth:`Series.to_pickle`, and :meth:`read_pickle` did not support compression for file-objects (:issue:`26237`, :issue:`29054`, :issue:`29570`) - Bug in :func:`LongTableBuilder.middle_separator` was duplicating LaTeX longtable entries in the List of Tables of a LaTeX document (:issue:`34360`) - Bug in :meth:`read_csv` with ``engine='python'`` truncating data if multiple items present in first row and first element started with BOM (:issue:`36343`) - Removed ``private_key`` and ``verbose`` from :func:`read_gbq` as they are no longer supported in ``pandas-gbq`` (:issue:`34654`, :issue:`30200`) - Bumped minimum pytables version to 3.5.1 to avoid a ``ValueError`` in :meth:`read_hdf` (:issue:`24839`) - Bug in :func:`read_table` and :func:`read_csv` when ``delim_whitespace=True`` and ``sep=default`` (:issue:`36583`) - Bug in :meth:`DataFrame.to_json` and :meth:`Series.to_json` when used with ``lines=True`` and ``orient='records'`` the last line of the record is not appended with 'new line character' (:issue:`36888`) - Bug in :meth:`read_parquet` with fixed offset time zones. String representation of time zones was not recognized (:issue:`35997`, :issue:`36004`) - Bug in :meth:`DataFrame.to_html`, :meth:`DataFrame.to_string`, and :meth:`DataFrame.to_latex` ignoring the ``na_rep`` argument when ``float_format`` was also specified (:issue:`9046`, :issue:`13828`) - Bug in output rendering of complex numbers showing too many trailing zeros (:issue:`36799`) - Bug in :class:`HDFStore` threw a ``TypeError`` when exporting an empty DataFrame with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`) - Bug in :class:`HDFStore` was dropping time zone information when exporting a Series with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`) - :func:`read_csv` was closing user-provided binary file handles when ``engine="c"`` and an ``encoding`` was requested (:issue:`36980`) - Bug in :meth:`DataFrame.to_hdf` was not dropping missing rows with ``dropna=True`` (:issue:`35719`) - Bug in :func:`read_html` was raising a ``TypeError`` when supplying a ``pathlib.Path`` argument to the ``io`` parameter (:issue:`37705`) - :meth:`DataFrame.to_excel`, :meth:`Series.to_excel`, :meth:`DataFrame.to_markdown`, and :meth:`Series.to_markdown` now support writing to fsspec URLs such as S3 and Google Cloud Storage (:issue:`33987`) - Bug in :func:`read_fwf` with ``skip_blank_lines=True`` was not skipping blank lines (:issue:`37758`) - Parse missing values using :func:`read_json` with ``dtype=False`` to ``NaN`` instead of ``None`` (:issue:`28501`) - :meth:`read_fwf` was inferring compression with ``compression=None`` which was not consistent with the other ``read_*`` functions (:issue:`37909`) - :meth:`DataFrame.to_html` was ignoring ``formatters`` argument for ``ExtensionDtype`` columns (:issue:`36525`) - Bumped minimum xarray version to 0.12.3 to avoid reference to the removed ``Panel`` class (:issue:`27101`, :issue:`37983`) - :meth:`DataFrame.to_csv` was re-opening file-like handles that also implement ``os.PathLike`` (:issue:`38125`) - Bug in the conversion of a sliced ``pyarrow.Table`` with missing values to a DataFrame (:issue:`38525`) - Bug in :func:`read_sql_table` raising a ``sqlalchemy.exc.OperationalError`` when column names contained a percentage sign (:issue:`37517`) Period - Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` where :class:`Period` dtypes would be converted to object dtypes (:issue:`34871`) Plotting - Bug in :meth:`DataFrame.plot` was rotating xticklabels when ``subplots=True``, even if the x-axis wasn't an irregular time series (:issue:`29460`) - Bug in :meth:`DataFrame.plot` where a marker letter in the ``style`` keyword sometimes caused a ``ValueError`` (:issue:`21003`) - Bug in :meth:`DataFrame.plot.bar` and :meth:`Series.plot.bar` where ticks positions were assigned by value order instead of using the actual value for numeric or a smart ordering for string (:issue:`26186`, :issue:`11465`). This fix has been reverted in pandas 1.2.1, see :doc:`v1.2.1` - Twinned axes were losing their tick labels which should only happen to all but the last row or column of 'externally' shared axes (:issue:`33819`) - Bug in :meth:`Series.plot` and :meth:`DataFrame.plot` was throwing a :exc:`ValueError` when the Series or DataFrame was indexed by a :class:`.TimedeltaIndex` with a fixed frequency and the x-axis lower limit was greater than the upper limit (:issue:`37454`) - Bug in :meth:`.DataFrameGroupBy.boxplot` when ``subplots=False`` would raise a ``KeyError`` (:issue:`16748`) - Bug in :meth:`DataFrame.plot` and :meth:`Series.plot` was overwriting matplotlib's shared y axes behavior when no ``sharey`` parameter was passed (:issue:`37942`) - Bug in :meth:`DataFrame.plot` was raising a ``TypeError`` with ``ExtensionDtype`` columns (:issue:`32073`) Styler - Bug in :meth:`Styler.render` HTML was generated incorrectly because of formatting error in ``rowspan`` attribute, it now matches with w3 syntax (:issue:`38234`) Groupby/resample/rolling - Bug in :meth:`.DataFrameGroupBy.count` and :meth:`SeriesGroupBy.sum` returning ``NaN`` for missing categories when grouped on multiple ``Categoricals``. Now returning ``0`` (:issue:`35028`) - Bug in :meth:`.DataFrameGroupBy.apply` that would sometimes throw an erroneous ``ValueError`` if the grouping axis had duplicate entries (:issue:`16646`) - Bug in :meth:`DataFrame.resample` that would throw a ``ValueError`` when resampling from ``"D"`` to ``"24H"`` over a transition into daylight savings time (DST) (:issue:`35219`) - Bug when combining methods :meth:`DataFrame.groupby` with :meth:`DataFrame.resample` and :meth:`DataFrame.interpolate` raising a ``TypeError`` (:issue:`35325`) - Bug in :meth:`.DataFrameGroupBy.apply` where a non-nuisance grouping column would be dropped from the output columns if another groupby method was called before ``.apply`` (:issue:`34656`) - Bug when subsetting columns on a :class:`~pandas.core.groupby.DataFrameGroupBy` (e.g. ``df.groupby('a')[['b']])``) would reset the attributes ``axis``, ``dropna``, ``group_keys``, ``level``, ``mutated``, ``sort``, and ``squeeze`` to their default values (:issue:`9959`) - Bug in :meth:`.DataFrameGroupBy.tshift` failing to raise ``ValueError`` when a frequency cannot be inferred for the index of a group (:issue:`35937`) - Bug in :meth:`DataFrame.groupby` does not always maintain column index name for ``any``, ``all``, ``bfill``, ``ffill``, ``shift`` (:issue:`29764`) - Bug in :meth:`.DataFrameGroupBy.apply` raising error with ``np.nan`` group(s) when ``dropna=False`` (:issue:`35889`) - Bug in :meth:`.Rolling.sum` returned wrong values when dtypes where mixed between float and integer and ``axis=1`` (:issue:`20649`, :issue:`35596`) - Bug in :meth:`.Rolling.count` returned ``np.nan`` with :class:`~pandas.api.indexers.FixedForwardWindowIndexer` as window, ``min_periods=0`` and only missing values in the window (:issue:`35579`) - Bug where :class:`pandas.core.window.Rolling` produces incorrect window sizes when using a ``PeriodIndex`` (:issue:`34225`) - Bug in :meth:`.DataFrameGroupBy.ffill` and :meth:`.DataFrameGroupBy.bfill` where a ``NaN`` group would return filled values instead of ``NaN`` when ``dropna=True`` (:issue:`34725`) - Bug in :meth:`.RollingGroupby.count` where a ``ValueError`` was raised when specifying the ``closed`` parameter (:issue:`35869`) - Bug in :meth:`.DataFrameGroupBy.rolling` returning wrong values with partial centered window (:issue:`36040`) - Bug in :meth:`.DataFrameGroupBy.rolling` returned wrong values with time aware window containing ``NaN``. Raises ``ValueError`` because windows are not monotonic now (:issue:`34617`) - Bug in :meth:`.Rolling.__iter__` where a ``ValueError`` was not raised when ``min_periods`` was larger than ``window`` (:issue:`37156`) - Using :meth:`.Rolling.var` instead of :meth:`.Rolling.std` avoids numerical issues for :meth:`.Rolling.corr` when :meth:`.Rolling.var` is still within floating point precision while :meth:`.Rolling.std` is not (:issue:`31286`) - Bug in :meth:`.DataFrameGroupBy.quantile` and :meth:`.Resampler.quantile` raised ``TypeError`` when values were of type ``Timedelta`` (:issue:`29485`) - Bug in :meth:`.Rolling.median` and :meth:`.Rolling.quantile` returned wrong values for :class:`.BaseIndexer` subclasses with non-monotonic starting or ending points for windows (:issue:`37153`) - Bug in :meth:`DataFrame.groupby` dropped ``nan`` groups from result with ``dropna=False`` when grouping over a single column (:issue:`35646`, :issue:`35542`) - Bug in :meth:`.DataFrameGroupBy.head`, :meth:`DataFrameGroupBy.tail`, :meth:`SeriesGroupBy.head`, and :meth:`SeriesGroupBy.tail` would raise when used with ``axis=1`` (:issue:`9772`) - Bug in :meth:`.DataFrameGroupBy.transform` would raise when used with ``axis=1`` and a transformation kernel (e.g. "shift") (:issue:`36308`) - Bug in :meth:`.DataFrameGroupBy.resample` using ``.agg`` with sum produced different result than just calling ``.sum`` (:issue:`33548`) - Bug in :meth:`.DataFrameGroupBy.apply` dropped values on ``nan`` group when returning the same axes with the original frame (:issue:`38227`) - Bug in :meth:`.DataFrameGroupBy.quantile` couldn't handle with arraylike ``q`` when grouping by columns (:issue:`33795`) - Bug in :meth:`DataFrameGroupBy.rank` with ``datetime64tz`` or period dtype incorrectly casting results to those dtypes instead of returning ``float64`` dtype (:issue:`38187`) Reshaping - Bug in :meth:`DataFrame.crosstab` was returning incorrect results on inputs with duplicate row names, duplicate column names or duplicate names between row and column labels (:issue:`22529`) - Bug in :meth:`DataFrame.pivot_table` with ``aggfunc='count'`` or ``aggfunc='sum'`` returning ``NaN`` for missing categories when pivoted on a ``Categorical``. Now returning ``0`` (:issue:`31422`) - Bug in :func:`concat` and :class:`DataFrame` constructor where input index names are not preserved in some cases (:issue:`13475`) - Bug in func :meth:`crosstab` when using multiple columns with ``margins=True`` and ``normalize=True`` (:issue:`35144`) - Bug in :meth:`DataFrame.stack` where an empty DataFrame.stack would raise an error (:issue:`36113`). Now returning an empty Series with empty MultiIndex. - Bug in :meth:`Series.unstack`. Now a Series with single level of Index trying to unstack would raise a ``ValueError`` (:issue:`36113`) - Bug in :meth:`DataFrame.agg` with ``func={'name':}`` incorrectly raising ``TypeError`` when ``DataFrame.columns==['Name']`` (:issue:`36212`) - Bug in :meth:`Series.transform` would give incorrect results or raise when the argument ``func`` was a dictionary (:issue:`35811`) - Bug in :meth:`DataFrame.pivot` did not preserve :class:`MultiIndex` level names for columns when rows and columns are both multiindexed (:issue:`36360`) - Bug in :meth:`DataFrame.pivot` modified ``index`` argument when ``columns`` was passed but ``values`` was not (:issue:`37635`) - Bug in :meth:`DataFrame.join` returned a non deterministic level-order for the resulting :class:`MultiIndex` (:issue:`36910`) - Bug in :meth:`DataFrame.combine_first` caused wrong alignment with dtype ``string`` and one level of ``MultiIndex`` containing only ``NA`` (:issue:`37591`) - Fixed regression in :func:`merge` on merging :class:`.DatetimeIndex` with empty DataFrame (:issue:`36895`) - Bug in :meth:`DataFrame.apply` not setting index of return value when ``func`` return type is ``dict`` (:issue:`37544`) - Bug in :meth:`DataFrame.merge` and :meth:`pandas.merge` returning inconsistent ordering in result for ``how=right`` and ``how=left`` (:issue:`35382`) - Bug in :func:`merge_ordered` couldn't handle list-like ``left_by`` or ``right_by`` (:issue:`35269`) - Bug in :func:`merge_ordered` returned wrong join result when length of ``left_by`` or ``right_by`` equals to the rows of ``left`` or ``right`` (:issue:`38166`) - Bug in :func:`merge_ordered` didn't raise when elements in ``left_by`` or ``right_by`` not exist in ``left`` columns or ``right`` columns (:issue:`38167`) - Bug in :func:`DataFrame.drop_duplicates` not validating bool dtype for ``ignore_index`` keyword (:issue:`38274`) ExtensionArray - Fixed bug where :class:`DataFrame` column set to scalar extension type via a dict instantiation was considered an object type rather than the extension type (:issue:`35965`) - Fixed bug where ``astype()`` with equal dtype and ``copy=False`` would return a new object (:issue:`28488`) - Fixed bug when applying a NumPy ufunc with multiple outputs to an :class:`.IntegerArray` returning ``None`` (:issue:`36913`) - Fixed an inconsistency in :class:`.PeriodArray`'s ``__init__`` signature to those of :class:`.DatetimeArray` and :class:`.TimedeltaArray` (:issue:`37289`) - Reductions for :class:`.BooleanArray`, :class:`.Categorical`, :class:`.DatetimeArray`, :class:`.FloatingArray`, :class:`.IntegerArray`, :class:`.PeriodArray`, :class:`.TimedeltaArray`, and :class:`.PandasArray` are now keyword-only methods (:issue:`37541`) - Fixed a bug where a ``TypeError`` was wrongly raised if a membership check was made on an ``ExtensionArray`` containing nan-like values (:issue:`37867`) Other - Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` incorrectly raising an ``AssertionError`` instead of a ``ValueError`` when invalid parameter combinations are passed (:issue:`36045`) - Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` with numeric values and string ``to_replace`` (:issue:`34789`) - Fixed metadata propagation in :meth:`Series.abs` and ufuncs called on Series and DataFrames (:issue:`28283`) - Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` incorrectly casting from ``PeriodDtype`` to object dtype (:issue:`34871`) - Fixed bug in metadata propagation incorrectly copying DataFrame columns as metadata when the column name overlaps with the metadata name (:issue:`37037`) - Fixed metadata propagation in the :class:`Series.dt`, :class:`Series.str` accessors, :class:`DataFrame.duplicated`, :class:`DataFrame.stack`, :class:`DataFrame.unstack`, :class:`DataFrame.pivot`, :class:`DataFrame.append`, :class:`DataFrame.diff`, :class:`DataFrame.applymap` and :class:`DataFrame.update` methods (:issue:`28283`, :issue:`37381`) - Fixed metadata propagation when selecting columns with ``DataFrame.__getitem__`` (:issue:`28283`) - Bug in :meth:`Index.intersection` with non-:class:`Index` failing to set the correct name on the returned :class:`Index` (:issue:`38111`) - Bug in :meth:`RangeIndex.intersection` failing to set the correct name on the returned :class:`Index` in some corner cases (:issue:`38197`) - Bug in :meth:`Index.difference` failing to set the correct name on the returned :class:`Index` in some corner cases (:issue:`38268`) - Bug in :meth:`Index.union` behaving differently depending on whether operand is an :class:`Index` or other list-like (:issue:`36384`) - Bug in :meth:`Index.intersection` with non-matching numeric dtypes casting to ``object`` dtype instead of minimal common dtype (:issue:`38122`) - Bug in :meth:`IntervalIndex.union` returning an incorrectly-typed :class:`Index` when empty (:issue:`38282`) - Passing an array with 2 or more dimensions to the :class:`Series` constructor now raises the more specific ``ValueError`` rather than a bare ``Exception`` (:issue:`35744`) - Bug in ``dir`` where ``dir(obj)`` wouldn't show attributes defined on the instance for pandas objects (:issue:`37173`) - Bug in :meth:`Index.drop` raising ``InvalidIndexError`` when index has duplicates (:issue:`38051`) - Bug in :meth:`RangeIndex.difference` returning :class:`Int64Index` in some cases where it should return :class:`RangeIndex` (:issue:`38028`) - Fixed bug in :func:`assert_series_equal` when comparing a datetime-like array with an equivalent non extension dtype array (:issue:`37609`) - Bug in :func:`.is_bool_dtype` would raise when passed a valid string such as ``"boolean"`` (:issue:`38386`) - Fixed regression in logical operators raising ``ValueError`` when columns of :class:`DataFrame` are a :class:`CategoricalIndex` with unused categories (:issue:`38367`) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.32 2021/04/09 14:41:35 tnn Exp $ d3 1 a3 1 DISTNAME= pandas-1.2.4 @ 1.32 log @revert wrong fix for py-scipy python 3.6 deprecation, fix properly @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.31 2020/10/12 21:52:03 bacon Exp $ d3 1 a3 1 DISTNAME= pandas-0.25.3 a4 1 PKGREVISION= 1 d14 1 a14 1 DEPENDS+= ${PYPKGPREFIX}-dateutil>=2.6.1:../../time/py-dateutil d17 1 a17 1 DEPENDS+= ${PYPKGPREFIX}-pytz>=2017.2:../../time/py-pytz d23 1 a23 1 TEST_DEPENDS+= ${PYPKGPREFIX}-test>=4.0.2:../../devel/py-test a24 3 # 20 test failures as of 0.18.1, see # https://github.com/pydata/pandas/issues/12337 # https://github.com/pydata/pandas/issues/14043 d28 1 a28 1 PYTHON_VERSIONS_INCOMPATIBLE= 36 27 # py-scipy d30 1 a30 1 PYSETUPTESTTARGET= pytest d33 1 a33 1 BUILDLINK_API_DEPENDS.pynumpy+= ${PYPKGPREFIX}-numpy>=1.13.3 @ 1.31 log @math/blas, math/lapack: Install interchangeable BLAS system Install the new interchangeable BLAS system created by Thomas Orgis, currently supporting Netlib BLAS/LAPACK, OpenBLAS, cblas, lapacke, and Apple's Accelerate.framework. This system allows the user to select any BLAS implementation without modifying packages or using package options, by setting PKGSRC_BLAS_TYPES in mk.conf. See mk/blas.buildlink3.mk for details. This commit should not alter behavior of existing packages as the system defaults to Netlib BLAS/LAPACK, which until now has been the only supported implementation. Details: Add new mk/blas.buildlink3.mk for inclusion in dependent packages Install compatible Netlib math/blas and math/lapack packages Update math/blas and math/lapack MAINTAINER approved by adam@@ OpenBLAS, cblas, and lapacke will follow in separate commits Update direct dependents to use mk/blas.buildlink3.mk Perform recursive revbump @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.30 2020/02/14 16:21:55 minskim Exp $ d32 1 a32 1 PYTHON_VERSIONS_INCOMPATIBLE= 27 @ 1.30 log @math/py-pandas: Update to 0.25.3 Highlights: - Groupby aggregation with relabeling - Better repr for MultiIndex - Better truncated repr for Series and DataFrame - Series.explode to split list-like values to rows @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.29 2020/01/26 17:31:40 rillig Exp $ d5 1 @ 1.29 log @all: migrate homepages from http to https pkglint -r --network --only "migrate" As a side-effect of migrating the homepages, pkglint also fixed a few indentations in unrelated lines. These and the new homepages have been checked manually. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.28 2019/06/16 19:14:52 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.24.2 d14 1 a14 1 DEPENDS+= ${PYPKGPREFIX}-dateutil>=2.5.0:../../time/py-dateutil d17 1 a17 1 DEPENDS+= ${PYPKGPREFIX}-pytz>=2011:../../time/py-pytz d22 3 a24 2 TEST_DEPENDS+= ${PYPKGPREFIX}-hypothesis-[0-9]*:../../devel/py-hypothesis TEST_DEPENDS+= ${PYPKGPREFIX}-test-[0-9]*:../../devel/py-test d31 1 a31 1 PYTHON_VERSIONS_INCOMPATIBLE= 27 # py-matplotlib, py-scipy d36 1 a36 1 BUILDLINK_API_DEPENDS.pynumpy+= ${PYPKGPREFIX}-numpy>=1.12.0 @ 1.28 log @py-pandas: updated to 0.24.2 Whats New in 0.24.2 Fixed Regressions Bug Fixes Whats New in 0.24.1 Changing the sort parameter for Index set operations Fixed Regressions Bug Fixes What’s New in 0.24.0 This is a major release from 0.23.4 and includes a number of API changes, new features, enhancements, and performance improvements along with a large number of bug fixes. Highlights include: * Optional Integer NA Support * New APIs for accessing the array backing a Series or Index * A new top-level method for creating arrays * Store Interval and Period data in a Series or DataFrame * Support for joining on two MultiIndexes @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.27 2018/08/10 09:00:36 adam Exp $ d9 1 a9 1 HOMEPAGE= http://pandas.pydata.org/ @ 1.27 log @py-pandas: updated to 0.23.4 v0.23.4: This is a minor bug-fix release in the 0.23.x series and includes some regression fixes, bug fixes, and performance improvements. We recommend that all users upgrade to this version. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.26 2018/07/09 08:22:45 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.23.4 d17 1 a17 1 DEPENDS+= ${PYPKGPREFIX}-pytz>=1.5:../../time/py-pytz d21 3 a23 1 TEST_DEPENDS+= ${PYPKGPREFIX}-nose-[0-9]*:../../devel/py-nose d30 4 d35 1 a35 1 BUILDLINK_API_DEPENDS.pynumpy+= ${PYPKGPREFIX}-numpy>=1.7.0 @ 1.26 log @py-pandas: updated to 0.23.3 0.23.3: This is a minor bug-fix release in the 0.23.x series and includes a fix for the source distribution on Python 3.7. We recommend that all users upgrade to this version. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.25 2018/07/05 01:21:05 minskim Exp $ d3 1 a3 1 DISTNAME= pandas-0.23.3 a25 1 PLIST_SUBST+= PYPKGPREFIX=${PYPKGPREFIX} @ 1.25 log @Update path to math/py-tables @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.24 2018/07/04 06:50:04 adam Exp $ d3 1 a3 2 DISTNAME= pandas-0.23.1 PKGREVISION= 1 @ 1.24 log @py-pandas: revbump for py-tables @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.23 2018/06/18 07:08:23 adam Exp $ d21 1 a21 1 DEPENDS+= ${PYPKGPREFIX}-tables>=2.2:../../math/py-pytables @ 1.23 log @py-pandas: updated to 0.23.1 pandas 0.23.1 This is a minor release from 0.23.0 and includes a number of bug fixes and performance improvements. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.22 2018/05/30 07:56:30 adam Exp $ d4 1 a17 1 DEPENDS+= ${PYPKGPREFIX}-pytables>=2.2:../../math/py-pytables d21 1 @ 1.22 log @py-pandas: updated to 0.23.0 v0.23.0: This is a major release from 0.22.0 and includes a number of API changes, deprecations, new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: - Round-trippable JSON format with 'table' orient - Instantiation from dicts respects order for Python 3.6+ - Dependent column arguments for assign - Merging / sorting on a combination of columns and index levels - Extending Pandas with custom types - Excluding unobserved categories from groupby - Changes to make output shape of DataFrame.apply consistent @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.21 2018/01/30 09:21:44 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.23.0 d14 1 a14 1 DEPENDS+= ${PYPKGPREFIX}-dateutil-[0-9]*:../../time/py-dateutil @ 1.21 log @Now DEPENDS on py-matplotlib rather than buildlinking @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.20 2018/01/05 16:13:51 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.22.0 d21 1 a21 2 # TEST_DEPENDS BUILD_DEPENDS+= ${PYPKGPREFIX}-nose-[0-9]*:../../devel/py-nose @ 1.20 log @py-pandas: updated to 0.22.0 v0.22.0: This is a major release from 0.21.1 and includes a single, API-breaking change. We recommend that all users upgrade to this version after carefully reading the release note. The only changes are: * The sum of an empty or all-NA Series is now 0 * The product of an empty or all-NA Series is now 1 * We’ve added a min_count parameter to .sum() and .prod() controlling the minimum number of valid values for the result to be valid. If fewer than min_count non-NA values are present, the result is NA. The default is 0. To return NaN, the 0.21 behavior, use min_count=1. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.19 2017/12/14 13:37:59 adam Exp $ d15 1 a29 1 .include "../../graphics/py-matplotlib/buildlink3.mk" @ 1.19 log @py-pandas: updated to 0.21.1 v0.21.1: Restore Matplotlib datetime Converter Registration New features - Improvements to the Parquet IO functionality - Other Enhancements Deprecations Performance Improvements Bug Fixes - Conversion - Indexing - I/O - Plotting - Groupby/Resample/Rolling - Reshaping - Numeric - Categorical - String @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.18 2017/11/02 09:41:38 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.21.1 @ 1.18 log @py-pandas: updated to 0.21.0 v0.21.0 Final: This is a major release from 0.20.3 and includes a number of API changes, deprecations, new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: * Integration with Apache Parquet, including a new top-level read_parquet function and DataFrame.to_parquet method, see here. * New user-facing dtype pandas.api.types.CategoricalDtype for specifying categoricals independent of the data, see here. * The behavior of sum and prod on all-NaN Series/DataFrames is now consistent and no longer depends on whether bottleneck is installed, see here. * Compatibility fixes for pypy, see here. * Additions to the drop, reindex and rename API to make them more consistent, see here. * Addition of the new methods DataFrame.infer_objects (see here) and GroupBy.pipe (see here). * Indexing with a list of labels, where one or more of the labels is missing, is deprecated and will raise a KeyError in a future version @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.17 2017/07/14 10:17:02 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.21.0 @ 1.17 log @0.20.3 Bug Fixes * Fixed a bug in failing to compute rolling computations of a column-MultiIndexed DataFrame * Fixed a pytest marker failing downstream packages’ tests suites Conversion * Bug in pickle compat prior to the v0.20.x series, when UTC is a timezone in a Series/DataFrame/Index * Bug in Series construction when passing a Series with dtype='category'. * Bug in DataFrame.astype() when passing a Series as the dtype kwarg.. Indexing * Bug in Float64Index causing an empty array instead of None to be returned from .get(np.nan) on a Series whose index did not contain any NaN s * Bug in MultiIndex.isin causing an error when passing an empty iterable * Fixed a bug in a slicing DataFrame/Series that have a TimedeltaIndex I/O * Bug in read_csv() in which files weren’t opened as binary files by the C engine on Windows, causing EOF characters mid-field, which would fail * Bug in read_hdf() in which reading a Series saved to an HDF file in ‘fixed’ format fails when an explicit mode='r' argument is supplied * Bug in DataFrame.to_latex() where bold_rows was wrongly specified to be True by default, whereas in reality row labels remained non-bold whatever parameter provided. * Fixed an issue with DataFrame.style() where generated element ids were not unique * Fixed loading a DataFrame with a PeriodIndex, from a format='fixed' HDFStore, in Python 3, that was written in Python 2 Plotting * Fixed regression that prevented RGB and RGBA tuples from being used as color arguments * Fixed an issue with DataFrame.plot.scatter() that incorrectly raised a KeyError when categorical data is used for plotting Reshaping * PeriodIndex / TimedeltaIndex.join was missing the sort= kwarg * Bug in joining on a MultiIndex with a category dtype for a level. * Bug in merge() when merging/joining with multiple categorical columns Categorical * Bug in DataFrame.sort_values not respecting the kind parameter with categorical data @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.16 2017/06/07 08:13:56 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.20.3 @ 1.16 log @v0.20.2: This is a minor bug-fix release in the 0.20.x series and includes some small regression fixes, bug fixes and performance improvements. We recommend that all users upgrade to this version. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.15 2017/05/21 08:54:33 adam Exp $ d3 1 a3 1 DISTNAME= pandas-0.20.2 @ 1.15 log @Changes 0.20.1: New .agg() API for Series/DataFrame similar to the groupby-rolling-resample API’s, see here Integration with the feather-format, including a new top-level pd.read_feather() and DataFrame.to_feather() method, see here. The .ix indexer has been deprecated, see here Panel has been deprecated, see here Addition of an IntervalIndex and Interval scalar type, see here Improved user API when grouping by index levels in .groupby(), see here Improved support for UInt64 dtypes, see here A new orient for JSON serialization, orient='table', that uses the Table Schema spec and that gives the possibility for a more interactive repr in the Jupyter Notebook, see here Experimental support for exporting styled DataFrames (DataFrame.style) to Excel, see here Window binary corr/cov operations now return a MultiIndexed DataFrame rather than a Panel, as Panel is now deprecated, see here Support for S3 handling now uses s3fs, see here Google BigQuery support now uses the pandas-gbq library, see here @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.14 2017/02/20 17:00:36 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-0.20.1 d13 1 d16 1 a18 1 DEPENDS+= ${PYPKGPREFIX}-pytables>=2.2:../../math/py-pytables a19 1 BUILDLINK_API_DEPENDS.pynumpy+= ${PYPKGPREFIX}-numpy>=1.7.0 d29 1 a29 4 # XXX Avoid picking up other compilers when installed .include "../../mk/compiler.mk" # XXX want py-bottleneck d31 1 a32 1 .include "../../graphics/py-matplotlib/buildlink3.mk" @ 1.14 log @Switch py-dateutils to plain DEPENDS. It supports both python 2 and 3 nowadays. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.13 2016/08/19 07:57:26 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-0.18.1 a4 1 PKGREVISION= 1 @ 1.13 log @Prefer egg.mk to distutils.mk. Clean up. Add missing dependency on py-sqlite3. Add missing test dependency on py-nose. Add comments with links to bug reports about test failures. Bump PKGREVISION for dependency change. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.12 2016/08/16 03:22:12 maya Exp $ d14 1 a26 2 PYTHON_VERSIONED_DEPENDENCIES= dateutil a34 1 .include "../../lang/python/versioned_dependencies.mk" @ 1.12 log @Update py-pandas to 0.18.1 Highlights in changelog: v0.18.1: .groupby(...) has been enhanced to provide convenient syntax when working with .rolling(..), .expanding(..) and .resample(..) per group, see here pd.to_datetime() has gained the ability to assemble dates from a DataFrame, see here Method chaining improvements, see here. Custom business hour offset, see here. Many bug fixes in the handling of sparse, see here Expanded the Tutorials section with a feature on modern pandas, courtesy of @@TomAugsburger. (GH13045). v0.18.0: Moving and expanding window functions are now methods on Series and DataFrame, similar to .groupby, see here. Adding support for a RangeIndex as a specialized form of the Int64Index for memory savings, see here. API breaking change to the .resample method to make it more .groupby like, see here. Removal of support for positional indexing with floats, which was deprecated since 0.14.0. This will now raise a TypeError, see here. The .to_xarray() function has been added for compatibility with the xarray package, see here. The read_sas function has been enhanced to read sas7bdat files, see here. Addition of the .str.extractall() method, and API changes to the .str.extract() method and .str.cat() method. pd.test() top-level nose test runner is available (GH4327). Update by K.I.A.Derouiche in PR pkg/51272 Slightly modified. @ text @d1 1 a1 1 # $NetBSD$ d5 1 d18 1 d20 5 a27 1 PYDISTUTILSPKG= yes d35 1 a35 1 .include "../../lang/python/distutils.mk" @ 1.11 log @Do not include py-numexpr/bl3.mk, just DEPEND on it. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.10 2016/06/08 17:43:35 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-0.17.1 d29 1 a29 1 .include "../../lang/python/egg.mk" @ 1.10 log @Switch to MASTER_SITES_PYPI. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.9 2015/12/28 14:35:02 wiz Exp $ d13 1 a31 1 .include "../../math/py-numexpr/buildlink3.mk" @ 1.9 log @Update py-pandas to 0.17.1. 0.17.1 This is a minor bug-fix release from 0.17.0 and includes a large number of bug fixes along several new features, enhancements, and performance improvements. We recommend that all users upgrade to this version. Highlights include: Support for Conditional HTML Formatting, see here Releasing the GIL on the csv reader & other ops, see here Fixed regression in DataFrame.drop_duplicates from 0.16.2, causing incorrect results on integer values (GH11376) 0.17.0 This is a major release from 0.16.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: Release the Global Interpreter Lock (GIL) on some cython operations, see here Plotting methods are now available as attributes of the .plot accessor, see here The sorting API has been revamped to remove some long-time inconsistencies, see here Support for a datetime64[ns] with timezones as a first-class dtype, see here The default for to_datetime will now be to raise when presented with unparseable formats, previously this would return the original input. Also, date parse functions now return consistent results. See here The default for dropna in HDFStore has changed to False, to store by default all rows even if they are all NaN, see here Datetime accessor (dt) now supports Series.dt.strftime to generate formatted strings for datetime-likes, and Series.dt.total_seconds to generate each duration of the timedelta in seconds. See here Period and PeriodIndex can handle multiplied freq like 3D, which corresponding to 3 days span. See here Development installed versions of pandas will now have PEP440 compliant version strings (GH9518) Development support for benchmarking with the Air Speed Velocity library (GH8361) Support for reading SAS xport files, see here Documentation comparing SAS to pandas, see here Removal of the automatic TimeSeries broadcasting, deprecated since 0.8.0, see here Display format with plain text can optionally align with Unicode East Asian Width, see here Compatibility with Python 3.5 (GH11097) Compatibility with matplotlib 1.5.0 (GH11111) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.8 2015/07/21 19:44:45 bad Exp $ d6 1 a6 1 MASTER_SITES= http://pypi.python.org/packages/source/p/pandas/ @ 1.8 log @Update py-pandas to 0.16.2. Closes PR pkg/49958 by matthewd. Changes since 0.14.1 for a full list see http://pandas.pydata.org/pandas-docs/stable/whatsnew.html: v 0.16.2 This is a minor bug-fix release from 0.16.1 and includes a a large number of bug fixes along some new features (pipe() method), enhancements, and performance improvements. We recommend that all users upgrade to this version. Highlights include: A new pipe method Documentation on how to use numba with pandas, v 0.16.1 This is a minor bug-fix release from 0.16.0 and includes a a large number of bug fixes along several new features, enhancements, and performance improvements. We recommend that all users upgrade to this version. Highlights include: Support for a CategoricalIndex, a category based index New section on how-to-contribute to pandas Revised “Merge, join, and concatenate” documentation, including graphical examples to make it easier to understand each operations New method sample for drawing random samples from Series, DataFrames and Panels. The default Index printing has changed to a more uniform format BusinessHour datetime-offset is now supported Further enhancement to the .str accessor to make string operations easier v0.16.0 (March 22, 2015) This is a major release from 0.15.2 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Highlights include: DataFrame.assign method Series.to_coo/from_coo methods to interact with scipy.sparse Backwards incompatible change to Timedelta to conform the .seconds attribute with datetime.timedelta Changes to the .loc slicing API to conform with the behavior of .ix Changes to the default for ordering in the Categorical constructor Enhancement to the .str accessor to make string operations easier The pandas.tools.rplot, pandas.sandbox.qtpandas and pandas.rpy modules are deprecated. We refer users to external packages like seaborn, pandas-qt and rpy2 for similar or equivalent functionality, see here v0.15.0 (October 18, 2014) This is a major release from 0.14.1 and includes a small number of API changes, several new features, enhancements, and performance improvements along with a large number of bug fixes. We recommend that all users upgrade to this version. Warning pandas >= 0.15.0 will no longer support compatibility with NumPy versions < 1.7.0. If you want to use the latest versions of pandas, please upgrade to NumPy >= 1.7.0 (GH7711) Highlights include: The Categorical type was integrated as a first-class pandas type New scalar type Timedelta, and a new index type TimedeltaIndex New datetimelike properties accessor .dt for Series, see Datetimelike Properties New DataFrame default display for df.info() to include memory usage, see Memory Usage read_csv will now by default ignore blank lines when parsing API change in using Indexes in set operations Enhancements in the handling of timezones A lot of improvements to the rolling and expanding moment funtions Internal refactoring of the Index class to no longer sub-class ndarray, see Internal Refactoring dropping support for PyTables less than version 3.0.0, and numexpr less than version 2.1 (GH7990) Split indexing documentation into Indexing and Selecting Data and MultiIndex / Advanced Indexing Split out string methods documentation into Working with Text Data @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.7 2014/07/19 13:17:46 bad Exp $ d3 1 a3 1 DISTNAME= pandas-0.16.2 @ 1.7 log @Update math/py-pandas to 0.14.1. This is two major releases since 0.12.0. Changes include API changes, new features, enhancements, and performance improvements along with a large number of bug fixes. For the detailed list of changes see http://pandas.pydata.org/pandas-docs/stable/whatsnew.html @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.6 2014/01/16 10:41:53 wiz Exp $ d3 1 a3 1 DISTNAME= pandas-0.14.1 d16 1 a16 1 BUILDLINK_API_DEPENDS.pynumpy+= ${PYPKGPREFIX}-numpy>=1.6.1 d21 2 a22 2 PLIST_SUBST+= PYPKGPREFIX=${PYPKGPREFIX} USE_LANGUAGES+= c c++ @ 1.6 log @Convert to use versioned_dependencies.mk. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.5 2013/12/10 13:00:30 bad Exp $ d3 1 a3 1 DISTNAME= pandas-0.12.0 @ 1.5 log @Update pandas to 0.12.0. This is a major release from 0.11.0 and includes several new features and enhancements along with a large number of bug fixes. Highlites include a consistent I/O API naming scheme, routines to read html, write multi-indexes to csv files, read & write STATA data files, read & write JSON format files, Python 3 support for HDFStore, filtering of groupby expressions via filter, and a revamped replace routine that accepts regular expressions. For detailed changes see: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.4 2013/05/16 23:10:16 bad Exp $ a12 1 DEPENDS+= ${PYPKGPREFIX}-dateutil>=1.5:../../time/py-dateutil d18 2 d29 1 @ 1.4 log @Update py-pandas to 0.11.0. Summary of changes since 0.10.1: This is a major release from 0.10.1 and includes many new features and enhancements along with a large number of bug fixes. The methods of Selecting Data have had quite a number of additions, and Dtype support is now full-fledged. There are also a number of important API changes that long-time pandas users should pay close attention to. * New precision indexing fields loc, iloc, at, and iat, to reduce occasional ambiguity in the catch-all hitherto ix method. * Expanded support for NumPy data types in DataFrame. * NumExpr integration to accelerate various operator evaluation. * Improved DataFrame to CSV exporting performance. For a full list refer to the "what's new" page. Also fixes PLIST errors introduced in last update. @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.3 2013/02/16 00:02:19 bad Exp $ d3 1 a3 1 DISTNAME= pandas-0.11.0 d17 1 @ 1.3 log @Update pandas to 0.10.1. Release date: 2013-01-22 New features: Add data inferface to World Bank WDI pandas.io.wb (GH2592) API Changes: Restored inplace=True behavior returning self (same object) with deprecation warning until 0.11 (GH1893) HDFStore refactored HFDStore to deal with non-table stores as objects, will allow future enhancements removed keyword compression from put (replaced by keyword complib to be consistent across library) warn PerformanceWarning if you are attempting to store types that will be pickled by PyTables Improvements to existing features: HDFStore enables storing of multi-index dataframes (closes GH1277) support data column indexing and selection, via data_columns keyword in append support write chunking to reduce memory footprint, via chunksize keyword to append support automagic indexing via index keyword to append support expectedrows keyword in append to inform PyTables about the expected tablesize support start and stop keywords in select to limit the row selection space added get_store context manager to automatically import with pandas added column filtering via columns keyword in select added methods append_to_multiple/select_as_multiple/ select_as_coordinates to do multiple-table append/selection added support for datetime64 in columns added method unique to select the unique values in an indexable or data column added method copy to copy an existing store (and possibly upgrade) show the shape of the data on disk for non-table stores when printing the store added ability to read PyTables flavor tables (allows compatiblity to other HDF5 systems) Add logx option to DataFrame/Series.plot (GH2327, GH2565) Support reading gzipped data from file-like object pivot_table aggfunc can be anything used in GroupBy.aggregate (GH2643) Implement DataFrame merges in case where set cardinalities might overflow 64-bit integer (GH2690) Raise exception in C file parser if integer dtype specified and have NA values. (GH2631) Attempt to parse ISO8601 format dates when parse_dates=True in read_csv for major performance boost in such cases (GH2698) Add methods neg and inv to Series Implement kind option in ExcelFile to indicate whether it's an XLS or XLSX file (GH2613) Bug fixes: Fix read_csv/read_table multithreading issues (GH2608) HDFStore correctly handle nan elements in string columns; serialize via the nan_rep keyword to append raise correctly on non-implemented column types (unicode/date) handle correctly Term passed types (e.g. index<1000, when index is Int64), (closes GH512) handle Timestamp correctly in data_columns (closes GH2637) contains correctly matches on non-natural names correctly store float32 dtypes in tables (if not other float types in the same table) Fix DataFrame.info bug with UTF8-encoded columns. (GH2576) Fix DatetimeIndex handling of FixedOffset tz (GH2604) More robust detection of being in IPython session for wide DataFrame console formatting (GH2585) Fix platform issues with file:/// in unit test (GH2564) Fix bug and possible segfault when grouping by hierarchical level that contains NA values (GH2616) Ensure that MultiIndex tuples can be constructed with NAs (GH2616) Fix int64 overflow issue when unstacking MultiIndex with many levels (GH2616) Exclude non-numeric data from DataFrame.quantile by default (GH2625) Fix a Cython C int64 boxing issue causing read_csv to return incorrect results (GH2599) Fix groupby summing performance issue on boolean data (GH2692) Don't bork Series containing datetime64 values with to_datetime (GH2699) Fix DataFrame.from_records corner case when passed columns, index column, but empty record list (GH2633) Fix C parser-tokenizer bug with trailing fields. (GH2668) Don't exclude non-numeric data from GroupBy.max/min (GH2700) Don't lose time zone when calling DatetimeIndex.drop (GH2621) Fix setitem on a Series with a boolean key and a non-scalar as value (GH2686) Box datetime64 values in Series.apply/map (GH2627, GH2689) Upconvert datetime + datetime64 values when concatenating frames (GH2624) Raise a more helpful error message in merge operations when one DataFrame has duplicate columns (GH2649) Fix partial date parsing issue occuring only when code is run at EOM (GH2618) Prevent MemoryError when using counting sort in sortlevel with high-cardinality MultiIndex objects (GH2684) Fix Period resampling bug when all values fall into a single bin (GH2070) Fix buggy interaction with usecols argument in read_csv when there is an implicit first index column (GH2654) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.2 2013/01/07 23:18:35 bad Exp $ d3 1 a3 1 DISTNAME= pandas-0.10.1 d25 1 d28 1 @ 1.2 log @Update pandas to 0.10.0. pkgsrc change: depend on math/py-pytables. Changes since 0.9.1: * Delimited file parsing engine rewritten to use a fraction of memory while being 40%+ faster. - Much-improved Unicode handling via the encoding option. - Column filtering (usecols) - Dtype specification (dtype argument) - Ability to specify strings to be recognized as True/False - Ability to yield NumPy record arrays (as_recarray) - High performance delim_whitespace option - Decimal format (e.g. European format) specification - Easier CSV dialect options: escapechar, lineterminator, quotechar, etc. - More robust handling of many exceptional kinds of files observed in the wild * API changes - Deprecated DataFrame BINOP TimeSeries special case behavior - Altered resample default behavior - Infinity and negative infinity are no longer treated as NA by isnull and notnull. - Methods with the inplace option now all return None instead of the calling object. - pandas.merge no longer sorts the group keys (sort=False) by default. - The default column names for a file with no header have been changed. - Values like 'Yes' and 'No' are not interpreted as boolean by default. - The file parsers will not recognize non-string values arising from a converter function as NA. - Calling fillna on Series or DataFrame with no arguments is no longer valid code. - Series.apply will now operate on a returned value from the applied function. - New API functions for working with pandas options. * New features - Wide DataFrame Printing. - Updated PyTables Support. * Enhancements - added ability to hierarchical keys. - added mixed-dtype support! - performance improvments on table writing. - support for arbitrarily indexed dimensions. - SparseSeries now has a density property. * Bug fixes - added Term method of specifying where conditions. - del store['df'] now call store.remove('df') for store deletion. - deleting of consecutive rows is much faster than before. - in_itemsize parameter can be specified in table creation to force a minimum size for indexing columns. - indexing support via create_table_index (requires PyTables >= 2.3) - appending on a store would fail if the table was not first created via put. - fixed issue with missing attributes after loading a pickled dataframe. - minor change to select and remove: require a table ONLY if where is also provided. * Compatibility - 0.10 of HDFStore is backwards compatible for reading tables created in a prior version of pandas, however, query terms using the prior (undocumented) methodology are unsupported. * N Dimensional Panels (Experimental) @ text @d1 1 a1 1 # $NetBSD: Makefile,v 1.1 2012/11/22 00:15:13 bad Exp $ d3 1 a3 1 DISTNAME= pandas-0.10.0 @ 1.1 log @Initial import of pandas 0.9.1. pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language. @ text @d1 1 a1 1 # $NetBSD$ d3 1 a3 1 DISTNAME= pandas-0.9.1 d16 1 @