Skip to content

BUG: 'ArrowExtensionArray' object has no attribute 'max' when passing pyarrow-backed series to .iloc #61311

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
MarcoGorelli opened this issue Apr 19, 2025 · 0 comments · May be fixed by #61321
Open
3 tasks done
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member

Comments

@MarcoGorelli
Copy link
Member

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({"a": [1, 2], "c": [0, 2], "d": ["c", "a"]})

In [3]: df.iloc[:, df['c']]  # works fine
Out[3]:
   a  d
0  1  c
1  2  a

In [4]: df = pd.DataFrame({"a": [1, 2], "c": [0, 2], "d": ["c", "a"]}).convert_dtypes(dtype_backend='pyarrow')

In [5]: df.iloc[:, df['c']]  # now, it raises
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[5], line 1
----> 1 df.iloc[:, df['c']]

File ~/pandas-dev/pandas/core/indexing.py:1189, in _LocationIndexer.__getitem__(self, key)
   1187     if self._is_scalar_access(key):
   1188         return self.obj._get_value(*key, takeable=self._takeable)
-> 1189     return self._getitem_tuple(key)
   1190 else:
   1191     # we by definition only have the 0th axis
   1192     axis = self.axis or 0

File ~/pandas-dev/pandas/core/indexing.py:1692, in _iLocIndexer._getitem_tuple(self, tup)
   1691 def _getitem_tuple(self, tup: tuple):
-> 1692     tup = self._validate_tuple_indexer(tup)
   1693     with suppress(IndexingError):
   1694         return self._getitem_lowerdim(tup)

File ~/pandas-dev/pandas/core/indexing.py:975, in _LocationIndexer._validate_tuple_indexer(self, key)
    973 for i, k in enumerate(key):
    974     try:
--> 975         self._validate_key(k, i)
    976     except ValueError as err:
    977         raise ValueError(
    978             f"Location based indexing can only have [{self._valid_types}] types"
    979         ) from err

File ~/pandas-dev/pandas/core/indexing.py:1613, in _iLocIndexer._validate_key(self, key, axis)
   1610         raise IndexError(f".iloc requires numeric indexers, got {arr}")
   1612     # check that the key does not exceed the maximum size of the index
-> 1613     if len(arr) and (arr.max() >= len_axis or arr.min() < -len_axis):
   1614         raise IndexError("positional indexers are out-of-bounds")
   1615 else:

AttributeError: 'ArrowExtensionArray' object has no attribute 'max'

Issue Description

df.iloc[:, df['c']] works for regular pandas dataframes but raises for pyarrow-backed ones

spotted in narwhals

Expected Behavior

   a  d
0  1  c
1  2  a

Installed Versions

INSTALLED VERSIONS

commit : 57fd502
python : 3.10.12
python-bits : 64
OS : Linux
OS-release : 5.15.167.4-microsoft-standard-WSL2
Version : #1 SMP Tue Nov 5 00:21:55 UTC 2024
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : C.UTF-8
LOCALE : en_US.UTF-8

pandas : 3.0.0.dev0+1979.g57fd50221e
numpy : 1.26.4
dateutil : 2.9.0.post0
pip : 25.0.1
Cython : 3.0.12
sphinx : 8.1.3
IPython : 8.33.0
adbc-driver-postgresql: None
adbc-driver-sqlite : None
bs4 : 4.13.3
blosc : None
bottleneck : 1.4.2
fastparquet : 2024.11.0
fsspec : 2025.2.0
html5lib : 1.1
hypothesis : 6.127.5
gcsfs : 2025.2.0
jinja2 : 3.1.5
lxml.etree : 5.3.1
matplotlib : 3.10.1
numba : 0.61.0
numexpr : 2.10.2
odfpy : None
openpyxl : 3.1.5
psycopg2 : 2.9.10
pymysql : 1.4.6
pyarrow : 19.0.1
pyreadstat : 1.2.8
pytest : 8.3.5
python-calamine : None
pytz : 2025.1
pyxlsb : 1.0.10
s3fs : 2025.2.0
scipy : 1.15.2
sqlalchemy : 2.0.38
tables : 3.10.1
tabulate : 0.9.0
xarray : 2024.9.0
xlrd : 2.0.1
xlsxwriter : 3.2.2
zstandard : 0.23.0
tzdata : 2025.1
qtpy : None
pyqt5 : None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Needs Triage Issue that has not been reviewed by a pandas team member
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant